Pages

Showing posts with label spring. Show all posts
Showing posts with label spring. Show all posts

How Stuff Works: Spring Component Scanning

Spring has an interesting feature of scanning its components defined and load it.So the configuration is tied to application ie the code, using annotations.Spring javaconfig also provides the capability to do convention over configuration.There are a lot of documents,references etc explaining how to do the spring configuration. I was looking into the under the hood flow of how the stuff works...

A minimal config to application context xml

<context:component-scan base-package="packageName"/>

will scan all the component classes in the package.The component classes in the classpath are detected and bean definitions are auto-registered for them.

As per the Schema URI and Schema XSD, the context namespace will be like this - Reference


There are stereotype annotations which are markers for any class that fulfills a role within an application.This is well showcased in SpringMVC.More about the annotations

For efficient configuration, we can have multiple context xmls for maintaining resources. The application can have one for DAOs, one for services and so on.The layers can be effectively scanned by the context loader with this usage. So MVC applications will have seperate xmls for @Repository (data access tier),@Service (service),@Controller (web tier) components.
So for the example in a simple java app, I used them in a single xml. But this is mot an mvc app.

A UserDAO Interface

package com.sample.data;

import java.util.List;

public interface UserDAO {

List<String> getUsers();
}


Its Implementation

package com.sample.data;

import java.util.ArrayList;
import java.util.List;

import org.springframework.stereotype.Repository;

@Repository("userDAO")
public class UserDAOImpl implements UserDAO {

@Override
public List<String> getUsers() {

List<String> l = new ArrayList<String>();
l.add("Roger Moore");
l.add("Pierce Brosnan");

return l;
}

}


Service Layer

package com.sample.service;

import java.util.List;

public interface UserService {
List<String> getUsers();
}

And its implementation

package com.sample.service;

import java.util.List;

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;

import com.sample.data.UserDAO;


@Service("userService")
public class UserServiceImpl implements UserService{

@Autowired
private UserDAO userDAO;

@Override
public List<String> getUsers() {

return userDAO.getUsers();
}


}


the client

package com.sample.client;

import org.springframework.context.ApplicationContext;
import org.springframework.context.support.ClassPathXmlApplicationContext;

import com.sample.service.UserService;

public class SpringClient {

private UserService service;

public SpringClient(){
ApplicationContext appContext = new ClassPathXmlApplicationContext("resource/applicationContext.xml");
service= (UserService) appContext.getBean("userService");
((ClassPathXmlApplicationContext)appContext).close();
}


public void showUsers() {
for (String s : service.getUsers()) {
System.out.println(s);
}
}

public static void main(String[] args) {
SpringClient spc = new SpringClient();
spc.showUsers();
}

}




applicationContext

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:context="http://www.springframework.org/schema/context"
xmlns:aop="http://www.springframework.org/schema/aop"
xmlns:tx="http://www.springframework.org/schema/tx"
xsi:schemaLocation="http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
http://www.springframework.org/schema/context
http://www.springframework.org/schema/context/spring-context-2.5.xsd
http://www.springframework.org/schema/aop
http://www.springframework.org/schema/aop/spring-aop-2.5.xsd
http://www.springframework.org/schema/tx
http://www.springframework.org/schema/tx/spring-tx-2.5.xsd"
default-autowire="byName">

<!-- Enable autowiring via @Autowire annotations -->
<context:annotation-config/>


<context:component-scan base-package="com.sample.data">
<context:include-filter type="annotation"
expression="org.springframework.stereotype.Repository"/>
</context:component-scan>

<context:component-scan base-package="com.sample.service">
<context:include-filter type="annotation"
expression="org.springframework.stereotype.Service"/>
</context:component-scan>

</beans>



After bootstrapping, the set of application components and service that need to be created are identified. AbstractbeanDefinitionReader will read resource definitions. DefaultListableBeanFactory will be used as default bean factory based on bean definition objects. XmlBeanDefinitionReader.loadBeanDefinitions() load bean definitions from the specified XML file in which the BeanDefinitionParser will identify the context namespaces and parses the applicationContext xml. The resources are identified by the implementation of ResourcePatternResolver:, ie PathMatchingResourcePatternResolver in which the location patterns are found like an ant-style. Internally it uses ClassLoader.getResources(String name) method which returns an Enumeration containing URLs representing classpath resources. Then the ComponentScanBeanDefinitionParser will parse through the context defintion nodes. If annotation configuration is enabled, autowiring of components takes place as these "candidate patterns" can be set as autowired. A default AutowiredAnnotationBeanPostProcessor will be registered by the "context:annotation-config" and "context:component-scan" XML tags.If filters are added, then it will parse the type filters. In the example I have provided annotation is the type filter. So it will use AnnotationTypeFilter to load the annotation Repository which is provided as the DAO's annotation resolver.

And we know that spring classes are designed to be extended.I was going through the API docs and found that we can add and exclude filters programmatically too.

So I added showComponents to client code using a selected base package

public void showComponents(){

ClassPathScanningCandidateComponentProvider provider =
new ClassPathScanningCandidateComponentProvider(true);
String basePackage = "com/sample/data";
provider.addExcludeFilter(new AnnotationTypeFilter(Repository.class, true));
Set<BeanDefinition> filteredComponents = provider.findCandidateComponents(basePackage);
System.out.println("No of components :"+filteredComponents.size());

for (BeanDefinition component : filteredComponents) {
System.out.println("Component:"+ component.getBeanClassName());
}

provider.resetFilters(true);
provider.addIncludeFilter(new AnnotationTypeFilter(Repository.class, true));
filteredComponents = provider.findCandidateComponents(basePackage);
System.out.println("No of components :"+filteredComponents.size());

for (BeanDefinition component : filteredComponents) {
System.out.println("Component:"+ component.getBeanClassName());
}

}

So the output will be

No of components :0
No of components :1
Component: com.sample.data.UserDAOImpl



If code and configuration are static like beans, the scanner annotations are useful. For accessing resources, jndi or jdbc or anything dynamic like that better go for xml as it is easy to modify it without code change.Its widely used for request mappings and controllers in Spring MVC.The xml overrides the config.When classes are more scanning will be difficult, so we have to filter them based on the type required.


More Reading

Classpath scanning and managed components

ETags - Roles in Web Application to Cloud Computing

A web server returns a value in the response header known as ETag (entity tag) helps the client to know if there is any change in content at a given URL which requested.When a page is loaded in the browser, it is cached.It knows the ETag of that page.The browser uses the value of ETag as the value of the header key "If-None-Match".The server reads this http header value and compares with the ETag of the page.If the value are same ie the content is not changed, a status
code 304 is returned ie. 304:Not Modified. These HTTP meta data can be very well used for predicting the page downloads thereby optimizing the bandwidth used.But a combination of a checksum (MD5) of the data as the ETag value and a correct time-stamp of modification could possible give quality result in predicting the re-download. An analysis of the effectiveness of chosing the value of ETag is described in this paper.

According to http://www.mnot.net/cache_docs/

A resource is eligible for caching if:

  • There is caching info in HTTP response headers
  • Non secure response (HTTPS wont be cached)
  • ETag or LastModified header is present
  • Fresh cache representation

Entity tags can be strong or weak validators.The strong validator provide the uniqueness of representation.If we use MD5 or SHA1, entity value changes when one bit of data is changed, while a weak value changes whenever the meaning of an entity(which can be a set of semantically related) changes.

More info on conditional requests explaining strong and weak ETags in here

In Spring MVC, Support for ETags is provided by the servlet filter ShallowEtagHeaderFilter. If you see the source here

String responseETag = generateETagHeaderValue(body);
.... ......

protected String generateETagHeaderValue(byte[] bytes) {
StringBuilder builder = new StringBuilder("\"0");
Md5HashUtils.appendHashString(bytes, builder);
builder.append('"');
return builder.toString();
}


The default implementation generates an MD5 hash for the JSP body it generated.So whenever the same page is requested, this checks for If-None-Match, a 304 is send back.


String requestETag = request.getHeader(HEADER_IF_NONE_MATCH);
if (responseETag.equals(requestETag)) {
if (logger.isTraceEnabled()) {
logger.trace("ETag [" + responseETag + "] equal to If-None-Match, sending 304");
}
response.setStatus(HttpServletResponse.SC_NOT_MODIFIED);
}



This reduces the processing and bandwidth usage.Since it is a plain Servlet Filter, and thus can be used in combination any web framework.A MD5 hash assures that the actual etag is only 32 characters long, while ensuring that they are highly unlikely to collide.A deeper level of ETag implementation penetrating to the model layer for the uniqueness is also possible.It could be realted to the revisions of row data. Matching them for higher predicatability of lesser downloads of data will be an effective solution.

As per JSR 286 portlet specification Portlet should set Etag property (validationtoken) and expiration-time when rendering. New render/resource requests will only be called after expiration-time is reached.New request will be sent the Etag. Portlet should examine it and determine if cache is still good if so, set a new expiration-time and do not render.This specification is implemented in Spring MVC.(see JIRA )

A hypothetical model for REST responses using deeper Etags could be effective while an API is exposed or two applications are integrated.I have seen such an implementation using Python here

When cloud computing is considered, for Amazon S3 receives a PUT request with the Content-MD5 header, Amazon S3 computes the MD5 of the object received and returns a 400 error if it doesn't match the MD5 sent in the header.Here Amazon or Azure uses Content-MD5 which is of 7 bytes.

According to the article here in S3 for some reason the entity was updated with the exact same bits that it previously had, the ETag will not have changed, but then, that's probably ok anyway.

According to S3 REST API,

Amazon S3 returns the first ten megabytes of the file, the Etag of the file, and the total size of the file (20232760 bytes) in the Content-Length field.

To ensure the file did not change since the previous portion was downloaded, specify the if-match request header. Although the if-match request header is not required, it is recommended for content that is likely to change.


The ETag directive in the HTTP specification makes available to developers to implement caching, which could be very effective at the transport level for REST services as well as web applications.The trade-off would be, there may be security implications to having data reside on the transport level.

But in the case of static files which is having a large "Expires" value and clustered files, Etag will not be effective because of the unique checksum for files that are distributed will be transported to client for each GET requests.By removing the ETag header, you disable caches and browsers from being able to validate files, so they are forced to rely on your Cache-Control and Expires header.Thus by reducing the header size which was having the checksum value.