Pages

Showing posts with label hibernate. Show all posts
Showing posts with label hibernate. Show all posts

Apache Lucene - Indexing - Part 1

"Information retrieval (IR) is the science of searching for documents, for information within documents and for metadata about documents, as well as that of searching relational databases and the World Wide Web."

Most of the application uses search features.If you are looking to add a powerful text search engine feature to your application then use Lucene, which can add advanced Search Engine capabilities to an application.This is a really powerful Java API which gave birth to powerful tools such as Nutch,Hadoop,Hibernate search and so on.Lucene was started in 1997 and adopted by Apache in 2001.The main functionality Lucene does is the powerful full text indexing of data.
Indexing with Lucene breaks down into three main operations: converting data to text, analyzing it, and saving it to the index.Lucene looks for strings only , so the documents has to be parsed and indexed.
To search large amounts of text quickly, you must first index that text and convert it into a format that will let you search it rapidly, eliminating the slow sequential scanning process. This conversion process is called indexing, and its output is called an index. So the searching is done on this index to find the data related with a cost of space 'storing indexes'.
These index files can be stored in a directory .A lucene index is divided into segments madeup of several index files(Lucene Documents).An index can be related to mutiple documents.So if new documents are indexed , it is added to segments than modifying the existing index file.Lucene uses a feature called incremental indexing ie there will be a global indexing and index those incremental documents so that it is searchable.Regarding the structure of a lucene index, it is an inverted index .While searching, lucene loads the index to memory .It uses a high performance indexing which has an index size roughly 20-30% of the size of text indexed which uses less memory. The documents in an index is a collection of fields which is a named collection of terms like <field,term>.These fields are independent search space defined at run-time.The segments or sub-indexes are independently searchable and the results of these segments are merged.Suppose a wiki article is indexed , we can set the field properties, so that the field objects contain actual indexed article data or stored one.



More about lucene index file formats - here

Develop an Open social application in 60 seconds

Open social application development made a giant leap.An eclipse plugin that eases the development of opensocial apps.I have written about open social applications before and I had a chance to work on it.Apache shindig is the initiative to develop a SNS container for application development and testing.I would say OSDE plugin developed by Yoichiro Tanaka rocks!! It uses apache shindig and hibernate for a dynamic development, so the developer can create a single application for different data models.As apache shindig provides java REST support, the application development will become more extensible.The database packaged with the plugin is H2(a Java SQL database).Before this plugin came, we had to develop and run the applications iniside a sandbox which was really tiring.This plugin has features of wizard like development for both javascript widgets and Java REST client applications.We can have our own custom social data which can be easily persisted due to the excellent plugin architecture.So I tried to develop a simple application ...

After the plugin is installed create a new OSDE project


Specify the gadget.xml and the API specs etc


For the development we need to run the apache shindig in the background.



To have a custom social data create people and add relationships between them.





Write a simple gadget ... (templates can be generated by the plugin if needed)
----src-------


<?xml version="1.0" encoding="UTF-8" standalone="yes"?><module><moduleprefs author_email="harisa@pramati.com" description="A friendly os app" title="Friends"><require feature="opensocial-0.8"><require feature="dynamic-height"></moduleprefs><content view="canvas" type="html">

<!-- Fetching People and Friends -->
<div>
<button onclick='fetchPeople();'>Fetch</button>
<div style="margin-left:20px;">
I am ... <span id='viewer' style="background-"></span><br/>My friends are ...
<ul id='friends' style="margin-top:5px;list-style:none;margin-left:75px;"></ul>
</div>
</div>
<script type='text/javascript'>
function fetchPeople() {
var req = opensocial.newDataRequest();
req.add(req.newFetchPersonRequest(opensocial.IdSpec.PersonId.VIEWER), 'viewer');
var params = {};
params[opensocial.IdSpec.Field.USER_ID] = opensocial.IdSpec.PersonId.VIEWER;
params[opensocial.IdSpec.Field.GROUP_ID] = 'FRIENDS';
var idSpec = opensocial.newIdSpec(params);
req.add(req.newFetchPeopleRequest(idSpec), 'friends');
req.send(function(data) {
var viewer = data.get('viewer').getData();
document.getElementById('viewer').innerHTML = viewer.getId();
var friends = data.get('friends').getData();
document.getElementById('friends').innerHTML = '';
friends.each(function(friend) {
document.getElementById('friends').innerHTML += '<li>&#187;' + friend.getId() + '</li>';
});
gadgets.window.adjustHeight();
});
}
</script>

----src ends---------

Run the application

Gadget --->


Cool... Its simple. But this plugin will be really useful when we develop the complex applications and aiming for multiple containers supporting open social api.

More ....
opensocial-development-environment
screencasts
youtube