Archive for June 2008
time based cache expiry for rails action cache
rails has excellent support for caching action, page, query and so on.
rails default behavior is more than expected for most of the project. though i was looking for some time based expiry function on “caches_action” functionality. unfortunately there wasn’t anything so here is a simple trick i have used to make it work with different url and time based expiration.
i added “caches_action :recent” on my controller and added the following protected method -
protected
def fragment_cache_key(p_args)
cache_key = “cache_key_#{request.path}#{request.headers["QUERY_STRING"]}”.gsub(/=/, “”)
action_cache_key = get_from_cache(cache_key)
if action_cache_key
return action_cache_key
else
action_cache_key = Digest::MD5::hexdigest(“#{rand}#{Time.now}”)
add_to_cache(cache_key, action_cache_key, {:expiry => 1.hours})
return action_cache_key
endend
actually i generate key and stored them inside my memcached instance with an hour expiry limit.
so when memcache invalidates my cache my action cache is also get invalidated.
so thus rails default action cache work with time limit
don’t think this is all, i suppose to cleanup the previously created cache file so i won’t get unnecessary store consumption .
semantic-repository-0.5.2: deploymet update
hi,
as some of you know, semantic repository version 0.5.2 has some problem with IOException (too many files open)
which was stopping repository to perform further search. thats what was reason to get empty search result.
after searching a while, we found the problem (which was also mentioned in lucene FAQ document).
by default bash shell allow limited files to open, since we had many index files, lucene has to open them up during performing search.
so when it exceeds the limit of 1025 files (which is default on our production environment). our application threw the following exception -
Caused by: java.io.FileNotFoundException: /var/indexes/ads-index/_gm.tvd (Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.(RandomAccessFile.java:212)
at org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:506)
at org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:536)
at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:445)
at org.apache.lucene.index.TermVectorsReader.(TermVectorsReader.java:70)
after increasing this limit to 10,000 we didn’t find such problem exists. we had to apply “ulimit -n 10000″ command on bash shell to make it works.
TODO: possible bug, probable our system is opening up several index searcher.
lucene based semantic-repository 0.5.2: major performance improvement now 24,000 items imported in 19 minutes
when we started using semantic repository, we had only one lucene index to make our content search able,
later we came up with another integration with one php based service aawaj.
on aawaj service they had more than 150,000 items to index. we tried with our current release 0.5.1 to index all contents but we ended with extremely performance outage. later we released another version 0.5.2, where we added queued request handling and threw index optimization over an restful service uri – /rest/service/optimize/
here is the simple benchmark report -
version – 0.5.1 – first 100 items ended in – 13.611 seconds.
version – 0.5.2 – first 100 items ended in – 5.6152 seconds.
the change is really different and significant, later today we had anoter import on our repository, interestingly it took 1 hour to index 150,000 items. which was bit surprising since we were unable to do it with 0.5.1
actually we added single thread executor which keeps everything in queue and execute one by one. so we could remove synchronized method.
here is an example code -
private final Executor mIndexTaskExecutor =Executors.newSingleThreadExecutor();public void addDocument(final Document pDocument) {mIndexTaskExecutor.execute(new Runnable() {public void run() {getLuceneIndexTemplate().addDocument(pDocument);}});}




