Loading...
when we started using semantic repository, we had only one lucene index to make our content search able,
later we came up with another integration with one php based service aawaj.
on aawaj service they had more than 150,000 items to index. we tried with our current release 0.5.1 to index all contents but we ended with extremely performance outage. later we released another version 0.5.2, where we added queued request handling and threw index optimization over an restful service uri - /rest/service/optimize/
here is the simple benchmark report -
version - 0.5.1 - first 100 items ended in - 13.611 seconds.
version - 0.5.2 - first 100 items ended in - 5.6152 seconds.
the change is really different and significant, later today we had anoter import on our repository, interestingly it took 1 hour to index 150,000 items. which was bit surprising since we were unable to do it with 0.5.1
actually we added single thread executor which keeps everything in queue and execute one by one. so we could remove synchronized method.
here is an example code -
private final Executor mIndexTaskExecutor =Executors.newSingleThreadExecutor();public void addDocument(final Document pDocument) {mIndexTaskExecutor.execute(new Runnable() {public void run() {getLuceneIndexTemplate().addDocument(pDocument);}});}







| www.flickr.com |
Leave a reply