hasan's blog (বল্গ)

work for fun!!!

Archive for the ‘ruby on rails’ Category

Ruby process & ActiveRecord data set executing in multi cores

with one comment

You know what! in one of our (tekSymmetry LLC) projects, we have so many background calculations,
which usually takes so many hours to get fully completed. ever since we have introduced those processes,
we were having problem with it’s execution time. sometimes it get’s in nerve

as you know a single ruby process can use a single processor’s core at a time.
this is probable one of the reasons why muli processes based deployment
strategy is picked by ruby on rails community.

anyway, these days our servers got more than one core! more precisely,
in our case each of our production server got 8 cores based intel xeon processor.

so you see the question rose if we could run those long running expensive process in multicores
our system could have better chance to get faster!.

well this blog post is intended for showing you the technique how we have done it in ruby on rails.

for better understanding, let me give you some hints so you can get the context -

  • we have big database table rows!
  • processing a single row doesn’t require anything from the same database table.
  • we are using linux (in our case debian lenny)

so here is the way we have done it -

  1. we took the max rows count for the main query
  2. and divided by the number of cores we have
  3. then we forked child process with each subset of the rows
  4. and executed the logic and related stuffs!
  5. on the parent process we initiated a loop where it was checking the newly forked process status
  6. if all the pid files (which are generated by the newly forked children) are removed,
    parent process will flag it as successful execution thus it will end the loop.

so you see, it is damn! simple :) _) and it is working for us :) _),
it has improved our execution time 8x faster, because of getting 8 cores in new server.

here is the code in ruby how we did it. (we created a helper “multicore_execution_helper.rb“  and included in model, thus execute_in_multicores became usable)

1    module MulticoreExecutionHelper
2    
3      def execute_in_multicores(
4          p_cores, p_total_rows, p_model, p_conditions = {}, &block)
5    
6        p_cores == 2 if p_cores.to_i == 0
7        total_items_per_core = p_total_rows / p_cores
8        logger.info "[BATCH-PROCESS-LOG] Total processes - #{p_cores}, " +
9                    "total rows - #{p_total_rows} [#{total_items_per_core} / 1 core]"
10   
11       # Create job id for each process
12       job_ids = p_cores.times.collect{|i| rand.to_s }
13   
14       # Fork process for each core and execute the block
15       p_cores.times do |offset|
16         Process.fork do
17           logger.info "[BATCH-PROCESS-LOG] Starting thread - #{offset} " +
18                       "assigned # #{job_ids[offset]}"
19   
20           # Keep job track through the created process pid file.
21           pid_file = File.join(RAILS_ROOT, 'tmp/pids/', "#{job_ids[offset]}.pid")
22           File.open(pid_file, 'w') {|f| f.puts Process.pid.to_s}
23   
24           # Since fork process is created from the sample of the parent
25           # process's memory so we need to reconnect all live connections.
26           begin
27             ActiveRecord::Base.connection.reconnect!
28   
29             # Retrieve data from the specific row through the defined
30             # offset and limit
31             teams = p_model.find(
32                 :all, {
33                     :o ffset => (offset * total_items_per_core),
34                     :limit => total_items_per_core}.merge(p_conditions))
35   
36             block.call(teams)
37           rescue => $e
38             logger.error "[BATCH-PROCESS-LOG] Exception raised during " +
39                          "execution - #{$e.inspect}"
40           end
41   
42           # Remove pid since we are done here!
43           FileUtils.rm(pid_file)
44         end
45       end
46   
47       # monitor whether the process is completed or still in progress
48       # don't return this method unless all the forked processes have
49       # completed their job
50       sleep(2)
51   
52       while 1 do
53         fully_completed = true
54         for job_id in job_ids
55           pid_file = File.join(RAILS_ROOT, 'tmp/pids/', "#{job_id}.pid")
56           if fully_completed && File.exists?(pid_file)
57             fully_completed = false
58             break
59           end
60         end
61   
62         break if fully_completed
63         sleep(2)
64         logger.debug '[BATCH-PROCESS-LOG] again...'
65       end
66     end
67   
68   end
69   

here is the usages code -

143        execute_in_multicores(p_total_cores, SomeStuff.count, SomeStuff) do |some_stuffs|
144          # Do.. whatever you wanna do with the stuff here! these are gonna run on multicores!
151        end

see it is really simple! :) _) if you like it let me know! how much you like it :) _)
here you can find the code on github 

best wishes!

Written by nhm tanveer hossain khan

January 29, 2010 at 7:34 pm

Ruby on Rails demo application presentation is picked by slideshare’s editor

with 3 comments

Today morning i was informed by an email that my slide on slideshare is picked by their editor to keep it on their featured slides list.

it was really too great things for me. i congrats those slideshare’s guys!

you can check out slide here -

here is the moment i locked it up on screen shot!

Written by nhm tanveer hossain khan

January 24, 2010 at 5:09 am

Posted in ruby on rails

debugging rails internal query execution

without comments

while we were working with somewhere in… ads project we came up with some debugging and performance mesuring tool, here in my post i will describe how you can use it for yourself.

query debugging –
picture-16
query debugging tool logs every executed query from active record and keep them in memory and using assisting template code it display all executed query from the active page.

also it executes query with mysql “explain” keyword. so on the same window you can see mysql query execution plan.
it helped us to track down queries which were not hitting the right index.
this is very simple trick – go through the code below -

module DebugUtil
class QueryDebug
@@QUERIES = {}
def self.add(p_query, p_report)
@@QUERIES[p_query] = p_report
end

def self.queries
q = @@QUERIES
clean
return q
end

def self.clean
@@QUERIES = {}
end
end
end

QueryDebug class keeps all executed query and their explained resultset in to the static array. so later in template QueryDebug::queries is invoked to get all executed query for the current page.

here is how we trap the query execution from active record -

if defined?(QUERY_DEBUG_ENABLED) && QUERY_DEBUG_ENABLED
ActiveRecord::ConnectionAdapters::MysqlAdapter.class_eval do
alias __existing_execute_method execute

def execute(sql, name = nil)
if sql.match(/^SELECT/i)
report = []
@connection.query(“explain #{sql}”).each do |row|
report < < row
end
DebugUtil::QueryDebug.add(sql, report)
end
__existing_execute_method(sql, name)
end
end

Object.class_eval do
def raise_during_query_debug
raise DebugUtil::QueryDebug::queries.inspect
end
end
end

you can see we have used “QUERY_DEBUG_ENABLED” constant to ensure whether this is enabled by intention.
now see how we are rendering on our template.

query debug

  1. checked
    < %= row.join(“ “) % >

we put this code in common layout. so it renders on every page. thats all :)

Written by nhm tanveer hossain khan

September 15, 2008 at 6:03 am

time based cache expiry for rails action cache

without comments

rails has excellent support for caching action, page, query and so on.
rails default behavior is more than expected for most of the project. though i was looking for some time based expiry function on “caches_action” functionality. unfortunately there wasn’t anything so here is a simple trick i have used to make it work with different url and time based expiration.

i added “caches_action :recent” on my controller and added the following protected method -

protected
def fragment_cache_key(p_args)
cache_key = “cache_key_#{request.path}#{request.headers["QUERY_STRING"]}”.gsub(/=/, “”)
action_cache_key = get_from_cache(cache_key)
if action_cache_key
return action_cache_key
else
action_cache_key = Digest::MD5::hexdigest(“#{rand}#{Time.now}”)
add_to_cache(cache_key, action_cache_key, {:expiry => 1.hours})
return action_cache_key
end

end

actually i generate key and stored them inside my memcached instance with an hour expiry limit.
so when memcache invalidates my cache my action cache is also get invalidated.

so thus rails default action cache work with time limit :)

don’t think this is all, i suppose to cleanup the previously created cache file so i won’t get unnecessary store consumption .

Written by nhm tanveer hossain khan

June 23, 2008 at 10:54 pm

nginx on debian box

without comments

i had a tough time to configure nginx on my debian production environment.
the recent stable release from nginx is 0.6.x but on debian repository it was 0.4.x, so i had to build it from the source and install it.

since i had an old 0.4.x instance of nginx, installation wasn’t as smooth as i was expecting. here i would try to show how i have resolved those broken issues and made my way to run nginx to reverse proxy my backend mongrel instances.

i took several attemtps to remove the existing 0.4.x instance of nginx but i failed.
i used “aptitude remove nginx” i ended with the following error -

Reading package lists… Done
Building dependency tree
Reading state information… Done
The following packages will be REMOVED:
nginx
0 upgraded, 0 newly installed, 1 to remove and 0 not upgraded.
Need to get 0B of archives.
After unpacking 582kB disk space will be freed.
Do you want to continue [Y/n]?
(Reading database … 78227 files and directories currently installed.)
Removing nginx …
Stopping nginx: nginx.
Stopping nginx: invoke-rc.d: initscript nginx, action “stop” failed.
dpkg: error processing nginx (–remove):
subprocess pre-removal script returned error exit status 1
Starting nginx: nginx.
Errors were encountered while processing:
nginx
E: Sub-process /usr/bin/dpkg returned an error code (1)

though this is not my real error code but it has similarity, i took it from the following url -

http://sudhanshuraheja.com/2007/09/remove-nginx-from-ubuntu-fiesty-fawn.html

this blog author had some suggestion, but that wasn’t working for me, so i tried in different way -
i executed “sudo apt-get build-dep nginx” i found this tips from one of the blog comments
the comment author explained in this way -

“this should install everything required to build the package (compiler, headers/libs, packaging tools). Usually on a fresh install I do this to get everything required to build zope.Then issue “apt-get source nginx” (you need deb-src sources in /etc/apt/sources.list). This will download nginx sources (original tarball, diff, and uncompressed sources with patches applied). Just cd in source dir, make your modifications and use “dpkg-buildpackage -rfakeroot -b” (this requires fakeroot package). In parent directory you should get new deb files ready to install, with start/stops scripts and your patches. Just take care of package update that will surely remove your nginx version.”

if you want service script to initiate nginx on startup follow the link -
http://blog.labratz.net/articles/2006/10/03/rails-deployment-apache-lighttpd-nginx-mongrel-cluster

best wishes,

Written by nhm tanveer hossain khan

May 24, 2008 at 10:58 pm

upcoming project mojar_workflow, workflow engine in ruby

without comments

hi,
we just kicking start a new open source ruby based workflow engine project mojar workflow.
we named it after our deshi word “mojar” reason is very clear to
spread out this word.

mojar workflow, is integral solution to execute a flow of business
rules. for example -

you have an action where you have the following set of rules -
1. start transaction
2. verify user account
3. verify user balance
4. verify user dues
5. reduce dues from balance
6. complete transaction

after few days you got a new requirement, where you suppose to reduce
user dues by the 10% because of company new discount policy.
so you have to implement the following rules -
1. start transaction
2. verify user account
3. verify user balance
4. verify user dues
5. reduce dues by 10% of discount
5. reduce discounted dues from balance
6. complete transaction

to implement such scenario you have to again code in your stable
release. but using mojar workflow, you can add that new concern from
the abstract flow maintenance layer. where you can define this flow in
yaml file or an xml document.

keep your eyes on -
http://rubyforge.org/projects/mojarworkflow/

best wishes,

Written by nhm tanveer hossain khan

February 10, 2008 at 1:59 pm

rails plugin symlinked broken on 1.2.5, fixed from 2.0

with one comment

i was trying to build a rails plugin. my project was in different directory so i symlinked the directory under “vendor/plugins/..”. but i couldn’t find it working.

so after passing few times, i could successfully run my plugin under rails 2.0-RC2. so later i compared lookup.rb file from the 1.5 and 2.0-RC2 release.

the defecting code was the following lines – (1.5)

def use_component_sources!
# ….
sources < < PathSource.new(:lib, “#{::RAILS_ROOT}/lib/generators”)
sources << PathSource.new(:vendor, “#{::RAILS_ROOT}/vendor/generators”)
sources << PathSource.new(:plugins, “#{::RAILS_ROOT}/vendor/plugins/**/generators”)
# ….
end

the fixed version – (2.0-RC2)

def use_component_sources!
# …

sources < < PathSource.new(:lib, “#{::RAILS_ROOT}/lib/generators”)
sources << PathSource.new(:vendor, “#{::RAILS_ROOT}/vendor/generators”)
sources << PathSource.new(:plugins, “#{::RAILS_ROOT}/vendor/plugins/*/**/generators”)
sources << PathSource.new(:plugins, “#{::RAILS_ROOT}/vendor/plugins/*/**/rails_generators”)
end
# …
end

i also checked out rails bug tracker i found a bug was pointed to this issue and apparently which was fixed on the following change set.
http://dev.rubyonrails.org/changeset/6101

Written by nhm tanveer hossain khan

November 29, 2007 at 1:27 pm

simple fragment cache implementation on ruby on rails

without comments

i was getting serious performance problem with one of my projects. so i came up with a simple fragment cache implementation on ruby on rails.

after implementing this stuff, i replaced “render(:partial => …)” with the following method -

render_from_cache_or_render(:cache_key =>”cache key”, :cache_expire_after => ConstantHelper::TAG_CLOUD_EXPIRED_IN, # minutes :partial => “….”)

let’s have a look on my implementation -

def render_from_cache_or_render(p_args)

return render(p_args) if true == p_args[:cache_off]

# check from cache
cache_key = p_args[:cache_key]
cached_content = CacheService.get_cache(cache_key)

if not cached_content.nil? and not cached_content.empty?
return cached_content
else
content = render(p_args)
# cache expire time if defined
cache_expire_time_in_minutes = p_args[:cache_expire_after] || 60
CacheService.add_cache(cache_key, cache_expire_time_in_minutes, content)
return content
end
end

actually, my implemented “CacheService” class is simply storing all cache in a hash map.
when some cache was requested for peek, cache expiry was checked before returning the cached value.

for CacheService implementation look at the bottom of my post.

anyway, after implementing and utilizing this stuff, i gained 70+ requests capability per second. fyi, before applying cache it was around 10 per second.

module Cache
class Item
attr_accessor :key, :expire_time, :content, :created_on

def initialize(p_key, p_expire_time, p_content)
@key = p_key
@expire_time = p_expire_time * 60 # in minutes
@content = p_content
@created_on = Time.now
end
end
end

class CacheService
@@CACHES = {}
@@CACHE_EXPIRE_TIMES = {}

def self.add_cache(p_key, p_expire_time, p_content)
cache_item = Cache::Item.new(p_key, p_expire_time, p_content)
@@CACHES[p_key.to_sym] = cache_item
end

def self.get_cache(p_key)
# load content from cache
cached_content = @@CACHES[p_key.to_sym]
return nil if cached_content.nil?

# verify cache validity
return cached_content.content if not expired?(cached_content)
return nil
end

private
def self.expired?(p_cache)
# find time difference
time_difference = Time.now – p_cache.created_on
return true if time_difference > p_cache.expire_time
end
end

best wishes,

Written by nhm tanveer hossain khan

November 17, 2007 at 3:00 am

thats why i like ruby!!! thanks dynamic scripting…

without comments

if you have rails deployment on windows environment with mongrel service, i think you might face the following problem -

Errno::EINVAL (Invalid argument):
/app/models/index_service.rb:63:in `write’
/app/models/index_service.rb:63:in `puts’

this problem was because of “puts” what i forgot to remove before deploying on test server.
if your deployment on windows service environment and if your code has few “puts” usages, you must face this problem with mongrel

on mongrel group, i found they are working with this, hopefully they will replace puts with logger and other things.
anyway, the quickest solution i had in mind was just use the dynamic behavior of ruby. here is what i did -

def puts(p_args)
logger.debug(p_args)
end

thats all fixed my problem :)

thank ruby, thanks for dynamic scripting…

Written by nhm tanveer hossain khan

November 14, 2007 at 5:46 pm

Posted in Ruby, mongrel, ruby on rails

Fat Refactoring: use include module to reduce number of lines

without comments

if i didn’t mention that before, i should tell it now, here at somewhere in… rnd team we are playing a lot with ruby on rails. these days our rails team is completely focusing on a product(which is secret for the time being :) ) where we
found a lot of interesting stuffs, for instance.

few days back, we found our application_helper and few controllers are growing too fast and getting extra fat (lines of code). so we had few refactoring to reduce the extra fat.

now have a look on the code we had with in application_helper.rb taken from tag/v-0.3
fat_refactoring_before
this code is not completely visible over the screen snap, this is 340 number of lines. which was the output of our 3 iterations.

though these number of lines are not that much problematic, but we had a scenario which was difficult to make it more concern aware and single concerned.

now have a look on our code which is taken from the current trunk,
fat_reforing_after
Wow, now it is 50 lines only including the header copyright information.
the trick was very simple, we followed the following conventions -

1. find out all related and same concerned functions
2. stick team together in a module
3. include the module to statically import all functions
no integration error, nothing has occurred.
we are happy with this :)

i think, our ruby learning process is going smooth :)

Written by nhm tanveer hossain khan

November 3, 2007 at 10:48 am