My favorites | Sign in
Project Logo
                
Code license: MIT License
Labels: Ruby, Rails, Plugin
People details
Project owners:
  peter.boling
Project committers:
aminharis7

Boling For Batches v1.0.2

Plugin for Rails

I often need to execute really large computations on really large data sets. I usually end up writing a rake task to do it, which calls methods in my models. But something about the process bugged me. Each time I had to re-implement my 'batching code' that allowed me to not chew up GB after GB of memory due to klass.find(:all, :include => [:everything_under_the_sun]). Re-implementation of the same logic over and over across many projects is not very DRY, so I got out my blow torch and lit it up. The difficulty was that the part that was different each time I batched was at the center of the code, right in the middle of the batch loop. But I didn't let that stop me!

Install

SVN:

svn checkout http://boling-for-batches.googlecode.com/svn/trunk/boling_for_batches

Rails Plugin:

./script/plugin install http://boling-for-batches.googlecode.com/svn/trunk/boling_for_batches

Example

  #Setup your new batch, and tell it what options to use, and what class to run batches of
  batch = BolingForBatches::Batch.new(:klass => Payment, :select => "DISTINCT transaction_id", :batch_size => 50, :order => 'transaction_id')
  
  #Run a specific instance method on each record in each batch, and send the rest of the params to that method.
  batch.run(:remove_duplicates, false, true, true)
  
  #Print the results!
  batch.print_results

Configuration

Options for the initializer (Batch.new) method are:

  :klass         - Usage: :klass => MyClass
                    Required, as this is the class that will be batched

  :include       - Usage: :include => [:assoc]
                    Optional

  :select       - Usage: :select => "DISTINCT field_name"
                              or
                         :select => "field1, field2, field3"

  :order         - Usage: :order => "field DESC"

  :conditions    - Usage: :conditions => ["field1 is not null and field2 = ?", x]

  :verbose       - Usage: :verbose => 'yes' or 'no'
                    Sets verbosity of output
                  Default: yes (if not provided)
                  
  :batch_size    - Usage: :batch_size => x
                    Where x is some number.
                    How many AR Objects should be processed at once?
                  Default: 50 (if not provided)
                  
  :last_batch   - Usage: :last_batch => x
                    Where x is some number.
                    Only process up to and including batch #x.
                      Batch numbers start at 0 for the first batch.
                  Default: won't be used (no limit if not provided)
                  
  :first_batch  - Usage: first_batch => x
                    Where x is some number.
                    Begin processing batches beginning at batch #x.
                      Batch numbers start at 0 for the first batch.
                  Default: won't be used (no offset if not provided)

Output

Interpreting the output:

License

Copyright (c) 2008 Peter H. Boling, released under the MIT license

Or in other words have fun, and don't blame me!









Hosted by Google Code