We are a user experience design and software development firm
Hire us to design your site, build your application, serve billions of users and solve real problems.
Those of you using ferret 0.11.6 (the latest released gem) and acts_as_ferret 0.4.3 (the latest stable version) may have noticed that rebuilding an index can be painfully slow when working with a large number of documents. Even if each document contains a relatively small amount of text, indexing crawls with a large set of documents. The problem is a result of how bulk update works; "bulk indexing" processes a single document at a time! Fortunately, there is a simple patch which will provide a significant speed boost.
There is a fairly old trac ticket where Francois Lagunas posted a clever patch which will make bulk indexing process documents as a group. Here is a monkey patch based on what he submitted as a patch (in Rails, just drop this as a file into config/initializers).
class Ferret::Index::Index
def update_batch(docs)
@dir.synchrolock do
ensure_writer_open()
commit = false
docs.each do |id, value|
delete(id)
commit = true if id.is_a?(String) or id.is_a?(Symbol)
end
if commit
@writer.commit
end
ensure_writer_open()
docs.each do |id, new_doc|
@writer << new_doc
end
flush() if @auto_flush
end
end
end
class ActsAsFerret::BulkIndexer
def index_records(records, offset)
docs = {}
batch_time = measure_time {
records.each { |rec| docs[rec.id] = rec.to_doc if rec.ferret_enabled?(true) }
@index.update_batch(docs)
}.to_f
@work_done = offset.to_f / @model_count * 100.0 if @model_count > 0
remaining_time = ( batch_time / @batch_size ) * ( @model_count - offset + @batch_size )
@logger.info "#{@reindex ? 're' : 'bulk '}index model #{@model.name} : #{'%.2f' % @work_done}% complete : #{'%.2f' % remaining_time} secs to finish"
end
end
If you are using a newer version of ferret by building the gem yourself, the ferret side of this patch is already included (although, you do need to make a slight change on the acts_as_ferret side). Stay tuned for another post about how to do this.
Topics: acts_as_ferret, ferret, fulltext search, rails
Hire us to design your site, build your application, serve billions of users and solve real problems.
great work!
Comment by Thomas, Wednesday, November 5, 2008 @ 10:25 pm