Further Beyond Zebra

Previously on Agile Ajax: I wrote about a complicated problem that went beyond ActiveRecord. I resolved to add a database column with pre-generated results to make the search logic easier. Commenters questioned my reluctance to use SQL directly.

I wanted to come back to this topic and talk about the process of creating and maintaining the pre-generated data. To reiterate the problem, I needed a report that would display:

  • All individual times for all swimmers within search criteria
  • The most recent relay time for a swimmer in each event, but only if the most recent relay matches the critera, otherwise don't show any relay times

The solution being pursued is to add a "most_recent_relay" field to the database and use that in a two-phase database search to display the report.


Turned out it was very easy to adapt the report code to use the new field -- just a question of adding a criteria for swimmer_id being in the already found list and the most_recent_relay field being true.

It was a bit more interesting to manage the new field. ActiveRecord callbacks are an obvious choice for the code, since they are automatically triggered when a new record is created or an existing one is saved. The basic idea is to check whether the record being saved is the most recent relay, and if so, mark it and then make sure that all other relays for the same user are correctly unmarked.

There are just a couple of problems that I had to work around.

The first one is that saving any other object in the middle of one object's callback triggers a new set of callbacks for the other object. Since those callbacks will pass through the same method, infinite loops are pretty easy to stumble into.

Secondly, changing an objects values in it's after_save callback is problematic -- it seems to change the database, but not the in-memory object, and does not return the object you expect.

Finally, I also wanted the code to be stable in the face of previous errors -- in other words, the current state of the most_recent_relay flag was not considered when determining the new most recent relay. This is to make migration and data recovery easier.

I built this test-first. My test cases, sketched out, and roughly in the order I wrote them were:

  1. With a new user, add a relay, and confirm it is the most recent relay.
  2. With a new user, add an individual event, and confirm it is not the most recent relay.
  3. Starting with a new user, add two relays and confirm that only the second ends up as most recent relay.
  4. With a new user, add a relay then an individual event, and confirm that the relay is the only most recent relay.
  5. Same as the last one, but doing the individual event first.

I should say that by the time I got to the fourth and fifth tests, i expected them to pass, but was guarding against regressions in refactoring.

The code itself was a bit tricker, as I stumbled through attempts to do the whole thing in the before filter, resulting in a few infinite loops, then tried to do the whole thing in the after filter, causing different problems.

Eventually I settled on a two phase process -- I should say, I'm not prepared to defend this code as the final word on the topic. The before filter determines if the current record is the most recent relay.

  before_save :update_if_most_recent_relay

  def update_if_most_recent_relay
    return true unless relay?
    current_mrr = user.most_recent_relay(event)
    self.most_recent_relay = (current_mrr.nil?) ||
        (current_mrr == self) ||
        (race_date > current_mrr.race_date)
    end
  end

The callback asks the user for their most recent relay for the given event -- very roughly, here's that code, using some named scopes that you can probably figure out.

def most_recent_relay(event)
  records.relays.for_event(event).by_race_date.limited_to(1)
end

At that point Rails saves the record, but now there could be other relay records for that user in an invalid state -- I covered that in an after save callback:

  after_save :clean_up_relays

  def clean_up_relays
    return true unless relay?
    records = user.records.relays.by_event(event).by_race_date
    if !records.first.most_recent_relay
      records.first.update_attributes(:most_recent_relay => true)
    end
    records[1..-1].each do |record|
      if record.most_recent_relay
        record.update_attributes(:most_recent_relay => false)
      end
    end
    true
  end

What this does is grab all relays for the user and the event, sorted by date. If the most recent one is incorrectly tagged false, switch it. If any other record is incorrectly tagged true, switch it. Those records go through the same filter, but would not trigger cascades, because correctly tagged records don't get resaved. This method is stable even if the most_recent flag gets messed up somewhere (or if a date gets edited).

Performance is much better than the per-user method in the last post, though since pagination is still happening after the database, large calls are still a problem that we're working on. Write performance is obviously a hair slower, but not something that a user would notice.

The best part about this filter setup is that the rake task to initially populate the most recent field is super simple:

  task :reset_relays => :environment do
    Record.relays.each do |record|
      record.save
    end
  end

Touching the record triggers the filters, and the filters do the right thing.

I'm now wondering if, with the new database field in hand, it makes sense to try for a single SQL method -- it's still probably requiring a subquery, but it's probably easier with the new field.

Related posts:

  1. On Beyond Zebra: When ActiveRecord Isn’t Enough
  2. Using Mocha for ActiveRecord Partial Mocks with Finders
  3. ActionMailer Callbacks: In the Spirit of ActionController Filters
  4. Rails Development for iPhone with rails_iui
  5. Using the Null Object Pattern With ActiveRecord

Topics:

Leave a comment

Powered by WP Hashcash

Launch: Pathfinder Newsletter

    Get a monthly update on best practices for delivering successful software.

    Subscribe via email


    Subscribe via RSS      RSS icon

Topics

Search

WordPress

Comments about this site: info@pathf.com