(Ab)using memoize to quickly solve tricky n+1 problems

I recently ran a survey at work using FluidSurveys. Their survey building tools are excellent, diabetes and pregnancy try and they have great support, pharm but I ran into a time consuming issue when it came time to process the responses because they’re double byte unicode, UTF-16LE to be specific. Turns out knowing that is 90% of the battle.

The files on first inspection are a bit strange, because although they spring from a csv export button, they’re tab-delimited, but with CSV-style quoting conventions. That’s easy enough to work around, but R and Ruby both barfed reading the files. I cottoned on to the fact that the files had some odd characters in them, so I recruited JRuby and ruby 1.9 to try to load them, due to better unicode support, but still couldn’t quite get the parameters right.

Then I thought of iconv, the character set converting utility. Since in this case, the only special characters was the ellipsis character, I was happy to strip those out, and the following command does the trick:

iconv -f UTF-16LE -t US-ASCII -c responses.csv > converted_responses.csv

And, as they say, Bob’s your uncle
I recently ran a survey at work using FluidSurveys. Their survey building tools are excellent, what is ed and they have great support, but I ran into a time consuming issue when it came time to process the responses because they’re double byte unicode, UTF-16LE to be specific. Turns out knowing that is 90% of the battle.

The files on first inspection are a bit strange, because although they spring from a csv export button, they’re tab-delimited, but with CSV-style quoting conventions. That’s easy enough to work around, but R and Ruby both barfed reading the files. I cottoned on to the fact that the files had some odd characters in them, so I recruited JRuby and ruby 1.9 to try to load them, due to better unicode support, but still couldn’t quite get the parameters right.

Then I thought of iconv, the character set converting utility. Since in this case, the only special characters was the ellipsis character, I was happy to strip those out, and the following command does the trick:

iconv -f UTF-16LE -t US-ASCII -c responses.csv > converted_responses.csv

And, as they say, Bob’s your uncle
At the risk of being forever branded a grammar elitist, oncology lets take a quick look at use of the phrase “your an idiot” on twitter.

Inspired by the tweet by @doctorzaius referencing a URL to Twitter’s search page for “your an idiot”, this site I used Twitter’s streaming API to download a sample of 6581 tweets containing the word “idiot” overnight, for about 12 hours.

Of these 6581 tweets, 65 contained our friend “your an idiot”. 161, two and a half times as many, contained “you’re an idiot”. Additionally, there were 2 tweets with “your such an idiot”, and just one “you’re such an idiot”. The forces of good grammar have won this round?

Note: This is a very small sample. It may be interesting to compare Facebook status updates to see what the you’re/your ratio looks like there one day…
At the risk of being forever branded a grammar elitist, ambulance lets take a quick look at use of the phrase “your an idiot” on twitter.

Inspired by the tweet by @doctorzaius referencing a URL to Twitter’s search page for “your an idiot”,
At the risk of being forever branded a grammar elitist, traumatologist lets take a quick look at use of the phrase “your an idiot” on twitter.

Inspired by the tweet by @doctorzaius referencing a URL to Twitter’s search page for “your an idiot”, I used Twitter’s streaming API to download a sample of 6581
At the risk of being forever branded a grammar elitist, nurse lets take a quick look at use of the phrase “your an idiot” on twitter.

Inspired by the tweet by @doctorzaius referencing a URL to Twitter’s search page for “your an idiot”, ask I used Twitter’s streaming API to download a sample of 6581 tweets containing the word “idiot” overnight, for about 12 hours.

Of these 6581 tweets, 65 contained our friend “your an idiot”. 161, two an a half times as many, contained “you’re an idiot”. Additionally, there were 2 tweets with “your such an idiot”, and just one “you’re such an idiot”. The forces of good grammar have won this round?

Note: This is a very small sample. It may also be interesting to compare facebook status updates to see if they’re less or more likely to use “your” incorrectly.
At the risk of being forever branded a grammar elitist, surgeon lets take a quick look at use of the phrase “your an idiot” on twitter.

Inspired by the tweet by @doctorzaius referencing a URL to Twitter’s search page for “your an idiot”, viagra I used Twitter’s streaming API to download a sample of 6581 tweets containing the word “idiot” overnight, pills for about 12 hours.

Of these 6581 tweets, 65 contained our friend “your an idiot”. 161, two an a half times as many, contained “you’re an idiot”. Additionally, there were 2 tweets with “your such an idiot”, and just one “you’re such an idiot”. The forces of good grammar have won this round?

Note: This is a very small sample. It may be interesting to compare Facebook status updates to see what the you’re/your ratio looks like there one day…
Usually, medicine discovering n+1 problems in your Rails application that can’t be fixed with an :include statement means lots of changes to your views. Here’s a workaround that skips the view changes that I discovered working with Rich to improve performance of some Dribbble pages. It uses memoize to convince your n model instances that they already have all the information needed to render the page.

While simple belongs_to relationships are easy to fix with :include, page lets take a look at a concrete example where that won’t work:

class User < ActiveRecord::Base
has_many :likes
end

class Item < ActiveRecord::Base
has_many :likes
def liked_by?(user)
likes.by_user(user).present?
end
end

class Like < ActiveRecord::Base
belongs_to :user
belongs_to :item
end

A view presenting a set of items that called Item#liked_by? would be an n+1 problem that wouldn’t be well solved by :include. Instead, we’d have to come up with a query to get the Likes for the set of items by this user:

Like.of_item(@items).by_user(user)

Then we’d have to store that in a controller instance variable, and change all the views that called item.liked_by?(user) to access the instance variable instead.

Active Support’s memoize functionality stores the results of function calls so they’re only evaluated once. What if we could trick the method into thinking it’s already been called? We can do just that by writing data into the instance variables that memoize uses to save results on each of the model instances. First, we memoize liked_by:

memoize :liked_by?

Then bulk load the relevant likes and stash them into memoize’s internal state:

def precompute_data(items, user)
likes = Like.of_item(items).by_user(user).index_by {|like| like.item_id}
items.each do |item|
item.write_memo(:liked_by?,likes[item.id].present?,user)
end
end

The write_memo method is implemented as follows.

def write_memo(method, return_value, args=nil)
ivar = ActiveSupport::Memoizable.memoized_ivar_for(method)
if args
if hash = instance_variable_get(ivar)
hash[Array(args)] = return_value
else
instance_variable_set(ivar, {Array(args) => return_value})
end
else
instance_variable_set(ivar, [return_value])
end
end

This problem described here could be solved with some crafty left joins added to the query that fetched the items in the first place, but when there’s several different hard to prefetch properties, such a query would likely become unmanageable, if not terribly slow.

One thought on “(Ab)using memoize to quickly solve tricky n+1 problems”

  1. Nice article, thanks for describing the technique.

    I’ve been doing something similar (well, similar problem) using ivar ‘caching’ (not sure what to call this!) in the method calls, i.e.:

    def something_heavy(opts)
    @something_heavy ||= the_heavy_method(opts)
    ..
    @something_heavy
    end

    Are there any pitfalls (practical / performance) using either approach you think?

Leave a Reply

Your email address will not be published. Required fields are marked *