Just skip to the last part if you are in a rush 😉
Last year I did an upgrade for a legacy Rails website, from version 3.2 to 4.2. One daunting task was to upgrade the ActiveRecord RubyGem since the API changed drastically in the new release. Luckily there was a RubyGem built specifically for preserving the functionality using the deprecated API, at least for now. To prepare for the inevitable, I want to write down some of the changes made recently for future reference so I don’t have to search through the documentation and Stackoverflow again.
What’s the problem?
We have a method that returns a report that aggregates on some data. 7 years ago when this application was built, there wasn’t much data so this was fine but now the report won’t load for an obvious reason: the performance of the query to calculate col2 was so bad given the current data volume. Therefore, we have to optimize it.
The above Rails code translate to the following SQL query:
What’s the solution?
We decided to use a nested query to group by both the col1 and column2 first and select from that result:
Works like a charm except that I’m having problem translate this back to Rails code since I haven’t worked with these Rails code for almost a year now. I did some research but couldn’t find any straight answers to my question so I had to switch to trial and error by composing different API combinations and comparing the logged query with the desired one.
It turned out this one page in the documentation has all you need but you need to be creative to combine them.
Here’s how
- find all with :select can be replaced by the new .select() API.
- :conditions can be replaced by .where() API.
- :group can be replace by .group() API.
- :from can be replaced by .from() API.
The nested part took me some time to figure out. It turned it’s just the .from() API.
In summary, just do the normal select for the inner query and save the result into a variable inner_result. Then pass it to the outer query’s .from() API.