Once upon a time there was a complex data download in our GFDB application, powered by a raw SQL query that consisted of 2,149 lines, had 45 joins, filtered through over 30 columns and searched for data from 12 different tables. (Side note: we utilize the SqlQuery gem to load raw SQL queries from templates which makes using pure SQL in Ruby on Rails applications pretty seamless.)

Ashley Levy, gap intelligence’s Market Analyst for floor care and laundry categories, was running this convoluted query to retrieve the data she needed to create a gap intelligence ScoreCard for home appliances. The ScoreCard, designed by our VP of Product Development, Katie Hess, leverages brand presence, advertising activity, and promotional activity across the retail channel to determine a score and rank for each brands retail execution.

gap Intelligence ScoreCard

Ashley was getting concerned that the query was taking over an hour to run so she reached out to the Prod Dev team for support. (We’re always here for you Ashley!) Despite the sheer amount of data she was attempting to export and the complexity of the query alone, we knew we could make it better. We started by reviewing every SELECT,WHERE, and JOIN statement within the query in painstaking detail to identify any missing indexes. We found that we were in fact missing one more multi-field index on our Promotions table which did already have 23 indexes. This helped speed up a slow portion of the query that filters for promotions by product, merchant and promotion type. We were able to reduce the query execution time from hours to less than 10 minutes by adding a migration with the following index to our Promotions table:

add_index :promotions, [:product_id, :merchant_id, :promotion_type],
  name: :index_promos_on_product_merchant_type

Without these indexes this specific query had to scan through our entire Promotions table to retrieve the data needed. Indexes provide pointers to data that have been categorized into groups just the way some books include an index to help the reader find the information they need without reading through every single word. Our new index created pointers to promotion data that was split up into groups based on the product_id, merchant_id, and promotion_type so the query would only need to scan within the appropriate, smaller group of promotion objects. One line of code was a big win for Ashley.

Rules to Live By

  • Add an index to all foreign keys (belongs_to or has_many relationships) to optimize for a fast lookup
  • Add an index for everything that is referenced in the WHEREJOIN and ORDER BY parts of your frequently used ActiveRecord or pure SQL queries
  • Add a dedicated index for any frequently used sort
  • Add a double index [:owner_type, :owner_id] for all polymorphic associations

The Twist

Indexes do require the use of storage space and in some cases can even be the cause of a hit to performance. A table’s indexes needs to be updated every time a new object is created, updated, or deleted. This process will block concurrent writes until it is finished. While, most of the time this impact is small compared to the benefits realized for frequently used select queries on larger tables, if you’re unsure, taking real world measurements can help you make the correct indexing decisions. Query plans, which can be generated by running your query with EXPLAIN ANALYZE preceding it, are great for investigating how to optimize complex queries.

Moral of the Story

The moral here is, don’t settle for slow requests when simply adding database indexes can make significant improvements to query performance. Ashley, working closely with gap intelligence’s Prod Dev department, didn’t settle and neither should you. Be like Ashley.

For more than 16 years, gap intelligence has served manufacturers and sellers by providing world-class services monitoring, reporting, and analyzing the 4Ps: prices, promotions, placements, and products. Email us at info@gapintelligence.com or call us at 619-574-1100 to learn more.