The denormalized query engine design pattern

DjangoCon US

WIllison, Simon

Formal Metadata

Title

Title of Series

DjangoCon US 2017

Part Number

Number of Parts

Author

WIllison, Simon

Contributors

Confreaks, LLC

License

CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this

Identifiers

10.5446/33216 (DOI)

Publisher

DjangoCon US

Release Date

2017

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Most web applications need to offer search functionality. Open source tools like Solr and Elasticsearch are a powerful option for building custom search engines… but it turns out they can be used for way more than just search. By treating your search engine as a denormalization layer, you can use it to answer queries that would be too expensive to answer using your core relational database. Questions like “What are the top twenty tags used by my users from Spain?” or “What are the most common times of day for events to start?” or “Which articles contain addresses within 500 miles of Toronto?”. With the denormalized query engine design pattern, modifications to relational data are published to a denormalized schema in Elasticsearch or Solr. Data queries can then be answered using either the relational database or the search engine, depending on the nature of the specific query. The search engine returns database IDs, which are inflated from the database before being displayed to a user - ensuring that users never see stale data even if the search engine is not 100% up to date with the latest changes. This opens up all kinds of new capabilities for slicing, dicing and exploring data. In this talk, I’ll be illustrating this pattern by focusing on Elasticsearch - showing how it can be used with Django to bring new capabilities to your application. I’ll discuss the challenge of keeping data synchronized between a relational database and a search engine, and show examples of features that become much easier to build once you have this denormalization layer in place. Use-cases I explore will include: Finding interesting patterns in your data Building a recommendation engine Advanced geographical search and filtering Reacting to recent user activity on your site Analyzing a new large dataset using Elasticsearch and Kibana.