We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Accurate polygon search in Lucene Spatial (with performance benefits to boot!)

Formal Metadata

Title
Accurate polygon search in Lucene Spatial (with performance benefits to boot!)
Title of Series
Number of Parts
188
Author
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Producer
Production Year2014
Production PlacePortland, Oregon, United States of America

Content Metadata

Subject Area
Genre
Abstract
Lucene, and the NoSQL stores that leverage it, support storage and searching of polygonal records. However the spatial index implementation traditionally has returned false matches to spatial queries.We have contributed a new spatial indexing strategy to Lucene Spatial that returns fully accurate results (i.e. exact matches only).Better still, this new spatial search strategy often enables keeping a smaller index and and faster retrieval of results.I will illustrate why false matches happen -- this requires a high-level walkthrough of spatial index trees -- and real world cases where it makes a difference.Our initial workaround was to query Elasticsearch through a separate server layer that post-filters Elasticsearch results against the query shape, removing the false matches.We've now built a similar approach into Lucene Spatial itself. By virtue of living inside, this new solution can take advantage of numerous efficiencies:1. it filters away false matches before fetching their document contents;2. it uses a binary serialization that is far faster than the GeoJSON we used before;3. it optimizes the tradeoff between work done in the index tree vs. post-filtering, often resulting in a smaller index and faster querying. I will provide benchmark numbers.I'll illustrate how developers and database administrators can use this improvement in their own databases (it's easy!).
Keywords