Apache Lucene is a Java toolkit that provides a rich set of search capabilities such as keyword search, query suggesters, relevancy, and faceting. It also includes a spatial module for searching and sorting with geometric data using either a flat-plane model or a spherical model. The capabilities therein are leveraged to varying degrees by Apache Solr and ElasticSearch--the two leading search servers based on Lucene.In this talk I'm going to start by briefly covering some core features of this search platform so that the audience appreciates the unique role it plays in the crowded world of information-retrieval. I will then show examples of using some spatial features in Apache Solr such as:? indexing points, polygons, and other shapes into a Lucene document? filtering search results by a query shape, to include using different search predicates? sorting by distance between indexed points and a query pointNext I will review some spatial features in Lucene spatial and ElasticSearch such as:? sorting bounding boxes by overlap percentage with a query box? aggregating geohash grid counts for heatmapsThe talk will also note the internal architecture and dependencies of Lucene spatial, and discuss a key dependent library called Spatial4j. At the end of the talk I will note some limitations to be aware of, as well as planned improvements. Finally, key advances in geodesic (spherical geometry) information retrieval in Spatial4j will be highlighted. |