Merken

Spatial in Lucene and Solr

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
In this talk on spatial and the scene and solar power before you going I like to series of hands of people who are familiar with Lucene or solar Elastic search can know what it is maybe views that on a maybe you don't maybe have a user but again what what people that and I was going to go quicker over what search all about his petty guys already know for the reason why I want this conference and is no way
to kind of node with all the other spatial blogs of inventory on spatial coverage before but I'm not only that we talked at search temperatures so I'm I wanna come tell the people about search the and the fact that many already know about but anyway more about me and basically a Lucene solar of search expert I did most of the time spatial stuff on that platform and I'm a freelancer that's what I do the the and about the the talk is about
search is about what's what's interesting search why is it so its own thing and not just a couple rent beaches on other products and and in the bulk mostly about spatial and solar and even about how it can affect in together under the hood so research what is a research vertical stuff to
search arm it's more than just 1 just you research and I would just take that for granted because there's a lot going on but it may not be readily apparent and we'll we'll take it for granted but would you start working with a little a lot to it for and stuff like standing in and tokenization and some language specific details but then of course is relevancy from ordering the results by relevance in know if compare this to say a database used distraite normal SQL features on you you need sort apply the old you don't sort by now how relevant is this record to nearing so that's a big deal and search I and query completion really Austin useful feature and a good search interface being able to actually suggests search queries Q or search terms based on what you're typing in the search box from did you mean if you miss type a query the search platform can often figure out on a better 1 for you highlighted snippets of you know basically if you return some number of results in like a little bit of snippet of context around where the matching words are in the in the document can give you the user a better sense of if it's relevant to you are not so many is it using mostly speeches on on front page for example but I wouldn't up my my 1st you mental model I wouldn't shows that Google's that a great it if for those units from their new to search and was able to break that model of what the platform can can do it gives you a sense of a technology but not necessarily how it's comes together as a platform that you work with anyway of passing is also an observer slide on the next result clustering is the clustering this search results based on word statistics and you see that too often but sometimes for a more analytical interfaces where there's just a lot of results you might want groupings based on certain themes that the text and from where is the using some kind interesting features besides the standard Boolean stuff like stuff fuzzy matching like is a search term but you know I might a type of a word or a letter a modalità character your finding stuff that's as close enough the
so resilience OfficeVision there was fastening you know what I mean by that so that residents of not only that this duplicate OK good good on this is awesome and it's amazing how coming people don't know what it is and when more more people assume recognizing more and more from my with that so for those of you who don't know what I strongly recommend that you were more about it not just 1 thing here and is essentially what it does is it aggregates counts on a set of results on summary counts and we see it all the time as users on break typically in e-commerce platforms but it is definitely not an e-commerce feature you know the 1st our vendors as so was certainly hot-selling to become a spenders I mean here we see in 2 different cases on the left we see an e-commerce site new egg and I could die off that the main body of the page search results products by on the left is here we see that bastard results we see a base the aggregated counts from different categories of of another reason is that the document is a product that might have a field called department or something like that or or hard drive Denver terrorism in in price and you want area counts over the entire results such just the top 10 showing right now but over everything so at scale this is kind of got to do it would takes technology that is designed to do it and so on and the search platforms and they do it very well like it is a dedicated feature is other that's impossible to do if you just had a rebuff standard SQL your disposal much harder so much harder that you wouldn't think to build an app that had this feature University user possessing people to build apps and clumsily not have this feature 1 sort of obvious to me like a year the app would be great with faceted navigation are managers and so you to using a search platform also we see on the right here we see analytics this is where there's not even necessarily search involves so on solar ask search are becoming more and more commonly seen in in the space where from Hilda power gas like this song like on the fly with nasty in easily it's a big part of fastening to them in so loss of search is not that it's actually impossible to do it today with the database but it makes it so easy with Elastic Search so the
and over some technology In search now to take that technology and and actually have it on a platform a natural solution you can install run work with on this there's some options out there so search platforms sure they have the technology are listed on the slide ruler but they need a bunch of other stuff that sort necessary Putin actually effectively uses so in a query language you because that the basics that the not stuff that I would call your search feature but it's practical feature to to get the job done on that good spatial of joint grouping on administrative stuff was lot of data anyone 1 means to horizontally scale study crawlers are usually like web crawlers sample some apps need a crawler some sort especially with within in what's called the enterprise search market they really need that and that's and the open source space that is usually very separate distinct an open source solution that you pair with solar Isaac Search I but I think some other commercial vendors tend to a bundle Jason choices you so the
elastic search are the leading open-source platforms are extremely popular as a equal in terms of popularity these days they're both based on the scene and as there is no other as of course the speed nexus ask search the most prominent 1 I've heard of this text and the commercial side there's a bunch of vendors that Museveni bought by other companies and there's a cloud option as well notably clad searches based on solar internally the I so what
is relationship here between was seen and solar inelastic search of the where I can view it as a sort of a fuzzy fuzzy lines on Lucene provides most of the kind of the raw search technology was to the beginning and so over the ls search you it brings at the next level to to make a platform out a system if it does all the other of things but also adds its own technology as well for example fastening as is the case where fastening is actually implemented separately in the scene and so monastic surface of a perfect example of on the core technology in the making of platform some blurring lines there but that's can how I have expressive and the key thing is that we've seen is just about it if you want to build a search if you wanna answer to something the starting with the scene it's very low level just toolkit is no configuration file you to write code to get something done on mean solar and Lucy and elastic search it brings into more of a a server-level interaction where you're interacting with them the server process or http on so spatial
yeah most of this is from the lens of half of examples from solar that's 1 might assume most familiar with was acceptable seen reference in here and some classic search as well so there's a bunch of
features you see here that's in the platform the the in it if if you search platform III 10 distinct always just a tax but it's really does those words numbers and dates and spatial to and in particular with regards to spatial and by far most people who use spatial and this platform index points like the document might represent a Taylor page list in which a site and and the point is that in an address of where that businesses as as 1 example on it that's extremely common example arm and then at search time you wanna be able to say you know where all of the matching documents within a given point radius from a given point in radius and provided search time at that that's the public % what people need to do and platform with oven does more out of note here there's with a Cartesian and mathematical model of the world as well as a geodesic 1 by geodesic I mean spherical geometry have assigned distance calculations on and by partitioning flat 2 D find on it this the platform does not do projections itself you would have to reject your data in give to it in an action and y a space and it will work with that but it doesn't apply the projection for you so and structure of the very basics
would start with something called lat type and solar not it's been around solar are the longest and in your schema you basically define a point he defines the field type that that field users it's a coverup right out of the schema that comes with solar as an example also conserves as as a documentation and in fact as well and we're actually providing of the document to solar with a point on the multiples of solar supports XML and J. Sun and some other format so if you want to use the days on format is a one-liner there the smallest possible to like imagine 2 fields 1 for the IDE and and another for the spatial field in this case it's named named point in it's like come along it's very simple example this is again lat lat-long type the other spatial types in so i'm if you want to use the to be not latitude longitude security protecting a data than you'd use point type for example but it would be very very similar now there's another field type it
and has a very long name I just called RPT the short recursive prefix tree In and have a example of indexing polygons only thing pollen about how examples only very and which is the syntax in which is adapted the previous slide to inserting WKT formatted shape at the very bottom of with the same bill tidy can index basically anything but i points even circles are which you would which is represented as as a special extension to WKT called buffers you buffer point and then you end up circles the and I while some details a bit later so if you want
to filter your search results by not just keywords but also spatially would then ways do it depending on what you want on if you have it if you effectively what defined everything close to a given point than internally I I think of that as you will you're you're spatial query is a circle and then you wanna match everything that it touches basically from the people think of that as find everything with X km of this point that's effectively the same thing think about and and so that's used in dual filters become the standard way to do that and solar visibly provided field the point and the distance in km it's really straightforward if you want to search are based on everything in our not provided rectangle then use the Range gray syntax and solar so whereas the left and right side is a point of the lower left and right spree straightforward as well from if you want to use rectangle across the date line then you have to use RPT type that does work but the violent I mentioned the last example is once you get out of online to query by a circle or rectangle then you need to use this 3rd syntax tightness funky sorry I work on making it better but is is and the key element so you see our this dedicatee to express the shape and you see the intersection before so that's the spatial predicates are by default you get intersects with the previous 2 but you to plug in something else like the music contains within joint on the for other for some other fields like the bounding box building a show you you can do equals and there's a few others as well the the the the main thing you might wanna
do with search platform aside from filtering results by some special predicate is also watering the results by distance and or influencing the relevancy score you have from keyword research to serve taking the score and mathematically combining it in some manner so here the upper example is is a straightforward sort by distance answered you just a function in there which is geospatial distance was also due this assumes that a geodesic time point point us there but if you want to do a Cartesian near to the others another function you would use the relevancy boost this like different ways is this many ways to you might wanna combine arm and also depends a lot on the field involved the simpler way could express the latter half of the slide from using you just as well but this particular 1 shows on reciprocal distance which is a little more convenient to use because it varies between 1 and 0 which is more which is a more convenient to stay number and to apply as opposed to something that goes from 0 to infinity of and it's a sparse around around the world but anyway I 20 that's rectangles but you certainly could use the RPT field I indicated earlier and for the syntax for describing the the rectangle you could use either of these envelope which as you can see in sample at right here or you can use the a polygon that has 5 coordinates it's closed and you know discuss more the more standard but as you're about to see this particular field has some more options this especially for a configurable is brand new solar for about 10 and this field uniquely supports are relevancy based on the area of overlap between the index rectangle and a query rectangle so if you imagine that you're a scene of this is actually supported from I think geo-portal due forget open-source project and basically the idea is that you have a bunch of documents in a document is something on the globe somewhere with the spatial extent and your search screen is your map is looking at some spatial extent and some things are really small something's really big might be the entire state and you might wanna relevancy rank the results by how close the the index rectangle is two year search map window rectangle so there's a that kind of custom every overlap formula involved here that's what's with special about the box field also supports more support more predicates that beating and more precise to but the main the main main purposes with overlap no 1 big 1 hour
server disclaimer if you will about indexing polygons and so the RPT field workers a prefix tree it's partly called that because it it it uses a recursive is a prefix tree algorithms to try of grid squares and the whispers only get so accurate prefer points this is kind this is completely a non-issue because you can vary scalably index points with grid squares to basically linear precision is known as an issue per say all but when you come to shapes that are being a beyond a point here that we polygons as these days but also applies the lines then you'd only scalably you so many grid squares to represent anything before the procession starts getting really terrible or are you your that your index requirements jumped exponentially to 0 so this is really a trade and accuracy so with our p with every field if you indexing polygons with solar but like that's a segment of Massachusetts on there's a ton of grid squares a text index Massachusetts for example so you can see that the jaggy edges right there recently in at the Lucene layer of the architecture there's a new serialized geometry option that is designed to be used in conjunction with our between such that you can use this to get kind of a fast filter band then but not necessarily accurate and then you combine it with the serialized in this jump to check together actually accurate results so they're designed this is I do work together it's not yet exposed at the solar Elastic Search layer Murdoch oracle
and the ceiling by the of a gene maps this is more of a to do then implemented feature but are unimpressed that are 6 search has something you call geohash on Jewish grid aggregations which I it in the end if you want heat map this is 1 kind piece of the puzzle are makes that puzzle easier so that you can do the same thing with solar and still very work to do on is less about 2 but more of a pointer of hey you want to heat maps to wonder what on since all that 1 well if you solar than burlesque search and these are kind of is what you need to look into and the problem is that if you if a matching millions of points or some ridiculous large number it's not really feasible scalable to say I want all of them and put them on the mapping need to summarize them in some way so that's what this is all about by
the scenes so there's 1 open source
and project distinct from policy spatial and solar and that our spatial for j got something that I created with us some other developers and in the scene spatial to kind of D. couple the spatial centric concerns that don't have anything to do with Lucene solar and search and so space of a J is from In the open-source project that's mostly about shapes are certain shapes that is already some products out there like JTS for example they're pre awesome but GTS with but there's certain that Mr. next match of requirements in particular i j TSS and have a circle shape everything to the points lines or polygons but I need a bona fide circle not a polygon representations what actual circle but also I need to needed geodesic version and that's kind of mankind the focus of this library lately is to better differentiate it from other options it's more a geodesic oriented by geodesic I mean representing the shape on the surface of a sphere of NASA's hard so the and with the focus of this is joining location texts currently incubation so internally there's
describe spatial for j which is a very small and focused library just just concerned with shapes and distances on and a key operation by way from previous slide for jump back just a moment is foundational things that
these shape is not the such as an object with the but but point accordance and they do stuff in particular they compute the relationship between a rectangle that's key because the grid schemes based on rectangle so it's constantly asking the shape say is it are you do you possess rectangle continued does is it within you is joint from you is intersects and that's computing these things fast test a lot anyway so scene
spatial and is a module of the same gene as many models specials 1 of them and it uses depending on there's multiple also called spatial strategies because the the the key concept here is that there's not 1 way to go about indexing your data spatially in the scene depending on your requirements there's different ways to do it in particular if you just want the just like the bounding box cases there's a great example of that if you want done the relevancy based on area overlap then the bounding box strategy is perfect for that indexes the awkwardness plus they line-cross Boolean ionic materialize that a bounding box really fast so if you want to do distance sorting them then point vector which is the the the on type on the source said on so there's multiple strategies involved and stays on some of them use of special J. more or less on the solar and you ideally it just an adapter to the kind of at the cost of ability layer of all but for historical reasons and sometimes other reasons there's some stuff that exists now at solar so that 1 type Ia predated the current using spatial model loss exerts got some neat stuff there as well so it's got its own that's that's only of King of computing poly on intersection and from this really nice they're they've they've adapted the hard to think duties on syntax of not mistaken on in this so because the surgeon accepts chase on our request response to this sort of adapted that to take on a polygon directly as opposed to solar were I just use the BKT string and so simple technology
and the had I'll move on quickly bonds can mostly what I have cut my my teeth found I really like on the backward squares and stuff our that's good segue actually
because the other working with a variety of in terms of last year's 1 recognize them 1st being a broom is working on a new flexible prefix-tree it's more optimized of seen between a 10 % and on % performance increased spending on also defectors it worked with a variety of interns on spatial for j on geodesics stuff like polygons buffet lines from space of jail buffet line is a first-class concept as opposed to taking a line buffering into a polygon that I really impressed by the you know we had today and assigned the 3 I think there's some stuff that I wanna potentially borrower from from the 3 based on what I've seen not because of Judah successfully and core of what conservation days about the the yeah future plans that
armed the variety of things some things are common in progress so there's this progress being made for sure of condemning excited lately about the performance of the flux prefix tree and it is mostly performance based feature this up to this point the only grades internally could choose required and geohash but they weren't really ideally you would have a flexible grid that isn't like fixed by either 432 subcells ideally would vary depending on the level of detail your app for various reasons that omega during this interim time arms over 11 25 miles because in my talk and various very questions from home and and
on that the all of the all words in the solar Elastic Search against the question is how on all of these questions is really do both solar anelastic search support both geodesic and the Cartesian In accordance systems yes there's usually call the up to both you know for example I think accusing Israel support because of some simple right on but when it comes to geodesics is usually not tricks that are being done that that and that may or may not work so if the previous use case you know for example are polygons no there's no I not found a geodesic deployed on library and in a bona fide on you know the part that to the talk about the j points about being a truly an arc on the surface of the globe and it becomes a curve and spaces no I don't know that of a library that will compute the intersection of a geodetic polygon edges and you with a with a grid square so answer some of those on the map is actually actually computed and to the but but say for example if you know it but but if you're if fish shape is a circle a point radius then that's that's pure geodesic no life so the someone he'd really details but they both have it yeah the all Brian my thanks for your talk a there was a big push for solar for 10 release and could you higher-level more what was what's in their what's new I prominence by the biggest thing by far is are distributed to the fastening of for long times pi was voted issues and soulless Europe and spit faceting and is it's been solar since 4 0 but it has not supported the horizontal all scalability you're basically fixed at a single shot I set this cut taken forever to get done because it was rated at the challenging and that's the big thing on the bounding box field that you see here that I got that and so on and something called terms query which is when you're on if the query solar where there's a field that could have a lot like you wanna say that our search for all documents were the category is either this or this or this or this and that becomes a long list that are like hundreds of should use something called a terms to Persia actually that was what I and the much more scalable than the Boolean query that you would that people would otherwise normally use the the on this company things say for 10 the the and so we've got a currently combining a search is they're going get solar for basically document pieces and getting the pieces and doing the same query polygons and it's suppose the basic having sex sets we come back on what the problems we have to deal with is the sole or documents some of the solar documents more of correspond to the same polygon so if you have like a 100 solar documents alters the polygon is that like comedy to really exploit the disk space to use or no matter how you have some optimizations for that the that's not that's not a problem whatsoever because it will I will I have a assuming you store the geometry and there it is stored now 100 times on it's not you know union the amino it's not shared but I but the indices of old had no problem villages use the very same terms and it'll just add ID onto what Lucene calls the postings list so that's not as problem at all composition this away the 1 they hadn't quite highlight enough from this kind orientation between search platforms and spatial requirements and text search requirements from databases most mature databases have text search modules but to that I told the respect to the database vendors like that's like a check box on the feature list of things that they have to do as as a sort of form which lives reason stuff it doesn't it aces this categories with the arm as opposed to say spatial where eigen WordNet like this doesn't hold you know this doesn't hold a candle expression of deposed yes processes is amazing arm so if your problem is perhaps predominantly a special problem in you got lots of complex partial special permits if find these approaches on the other hand if you're very simple those simple spatial comments and find all documents that are all missing documents or records whatever with then extra plumbers something for years purchases way overkill on have water people used my SQL spatial and the moment they brought all the data in the solar they sort amazing performance increase and met the requirements so supports can think about these things would you don't neutral it you can try and avoid here we In the best of breed for everything and especially for search was on this but must be for spatial you can always pick all of them because then you have to somehow join them story and put your data so I do that yet pick the ones that you you would your requirements and figure out which is the best based on the requirements if you are in a situation where well I want use post just because of the the the that need actors share something you can probably find of this now is especially with the serialized TV strategy I mentioned this ways of getting accuracy on that a couple class I've worked with had to that to do like a secondary search which takes the search performance just just terrible yes driver set the status of the talks where I talk about comparing the spatial and mongo and posters and 1 of the things they say about the scene Spatial is also that you don't want you to is kind of all that is if you're doing a lot of updates to your documents I think you also that we use visuals not a good reason for that is true I'm my look at and this is not just a spatial in general it out with you just listen value inherit from the exactly know what these platforms good for where the in of this not good for everything and if you have a very high update volume a Lucene-based based platform will be solar Elastic search them to so well on because internally it can't actually updated document has to add again and effectively marked other 1 has deleted which events needs to get merged way and that just more were compared to some systems that are better optimized for that are the some of there's it's been very slow going but there's been some signs of that being improved are right now it's limited to number feels that you only used in reverse version in very low narrow use case there's fields are being made to be updatable deny and directly away but still this the stuff will it be done and so the thing that is in the plus side those all file so as opposed to Davies all very complicated you can use or sink the scale of years of my students in that you are thinking basically Berthier enough that not not a complicated you do that's through the the key thing in there that why that is true is because the scene is an hour it at bits and append only on the it internal indices are append-only only to the files it doesn't update the middle of any file only appends only so it lends itself to being put on HDFS for example from and it's ah sinkable because once it modifies the segment and rights in it's done there it's doesn't have to get on so the the back the original horizontal scale 4 for solar of replication was based on our sinks minutes because it just work well that's cool thanks to invest in my turn
Demoszene <Programmierung>
Softwareentwickler
Sichtenkonzept
Reihe
Elastische Deformation
Computeranimation
Expertensystem
Softwareentwickler
Web log
Applet
Biprodukt
Systemplattform
Computeranimation
Keller <Informatik>
Unternehmensarchitektur
Knotenmenge
Software
Lucene
Expertensystem
Resultante
Einfügungsdämpfung
Bit
Formale Sprache
Gruppenkeim
Zählen
Unternehmensmodell
Analysis
Raum-Zeit
Computeranimation
Homepage
Festplattenlaufwerk
Einheit <Mathematik>
Datenmanagement
Kontrollstruktur
Schnittstelle
Umwandlungsenthalpie
App <Programm>
Zentrische Streckung
Statistik
Oval
Vervollständigung <Mathematik>
Fuzzy-Logik
Kategorie <Mathematik>
Datenhaltung
Güte der Anpassung
Abfrage
Kontextbezogenes System
Biprodukt
Rechenschieber
Datenfeld
Abfrage
Menge
Rechter Winkel
Standardabweichung
Web Site
Subtraktion
Quader
Wort <Informatik>
Zahlenbereich
Sprachsynthese
Analytische Menge
Nichtlinearer Operator
Term
Systemplattform
Datensatz
Datentyp
Luenberger-Beobachter
Elastische Deformation
Cluster <Rechnernetz>
Grundraum
Leistung <Physik>
Token-Ring
Quick-Sort
Flächeninhalt
Mereologie
Wort <Informatik>
Boolesche Algebra
Vollständigkeit
Normalvektor
Konfiguration <Informatik>
Mathematische Logik
Dokumentenserver
Content <Internet>
Orakel <Informatik>
Systemplattform
Term
Raum-Zeit
Computeranimation
Open Source
Abfragesprache
Prozess <Informatik>
Stichprobenumfang
Installation <Informatik>
Auswahlaxiom
Streuungsdiagramm
Beobachtungsstudie
Elastische Deformation
App <Programm>
Zentrische Streckung
Spider <Programm>
Open Source
Güte der Anpassung
Konfigurationsraum
Systemverwaltung
Applet
Systemplattform
Abfrage
Varietät <Mathematik>
Boolesche Algebra
Quick-Sort
Konfiguration <Informatik>
Rechenschieber
Auswahlaxiom
Software
Spider <Programm>
Unternehmensarchitektur
Faserbündel
Streuungsdiagramm
Server
Prozess <Physik>
Systemplattform
Interaktives Fernsehen
Physikalisches System
Elektronische Publikation
Systemplattform
Code
Quick-Sort
Computeranimation
Übergang
Demoszene <Programmierung>
Rohdaten
Framework <Informatik>
Fuzzy-Logik
Flächentheorie
Zahlenbereich
Server
Speicherabzug
Elastische Deformation
Programmbibliothek
Konfigurationsraum
Gerade
Punkt
Adressraum
Zahlenbereich
Systemplattform
Raum-Zeit
Computeranimation
Homepage
Multiplikation
Unternehmensmodell
Mathematische Modellierung
Datentyp
Punkt
Abstand
Quick-Sort
Datenstruktur
Automatische Indexierung
Radius
Prädikat <Logik>
Rechteck
Computersicherheit
Konfigurationsraum
Sphärische Geometrie
Mailing-Liste
Rechnen
Dateiformat
Taylor-Reihe
Datenfeld
Abfrage
Polygon
Automatische Indexierung
Rechter Winkel
Zahlenbereich
Dateiformat
Räumliche Anordnung
Projektive Ebene
Wort <Informatik>
Resultante
Retrievalsprache
Maschinenschreiben
Bit
Multiplikation
Punkt
Quader
Kreisfläche
Rechteck
Abgeschlossene Menge
Element <Mathematik>
Polygon
Computeranimation
Netzwerktopologie
Puffer <Netzplantechnik>
Spannweite <Stochastik>
Gruppe <Mathematik>
Datentyp
Speicherabzug
Punkt
Abstand
Maßerweiterung
Drei
Default
Prädikat <Logik>
Automatische Indexierung
Filter <Stochastik>
Shape <Informatik>
Datentyp
Kreisfläche
Matching <Graphentheorie>
Datumsgrenze
Konfigurationsraum
Abfrage
Spannweite <Stochastik>
Rechenschieber
Prädikat <Logik>
Datenfeld
Polygon
Rechter Winkel
Automatische Indexierung
Rekursive Funktion
Shape <Informatik>
Resultante
Konfiguration <Informatik>
Punkt
Prozess <Physik>
Vektorraummodell
Information
Computeranimation
Strategisches Spiel
Netzwerktopologie
Wechselsprung
Algorithmus
Serielle Schnittstelle
Gruppe <Mathematik>
Bildschirmfenster
Punkt
Quick-Sort
Gerade
Automatische Indexierung
Lineares Funktional
Shape <Informatik>
Datentyp
Rechteck
Geodätische Linie
Abfrage
Digitalfilter
Gleichheitszeichen
Konfiguration <Informatik>
Rechenschieber
Prädikat <Logik>
Datenfeld
Automatische Indexierung
Rechter Winkel
Server
Projektive Ebene
Aggregatzustand
Quader
Existenzaussage
Rechteck
Zahlenbereich
Systemplattform
Polygon
Räumliche Anordnung
Ausdruck <Logik>
Demoszene <Programmierung>
Rangstatistik
Stichprobenumfang
Abstand
Elastische Deformation
Maßerweiterung
Touchscreen
Prädikat <Logik>
Konfigurationsraum
Einhüllende
Quick-Sort
Mapping <Computergraphik>
Abstand
Quadratzahl
Räumliche Anordnung
Serielle Schnittstelle
Rekursive Funktion
Körpertheorie
Unternehmensarchitektur
Orakel <Informatik>
Demoszene <Programmierung>
Mapping <Computergraphik>
Punkt
Rechteck
Open Source
Demoszene <Programmierung>
Zahlenbereich
Gradient
Matching
Zeiger <Informatik>
Computeranimation
Punkt
Mengentheoretische Topologie
Momentenproblem
Kreisfläche
Desintegration <Mathematik>
Selbstrepräsentation
Vektorraummodell
Versionsverwaltung
Parser
Abstraktionsebene
Polygon
Rechenbuch
Raum-Zeit
Computeranimation
Demoszene <Programmierung>
Wechselsprung
Kugel
Gewicht <Mathematik>
Flächentheorie
Programmbibliothek
Punkt
Abstand
Gerade
Kugel
Nichtlinearer Operator
Shape <Informatik>
Kreisfläche
Matching <Graphentheorie>
Open Source
Geodätische Linie
Mathematisierung
Kartesisches Produkt
Biprodukt
Fokalpunkt
Gerade
Konfiguration <Informatik>
Rechenschieber
Abstand
Suite <Programmpaket>
Polygon
Projektive Ebene
URL
Schlüsselverwaltung
Shape <Informatik>
Einfügungsdämpfung
Punkt
Mengentheoretische Topologie
Quader
Kreisfläche
Vektorraummodell
Rechteck
Parser
Abstraktionsebene
Unternehmensmodell
Polygon
Rechenbuch
Vektorraummodell
Computeranimation
Demoszene <Programmierung>
Gewicht <Mathematik>
Datentyp
Endogene Variable
Punkt
Abstand
Softwaretest
Kugel
Shape <Informatik>
Mathematisierung
Nummerung
Quellcode
Kartesisches Produkt
Modul
Gerade
Quick-Sort
Objekt <Kategorie>
Abstand
Suite <Programmpaket>
Polygon
Flächeninhalt
Automatische Indexierung
Strategisches Spiel
Shape <Informatik>
Zeichenkette
Offene Menge
Automatische Indexierung
Inverses Problem
Kreisfläche
Geodätische Linie
Güte der Anpassung
Aeroelastizität
Geodätische Linie
Automatische Handlungsplanung
Benchmark
Polygon
Term
Gerade
Raum-Zeit
Computeranimation
Netzwerktopologie
Quadratzahl
Polygon
Datenstruktur
Standardabweichung
Speicherabzug
Energieerhaltung
Gerade
Varietät <Mathematik>
Retrievalsprache
Punkt
Momentenproblem
Versionsverwaltung
t-Test
Parser
Komplex <Algebra>
Gerichteter Graph
Raum-Zeit
Computeranimation
Kreisbogen
Übergang
Gradient
Eins
Netzwerktopologie
Arithmetischer Ausdruck
Skalierbarkeit
Vorzeichen <Mathematik>
Eigenwert
Datenreplikation
Visualisierung
Kurvenanpassung
Schnitt <Graphentheorie>
Figurierte Zahl
Zentrische Streckung
App <Programm>
Shape <Informatik>
Kategorie <Mathematik>
Datenhaltung
Güte der Anpassung
Geodätische Linie
Abfrage
Partielle Differentiation
Ereignishorizont
Datenfeld
Menge
Automatische Indexierung
Rechter Winkel
Strategisches Spiel
Varietät <Mathematik>
Orientierung <Mathematik>
Quader
Wasserdampftafel
Klasse <Mathematik>
Zahlenbereich
Fluss <Mathematik>
Term
Räumliche Anordnung
Polygon
Systemplattform
Demoszene <Programmierung>
Bildschirmmaske
Datensatz
Arithmetische Folge
Flächentheorie
Mini-Disc
Programmbibliothek
Polstelle
Spezifisches Volumen
Elastische Deformation
Videospiel
Radius
Kreisfläche
Transformation <Mathematik>
Axonometrie
Mailing-Liste
Physikalisches System
Elektronische Publikation
Modul
Quick-Sort
EINKAUF <Programm>
Mapping <Computergraphik>
Druckertreiber
Quadratzahl
Polygon
Mereologie
Räumliche Anordnung
Serielle Schnittstelle
Wort <Informatik>
Boolesche Algebra

Metadaten

Formale Metadaten

Titel Spatial in Lucene and Solr
Serientitel FOSS4G 2014 Portland
Autor Smiley, David
Lizenz CC-Namensnennung 3.0 Deutschland:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
DOI 10.5446/31603
Herausgeber FOSS4G, Open Source Geospatial Foundation (OSGeo)
Erscheinungsjahr 2014
Sprache Englisch
Produzent FOSS4G
Open Source Geospatial Foundation (OSGeo)
Produktionsjahr 2014
Produktionsort Portland, Oregon, United States of America

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract Apache Lucene is a Java toolkit that provides a rich set of search capabilities such as keyword search, query suggesters, relevancy, and faceting. It also includes a spatial module for searching and sorting with geometric data using either a flat-plane model or a spherical model. The capabilities therein are leveraged to varying degrees by Apache Solr and ElasticSearch--the two leading search servers based on Lucene.In this talk I'm going to start by briefly covering some core features of this search platform so that the audience appreciates the unique role it plays in the crowded world of information-retrieval. I will then show examples of using some spatial features in Apache Solr such as:? indexing points, polygons, and other shapes into a Lucene document? filtering search results by a query shape, to include using different search predicates? sorting by distance between indexed points and a query pointNext I will review some spatial features in Lucene spatial and ElasticSearch such as:? sorting bounding boxes by overlap percentage with a query box? aggregating geohash grid counts for heatmapsThe talk will also note the internal architecture and dependencies of Lucene spatial, and discuss a key dependent library called Spatial4j. At the end of the talk I will note some limitations to be aware of, as well as planned improvements. Finally, key advances in geodesic (spherical geometry) information retrieval in Spatial4j will be highlighted.
Schlagwörter search
information-retrieval
NoSQL
spatial

Ähnliche Filme

Loading...