Add to Watchlist

Magical PostGIS in three brief movements

34 views

Citation of segment
Embed Code
Purchasing a DVD Cite video
Series
Annotations
Transcript
but I services matter approaches and
3 brief movements I like to start by acknowledging the books Carribean employability work on posters and like a lot of Socrates serves companies currently has a strong open source at that stronger than most actually because their system is built on top of open source components the DB encourage is actually PostgreSQL PostGIS so much what I'm actually talking about today can be run on the parity the cloud and some of my examples actually that they can also be run on a local post posters instances well In fact I built a lot of these examples originally on top of the balance that so it's really it's fairly agnostic as far as what you were put on top the database and the rest posters what it's all about so this is supposed to be magical posters and perhaps the the it's magical closest as an apple last name is kind of thing in here at the but the apple user Conference it's it's magical stuff the and and maybe that's true you to the room but in retrospect I could have called show and tell since I seem to have a lot of material about my favorite toys or maybe stupid extension tracks would been more honest since I got some great examples of phrase extensions but regardless of the playbill it is medical colleges the In 3 brief movements and like in this time because I feel like people are appreciating the kind of the yes but it was
in I guess really video and recording electrons and 0 here I want to give the soccer feel like folks art appreciating the kind of deep and beautiful magic that they can create using a little more than the standard back and database too often people have this utilitarian view of the database they there don't really like the database that much to them it's just a bit bucket was hold about tables they stuffed data in the dragged it out some people hate their database so much they hide away behind an object relational mapper mapper there and or so they can pretend that it but it is there in the background doing the hard work so they can pretend they're all alone they're beautiful little middleware language and a really missing out because once you get to know it more intimately you come to realize that your database is a beautiful beautiful thing it's not just a bit by bit it's a magical to lots of all kinds of good stuff inside this talk is actually not so much about posters it's about the kinds of things you can do with posters when you combine it with the magic that's already inside the PostgreSQL database post rescue so magical because it was designed from the start to be more than a bit bucket Michael Stonebraker had already spent a decade pushing around that's the interest project when he dreamed up his next-generation database 1986 and he wrote a paper called the design of post-stressed which laid out his goals for the new and at that point unwritten database and those girls which form the foundation of those stresses awesomeness and posters itself in particular support for complex objects geometry geography are complex objects and saw rostas and user extensibility is what allows users to insist all allows anyone to add types and functions the database at runtime and most the fun stuff posters postural takes advantage of extension points the active features are pretty common database feature now other database they can be managing data flow and relational models what Kaiser in together that's what makes the system as powerful as it every piece of information is couple and topples collection tables suppose press lived as stonebreakers academic project for almost a decade but useful enough that by the time he
moved on to other projects and the topics it already had a user base who kept alive if it out with the new sequel standard and eventually grew into the PostgreSQL all development the we have today so for the 1st movement or 3 movements a lot of what's possible when you start making use of PostgreSQL roles native full-text search support because if there is a phrase that makes you want to put my head in the oven it is where using posters Elastic search there also I and I acknowledge Lucene lattices denies tools but know I I sure hope you need every scrap of functionality they offer because once you put which of 2 different data storage and persistency strapped together everything in your system gets more complex and unclear assuming hopefully that you're relational databases source of truth all the changes have been replicated over to your last it's a system which adds a synchronization step of the work that the data changes fast enough that meant to be quite complex but that's actually easy problem the hard problem is that once you have to query and points any query that involves both a text search and a spatial search of sufficient complexity to require posters requires that the middleware starts to coordinate the query so 1st to talk to 1 systems and says and give me all the records on this text search query and then it has to take all that information walk over the other systems are and give me all records you have better in this set and that MySpace plots and depending on the query the order you want to do that in text 1st where spatial 1st very so basically you have to build a little quote planner in you know where was a terrible idea because PostgreSQL will only has acquiring letter it already has a full text search system built into it post press to search has all the basic capabilities you want in a full-text search engine as stemming Fox and Fox running and run that has waited searches so you can get more press results matching the the title at ability to create your own dictionaries say can handle both different languages and a specific profession domains jargon and so on it's got the ability to rank results based the quality of search it's got highlighting of matched terms the in but what is the source of the magic posters well if you're full text that engine and a spatial engine or the same database you can run compound spatial and text queries and you don't have to think about the execution path for efficiency the database engine just does it for you automatically so here's a fine example application it's built using geographic names in this case the geonames . org because geographic names of basically words they're just there is really short documents that complications but any document type with location can be used to build a cool test spatial location application so little data mangling you can turn the geographic location names filed a table looks like this of primary key the name and location In order to get the a full text search enabled you have to add the inspector column so are column type a full-text searching is uh is that the expected that we populate it with the expected data using the English configuration will talk a little bit more about that later and finally index using the full text index for test vectors i energy in the index at Stanford generalize inverted index is also used in most press support for array types so are all set up and the semantic parameter here the word English so we have specified English configurations of English grammar rules are used to determine things that like that that open Oaks an agent agent basically the same thing to identify all the articles and pronouns can be ignored but to reduce the phrase invisible vector like structure suitable for indexing so to TS vector gives us a column of the inspectors requiring but how do to do that the the quietest battery needed to square which is itself a logical filter you can construct 1 of the combination of and more clauses optionally with weighting and partial matching is a simple 1 was a query that would match entries with both both entry or only in rich and we can use a query in a full-text search of our 2 million geographic names Table to remember geographically still using the the AC operator find all the tears vectors that match the tears query it turns out there's only 3 out but the really interesting thing is how quickly find the answer is to 70 ms such a good search of 2 . 2 million records and the best part is now the full-text search is handled inside the database it's possible to build efficient compound spatial text queries to select a scoring which combined the search for all the records with all entry with the spatial filter restricting the result nearest kilometers and because both clauses or hand of a database all database machine meters bozo figure out the most efficient way to access the roads so this is the explain allies output for last query and we even the bottom up you can see that in this case a database ran full-text search 1st because it was the most selective title return 59 rose as opposed to the all things in 100 kilometers filter which would return so that and then apply the spatial filter which removes 57 and 59 giving just the 2 we got results so I showed you sequel the and maybe Daghestan wonder where the magic as so if you have nothing sequels magic but usually the magic company behind the power sequel into a user interface and make that power visually manifest so take all those place names and subset them quickly using text search and then those past the result into heat map so here's a map for a unique regionalism in the Pacific Northwest of North America and in Cascadia we call mountain lions cougars so all placing that Cuba and turn the mostly the last with all kinds of ideas about how we do things and thus how we perceive ourselves so those of sees some cachet Northern I Southern this makes sense the the following it is placed in the US the because I'm using English the kinds of stuff to particular places Eastern appears makes sense and in Western it some catch being western I think everything is West of Europe and maybe that's why all that random there so that was magical of production practical enough and by the way if you go to my blog stage that appear in the last page you can see this example liven you can type in your own the neuron names and see what's going on I 2 more so that was natural pressed great practical so as to this 1 suppose you were a county with some standard partial interested and you want to set up a simple partial finder provokes define their home how would you do it I 1 have a Google style interface just 1 input field the magical or complete so you can use city you have GIS data so your GIS data probably has a street name address number of inner-city for every site address so will make use of that the but how that our own street address greater the during the
development the so we we trick to do it may go on and on and on a complete war it is you have to be a look up not just the words that uses oriented but the words that they're in the middle of typing and fortunately post breast text search can do that too so in this 2 T query function I'm not just looking further the 256 the I'm also looking for the words that start with in AI so instead of looking for 256 Main Street and I haven't quite finished yet so that's what I could stop impose grass textures calls this prefix match and with prefix matching and a simple Javascript you know Jake you are in this case this example and another of the 4 you never really fast complete search text up and running in a few minutes and it's uncannily accurate doesn't care a word word order but you wanna get fancy in addition the used and 1 row for each tree addressing also had rose to table so we can add rows for street intersections like main 2nd so the last example here is interesting because the search field we've got 349 these Main stood another map out of the rule based this based on that we get the east all spelled out names to for the mismatch here so what happens if we go back to a form and try to search for the names as they appear on the based the 349 East Main St using the fully spell the word East I now back no city to search for Main Street but going straight out in full or searching for the dresses South 2nd Street that appears on the map but no success so what's going on here but what's happening is we broke street names down into words each token got the word and we say the words the full text and and the words are like English words and they have their own grammar incentives street names look English but not so the search is failing can lead exit from why instead of treating the words in the indexes parts of language which is the most parts of addresses so that the system would know that if you wrote stood immense Street and if you're a man you met men off then the searches using abbreviations work and searches against data that was abbreviated work and post prestige search allows dictionaries for a set the words you wanted more words will replace words so I create a custom dictionary for street addresses articles just well addressing dictionary the and for my base example I used a simple a standard simple dictionaries that which was doing any special crossing the words but this is actually better than the English dictionary which will drop things that aren't English words like but still not that good since search would have use exactly the same relation style as the data don't work come of it so in this example 128 is considered a word and he is considered were and still is considered a word but we passed the same thing using the addressing dictionary the cost of address dictionary comes into play in abbreviations expanded out he becomes East stood become street so by altering the search application to use the addressing dictionary instead we get much better behavior East Main Street works and South 2nd Street words things even work when the users makes up the correct address in order but put the directions last or what the House Numbers last the so that's because 1 I couldn't demonstrate encouraging because the King and uh the system-level stuff like that dictionaries In the current you have to take the extensions that are already in already in place of but my custom crescent induces a printer so rather than go about your else here is that the modern version of the AOL Keywords for this section this type in PostgreSQL full text 9 . 4 4 latest information but full-text search of particles presence language 0 5 if you type in the Ramsey get have you can find addressing dictionaries repository 1 that database so the 1st so possibly while they're Christoph literature music and the 2nd 2nd movement Federated systems so 1st I upwards Federation pushing data up from my local database into a cloud storage system and in deference my employer and because it's so easy to sing to a system was no impedance mismatch problem from posters the posters is pretty easy I always showing how I federated a local posters art departed be accomplished so further got some open data from the city Victoria I my own city is the shape of public art in the city and I loaded into my local posters using shape biggest you all and you didn't you just so areas was what and then I load the same data into party there it is FIL more comprehensible to base that I and I can use the currently visualization tools to make a pretty In this case the crateral style but how I connect the 2 systems that get changes in the local posters to propagate to the cloud the well Carter B is a web service so we need a web transport to push the changes over and it happened that that happened there wrote 1 of those on the ATP extension oppose press but nothing spatial about this extension at allowed you to make http called using post rescue all functions so you can run and AGB get function and get back the results through web services are not content but also the MIME type status codes letters and so on and not just gets you can post inputs and leads us to interact with any AGB web service you want and here's the here's the mixes were are not currently has a Web service called the sequel API and by the name you can measure what is this equally guys actually diabolically simple you called at the end point you tell what format what he returned to be adjacent Jason if you're altering the data or the data is privately provided the Ikeda group you are and then you just provide a sequel you want executed it is so diabolical actually describe a couple years before it was invented as the architecture of evil I since with an unprotected sequel passing so much evil the outside world work in a database of course the current guys protected against sequel injection and user isolated on databases and everything is only run at the Commission's level of the user the flying in which is basically a very low level read read access unless you have an API key so it's not exactly the same thing ascribed 2009 but it's incredible light to pass the sequel through the database and unfiltered the simplicity of the approach allows for incredible flexibility in building absence there's no new data for the http interface level to reinvent things a proxy for sickle I give you the OGC WFS you might look to see that it's an altered this looks like a really unpleasant sequel and that's all inferences the budget and supporters filters and so on and why why reinvent that anything is directly so for this example Federation used uses an editor and I directly added a local posters database each database update in turn triggered native peoples call was called the ATP extension passed an update to the current be sequel
API and this in turn supplied authority being database which made visible to me in from looking at the current the rent so diabolical pure evil so here's a local database triggers updates Scotty be that's only tied up in advance but if we're doing a full Fred implementation we can make it in certain delete triggers well the to write to kind of you we need authenticates we provide API key the singled out in this table is simple since we're only updating location field a little more complete version might the all the fields we have to your code the sequel the passage of each of the form and then rather than HTTP POST to get a party to inject the return code nature is actually a good response that was accepted and returning to topple back to writing the local database that so here's an action right this is my went on top reading in the car even at the bottom receive the results I added a point the movement then refreshed currently window and the change and set up set up so new new and fresh refrained from that but on the end so the trigger method is that I think middle solution but if you want something even simpler operating in batch the solution from Martin Jensen is even more diabolical I just dumps the table directly into curl and this exactly the CSP so the 4 points and the slams a table right under currently being important that's probably the smallest distance equality for moving data directly from 1 database to carry the so that the example of a push up moving data from local posters remote BGP host have to push down pushing data down from the cloud to local posters database we do the reverse sure we can always use a fancy simple standard called sequel mad that stands for management of external data which is exposed in impose breast as a real world person real world peace functionality called for data wrappers arrested you foreign data rabbits exposed what looked to the client just like a table database you get access by running select queries on um can change it by running insert update delete amounts on it but behind the scenes of for data rapidly will be anything all so it can be a table on a remote database and not necessarily even remote host graph databases rabbits Oracle nice to well I another database that could be a non database to source flat file our could be a non tabular database like Twitter worry there's FTW implications for all those things I would like to talk about today is that you get the wrapper front yard the spatial data extraction library is a perfect fit for w rapid many ways and expose the very tabular kind of data the audio data models very much a tabular model I was spatial spatial objects in and the multi-format spatial library by implementing a giant w we had access to all formats although Java the price running just 1 wrapper so here looks like to expose a file database post procesing or during the W so 1st you turn on the get extensions are then you create a server OK that references the data source in the space of biology database file you can see the nomenclature really seems to be working against other database before so that's not actually a real restriction and finally creative on table that in turn references certified it the final columns and the foreign server you want expose you local database but it's the same thing only using currently is the foreign server you know currently is oppose pressed posters and if we don't have access to low level also we define our server using the Curry B cardiology arrive rather than the post presses your right now we define our foreign table the match the part of the table so the breakpoint people statement once that's defined we can run queries locally on the table and get results just as the data were local so here's a distance query finding the 7 nearest pieces of public art to the piece being fire in the belly the ODR and yet at that the drive is getting better all the time so as a In a multitude consent falls so that's where the where clause into the where clause is to remote servers so that only subsets of the data is sent back to the client I think when extradition add spatial filters and finally updated delete supports it's possible to actually integral data without ever leaving the friendly confines of course press so is the 3rd group that they have to give a talk about model the 4 seasons at some point the Summer Fall Winter Spring the cool and but for now the new remains a larger is the 3rd movement and time tied in that in the abstract maybe right this talk I promised I do at a time travel portional look into the future so for this section i wanna turn to overcome the magnificent as also in the blocks you're all and so is what contact as you type a city in the got complete form which is driven off oppose press query and based of the city codec tells you it's going to rain tomorrow and this is all done with PostgreSQL PostGIS the the video was taken that couple months ago so the answer wrong but if you go to the demo page you can find a lot fun that is only for the states and a lot that that should be pretty accurate or as accurate as a fortuneteller can be expected be so it's appeal that covers and see how the trick works I know up National Oceanic in the atmosphere administrations nice enough to publish the forecasting and web directory that's kept costly updates so you can see that I create the slide February and and for the rate prediction the file were interested in is the pop 12 well probably precipitation um given in 12 hour forecast windows if you download that file conversion to and look and you just as what you see which is like totally awesome should be the His due is the following the loading the test with that 1 is rare event is green in band 3 is blue which gives a really cool picture of lots of fun mixing I'm actually status might be viewed separately as the forecast period self-esteem and get the you can see the forecaster precipitation moving from west to east in this case you expect the general direction of whether the gesture in North America right so once the data are intuitive we can put them in a post yes but actually getting optimal conversion from the knowing it's CF due to the signals bit of adventure and learning experience for the lecture on the Fall conversion did preserve all 5 bands and included the spatial reference information and the metadata metadata the book about school but the input file is 1 point 5 makes the output file files 113 next so the 1st thing you notice when you probably have files pixel type is double that's a bit 8 bytes per pixel but the input datasets integers from 0 to 100 with no data which basically fits in a single by so there's an 8 fold improvement in storage available figures change the pixel type which is pretty easy that gets the upper . 14 May so no longer 100 times larger in but only 10 times larger which is still pretty terrible i we look at the tip computers it has some some awful imperfections where the 9 and I noted pixels of course down at the same date range as your own or same range 0 100 so we need to explicitly map the no data values into a slot in the Member space there's no real data so since our data run from 0 to 100 we can map 255 safely for the files still are and so what's going on the true that the Google is producing an uncompressed due to buy people so we have to ask for compression deflate is does an excellent job in our down to 1 . 4 8 megabytes about original but that actually the
plate has an extra options we can all the time we had a higher compression level which is used a little more and we and we had a scanline predictor of we get
down to just over 1 it was pretty nice improvement over 130 megabyte the default conversion us the so the key part of our solution is what kind means when he says tomorrow and the figure that helps to look at the output from new info there's actually finance due and the great job preserving the original real format metadata which is what you so we can figure out what each band means are the 1st that is good for a 10 hours after the forecaster generated the 2nd bond is good until 22 hours at the forecast for advanced 3 4 hours for the benefit of 46 hours each Ben has a valid time which gives a UTC timestamp of forecast expires but could be very useful work this acid 2 ways to solve a problem others the right way speaker careful look at the metadata and figure out which forecast and we need based on the kind and is my way which was just an average band 3 and 4 together those pretty close to tomorrow between 12 and 36 hours I possible ICA do any average inversely to load the data which involves this so the usual pick command-line syntax for there's nothing too crazy here I would choose a 32 by 32 chips that because because 1 by per pixel time 3 2003 2005 bands implies a 5 killed by tile which is slightly smaller than the page size suppose press the APP side of was all the data are loaded we just need to simple queries to run but I and that sort of stuff here using query the 1st equation dry the complete where it's just reads the populated places order itself by size of cities that's really easy and 2nd 1 that generates a probability guess for codec given the cities of step 1 is a nice big cock the sequel step 1 selected city to generate a buffering which we use to summarize opposite probabilities step 2 we apply that covered the rest table and find all Rastas intersect the buffer in Moscow just the pixels of the rest the fall inside the book and then step 3 we take those mass rostas and andsummarize and find the maximum value of the probably diversification for every pixel that becomes the number we use to drive context yes here's here's a looks like visually so we calculate the blue buffer and then we use that blue buffer defined intersecting chips into the red boundaries of chips we just hit 2 of barely and we mascot chip defined just the pixels that intersect the and the Nordic you contact up-to-date apple process for in my home computer every 6 hours pulls the precipitation forecast from NOAA converts due to stuffed up into an S 3 bucket and from there currently be automatically since every day using the standard tables and capability currently are basically every refresh period just slips the file down replaces the current data with the new and that's contacted the magnificent looking in future I have a little code official off but it that much time so say thank you very much for the and I don't have time for 1 question I'll that 1 question but just some question place cerebrum not in but these easiest question ever and if you do it you are shy it you can have 2 questions so you run until we went back that's Parker hold you 1 yes you yes the the I am I wouldn't overturned so I would say that the a lot like typical it the north end of
Trail
Open source
Parity (mathematics)
Connectivity (graph theory)
Chemical equation
Multiplication sign
Point cloud
Instance (computer science)
Web service
Computer animation
Database
Extension (kinesiology)
Local ring
Physical system
Complex (psychology)
Building
Greatest element
Multiplication sign
View (database)
Scientific modelling
Source code
Function (mathematics)
Parameter (computer programming)
Data dictionary
Mereology
Disk read-and-write head
Weight
Food energy
Formal language
Subset
Mathematics
Synchronization
Elasticity (physics)
Extension (kinesiology)
Partial derivative
Physical system
Theory of relativity
Mapping
Relational database
Software developer
Interface (computing)
Bit
Functional (mathematics)
Vector space
Data storage device
Telecommunication
Order (biology)
output
Configuration space
Website
Figurate number
Data type
Middleware
Resultant
Row (database)
Point (geometry)
Web page
Metre
Dataflow
Domain name
Sequel
Geometry
Virtual machine
Rule of inference
Field (computer science)
Power (physics)
Number
Goodness of fit
Latent heat
Term (mathematics)
Database
Operator (mathematics)
Software testing
Selectivity (electronic)
Object-relational mapping
Data structure
Subtraction
Address space
Run time (program lifecycle phase)
User interface
Standard deviation
Dialect
Myspace
Matching (graph theory)
Information
Key (cryptography)
Projective plane
Set (mathematics)
Cartesian coordinate system
Table (information)
Plot (narrative)
Subject indexing
Uniform resource locator
Word
Computer animation
Query language
Search engine (computing)
Personal digital assistant
Blog
Formal grammar
Object (grammar)
Pixel
Group action
Curvature
Java applet
Scientific modelling
System administrator
Equaliser (mathematics)
Range (statistics)
Source code
Shape (magazine)
Client (computing)
Data dictionary
Web service
Mathematics
Videoconferencing
Physical system
Texture mapping
Theory of relativity
Spacetime
Mapping
Simultaneous localization and mapping
Block (periodic table)
Software developer
Interface (computing)
Bit
Prediction
Demoscene
Process (computing)
Data storage device
Order (biology)
Text editor
Figurate number
Reading (process)
Point (geometry)
Web page
Slide rule
Sequel
Computer file
Point cloud
Code
Rule of inference
Event horizon
Number
Frequency
Database
Authorization
Energy level
Selectivity (electronic)
Address space
Metropolitan area network
Form (programming)
Authentication
Matching (graph theory)
Graph (mathematics)
Information
Key (cryptography)
Demo (music)
Directory service
Grass (card game)
Cartesian coordinate system
Local Group
Table (information)
Word
Uniform resource locator
Spring (hydrology)
Visualization (computer graphics)
Query language
Personal digital assistant
Statement (computer science)
Codec
Window
Library (computing)
Musical ensemble
Building
Ferry Corsten
Code
State of matter
Multiplication sign
Sheaf (mathematics)
Insertion loss
Complete metric space
Function (mathematics)
Mereology
Inference
Dressing (medical)
Formal language
Subset
Web 2.0
Bit rate
Data compression
Data conversion
Extension (kinesiology)
Area
Injektivität
Adventure game
File format
Wrapper (data mining)
Fitness function
3 (number)
Wellenwiderstand <Strömungsmechanik>
Functional (mathematics)
Data model
Network topology
Repository (publishing)
output
Right angle
Remote procedure call
Data type
Data management
Resultant
Row (database)
Filter <Stochastik>
Color confinement
Server (computing)
Implementation
Batch processing
Perfect group
Real number
Time travel
Distance
Field (computer science)
Metadata
Twitter
Revision control
Natural number
Operator (mathematics)
Integer
Software testing
Proxy server
Addition
Computer
Subject indexing
Particle system
Computer animation
Formal grammar
Dependent and independent variables
Local ring
Matching (graph theory)
Web page
Pixel
Musical ensemble
Context awareness
Computer file
Sequel
Code
Multiplication sign
Real number
ACID
Mass
Function (mathematics)
Mereology
Computer
Metadata
Number
Maxima and minima
Frequency
Computer configuration
Data compression
Energy level
Boundary value problem
Data conversion
Default (computer science)
Standard deviation
Process (computing)
Information
File format
Timestamp
Table (information)
Exergie
Tessellation
Process (computing)
Computer animation
Query language
Buffer solution
Order (biology)
Equation
Right angle
Figurate number
Quicksort
Computer animation
Lecture/Conference
Loading...

Metadata

Formal Metadata

Title Magical PostGIS in three brief movements
Title of Series FOSS4G Seoul 2015
Author Ramsey, Paul
License CC Attribution - NonCommercial - ShareAlike 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
DOI 10.5446/32156
Publisher FOSS4G, Open Source Geospatial Foundation (OSGeo)
Release Date 2015
Language English
Producer FOSS4G KOREA
Production Year 2015
Production Place Seoul, South Korea

Content Metadata

Subject Area Information technology
Abstract Everyone knows you can query a bounding box or even spatially join tables in PostGIS, but what about more advanced magic? This short symphony of PostGIS examples will look at using advanced features of PostGIS and PostgreSQL to accomplish surprising results: * Using full text search to build a spatially interactive web form. * Using raster functionality to look into the future. * Using standard PostgreSQL features to track and visualize versioning in data. PostGIS is a powerful tool on it's own, but combined with the features of PostgreSQL, it is almost magical.
Loading...
Feedback

Timings

  583 ms - page object

Version

AV-Portal 3.7.0 (943df4b4639bec127ddc6b93adb0c7d8d995f77c)