We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Processing and publishing Maritime AIS data with GeoServer and Databricks in Azure

00:00

Formal Metadata

Title
Processing and publishing Maritime AIS data with GeoServer and Databricks in Azure
Title of Series
Number of Parts
156
Author
Contributors
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
The amount of data we have to process and publish keeps growing every day, fortunately, the infrastructure, technologies, and methodologies to handle such streams of data keep improving and maturing. GeoServer is an Open Source web service for publishing your geospatial data using industry standards for vector, raster, and mapping. It powers a number of open source projects like GeoNode and geOrchestra and it is widely used throughout the world by organizations to manage and disseminate data at scale. We integrated GeoServer with some well-known big data technologies like Kafka and Databricks, and deployed the systems in Azure cloud, to handle use cases that required near-realtime displaying of the latest AIS received data on a map as well background batch processing of historical Maritime AIS data. This presentation will describe the architecture put in place, and the challenges that GeoSolutions had to overcome to publish big data through GeoServer OGC services (WMS, WFS, and WPS), finding the correct balance that maximized ingestion performance and visualization performance. We had to integrate with a streaming processing platform that took care of most of the processing and storing of the data in an Azure data lake that allows GeoServer to efficiently query for the latest available features, respecting all the authorization policies that were put in place. A few custom GeoServer extensions were implemented to handle the authorization complexity, the advanced styling needs, and big data integration needs.
Keywords
127
MIDIThermodynamischer ProzessCovering space
GeometryServer (computing)Control flowInformation securityLecture/Conference
Source codeKey (cryptography)Variety (linguistics)CASE <Informatik>Information securityData storage deviceAuthorizationVelocityAdditionPosition operatorTerm (mathematics)Right angleFluid staticsOrder (biology)Decision theoryObject (grammar)Context awarenessInformationPoint (geometry)Physical systemMatrix (mathematics)Variety (linguistics)MereologyLatent heatNumberMultiplication signDatabase2 (number)Set (mathematics)Slide ruleDemosceneComplex (psychology)Volume (thermodynamics)SubsetComputer animation
Normed vector spaceCASE <Informatik>Real-time operating systemVolume (thermodynamics)Visualization (computer graphics)Position operatorVelocityComputer animation
SoftwareSource codeMaxima and minimaAlgebraNormed vector spaceMiniDiscPay televisionGeometryPhysical systemType theoryReal-time operating systemOperator (mathematics)Position operatorElektronisches MarketingDifferent (Kate Ryan album)Projective planeComputer clusterSet (mathematics)DatabaseCASE <Informatik>Queue (abstract data type)Level (video gaming)2 (number)Function (mathematics)Client (computing)Electronic visual displayServer (computing)Web 2.0Computer animation
SoftwarePoint (geometry)Uniform resource nameThermodynamischer ProzessCache (computing)Position operatorCASE <Informatik>Service (economics)AlgorithmProcess (computing)Figurate numberCross-correlationInformation securityComputer animationLecture/Conference
Computer-assisted translationComputer animation
Transcript: English(auto-generated)
Thank you, so I'm going to cover processing and publishing maritime data with GeoServer with Databricks in Azure Skipping over the company because we only have four minutes big data maritime security. So what's the use case? We have a large database of information about the scene particular moving points the ship positions
constantly delivering AIS Through the sensors the their position but also all the maritime assets port and navigation light systems and so on So we have a bunch of static or semi-static objects and a lot of moving points The use case in numbers in 24 hours. We receive a 50 million positions reports out of up to five hundred thousand different ships
With a peak of two hundred and fifty thousand messages per second and we keep seven years of data online That's a hundred and twenty five billion positions. So is that big data? Well, I say that with the previous slide
I should have convinced you that we have both the velocity in terms of incoming data and the volume in terms of Data storage do I have the variety as well? Well, yeah, because in order to provide the informed decision-making support We have to provide a variety of context information
So we don't just have the moving points and so on But we also have a bunch of contextual information that needs to be displayed at the same time And we need to do so interoperating with other data sets as well in addition to that. We have a complex authorization right matrix so that certain entities and certain users can only see a specific subsets of the data a
Specific port a specific part of the sea from a specific nation blah blah blah. So forget about caching. It's just impossible Now a couple of use cases one that deals with velocity and one that deals with volume Visualize in real time the ship positions
We display the latest position like every two seconds on the client of all the ships all together All The ship positions come in they are validated and reached and processed in general and then added into Into a database that you server uses This is a picture of all the ships in the Mediterranean Sea in this case
We are coloring each vexed vessel according to its type But we can do it according to maybe it's a fishing gear if it's a fishing boat And here is another example where we are supporting an aircraft search and rescue operation So a different use case with the same data and in this case a display of the real time
It's its navigation system positions Another example of all the ships you can see also in North America here So that's why we reach 500,000 ships in a polar projection So we don't only support the web market or about a bunch of useful projections for the output
There is advanced at styling the client is really rich the user in the control room can choose Particular types of ships and have them I lighted and have them move around The map every two seconds with with a continuous update So as you can see we have to process and render very quickly a lot of data. How does it work?
Well, we have a Kafka cluster with a set of Queues that receive the data. We have an ingestion cluster that is processing the data validating it and reaching it adding extra data depending on your their position and so on and eventually writing everything to a post SQL
Database and then there is a geo server cluster that pulls out of the post SQL database all this fresh information And render it on a map depending on the current user the current filter the current highlights and so on and so on So everything is on the fly because here things are moving fast
Okay I know the use case historical positions. So in this case, we have all the positions for the past seven years And what can you do with it? So for example ship correlation have two ships met at sea to do something may be legal Or, you know washing the tanks of a petroleum vessel or something like that
We have the ability to choose and export from an azure data lake the positions of the ships Correlate them through an algorithm and figure out whether they are on occasion to met at sea or not in this case We have an azure data lake and an Apache spark a process that does the extraction
This extraction can be slow can be 30 20 30 seconds and and then we have a cache in post SQL to do interactive mapping out of it And that's it. Okay, awesome. So that's a free marine traffic, but it's a private solution or it's a public service
Okay, okay. Thank you