MobilityDB
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Alternative Title |
| |
Title of Series | ||
Number of Parts | 490 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/46950 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
TrajectoryData managementObject (grammar)Compass (drafting)Physical systemTrajectoryDatabaseData managementComputer animation
00:26
Moment of inertiaPoint (geometry)Multiplication signPoint (geometry)Order (biology)Data managementSequenceSequelMobile WebTimestampLibrary (computing)Spherical capTrajectoryData structure2 (number)AreaDimensional analysisTrailIdentifiabilityComputer animation
01:17
Inflection pointPoint (geometry)TrajectoryPoint (geometry)Table (information)Functional (mathematics)Data structure2 (number)TimestampSequenceElement (mathematics)Data typeSelectivity (electronic)Computer animation
01:54
Boolean algebraTelephone number mappingGeometryRange (statistics)Array data structureInformation systemsInternet service providerField extensionData modelArchitectureMaxima and minimaDemonExecution unitEmulatorCAN busVacuumType theoryTemporal logicView (database)Streaming mediaInformationOperator (mathematics)GeometryTemporal logicPredicate (grammar)Streaming mediaDatabaseResultantCartesian coordinate system1 (number)MappingAttribute grammarPoint (geometry)Network topologySoftwareAbsolute valueType theoryTable (information)GeometryBoolean algebraSpacetimeFunctional (mathematics)Data structureSchweizerische Physikalische GesellschaftTrajectoryDampingMathematical optimizationData managementElectronic mailing listMathematical analysisMultiplication signStatisticsQuery languageLevel (video gaming)IdentifiabilityFraction (mathematics)Process (computing)DistanceEstimatorSelectivity (electronic)Extension (kinesiology)Goodness of fitSubject indexingData modelDefault (computer science)INTEGRALPlanningProjective planeMereologyLine (geometry)String (computer science)Binary codeNumberMatching (graph theory)Context awarenessArithmetic meanAbstractionWebsiteGene clusterField extensionMedical imagingForcing (mathematics)Form (programming)Presentation of a groupMobile WebVacuumInformation securitySpeciesPrototypeLattice (order)Wave packetComputer animation
07:14
Field extensionTable (information)outputIntegerKey (cryptography)Order (biology)Singuläres IntegralTrailTable (information)Multiplication signIdentifiabilityExtension (kinesiology)Object (grammar)Operator (mathematics)Point (geometry)DatabaseGeometryTemporal logicOrder (biology)Row (database)Projective planeMereologyInstance (computer science)Coordinate systemTheory of relativitySingle-precision floating-point formatStandard deviationInformationMobile WebProcess (computing)Rule of inferenceStability theoryPhysical systemAreaTimestampArchaeological field surveyWebsiteComplete metric spaceFlow separationMetropolitan area networkComputer animation
09:20
Open sourceGoogolData storage deviceUniform resource locatorProjective planeGoogolMultiplication signTrailInstance (computer science)TrajectoryComputer animation
10:15
Open sourceGoogolAuthorizationRule of inferenceCartesian coordinate systemWindowSoftwareCohen's kappaEndliche ModelltheorieRow (database)TrailFilter <Stochastik>Computer fileComputer animation
10:56
Data analysisVelocityVelocityMappingCartesian coordinate systemSoftwareTrajectoryLogicComputer animation
11:23
Visualization (computer graphics)Extension (kinesiology)Artificial intelligenceForschungszentrum RossendorfData analysisVelocityLimit (category theory)Visualization (computer graphics)Standard deviationHand fanForschungszentrum RossendorfLattice (order)Medical imagingOpen sourceMobile WebExtension (kinesiology)Artificial neural networkComputer animation
12:31
Visualization (computer graphics)CubeWordQuery languageMobile WebOperator (mathematics)Computer animation
12:50
IntegerTable (information)GeometrySubject indexingPrice indexJunction (traffic)Axonometric projectionCommutative propertyElectronic mailing listTable (information)TrajectoryDampingSubject indexingOpen sourceCommutatorQuery languageBoolean algebraOrder (biology)Projective planeAreaRectanglePredicate (grammar)Operator (mathematics)DatabaseGeometryIdentifiabilityTemporal logicState observerCalculationLine (geometry)Mobile WebPhysical systemPoint (geometry)Multiplication signInformationComputer animation
14:58
TrajectorySystem on a chipTable (information)GeometryIntegerTemporal logicOperations support systemTotal S.A.FrequencyLocal GroupOrder (biology)ResultantTraverse (surveying)Different (Kate Ryan album)Projective planePairwise comparisonQuery languageTrajectoryConnectivity (graph theory)Group actionInformationAverageDistanceSubject indexingWater vaporTable (information)Multiplication signDampingLengthTheory of relativitySummierbarkeitFrequency1 (number)Speech synthesisData conversionNoise (electronics)Computer animation
17:43
Total S.A.Table (information)IntegerGeometryTemporal logicMetreQuery languageDistanceMultiplication signOperations support systemTrajectoryPredicate (grammar)Different (Kate Ryan album)CumulantResultantLengthLine (geometry)DampingSummierbarkeitLattice (order)MetreMultiplicationProjective planeSymbol tableComputer animationDiagram
19:44
ResultantMetreLine (geometry)1 (number)Complex (psychology)Computer animation
20:30
Subject indexingData managementTouch typingQuery languageDifferent (Kate Ryan album)Table (information)Virtual machineMobile WebWebsite2 (number)Replication (computing)Computer animation
21:09
DatabaseMagneto-optical driveExtension (kinesiology)Open sourceStandard deviationDatabaseExtension (kinesiology)Object (grammar)Slide ruleStandard deviationOpen sourceSoftware developerPhysical systemMobile WebComputer animation
21:34
Chi-squared distributionFunctional (mathematics)DistanceThree-dimensional spaceComputer animation
22:37
Point cloudFacebookOpen source
Transcript: English(auto-generated)
00:05
Hello everyone. I am Mahmoud, professor in ULB, in this campus. We are doing MobilityDB, it's a moving object database system.
00:22
It is meant for trajectory data management. Already, in the talk of Vissarion, we got an idea what's a trajectory. In MobilityDB, the time dimension is also taken into account.
00:41
So, if you have a GPS track from your mobile or some GPS tracker or navigation device or so on, it gives you more or less data like this, so you get some identifier for the trip and then a sequence of point and time.
01:02
You can put this in PostGIS, in MySQL, you can use a Boost library in order to start doing data management of this. What MobilityDB does is it puts the whole trip together.
01:24
So, it encapsulates, it creates this structure for a trip where you have a point at time stamp and then a second point at time stamp. So, you put the whole sequence together in one data type and that becomes a data element in your table.
01:41
And then you can start writing functions over it in SQL to calculate the speed, to calculate the heading, to do selections, to do joins and so on. So, generally speaking, this is a general architecture.
02:02
You are posting SQL providing these relational types, some advanced types like XML and JSON as well. And then on top you have PostGIS, which provides two main abstractions, geometry and geography. So, you can be doing maps and then you have MobilityDB on top that provides the temporal information,
02:27
that adds the temporal information to PostGIS and PostgreSQL types. So, you get a temporal geometry point representing a car, for example, or a person moving over time. You get temporal geography points according to the coordinates of their geographical geometric.
02:47
But also you get temporal float, temporal integer, temporal text, temporal boolean. And these are important for evaluating functions and predicates over trajectories. For example, the speed of a trajectory is changing over time, so that's a temporal float.
03:03
You want to check a predicate over a trajectory, is the car now inside Brussels? So, the result would be a temporal boolean because sometimes it's true, sometimes it's false. So, these types have to also be supported. And you can imagine that the list of temporal types can be extended to support different applications.
03:26
Right now these are the main ones supported in MobilityDB. So, MobilityDB is a vertical extension that extends PostgreSQL at all data management levels.
03:41
So, it extends the data model with both time types and temporal types that I just mentioned. So, you can use them as attribute types in your table. It extends the indexes so that one can process fairly large tables quickly.
04:02
So, it extends the gist index of PostgreSQL, which is R3 actually. The SPG is a space partitioning, so it's a kind of grid structure. The B3 index, it also extends a query optimizer so that one can do vacuum analysis to collect statistics about these temporal types.
04:28
And to be able, so that the optimizer can estimate selectivity of predicates and do its optimization stuff. Invoke the indexes when relevant, try to find the best execution plan.
04:42
And then a big set of operations. For example, you can always project, remove the time dimension, project to a line string, and then do all kinds of processing on a line string.
05:01
If you want to include the time, then you use the lifted operations. We call them lifted because they lift the static operations with time. So, you can do arithmetic on temporal numbers, can do binary operations on temporal booleans, can do distance and topology operations on temporal points, and so on.
05:27
This is all built as an extension, not a fork, to PostgreSQL. And because it's an extension, it's by default should be, we hope, compatible with other PostgreSQL, other tools in the ecosystem of PostgreSQL.
05:46
We have tested with some, and we wish to test with more. For example, integrating with PGA routing for calculating shortest distance, so that to support network points.
06:07
You know you can represent the coordinates, either absolute as latitude and longitude, or as map matched, like an identifier to a certain road in a road network, and the fraction of distance that is travelled to get more context.
06:25
PipelineDB provides some stream processing of PostgreSQL. We did some, let's say, toy experiment on it. We do support good integration with Citus. Citus is for scalability, so that you can run a PostgreSQL database on a cluster, and your queries get distributed.
06:51
So, combining both MobilityDB and Citus, one can do query big data sizes.
07:01
So, my colleague there is responsible for this part. For a quick start using MobilityDB, there is a Docker image, and so on. So, how does it look like loading the data? The most common format is comma-separated. You can always transfer whatever tracking format to some comma-separated.
07:29
Most important, create extension MobilityDB Cascade. That's how to start supporting these spatio-temporal operations in your PostgreSQL database.
07:41
Cascade will also create extension PostGIS if it is not there. Basically, MobilityDB manages the temporal part and its relation to spatial part, and whenever it is about spatial processing, it delegates to PostGIS.
08:01
So, in this table at the left, that's a pretty standard one that you would use for just loading the flat information you get from a GPS device. So, you have longitude, latitude, and time. These are the key things, and also some identifier for the moving object and the trip.
08:23
And that's the one you can do with MobilityDB. Here, you see one column called the trip, which is the T geometry point. This is the one that's going to carry the complete trip. So, every row in this table will represent a trip.
08:40
And in order to load this table in this table and create your trips, basically you combine every point with its timestamp, do whatever projection you want in order to put it in the coordinate system required,
09:00
create an instance of these, and then aggregate all instances that are for the same trip and same car, aggregate them in an array, and then put this array in a temporal geometry point. So, now you have this complete trip in a single data item.
09:23
To support, well, you can also use other formats, GTFS for public transport schedules, and there's a tutorial about this on the GitHub of MobilityDB that shows step-by-step how to create trajectories from GTFS.
09:45
Google location data, you can download your own track and then start playing with it. Actually, Google stores a lot, so it will be fun. You can start calculating aggregates, how much time you spent in driving or you spent walking, where you go,
10:09
if you manage to get location data of someone else that's becoming more interesting. And that's also another workshop, so you get a step-by-step tutorial.
10:21
A third workshop that's nice is managing AIS data. This is ship data, basically. Here we use some data published from the Danish maritime authority. They have huge data size, like three terabytes of ship tracking data.
10:41
This only shows a single day after some filtering. So the original file is 10 million rows and one gigabyte, not very big, but you can go bigger. This is an application done by colleagues in Moscow who use MobilityDB to play with the public transport,
11:09
well, not to play, as you want to put it, with the public transport network in Moscow. And they did these nice velocity maps basically by aggregating the spatio-temporal trajectories.
11:26
Very nice visualization, if it works. So this is normally, yeah, it will start moving.
11:40
So that's only yesterday. I had a very nice meeting with our colleague, Yangsook, from the Artificial Intelligence Research Center in Japan. Actually visualizing MF-Json. MF-Json is new to appear OGC standard.
12:05
I know from the talk of Jody that not many are fans of OGC, but it is changing. OGC now is doing pretty cool stuff, including standards for moving features. So this data basically was exported from MobilityDB as MF-Json imported in this Cesium extension,
12:25
and then visualized in the moving. This is one of the ships in the AIS example. They even did some 3D, so you can see a space-time cube. This is the same ship, but now just a spatio-temporal movement.
12:52
So let's look more about queries in MobilityDB and what kind of operations you can get there.
13:00
So in this example, in the following few ones, the same ship database will be used. Basically, we have a table of ships that has an identifier for a trip, that has a trip trajectory as a geometry point, temporal geometry point, speed over ground, this is a typical observation reported by the AIS sensors,
13:29
that tells you the speed of the ship. And because it changes over time, it's loaded into a temporal float. So this is coming from the source information, not calculated. Course over ground, also temporal float, and then I pre-calculate the spatial trajectory,
13:44
which is a line string, project the same trip into some projected coordinate system, the ETRS, which fits the area of Denmark. Now, if we want to list all the ships that commute between these two ports,
14:03
we just load a bee and put a garden. So you have two ports, and you want to see which ships commute around. So I express this in this SQL query. Basically, the two ports are represented as two rectangles, and then we are interested in the trips that intersect the two rectangles together,
14:25
so that two predicates intersect. So this predicate is a MobilityDB one. It accepts a trajectory, a temporal geometry point, and some geometry, and it returns a boolean. And then in order to do this efficiently, you need an R3,
14:44
so you create a gist index over the projected trip column. So some gist index will be used behind the execution of this query. And this is the result. The two ports are the red rectangles,
15:03
and then these are all the ships traversing. Another query finds the ships that have speed over ground different from the speed calculated from trajectory. Basically, in this table, you have two speed components,
15:21
one that is coming from the sensor, SOG, and one that you can calculate from the spatio-temporal trajectory that has been constructed. Normally, they should be the same. If they are not the same, then the sensor is providing some wrong information. So I don't know why the query should be interesting.
15:41
I thought it's interesting, so I try it, get everything from the table ships, and then perform a minus between the two speed components. This will convert both to kilometers per hour. And this minus is a temporal minus between two temporal floats. So it's going to be calculating the difference at every time instant
16:02
and producing a temporal float. And the time-weighted average will summarize this temporal float into a single float and compare if the speed difference is greater than 10, then show me this trip. So this is what I just explained.
16:21
You have two speed components. The one calculated, the blue one, is the one calculated from the spatio-temporal trajectory. The orange one is the one SOG. If you do a temporal minus, this is what you get. And then you summarize this using time-weighted average to get a single float.
16:41
I didn't put a result here, but we got some trips that really showed noise. This is another query for an aggregation. So here it aggregates the length of the trips per hour.
17:00
So basically we want to see what is the distance traveled per ship every hour. So every hour in the day, what is the total distance traveled by all ships. As an indicator of how busy the Danish water is every hour. So we create a relation of 24 periods, the 24 hours.
17:24
And for every period, we restrict the trajectory to this period. And then just calculate the length in kilometers of the trip and do a regular sum over this group by the period.
17:41
So you get something like this. So per hour, this is the total traversed distance by all the ships. Nothing interesting. More or less, it is the same all the day. But the query is interesting.
18:01
This is a temporal aggregate. So in the previous one, we did a sum. Here that's a temporal sum. And it is using the operator cumulative length. Cumulative length at every time instant, it tells what is the distance that has been traversed so far from the beginning of the trip.
18:23
So the result is a temporal float. And now you have multiple temporal floats per trip. So you do a temporal sum, which will sum these temporal floats at every time instant. And the result is a temporal float that looks like this.
18:42
So this is the time, this is the distance. And you see that it is steep, increasing, which confirms the previous query. That there is no difference according to the hour or the time. This is another trip to do a join.
19:01
Here we want to see whether there were some danger situations in the dataset where two ships came very close to one another. And for this predicate distance within, taking two trajectories and checking whether they have ever came to 300 meters close to one another.
19:26
So this query is self-joining the ship to itself. And if so, for these trajectories, show me the shortest line between the two trajectories.
19:45
So the result is something like this. The blue ships and the green ships, whenever a blue ship comes close to a green ship, 300 meters or less, we see this red line.
20:02
So the ones above are at the entrance of the port, so that's not a big deal. Ships are going slow, they are entering the port. But maybe this one is more interesting because it happens just in the middle. Two ships are coming close to one another. Maybe one would like to further look at their direction, their speed, where they are heading towards one another.
20:24
So you can continue adding more complexity to the SQL. This is what I mentioned in the beginning. You can run this on a cluster. Just put on every node, Citus, MobilityDB, PostgreSQL, and PostGIS.
20:42
The data will be sharded and replicated. And then the user query, you have a few management commands to shard your data and to create reference tables and so on. And then everything else is transparent. You just write the query, it gets distributed.
21:01
And we saw that on four machines, the queries become 20 times faster. So to end, MobilityDB is a moving object database system. It's an extension of PostgreSQL and PostGIS. It's developed by the team in ULB.
21:23
It's open source. It's available on GitHub. And it's compliant with the OGC standard for removing features, the new development that is happening over there. So it doesn't show the last slide.
21:40
Thank you very much. We have one minute for one question, and even a short one. Please. Could you speak up a little?
22:13
Yes and no. It builds on top of the PostGIS point, which can be three-dimensional. But I know that the distance functions in PostGIS are not very accurate when they come to third dimension.
22:27
So the distance is done by PostGIS basically. MobilityDB will manage the time. Thank you very much again. Thank you.