We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

State of MovingPandas: analyze all those tracks (not just GPS)

00:00

Formal Metadata

Title
State of MovingPandas: analyze all those tracks (not just GPS)
Title of Series
Number of Parts
351
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Production Year2022

Content Metadata

Subject Area
Genre
Abstract
This talk presents the current state of MovingPandas (movingpandas.org) and related movement data analysis tools. MovingPandas has been growing steadily since its first publication in 2018 (with more than 24 contributors to date). Building on GeoPandas and GeoViews, MovingPandas provides movement data analysis tools that support efficient exploratory data analysis through interactive (visual) analysis. Early functionality and demos were focused on dealing with GPS tracking data (including vehicle and animal tracks). This talk presents recent developments towards supporting other track data, including examples from sports tracking (movement in real space, extracted from video footage) and eye or mouse tracking (movement in virtual space). Among many other details, this includes support for local coordinate systems, integration of context beyond geographic base maps, as well as trajectory generalization, segmentation, and distance measures. Finally, we revisit the origins of MovingPandas: the QGIS plugin Trajectools; and review the steps necessary to bring MovingPandas' trajectory analysis tools to QGIS.
Keywords
202
Thumbnail
1:16:05
226
242
State of matterProjective planeLine (geometry)Computer animation
Visual systemExploratory data analysisPrototypeCalculationCoordinate systemBitObject (grammar)TrailProjective planeSpacetimeDimensional analysisMathematical analysisPhysical systemRight angleComputer animation
DatabaseMathematical analysisTemporal logicComputing platformExtension (kinesiology)P (complexity)TrajectoryData managementOpen setComputational physicsLibrary (computing)Parallel portTransportation theory (mathematics)Visualization (computer graphics)Query languageSocial classFunction (mathematics)Software frameworkProcess (computing)Type theoryOperations researchGeometryProcess modelingContext awarenessSoftware developerLibrary (computing)Data analysisSpacetimeMultiplication signDatabaseAreaGeometryFrame problemDiallyl disulfideFunktionalanalysisMobile WebMathematical analysisExtension (kinesiology)Open sourceSubject indexing1 (number)Time seriesRaw image formatReading (process)Computer iconCheat <Computerspiel>Point (geometry)Inheritance (object-oriented programming)Projective planeAttribute grammarUltraviolet photoelectron spectroscopyTimestampComputer animation
Library (computing)Process (computing)Software developerLibrary (computing)Software testingOpen setSlide ruleFeedbackLevel (video gaming)Revision controlCodeProjective planeGoodness of fitCommon Language Infrastructure
FingerprintSource codeCodeGroup actionWikiInformation securityGame theoryTrajectoryHeegaard splittingHuman migrationPolygonMathematical analysisDistanceLaptopHuman migrationService (economics)Computer fileCondition numberSet (mathematics)SoftwareGeometryProjective planeGraph coloringOpen setRepository (publishing)WindowReading (process)Process (computing)File formatTrailFunktionalanalysisLibrary (computing)ResultantLevel (video gaming)Multiplication signTracing (software)Revision controlDifferent (Kate Ryan album)Software repository2 (number)Computer animation
Table (information)Content (media)Web pageCodeRepository (publishing)TrajectorySocial classFunction (mathematics)Mathematical analysisSource codePrice indexDiffuser (automotive)TrailKepler conjectureVisualization (computer graphics)Library (computing)Attribute grammarLibrary (computing)View (database)Set (mathematics)Visualization (computer graphics)Different (Kate Ryan album)Instance (computer science)Volume (thermodynamics)Arrow of timeType theoryOpen setLevel (video gaming)Direction (geometry)Multiplication signKepler conjectureRule of inferenceTesselationPairwise comparisonDisk read-and-write headLine (geometry)BitMetropolitan area networkOutlierMaxima and minimaTable (information)Object (grammar)Social classInteractive televisionGraph (mathematics)MultiplicationError messageMappingDensity of statesComputer configurationAlgorithmRight angleProjective planeFunktionalanalysisDiagramElectronic mailing listScatteringPlotterTrajectoryCodeDiffuser (automotive)State observerMUDSlide ruleMathematical analysisLipschitz-StetigkeitReading (process)Heegaard splittingStandard deviationComputer animation
Moving averageTrajectoryPreprocessorMetreSmoothingSuccessive over-relaxationRothe-VerfahrenMobile appDisintegrationOpen sourceNetwork topologyLibrary (computing)Type theoryFunction (mathematics)Keyboard shortcutSeries (mathematics)Video trackingMathematical analysisExploratory data analysisPhysical systemBuffer solutionDifferent (Kate Ryan album)Medical imagingStandard deviation1 (number)Line (geometry)Computer fileSoftware bugOpen sourceData analysisUniform resource locatorArrow of timeLevel (video gaming)Right anglePoint (geometry)Position operatorLibrary (computing)Software developerHecke operatorAnalytic setLoginFunktionalanalysisSign (mathematics)MultiplicationRoutingLipschitz-StetigkeitCodeTable (information)Object (grammar)Data exchangeView (database)Hand fanIntegrated development environmentExtension (kinesiology)Element (mathematics)Mobile WebLink (knot theory)Projective planeFront and back endsStatement (computer science)Multiplication signDirection (geometry)MappingInformationKeyboard shortcutFile formatRevision controlClient (computing)SmoothingOutlierStress (mechanics)Touch typingTrajectoryTesselationCoordinate systemDiagramComputer animation
Transcript: English(auto-generated)
Thanks so much, Piermin, and thank you all for coming. It's really great that we already had the GeoPanda's introduction, so I really enjoyed this kind of lineup, because the project that I want to talk about, moving pandas, they actually built on top of GeoPanda's. But let me take one step back,
like what kind of data is moving pandas actually built for? And the answer is everything that moves, and not just in geographic space. So I want to make it general purpose, but it should also work for geographic movement. So I started using it a lot with ship movement data,
so you can also use it for other things that move freely in space. You could use it for flight data, as long as it's two dimensional, because we don't have three dimensions really right now. But you can also use it if you're doing eye tracking analysis, or sports analysis, or my last little pet project is the bottom right one, which uses data from the Mars rover,
so I'm actually leaving Earth and going to a different planet, and you can also do the calculations there. It just gets a little more tricky with the coordinate reference systems, because you need to figure that stuff out as well. So, bit of headaches. But in general, I wanted to take away moving pandas is supposed to work for all kinds of movement,
whether it's human, animals, or whatever other object that can be moving around. And probably you would imagine that it's not the only project that does that kind of thing. Actually, there are a ton of movement analysis tools, and these are just the ones that I'm aware of. And I'm super cheating here,
because particularly for the R guys, there is a whole review paper that is just looking at movement data analysis tools in R, and it's like 40 different libraries that all do something with movement data, so I haven't included them all here. But if you look at the Python side, we have like 10 different ones
over the last couple of years. It really started out very much in 2018, late 18, early 19, and I have indicated with the small icons on the left whether they're using just pandas or whether they're using geo pandas. So there is no general agreement whether you should be using either one or the other.
I talked with a couple of people who consciously decided against using geo pandas. Maybe now with the speed ups in reading and writing, we could nudge them towards the geo side of things, but right now it's very split. But what I find super surprising is that everyone uses pandas.
Like I'm not aware of a Python library for movement data analysis that doesn't use pandas. And maybe also interesting, if you're generally into the larger things and not just Python user, there's also developments in the database space and in C++ space.
So we have an extension for post GIS that is called MobilityDB. It's actually currently in incubation with OSGEO. So they are the extension to post GIS for movement data, very much like moving pandas is the extension to geo pandas for movement data.
And they are also working on an equivalent to GIS, which is called MIOS for mobility engine open source. So they're trying to pull out the analysis functions and to make it more accessible to others as well to use as a backend solution.
I guess in this room, I probably don't have to explain why everyone is using pandas or geo pandas, but when you look at movement data, it's really just a time series that has coordinates attached to it at every timestamp. So all these libraries,
they take a step back from the usual GIS layer approach where everything is a geometry that has some attributes and looks at it in the other way around. Everything is a time series that has geometries attached and usually they only allow for point geometries. Even though I think what is really interesting about the approach of using a geo data frame
is that we could actually also support moving area features. I haven't implemented that yet, but I think that would be a really standout feature if we could do that as well. So everyone is kind of digging into this time indexing that pandas provides and the time series support.
And like you saw before around half of the Python libraries that do movement analysis, they also use the geometry support from geo pandas as a natural base to build everything upon. Moving pandas right now is in version 0.11.
So it's still in a very experimental stage, I would say. I'm still trying to grow the whole community. There's a couple of contributors that you will see on one of the next slide. It has a really good coverage in tests and documentation, which I'm quite happy about.
And something that I found very helpful in the beginning of the development process was also this PyOpenSci review, which is like a volunteer effort if you're doing scientific library development and you want some feedback from the community, it's a pretty good process to go through. They look into how the project is set up, how the code is structured, how the whole CLI is set up
so that you really have something that others can contribute to easily enough. So I would encourage you to try that as well. And at least for me, it was very helpful and I learned a lot from the process as well. The preferred installation for moving pandas is via Conda,
particularly for Windows users. I haven't tried lately whether it would be possible to already pip install it, but at least in the past, it just did not work. So I'm right now recommending everyone use Conda Forge for the installation.
There's actually two main repos for the project. There's the one with the library with moving pandas, where you can see we have currently 24 different contributors that have already contributed in one way or another. And then there is a second library where you will find all the example notebooks.
So the documentation is pretty lean, but if you go to the example notebooks, there you will find a lot of either tutorials that focus on explaining how a certain functionality works. That is one set of the Jupyter notebooks that are in there. And then there's a second collection of Jupyter notebooks
that take a particular data set. For example, a ship tracking data set or a bird migration data set. And it goes through all the common steps that you would usually do when you try to first explore and to clean and to understand the data set like this.
So that's why this is in a separate moving pandas examples repository. And here you can already see these other two subfolders with all the different examples. Here it also uses all the freedom from Jupantas and Fiona that we have to read different file formats.
So the OSM traces are the GPX files that you get from the OpenStreetMap project from their services, whatever people upload in public. Then these sports tracking things are just CSV files and the horse collar is a geo package. So it kinda shows users who are not so familiar
with GIS software as well that they can just use all these things. And when you go to one individual notebook, you can either launch it in a binder so that you don't have to install anything. You just use it online and you can step through all the processing steps.
Of course, you can get the original IPython, the binder file, or you can view the rendered version in HTML. So if you don't have time and you just want to see the results, you can go there. Also all the interactive graphics, they still work, you can zoom in, you can move around
and experience and see whether the functionality provided there would be something for your project. I already mentioned that we of course also have a standard documentation on Read the Docs. And here you can see a bit which classes are available in the library. We have a class for individual trajectories.
We have a class for trajectory collections where it's basically just a glorified list of trajectories. Then we have a class that deals with generalizing trajectories. So you put in a whole bunch of different trajectories and you can generalize either purely geometrically like a Douglas-Poyk algorithm would do.
But there are also generalizers that are temporal or spatiotemporal and they remove the extra nodes based on different kinds of rules. The aggregator on the other hand is you throw in a trajectory collection and it will merge sub-trajectories together.
So it will give you like an aggregated view of the whole data set. We have an example later in the slides as well. Of course the other thing is you don't want to only aggregate trajectories. Quite often you also get a continuous observation of some moving object and you want to actually split it
into individual instances, into individual trips. For example between stops or you want to have one trajectory per day or per hour. That's what the trajectory splitter is for. It has different sets of rules either by stops or by time that you can use. So if you want to stop, split by stops
it actually uses the stop detector which is another class which will try to detect when the moving object stays within a certain area for at least a minimum amount of time and it puts a stop there. And of course you can also smooth trajectories. If there are too many outliers jumping around
because of GPS error or whatever then you can put it into the smoother. And I decided against showing you too much code but I wanted to focus on another aspect of these libraries and that for me is visualization because I think particularly with spatial data there's so much of potential errors
that you only see once you visualize the data. So I actually took quite a lot of time for me when developing Moving Pandas to choose which visualization library I want to use because there's a couple of options. We saw before that Geo Pandas explain has now this preview with folium.
So folium is one option that we have. Another option that we have is GeoViews which is based on the Bokeh library. Of course we have Matplotlib particularly if we want static visualizations. And as a fourth contender which is also used there is of course Kepler GL for the fancy potentially 3D visualizations.
So these are the four visualization libraries that are used by the movement analysis libraries that I have here on the left. So also here there is no agreed upon pest solution so far and they all live happily next to each other. This is what they look like at least.
GeoViews and folium both next to each other. Obviously they all provide access to some background maps and I use the OpenStreetMap background tiles here to make it a more fair comparison. But then of course the differences are in the details. Like everyone knows how a leaflet map looks like
so everyone is familiar with how to interact with it. A lot of people quite like that. On the other hand side on the left side you see how GeoViews looks like based on Bokeh. The interaction is slightly different which might put some people off but it has also some really neat features
like being able to save the graphs, like being able to make linked graphs between either multiple maps right next to each other or other diagram types that are linked to the map like a scatter plot of some different values in the attribute table
which can be really useful for data exploration. Both of them have the ability of course to have pop-ups and they are interactive to zoom in and out. None of them is very good at drawing arrows. So for movement data it would be kind of useful to actually indicate the direction of the movement by putting some arrowhead or something on the line
and I'm a bit spoiled from QGIS cartographic capabilities. None of these libraries comes even close like it's a poor man's data visualization in that sense. There's still a lot of things to do in the cartography. But one thing that I couldn't do in QGIS
is just making these small multiples right in the map view, right? So here these are actually linked. So if I would zoom in in the left example, the other ones would zoom in as well to the same extent all the time. And here you can just see the different versions of cleaning and smoothing that are implemented with a Kalman filter
and with other outlier removal functions that you can use. This is an example of the stop detection of where on the left hand side you can see the original trajectory with the locations of the points and on the right hand side if you use those points to actually split up the trajectory into individual shorter ones.
This is the aggregator that I was talking about. So we have migration data of birds from Scandinavia to Africa and back. And where the line is wider, there's more birds that travel along the same route. Here you can see another shortcoming of the plotting libraries by the way,
because I have movement in both directions, right? Because they're migrating north-south and south-north back but I cannot offset the line to the right and left to properly distinguish between the movement directions. So now it's just both directions plotted on top of each other which is ugly but right now that's what I'm left with.
I would have to separate it between one map for summer and one map for winter I guess. What's also particularly convenient because I said I want it to be general purpose is that you don't have to use background tiles. You can also use any image that you have
with your local coordinates. So I just loaded this PNG file of a soccer pitch for the movement data analysis. And I can then use the same functionality that I used before for geographic data here with these local coordinates. And again I would really love to be able to put an arrow on these lines.
Right now I just have a orange marker that indicates the start position of the player but it's an ugly hack to be honest. You can also see how I put the layers on top of each other because there's actually some code here. You just have the image, then there's a star sign and then there's one layer
and the next layer is put on top. Again we're putting the star sign there. So it's very easy compared to Matplotlib where you always have to throw the x-axis objects around and to put stuff on top of each other here it's just always the star sign in between. And I find it very convenient to read. So I'm a big fan of that.
Another thing that is really neat with this, with geofuse and the whole of this environment is that it's also very convenient to build this kind of data exploration apps where you have other UI elements that you can specify with just one or two statements in the code
and you can have your users play around with the algorithm and evaluate it whether they like it or not. You get, as I said, maps. Here in this example I have also a table linked to it that shows you more information about the stops. So I find this really interesting solution
for talking with clients about the data analysis and how to advance with it without having to write too much extra code that is then just thrown away if they don't like it. Another thing that is related to this whole endeavor is of course data exchange. So we don't really have any standardized formats
maybe except for GPX from GPS trackers. But there's this initiative right now at the OGC, it's been going on for a couple of years but it still needs to gain some traction I think. It's called moving features. So they are trying to think like what should be the standard exchange formats,
what should be the standard functions that should be available in these kinds of tools so that we have a common nomenclature to talk about. Like everyone knows what a buffer is if we talk in geospatial but in movement data analysis. Every team calls the same thing different names. So I hope standardization will also be
one of the next things. And of course I'm also very busy with bug fixing still because it's in a very early stage and also a lot of the underlying libraries they are in very quick development as well. So what's in the future for us? I already mentioned the mobility engine open source meals project which will hopefully
be a C++ library that can serve as a back end for a lot of these analytics libraries. So it is going to extract the analytical capabilities that are now in mobility DB and make it generally available. And we are also looking in how to provide the Python bindings naturally. So I think this idea of if using the arrow format
might come in pretty handy maybe. If you're interested I've put the links here also to these projects. And if you want to know more about the scientific side of mobility data analysis I invite you to go to my website. There's a couple of talks from the last years
that have been recorded. And then you can dive deeper into the topic or just get in touch with me during the conference till Friday evening. And with that I want to thank you for coming and I'm looking forward to your questions.