We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

JuliaGeo: A Fresh Approach to Geospatial Computing

00:00

Formal Metadata

Title
JuliaGeo: A Fresh Approach to Geospatial Computing
Title of Series
Number of Parts
295
Author
Contributors
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
2018 saw the release of [Julia](https://julialang.org/) 1.0, a high-level, high-productivity programming language, with the performance of a low-level language like C. That means looping over all features or cells and applying your own functions is encouraged. The [JuliaGeo](http://juliageo.org/) organization was first started in 2015 to provide support for working with geospatial data in the Julia programming language. JuliaGeo currently provides high-level APIs with comprehensive support for OSGeo libraries like GDAL, GEOS and PROJ. In this talk, we introduce Julia and the JuliaGeo packages, showcase interactive workflows, and talk about next steps.
Keywords
Forcing (mathematics)Presentation of a groupMultiplication signFatou-MengeAlgorithmLevel (video gaming)Computer programmingProjective planeSoftwareType theoryMachine codeEmailArray data structureContext awarenessCartesian coordinate systemBitFormal languageSlide ruleGoodness of fitRevision controlOpen sourceNumerical analysisPoint cloudFunctional (mathematics)Group actionSelf-organizationFlow separationDrum memoryPoint (geometry)NumberPhysical systemWebsiteMereologyVideo gameNeuroinformatikMassActive contour modelGeometryPhysical lawLecture/Conference
Repository (publishing)Device driverOpen setCodeGeometryWebsiteWrapper (data mining)Interface (computing)Link (knot theory)Projective planeCubeBitComputer configurationRevision controlWindowCuboidParameter (computer programming)Line (geometry)Type theoryStructural loadData managementFatou-MengePoisson-KlammerFrame problemTable (information)NP-hardLatent heatLibrary (computing)Operating systemMultilaterationOperator (mathematics)Formal languageLaptopFunctional (mathematics)Multiplication signMoment (mathematics)Branch (computer science)GoogolString (computer science)CASE <Informatik>RepetitionBinary codeLevel (video gaming)Direction (geometry)Digital libraryPulse (signal processing)Integrated development environmentBuildingLecture/Conference
Frame problemInterface (computing)Table (information)Execution unitFrame problemType theoryTable (information)Point (geometry)Operator (mathematics)IterationFormal languageTerm (mathematics)Software developerPointer (computer programming)GeometryMedical imagingAreaPolygonOnline helpComputer animationLecture/Conference
Functional (mathematics)System callCASE <Informatik>Multiplication signInterrupt <Informatik>HorizonMereologyDirection (geometry)Interface (computing)CodeLibrary (computing)Instance (computer science)Fatou-MengeFormal languageSequelCompilerDivisorComplete metric spaceRepository (publishing)PlotterHypermediaLatent heat40 (number)PiShape (magazine)TrailEndliche ModelltheorieGoodness of fitGeometryWrapper (data mining)Core dumpComputer animationLecture/Conference
Transcript: English(auto-generated)
I'm personally quite excited about this presentation. It's the first time we have a presentation on Julia at Phosphor-G. So I hope this is just the beginning of something that can grow to be another way to use Phosphor-G,
another way that we can use open data. So Martijn Fisser and Martin Prong will introduce us to Julia. Good luck. Thank you. We're also quite excited to present here for the first time at Phosphor-G.
We presented before on JuliaCon last year. But that's kind of preaching to the choir now for the first time. It's nice to present to the wider geospatial community. So I wonder, show of hands please. Like, who knows what Julia is? Who's heard about it?
Okay, that looks very good. Who's actually installed it and given it a try? Even if it was just, ah, also quite a few. So we're going to present about Julia Geo. Julia Geo, it's a GitHub organization where a bunch of us work together
on creating packages that make it easier to use Julia for geospatial applications. We're not only going to talk about packages that are strictly inside this GitHub repository,
but in a wider sense as well. So first off, let's introduce ourselves. Where are we? Martijn and Martij, we are not the same person. Dutch names.
So we work both at Del Tades, but I just want to be clear. I will explain a bit how we came to it and how we are using it at Del Tades. But in this sense we are presenting it more from the community, from the wider group of people that develop these packages.
Personally I'm a hydrologist and Martij is more from a geoinformatics background. And those are our GitHub handles. So context. Basically at Del Tades I'm guilty of being the first one to introduce Julia within projects.
And that project was about processing point clouds. And it was clear very early on that existing software wouldn't work for us because we had a very clear idea that we wanted to do our own kind of algorithms
that work on a point-by-point level. Myself, I'm mostly familiar with Python. I program in Python a lot. I enjoy it. But for this application if you really need to iterate over points, like billions of points, it's really hard to scale it.
Of course there are some solutions like Numba or Seitan. But it's also difficult. So there was basically either I could start prototyping it in Python, get something together quite quickly.
But I knew it would be slow to run. I don't know how to program C++ myself, like many people that come from an earth science background. So I would have to, then if I'm done prototyping, ask somebody to make it fast basically.
And the idea of Julia is that it solves this two-language problem, so to say. The fundamental thing they state is there's no need for there to be a separation between fast to implement and fast to run.
So it's a dynamic language. It's general purpose, but it was specifically designed to be good at scientific and numerical computing. And it does that by compiling functions to native code so it runs in a speed similar to C.
Version 1.0, so the first stable release, was only released last year. But I think now, since a week ago, it's already been 10 years since the first commit of Julia started. And it started at MIT. So it's an open source effort.
It's licensed under the MIT license, similar to JIDO itself. And so the ecosystem of packages, now that the language is stable, the packages are themselves still evolving and getting better all the time.
So what does it look like? Just a small slide of some syntax just to show you it's not some kind of scary syntax. Those familiar with Matlab Python syntax might recognize a couple of things. This is how we print things. As you see here in the VFA, this is a two-dimensional array.
So we talk about types a lot. But the nice thing about Julia is that it doesn't force you to talk about types at all. So you can leave out like, OK, I don't care what type this N is.
And it will just compile on the fly for the right types. And yeah, so one step back, more about the interest for using Julia in the geosciences. So of course, there's always, we have a lot of public channels.
This is from the old mailing list, now that we're on this course. From 2014, back then, email from Fabian Hans from Germany, just asking people like, hey, who else is interested in using Julia for this? And can we maybe get together and get a bit organized?
So this got the ball a little bit running. And I think half a year later, we started this Julia Geo GitHub organization. So first, discussions, like, OK, let's make Julia nice for this.
What should we do? And I think a lot of us agreed from the start is that to get to a useful kind of level, we need to just start wrapping the big OS Geo projects, like GDAL, geosproj, and SCDF.
Well, that's not OS Geo, but because there's a lot of very useful functionality already implemented, we can try to implement that ourselves. But why? It will take a very long time before we can be nearly as productive.
And at the same time, for those who, for some reason, cannot or don't want to rely on binary dependencies, people are working on native packages. OK, thank you, Mertijn.
So what can we do now at the moment with Julia Geo? For that, I'm going to switch to a Jupyter-like notebook. So here, basically. And I think one of the main nice things about Julia is that you have a built-in package manager.
So if you, in the Julia environment, if you do the bracket and say add project four, it will download project four, the Julia library, but it also will download a pre-compelled library for you, depending on which operating system you're in. It will work on Mac, it will work on Linux and Windows. So it basically adds the library, and you can start using it just afterwards.
No hard installation things. You can look up documentation about the projections that are actually in there. We can define two common projections and eventually transform a coordinate between those two projections.
And it will, in a few lines of code, you basically have everything working. If we go back. And this also works for GDAL, this works for GEOS, this works for NetCDF. So not more spending days on getting GDAL to work on Windows in Python.
This should work out of the box. Of course, I'm omitting many packages here now, so please take a look at the GitHub repository. And see if you recognize some packages there that you could use. So let's talk about what's nearly there, what we're working on now.
These are open pull requests, and they're being reviewed. So as we've seen before in the Opera house next door, we have the project six going on, GDAL three releases. And these need to be wrapped. There are new interfaces there that need to be tested for us.
So I can also show you that if I go back now. By the way, this is public notebook, you can find the link later. In the package manager in Julia, you can also install a git branch. It's basically a git package manager. So here we add with the wrap6 version, we install another branch, which actually gets you the latest project version.
Basically, exactly works the same. Again, start using it, and here you can see that we use the new interface, project create, to actually create a projection. So this is a new interface.
And secondly, for example, we produce the WK2 version of this projection. And this works pretty well already. And later on, we want to make this wrapping a bit more convenient, more Julia. Because in the C library, there is no optional arguments or keyword arguments.
So we'll put those in ourselves. For the rest, all the wrapping is basically automatic. So GDAL, we didn't produce that much code ourselves. It's just wrapping all the C libraries. So this will be in the next release of these Julia packages, probably in a month or so.
From then on, from a month later, talking about a year now, where do we want to go with the Julia geo ecosystem? We have a stable language now, and now we want to find out what we can actually do with it. So make use of the language itself.
I think first we need some better high-level documentation, that you can actually one site where you can find all the spatial packages, so you don't get lost. We can make some more Julian packages, because wrappers are nice, but they're very C-like. We can add more GDAL drivers. There's a lot of drivers out there, more added every day.
And I think on a high-level site, plotting things like data cubes and more of those hyper cubes would be nice to have, which is something we're working on, but it hasn't crystallized yet, I would say.
Now I want to talk about one strong thing about the Julia language, which is the definition of common interfaces. And to explain this, I will make use of the tables.jl package in Julia. So nothing geospatial just yet. It's just a definition of how tables, which operation should be defined on a table.
I can demo this as well. So here we install in Julia the CSV package, the data frames package, and the type tables package. These are three different definitions of what a table is.
And you can imagine that one is very generic and high-level and easy to use. One is more used for speed, and another one is just for a very specific table type thing. So there will always be different types of one base thing. But the nice thing is, in the tables.jl package, we defined how you can talk to a table and what you should get back.
So basically how to get it in and out. So if you look at this, we create a CSV, just string, and read it in as a CSV. And now you get a CSV type. But then you say, use data frames, and just say to the data frame, now load that CSV that we just created.
And now it's a data frame. You can also do this in Python, but now you need both packages to work together and know of each other, and start subclassing things, and you need to know a lot.
You can actually go on and say, using type tables, and now use that data frame as a type table, and now you have another type already. So it's very easy to come up with your own type, define a few common operations on your own type, and now it's understood by the complete ecosystem. You can imagine that if you start doing this for geospatial-related stuff, that would be wonderful.
So you can introduce your own point type. We all differ about what a point should be. It should be a C pointer to geos. Is it a JSON thing? Is it just an array without labels? If they just define some common operations on them, we can use your own point type, or polygon, or raster, across the whole language.
And then you can really get your iteration speed in terms of development going. And we hope to include all the geo-related stuff with the different ecosystems that there are already. So images, geometry, and the data environments.
Okay. Thank you. But before I clap, I want to thank all the other contributors to the Julia Geo ecosystem. We've put up the GitHub handles here. There are many more people who contributed. You know who you are. And please, join us on Julia Geo.
See if you like something. Make pull requests, issues, and we're happy to help. Thanks. So, questions? Yes, I will come to you and give you the mic.
Are there any downsides using Julia? Oh, difficult question. Probably not. I'd say it's mainly the ecosystem now. So, we have these packages, and they're still young.
We're still trying to crystallize how such a common interface would look like. And if you would search for your specific use case, you probably can find a package. And if you can't, you really have to write your own. So it's a factor still smaller than the Python ecosystem, for example. But I would say, come join us.
Okay. Maybe one thing to add to that, because a lot of people say, yeah, Julia is fast, and then they try it. And then the first time they call a function, for instance. They're like, hmm, this is not that fast. So how it works, what happens is the first time you run a function, it looks at the types, and it compiles the code that is necessary.
And the second time you call it, that code is already created, so it will be fast. And the core developers, who now form their own company, Julia Computing, it's called, they have this now as the highest priority to reduce the latency
and make the compilation faster. If I'm right, it's also possible to just import Python models in Julia. So what about Shapley and all this stuff? It's easy to just import, but then you don't have this performance improvements, I think.
Yeah, good question. So there's a pycall.jl that allows a good interop between Julia and Python. And one of the first main use cases, for instance, it's a lot of work to create a good plotting package.
Look at modplotlib, for instance. So the author, Stephen Johnson, MIT professor, created pyplot to just call the Python modplotlib and save a lot of work like that. In the case of Shapley, I haven't tried it, but it should just work just fine,
except that Shapley is, of course, a wrapper for geos, and we already wrapped geos directly, so I don't think it would make much sense to go through Python to geos when we can go directly.
Yeah, indeed. So you lose some performance. Indeed, that is just hard to overcome because of how dynamic Python is.
So I don't really know the language, and I'm wondering what is the story about interop, the foreign function interface. Are there any tools to import C headers, because you said that wrapping is more or less automatic?
And also, can you export functions to C code, just in case anyone wants to do that? And also the same question for C++, because that's usually problematic. So the thing for a foreign function interface, that's the easiest.
So there's a function called C call in Julia, and then you can directly call C functions from a shared library. This is what we use mainly, except the code that calls all these C calls. This is generated automatically by a package called clang.jl,
so that uses a compiler to figure out, oh, these functions are available, and we then only do some post-processing. For instance, we take the Doxygen.xml from the GEDOL repository, and we actually look up all the documentation and add it automatically.
It's also possible to export to C functions, basically. You could have a look at package compiler.jl for that. And for C++, there's actually this really interesting package called cxx.jl
that basically allows you to directly call C++ from Julia, and also cxxwrap that allows, kind of like in a Boost Python, kind of matter to define this interface.
But I haven't used the CXX interface myself yet. Other questions?
Okay, thank you, Martin.