Bestand wählen
Merken

GRASS GIS 7: Efficiently processing big geospatial data

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
this whatever is of interest for you so there's also graphical representation In this brought you see point data are and then the could only continues but interrupted time series and so forth you if the X is that and and you get something out and you can see a short time series is complete or not and this is particularly interesting if you're dealing with millions of points for example or a long time series this plot here shows the chlorophyll versus time this is close to the uh that's in the south along the southern hemisphere modus observations and there have been analyzed to see all the chlorophyll evolves there over the various years and this is a broad I have done uh I would show in the next uh about modus land surface temperature reconstruction we have been doing before doing so a few words about vizualization
something you have already seen this in your animation toward included in in Graz yeah 7 you see on this slide at time series has been animated it comes from a mighty annual time-series observations being done in North and Carolina the coastline these data are publicly available in Portland last year we give a workshop on that and you can search for this data set and who the exercises yourself so it's pretty easy to get something like this and you can see how this is said you would all this due is moving over time because it's transport it said as the sand transport by wind and you see also houses being built up on even disappear because they're probably distracted by some bad weather event and so forth then in for
another point of my 2 temporal data analysis if you have this is the the tsunami event in Japan in 2011 you can see uh for disaster management before and after the event have the slider and can visually compare what happened to get an idea about the graphical sorry about the impact in a graphical way Williams had already shown how
to look into a volume is not so easy so there's that 2 options 1 is to make slices in any direction which you can see over there another option is to all get a semitransparent vizualization of your volume content and this is another possibility and you can then the move around and if you have discussed kind of theater
like here in North Carolina where much of this about this is Helen on because so much of the park has been developed uh you can get the coastline vizualization really is something like real is on on our the threshold what she was developing this result is efficient OK uh connecting to other
software which is quite of interest also
grass has been added to the Processing toolbox I don't want to go into detail many or few when all that we have updated the tour books about 7 solar thing in the next release there will be a graph 7 entry as well so that you can I go
for that published only I think 24 hours ago the SP grass 7 the extension for our so you can now directly connected grass in our uh as before but now with the new graph 7 version I just made some plot elevation verses ideological classes you get over your data into the our space roster victories both supported and you get all draw boxplot and it's ordinal so like this you can really do so phisticated statistics in no time and new in 7 is there WPS support so if you want to all defined WPS processes the different software packages supporting paragraph 7 this project by w prison 52 not all of them come with grass providers maybe there more I don't know in and the interesting part is that each call mind can express itself in this extended style year so and this also applies to your own script so if you write a script and you make use of this kind of parser command you just cloning assisting script it's pretty easy to set up no it would also be dealt this kind of information here which you can then integrate in your workflow so programming where uh on sorry this before so this is just a quick view of what you can do that you don't really want to import your data always because this also of duplicate state space and if you will space occupation and if you have something like 1 terabyte in you would imported and get another terabyte of space consumption it's not that fun and for this with a command called I external vetoed external as well you just register the external datasets could be a duty for whatever in your address location which you can automatically create from the original data set as well and then you define as output you want you to which means at this point you say of the original data set is there you don't imported you just a graph where it is and tail everything which is calculated uh would be saved as GeoTiff in this case and then you do your computation are and you can see here I put the ending numerical this is equal some function and this will be using immediately appear as a GeoTiff in the directory which specified here so you don't bother uh anymore with the import export but especially interesting for WBS you just goes through and get your uh GeoTiff or whatever format you prefer holes and then you sees the connection you can make use of it so programming
then this annual Python API which I don't show here because this is being shown in the next talk so um just stay in the room and you can also read the slides I think later on from the website and now
there's some words about the there are massive data support what's massive just quickly you all this probably limiting factors are memory and limiting factors can be processing time if you have lots of data of disk space is something which is nowadays my would consider no more this solved in in in a period of terabytes and going toward that about maybe and larger support 5 says is also no longer an issue but what's an issue is and what can be solved in the software itself this applies generally to and use of obviously uh make it faster and term is an example of a query um how much time it takes if you increase the number of points you million points for example you have lied lied up on cloud anyone to query something within 10 million points but it should be fast and you can see the difference between 6 and 7 is so that it is really fast and is due to annual the format which has been implemented so the grass vector engine has been quite improved and you can also easily operate between a both formats there's the so that computational time in in the roster work this course surface calculation in graph 6 years this nonlinear roles of time-consuming consumption which has been turned into a linear problem and this is something quite better you see my small laptop here so this is nothing fancy but but I can do work PCA so that is the principal component analysis of 30 million points in so what did I write 6 seconds on this mission so try this and some other software and I think it would take a little bit more of time it
so what we have done 1 we have been using a modus land surface temperature data so this is known example for a large data set 1 this out 21 I'm tired sulfur land surface temperature of motors and those are being and if you want to just move there to see more no problem for me the the so this is Europe you can hardly see it this is a 1 particular over past uh it has been co contaminated and what we wanted to do um to reconstruct the values which are not there and this is a fairly complex algorithm which we have been publishing in this they here and from there to there uh everything is done Multiple so outlier detection multiple regression also multiple regression is in 7 and you eventually get out this map so this looks like magic will come from here to there but what we do is but we only can we do not consider only the single map what we look back and forth and if the weights the closer we are to the observation itself the more weight we given the further we go on the less we do so maybe the day before the day after words that are not allowed in this particular pixel and we also assume of course that the season spoon and reckon the change so this is something which is so naturally to be considered here and
so this is an example 1 map out of 17 thousand maps at time so we've been processing the entire archival few world covering Europe each map is having something like 450 million pixels and to construct let's say calculate is mapped uh we have 9 different input map so we are might multiplying this uh . 9 we all most close to more although a 4 billion exist at this point and this is something which you can now easily doing class 7 and 6 4 so in 7 you can do that and this is now the animation of monthly averages out of the 17 thousand maps right so this is approximately including the average data
20 terrabytes so which we have new generating we used our class for this of you to let go of time I don't speak about the technical stuff too much but just to
give you an idea of what would be I mean this is what we have been setting up and maybe I would be happy to discuss this if you are doing similar things yesterday there was talk about blast of 5 system so we also using justified system here uh having small low-cost boxes each of them contains the uh forehead disks of 3 terrabytes this part years already something like 96 terabyte for the raw data the storage and then we have all the chassis here connected to the front end node and using a job manager we're doing the computation here and then we have tool high-speed devices as well for the gross data management and so on so if you are interested we can I discuss
some big data challenges where it so this is something I'm doing for many years meanwhile we always had the problem to saturate connections like connecting says iterating the Internet and W connection for example tuning the internal TCP protocol for that then we exceeded the ext street specifications so we switched to accept that we exceeded the except specifications and so forth no this is something the the more data you bet and this is naturally a problem also for the new center now data processing I think the chain the things changed a bit the 10 years of all your model maybe 15 you what legal data now we are almost sold by data which is a nice problem knowing we need to get our hardware and software right 1 but and so on and this is something which was a nice benchmark for us in order to uh C. if grass can handle this kind of data and so we would say we can do so now and I already mentioned the issue then run the project of the computations in parallel on having something like 4 billion points in 1 job but then you launch let's say a few of them in parallel then you really know if your uh I all works on so where's
the stuff everything is ready to use we are currently at the very least candidate number 1 so probably next what is today Sunday so in 2 or 3 days uh we will release the next release candidate and this is hopefully also the last 1 you get a free sample data to play with also the time the time series which I've already mentioned so that you can explore easily the including tutorial by the way exploring easily climate data analysis or relied does time series which of everybody has at home you can just download from
there and figure out the new features on this dedicated nature which is also link everywhere around but you're welcome to all the test it out if you don't do so if you I use of grass 6 is considered to operate rather sooner than later thank you thank you thank you
GRASS <Programm>
Punkt
Kugel
Zeitreihenanalyse
Kommandosprache
Selbstrepräsentation
Luenberger-Beobachter
Plot <Graphische Darstellung>
Plot <Graphische Darstellung>
Wort <Informatik>
Interrupt <Informatik>
Rechenschieber
Spezialrechner
GRASS <Programm>
Datenmanagement
Punkt
Menge
Zeitreihenanalyse
Datenanalyse
Temporale Logik
Luenberger-Beobachter
Nim-Spiel
Ereignishorizont
Resultante
Schwellwertverfahren
GRASS <Programm>
Program Slicing
Inhalt <Mathematik>
Spezifisches Volumen
Richtung
Konfiguration <Informatik>
Zeichenkette
Open Source
Offene Menge
Software
Prozess <Physik>
GRASS <Programm>
Prozess <Physik>
Prozess <Informatik>
Graph
Software
Desintegration <Mathematik>
Virtuelle Realität
GRASS <Programm>
Punkt
Desintegration <Mathematik>
Adressraum
Versionsverwaltung
Computerunterstütztes Verfahren
Extrempunkt
Raum-Zeit
Service provider
Softwarekonfigurationsverwaltung
Statistische Analyse
Skript <Programm>
Funktion <Mathematik>
Lineares Funktional
Statistik
Sichtenkonzept
Systemaufruf
Plot <Graphische Darstellung>
Dateiformat
Gefangenendilemma
Rechenschieber
GRASS <Programm>
Funktion <Mathematik>
Menge
Ein-Ausgabe
Dateiformat
Projektive Ebene
URL
Information
Programmbibliothek
Verzeichnisdienst
Versionsverwaltung
Pixel
Aggregatzustand
Schnittstelle
Subtraktion
Web Site
Dualitätstheorie
Klasse <Mathematik>
Dienst <Informatik>
Dialekt
Maßerweiterung
Optimierung
Drahtloses lokales Netz
Einfach zusammenhängender Raum
Prozess <Physik>
Graph
Parser
Objektklasse
Automatische Handlungsplanung
Ultraviolett-Photoelektronenspektroskopie
Zeichenkette
Mereologie
GRASS <Programm>
Steuerwerk
Modul <Software>
Bit
Punkt
Flächentheorie
Benutzeroberfläche
Raum-Zeit
Karhunen-Loève-Transformation
Digital Object Identifier
Algorithmus
Lineare Regression
Punkt
Haar-Integral
Nichtlineares System
Prozess <Informatik>
Abfrage
Ruhmasse
Karhunen-Loève-Transformation
Rechnen
Arbeitsplatzcomputer
Frequenz
Teilbarkeit
Ausreißer <Statistik>
Software
GRASS <Programm>
Menge
Festspeicher
Dateiformat
Wissenschaftliches Rechnen
Subtraktion
Gewicht <Mathematik>
Mathematisierung
Zahlenbereich
Term
Physikalisches System
Multiplikation
Software
Flächentheorie
Mini-Disc
Notebook-Computer
Luenberger-Beobachter
Hardware
Pixel
Prozess <Physik>
Graph
Vererbungshierarchie
Zwei
Vektorraum
Kombinatorische Gruppentheorie
Mapping <Computergraphik>
Hydrostatischer Antrieb
GRASS <Programm>
Wort <Informatik>
Notebook-Computer
Mini-Disc
Streuungsdiagramm
Benutzerführung
Algorithmus
Subtraktion
Punkt
Pixel
Finite-Elemente-Methode
Klasse <Mathematik>
Flächentheorie
Datenmanagement
Computer
Ein-Ausgabe
Warteschlange
Mapping <Computergraphik>
Physikalisches System
GRASS <Programm>
Mittelwert
Wissenschaftliches Rechnen
Mini-Disc
Operations Research
Bitmap-Graphik
Bit
Prozess <Physik>
Punkt
Quader
VHDSL
Datenmanagement
Intranet
Computerunterstütztes Verfahren
Internetworking
Freeware
Knotenmenge
Informationsmodellierung
Datenmanagement
Software
Datennetz
Mini-Disc
Wärmeübergang
Datenverarbeitung
Speicher <Informatik>
Benchmark
Einfach zusammenhängender Raum
Umwandlungsenthalpie
Hardware
Prozess <Informatik>
Protokoll <Datenverarbeitungssystem>
Zeitabhängigkeit
Finite-Elemente-Methode
Ext-Funktor
Ähnlichkeitsgeometrie
Physikalisches System
Rahmenproblem
GRASS <Programm>
Verkettung <Informatik>
Funktion <Mathematik>
Ein-Ausgabe
Mereologie
Debugging
Disk-Array
Projektive Ebene
GRASS <Programm>
Mini-Disc
Ordnung <Mathematik>
Versionsverwaltung
Pixel
Softwaretest
Prozess <Informatik>
Freeware
Finite-Elemente-Methode
Datenanalyse
Natürliche Zahl
Machsches Prinzip
Stichprobe
Zahlenbereich
Maßerweiterung
Binder <Informatik>
Quellcode
Software
Freeware
GRASS <Programm>
Zeitreihenanalyse
Stichprobenumfang
GRASS <Programm>

Metadaten

Formale Metadaten

Titel GRASS GIS 7: Efficiently processing big geospatial data
Alternativer Titel Geospatial - Grass 7
Serientitel FOSDEM 2015
Autor Neteler, Markus
Lizenz CC-Namensnennung 2.0 Belgien:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
DOI 10.5446/34394
Herausgeber FOSDEM VZW
Erscheinungsjahr 2016
Sprache Englisch
Produktionsjahr 2015

Inhaltliche Metadaten

Fachgebiet Informatik

Ähnliche Filme

Loading...
Feedback