Towards Big Earth Data Analytics: The EarthServer Approach

Video in TIB AV-Portal: Towards Big Earth Data Analytics: The EarthServer Approach

Formal Metadata

Towards Big Earth Data Analytics: The EarthServer Approach
Title of Series
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date
Production Place

Content Metadata

Subject Area
Big Data in the Earth sciences, the Tera- to Exabyte archives, mostly are made up from coverage data whereby the term "coverage", according to ISO and OGC, is defined as the digital representation of some space-time varying phenomenon. Common examples include 1-D sensor timeseries, 2-D remote sensing imagery, 3D x/y/t image timeseries and x/y/z geology data, and 4-D x/y/z/t atmosphere and ocean data. Analytics on such data requires on-demand processing of sometimes significant complexity, such as getting the Fourier transform of satellite images. As network bandwidth limits prohibit transfer of such Big Data it is indispensable to devise protocols allowing clients to task flexible and fast processing on the server. The EarthServer initiative, funded by EU FP7 eInfrastructures, unites 11 partners from computer and earth sciences to establish Big Earth Data Analytics. One key ingredient is flexibility for users to ask what they want, not impeded and complicated by system internals. The EarthServer answer to this is to use high-level query languages; these have proven tremendously successful on tabular and XML data, and we extend them with a central geo data structure, multi-dimensional arrays. A second key ingredient is scalability. Without any doubt, scalability ultimately can only be achieved through parallelization. In the past, parallelizing code has been done at compile time and usually with manual intervention. The EarthServer approach is to perform a semantic-based dynamic distribution of queries fragments based on networks optimization and further criteria. The EarthServer platform is comprised by rasdaman, an Array DBMS enabling efficient storage and retrieval of any-size, any-type multi-dimensional raster data. In the project, rasdaman is being extended with several functionality and scalability features, including: support for irregular grids and general meshes; in-situ retrieval (evaluation of database queries on existing archive structures, avoiding data import and, hence, duplication); the aforementioned distributed query processing. Additionally, Web clients for multi-dimensional data visualization are being established. Client/server interfaces are strictly based on OGC and W3C standards, in particular the Web Coverage Processing Service (WCPS) which defines a high-level raster query language. We present the EarthServer project with its vision and approaches, relate it to the current state of standardization, and demonstrate it by way of large-scale data centers and their services using rasdaman.
Standard deviation Computer font Service (economics) Presentation of a group Service (economics) Maxima and minima Scalability Open set Web service Process (computing) Computer animation Visualization (computer graphics) Design by contract Computing platform Arithmetic logic unit Metropolitan area network
Point (geometry) Standard deviation Group action Service (economics) Set (mathematics) Data storage device Client (computing) Ordinary differential equation Analytic set Variable (mathematics) Element (mathematics) Neuroinformatik Web 2.0 Data model Causality Hash function Process (computing) Endliche Modelltheorie Physical system World Wide Web Consortium Addition Service (economics) Focus (optics) Electronic mailing list Computer Client (computing) Division (mathematics) Scalability Open set Symbol table Uniform resource locator Computer animation Uniform resource name Interpreter (computing) Abfrageverarbeitung Quicksort Directed graph Wide area network
State observer State diagram INTEGRAL State of matter Covering space Curve File format First-order logic Numbering scheme Dimensional analysis Neuroinformatik Data model Mathematics Type theory Different (Kate Ryan album) Large eddy simulation Set (mathematics) Process (computing) Endliche Modelltheorie Abstraction Multiplication Physical system Service (economics) Mapping Temporal logic Solid geometry Range (statistics) Type theory Hooking Order (biology) Spacetime Surface Implementation Codierung <Programmierung> Letterpress printing Element (mathematics) Time domain Population density Causality Metropolitan area network Condition number Data type Variety (linguistics) Information Coalition Server (computing) Cellular automaton Scalability Complexity class Computer animation Network topology Musical ensemble Spectrum (functional analysis)
Standard deviation State of matter Demo (music) Likelihood-ratio test Average Mereology Formal language Query language Web service Query language Data mining Pattern language Process (computing) Endliche Modelltheorie Extension (kinesiology) World Wide Web Consortium Standard deviation Geometry Array data structure Computer animation Query language Series (mathematics) Web service Key (cryptography) Data structure Physical system
Geometry Server (computing) Weight Set (mathematics) Code Formal language Subset Element (mathematics) Formal language Mathematics Subset Computer animation Operator (mathematics) Selectivity (electronic) Process (computing) Musical ensemble Physical system
Implementation Code INTEGRAL Disintegration Computer-generated imagery Bit rate Variance Area Product (business) Web 2.0 Operator (mathematics) Query language Authorization Software testing Process (computing) Implementation Extension (kinesiology) Pairwise comparison Physical system Server (computing) Data storage device Operator (mathematics) Semantics (computer science) Formal language Frame problem Inflection point Mathematics Subset Process (computing) Computer animation Query language output Resultant Data buffer
Standard deviation Standard deviation Server (computing) GUI widget Client (computing) Query language Web 2.0 Computer animation Raster graphics Web service Visualization (computer graphics) Interface (computing) Electronic visual display Process (computing) Library (computing) Resultant World Wide Web Consortium
Real number Computer-generated imagery Set (mathematics) Client (computing) Database Discrete element method Scalability Medical imaging Array data structure Computer animation Bit rate Green's function Operator (mathematics) Visualization (computer graphics) Computing platform Electronic visual display Convex hull Musical ensemble Alpha (investment) Resultant Graphics processing unit
Distribution (mathematics) Multiplication sign Maxima and minima Parallel port Scalability Subset Causality Process (computing) Extension (kinesiology) Multiplication Physical system Point cloud Focus (optics) Parallel computing Point (geometry) Computer network Binary file Scalability Flow separation Single-precision floating-point format Uniform resource locator Computer animation Query language Telecommunication Compilation album Hill differential equation Key (cryptography) Abfrageverarbeitung Mathematical optimization Resultant
Point (geometry) Standard deviation Implementation Euclidean vector INTEGRAL Multiplication sign Sheaf (mathematics) Database Data storage device Disk read-and-write head Information retrieval Goodness of fit Performance appraisal Implementation Multiplication Physical system Data type Area Mapping Parallel computing Server (computing) Machine vision Core dump Scalability Computer animation Raster graphics Abfrageverarbeitung Musical ensemble Mathematical optimization Data structure
Standard deviation Service (economics) Euclidean vector Link (knot theory) Consistency State of matter Multiplication sign Computer-generated imagery Covering space Database Perturbation theory Data storage device Mathematical analysis Chemical polarity Arm Web 2.0 Different (Kate Ryan album) Implementation Mathematical optimization Physical system Domain name Parallel computing Server (computing) Computer file Data storage device Internet service provider Core dump Ext functor Bit Cartesian coordinate system Scalability Product (business) Computer animation Raster graphics Web service Function (mathematics) Endliche Modelltheorie Mathematical optimization Resultant
Satellite Texture mapping Service (economics) Computer-generated imagery Data storage device Internet service provider Temporal logic Parameter (computer programming) Arm Product (business) Web 2.0 Web service Voting Type theory Computer animation Uniform resource name Interface (computing) Right angle World Wide Web Consortium
Surface Statistics Service (economics) Information Multiplication sign Computer-generated imagery Internet service provider Solid geometry Mathematical analysis Perspective (visual) Neuroinformatik Web 2.0 Type theory Computer animation Personal digital assistant Visualization (computer graphics) Different (Kate Ryan album) Spacetime Physical system
Standard deviation Implementation Group action INTEGRAL Real number Disintegration Visual system Set (mathematics) Water vapor Power (physics) Formal language Web service Array data structure Cube Term (mathematics) Set (mathematics) Query language Gastropod shell Energy level Cuboid Process (computing) Physical system Point cloud Suite (music) Machine vision Projective plane Client (computing) Bit Cartesian coordinate system Complexity class Computer animation Visualization (computer graphics) Integrated development environment Query language Internet service provider Interface (computing) Website Speech synthesis Quicksort Diagram Row (database)
on we have and you duty can there and and I know that he pulled off the feat of the thank for among the captive from job was arrested Bremen and recording making the fraud so I'm here the took the about the approach of the burden for the big that the and what we were seeing this presentation is about the right overview of this but itself and had which opens than we are using in broad and and something about the the technical platform and that this would be the issue of being with the but and some man the most original of the services that are being built up from the 7th First Active from the brink access to to the data datasets themselves the service is
that you on the road at the symbol of 11 by from both computer and Sciences pudding together at the sort of elements of the technology is there to build and infrastructure for serving in the sink and efficiently the signs that that the set providing an anti of flexible way and that the real point of the technology group additional services from the Texas and the was so approach well we use these with its systems for a suicide processes and we eat we moved to within the duration of the and made a bid for the his and location and wish which related and free these or that on the Web using 3 clients and the plants and all that based on open sauce opened and some list of the most
under to begin with a 2 0 victory over the interpretability of the data and 7th in the UK we was the system that focus is division on the day the model which is Jim out of its cause model what basic to is a
Coleridge well but Serbs invasion of special providing phenomenon and it is providing the some the way the world eyes the conditions for Coleridge which provides after change the gym L destination provides a complete Implementation over which can 7 the order data and that from other systems in their way visit type of coated with the early with so read the data actually not just maps but going on dimension on it the more the dimensional reads like that acute in space and time for example the density of the inmates at Les well I have to locate here amazing space so that there is some other with of being with the current system of said and the young grid data which is quite computer for the technology Behind the coalition will find the other end of the said like multipoint for the ball to the politically from the chorus like the crew of the face themselves the said looking to the
standard itself that this is the principle of whether you are a print that is basically a future from the amount of heat the combined Jimale go is defined by 3 main elements as well as said that is a definition of of set so the decorum and is a member race at which is the container of the actual ideas that you have to access to a 5th of all the band spectrum each all the peaks of values of state the stock to it's about are contained in the region of these were in September well I've said it can be stuck to make them so I have to find a way to deliver information about the spectre of the of the itself with the range adamant that comes as a on and tells you how which but still which Violet says struck storey feasible to accompany made it information about each the band body and the mind of the cell this is not about and then you have to locate the space hollywood that with the main said adamant the man said he is a gay coming from the mouth and the book is only Coleridge types and it is also what the find the type of cause soffits agree that Woolwich for more some of which could be told bans on the demands so with the sentiment to deliver the coordinate of the data and became a take on the front topologies and can be found for example that the banning of the layout of the itself so what is the hit the idea behind this and having them go as well but it will integration of later because of a new model that takes on the observation from very for a very different light of sensors and put them within a generic scheme the and several N-dimensional data in N-dimensional for the fees so from 1 side you can see the coach with the from the sauces and on the other side can access and crosses the with the aim of the interview on before the beauty of it is OK and there are
many standards but news on the court while the risk or and extension model but the key aspect of this is that they want to show is that to be the so which part of the standard became for while all the Get flexibility for my the and accessing and that the set up of the way he was and a high-level quoted which approach so we was found that allows you to the right direct queries over your of model well having a query over and other cities where the model that
we do that on the whole virtual so quickly we have the white boys persisting service which defined these quite language only N-dimensional that said that the storey of the over back what you do with the state of language so
operations at the end of the competition will be Languages for both elements where you find which codes to which the and want all the way to a 4 clubs and you decide what to the top out the selection for example here the reserve band mathematicians the finding the great so eager for sustainable orange and accepting the irony that a subset sets of selection of bands of with some competition among them only
a few corretja's large you want to sup said its tool and find their weight and the language allows that has the and specifies upset in the quality system in which the poet used for the fine here the simple is likely that wanted the and time for three-dimensional to of that the interesting thing is that you can't
beat the great before and corrodies together by specifying them as different by in the way and provide operate over these different alleges is that the same way were ready to provide an integrated results they right for example the amount that you can get with a query what she wants to learn all the way in which is laid out in the books you have this amount of what I want to get from the pro system of the that the said directly input into the where and instead of having it in the text and the human available for more likely that the zoysa pump where also for presenting it function what are we
doing her on top of that was in the frame of the product is integrate integration of not only the crossing from the spot but also of accessing the made data that if the court is thought of the said what does it mean that you don't have to know what all the authorities by name and know what they mean you can go that by describing the coveted but the goal is to have a refinement data and specify the where they reckon that As for for example you can tell by wants to crosses with these web on the Old Course is that deal with the US alone on the job at the extent or you can get meant that the from the result Processing like by 1 2 Tests on the nation of the the go on the road rage and a 1 to 10 and the idea of the extent of this code is that in the world this is the implementation is now morning from the University and the University of part of approach so this is for the storage of then I want to equalise some of the data the crossing and who with
these the standards that is being employed
in approach from Brazil adding more dimensionally so once you do you extraction and you have to find out which of the 3 that you can be sure that what we do basically is again leverage on the rest on the to build a Web faces that bills of worry for you and for the display of the results and some of that is the
3 occasions that with the used and you can see the result of the equity extracting data from 4 different poetry 1 of the there is the and Red bands of that set and the other is a off a channel that is built with their digital oceanworld so that you can get display as an image light up on the grid in this with the from a OK hit
them that we talk about it and the last a technical after that their using for storing in this as in the real operating that rise among which is the rate at the base and it is providing the cost edge of the
system that cough up was of across with his dealing with the scalability issues of the the day and so we had a beautiful the publication of the system and have the approach is that too well we reuse queries to access the data to the extent that they are so I wanted this to be the great based on the Continent and the data location so far the system
offers several of communication for that and for the with the that itself from the Severn with the tying detective and with FOMC at the time crossing the by and the well off trading is what the and that what we aim to do do it the politicians and the degree-level level so that you receive free and you able to speak at a copy of the personal the available and according to the the continent of the way so well that the causes of allocated to use single where he was critical different across said and then joined the results which will for their producing the dimensionality because the subset setting up a single 7 and you Uzbek resulting for focus as a
told the head of the present system for the cost by for the stories of the data and how big a deal as the PM and the area is so it's a particular restricted for the degraded over but the for the rest and the system was both problem which will be what the examples movie point its section and we are moving tell the but with its code so that they can be used in these systems a 1 interesting features that employed 12 or duplication of data is his usual that they would later and that is a good way across system has been implemented and also with the the integration of the method BP OK so suppressed not limited memory of
the system and it was N-dimensional data so you can not only maps of poetry died at billions much for be that accuse liking climate which and how well it was not only in the area and and but also Implementation directors vision for this time of what the W system that the music is that you can use be OK went on 1
feature 24 features that the party should be a bit of a in tying custom tied to the of providing you can do and how so that up to my optimized the axis of the into the layout of the data by well the interested future that was the easy to feature what does it mean well to obtain the Optimization of the of the pilot storage if to point to the fact that the base and Ladies out for being too what you expect to be the taxes but the for for that 1 essentially through reference to the fact themselves for existing of themselves so it and not impose but reduced to a bit of post with this approach to the Optimization of that the state was is lost that you don't have duplicate and that so what to the way it looks the link all the time and when he talks about whether it is a system of frequently you direct and building up the search for for fixed but several of the services that are being built on top of this technology stocks of the stock for the storage and crossing the patient with different domains of data providing social for accessing approach to sing and I think it
1 member of the crew here 7 which provides you away in the face for accessing and the idea and the and snow over across the and you can go combined and that it is there with digital television summary between these 2 data that can be done directly on the Web in the face which will see the before you and you you get the results 2nd the main the
atmosphere data services which is employing the same Technology and providing the singing singer into faces to more is the right was the brought so within with this service you can access them for example to the mention all the texture of the day causes store in the centre for temporal for 5 of the buzz of the parameters the sevice deals
with the 4 should data that provides Web access to marine at the said that can be analysed the and put and on the Web and the face and again this is leveraging this same stack up on the sofa with a flexible as it can damage a free vote in the face by community why some of the three dimensional was
realisation Council of the geology the main where you can have some information like geology ourselves and the access can be done by weaseling them through the and moving into space to see how they are located and how they really want to attract the best example of the treaty this was be a two year
Italy answer not only air is conceded that we also have find the time implementing the and services and the system is different from that of the early the analyzing multispectral high perspective that the is them from us again that work with the Web the face and it provides a for with examples of created computer statistics or in this case is the brand of that itself so riding a single we get the data for your bag on the back the way so for concluding
to the aim of this project and now that is the right way to go down on 1 knee takes over that and have they found that that is where the a flexible way of queries that without having to programme without them to the with the intent of the members of the axis of the set but does provides and high level of access to a establish how well the integration of the data mitigated provide search for Capillipes is good vision for the country and yet with we have the power base in the of the system or implemented the sauce worship and the visualization tool sold 1 thing her 1 4 4 4 2 in the for the use of real use of interest groups or were believed that were in the winner hours of the use of the website and the terms for the implementation for use and the this system is the so we have could be on the road from the British with the
sort of thing you from the PP the best found but the group which he by of up to see it is that language that is the World on the iPad and the environment myself indeed it's a provide and the funding and the moved from the sea says completed find and you can provide a different implementation of on the spot of cost with the records of the official from the and the on off in water you will need to have a shell the service speaking by survived the pound with the same injury the and have a good day in the city roughly of well it depends on which side in you are using for just a bit of a which began in giant you are demanding so roughly with we can get something from the summit array themselves on the back and the box 4 and the new exactly but where 1 service provider in the audience from the end and I reached the summit the banking so the drift