Continental Scale Point Cloud Data Management with Entwine

Video thumbnail (Frame 0) Video thumbnail (Frame 986) Video thumbnail (Frame 2052) Video thumbnail (Frame 3029) Video thumbnail (Frame 3627) Video thumbnail (Frame 4171) Video thumbnail (Frame 4713) Video thumbnail (Frame 5321) Video thumbnail (Frame 5938) Video thumbnail (Frame 6505) Video thumbnail (Frame 8166) Video thumbnail (Frame 8717) Video thumbnail (Frame 10161) Video thumbnail (Frame 11024) Video thumbnail (Frame 12781) Video thumbnail (Frame 13303) Video thumbnail (Frame 14144) Video thumbnail (Frame 14905) Video thumbnail (Frame 15571) Video thumbnail (Frame 16208) Video thumbnail (Frame 17219) Video thumbnail (Frame 18326) Video thumbnail (Frame 21835) Video thumbnail (Frame 22919) Video thumbnail (Frame 23455) Video thumbnail (Frame 24126) Video thumbnail (Frame 24770) Video thumbnail (Frame 25974) Video thumbnail (Frame 27062) Video thumbnail (Frame 27752) Video thumbnail (Frame 28397) Video thumbnail (Frame 38613)
Video in TIB AV-Portal: Continental Scale Point Cloud Data Management with Entwine

Formal Metadata

Title
Continental Scale Point Cloud Data Management with Entwine
Alternative Title
Continental Scale Point Cloud Data Management and Exploitation with Entwine
Title of Series
Author
Contributors
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
2019
Language
English

Content Metadata

Subject Area
Abstract
The defining characteristic of point cloud data is that they are large, and tools such as [Entwine](https://entwine.io) and the Entwine Point Tile specification can help you overcome their bigness. We will discuss how we used Entwine and EPT to construct point cloud web services for the [USGS 3DEP LiDAR data](https://usgs.entwine.io) of the United States as an Amazon Public Dataset. We will also demonstrate how to leverage EPT web services with open source software such as [PDAL](https://pdal.io) to extract information, enhance data utility, and reduce data volume for tasks such as filtering, object identification, and visualization. You will learn about how these tools work together with others such as [GDAL](https://www.gdal.org/) and [PROJ](https://proj4.org/) to provide data management and processing pipelines for expansive data holdings.
Keywords General

Related Material

Video is cited by the following resource
Scale (map) Scaling (geometry) Open source Multiplication sign Point (geometry) Bit Graph coloring Digital rights management Software Point cloud Digital rights management Metropolitan area network Point cloud
Point (geometry) Filter <Stochastik> Divisor Computer file Point (geometry) Projective plane Similarity (geometry) Bit Function (mathematics) Process (computing) Point cloud Abstraction Library (computing) Abstraction Library (computing) Point cloud
Point (geometry) Complex (psychology) Noise (electronics) Building Computer file Open source Block (periodic table) Set (mathematics) Angle Function (mathematics) Reading (process)
Point (geometry) Trail Building Network topology Block (periodic table) Electric power transmission Wave packet
Type theory Volume Mathematics Discrete element method Different (Kate Ryan album) Image resolution Point cloud Set (mathematics) Endliche Modelltheorie
Point (geometry) Type theory Dataflow Sign (mathematics) Software Multiplication sign
Point (geometry) Population density State of matter Order of magnitude Tessellation
Point (geometry) Mapping Software Tessellation
Point (geometry) Computer file Transformation (genetics) Set (mathematics) Attribute grammar Fluid statics Order (biology) Software Query language Data structure output Data compression Self-organization Point cloud Standard deviation Multiplication Information File format Parallel computing Point (geometry) Data storage device Electronic mailing list Attribute grammar Scalability Tessellation Type theory Software Self-organization Point cloud output Data structure
Point (geometry) Network topology 1 (number) Representation (politics) Data structure
Point (geometry) Arithmetic mean Response time (technology) Service (economics) Scaling (geometry) Visualization (computer graphics) Point cloud Tessellation Physical system Point cloud
Videoconferencing Analytic set Data structure Exploit (computer security) 2 (number)
Server (computing) Computer file File format Patch (Unix) Projective plane Gradient Bit Water vapor Exploit (computer security) Number Roundness (object) Visualization (computer graphics) File system MiniDisc Quicksort Endliche Modelltheorie Audiovisualisierung
Point (geometry) Multiplication Query language Patch (Unix) Point cloud Similarity (geometry) Right angle Database Figurate number Product (business) Tessellation
Point (geometry) Noise (electronics) Digital filter Algorithm Computer file Image resolution State of matter Limit (category theory) Bound state Mereology Operator (mathematics) Orthogonality Order (biology) Boundary value problem Data structure
Metre Point (geometry) Laptop Hexagon Image resolution Hierarchy Electronic visual display Boundary value problem Thresholding (image processing) Mereology 2 (number)
Algorithm Greatest element Service (economics) Algorithm Total S.A. Bit Attribute grammar Revision control Arithmetic mean Web service Error message Different (Kate Ryan album) Rewriting Website Normal (geometry) Endliche Modelltheorie Resultant Social class
Standard deviation Context awareness Building Multiplication sign File format Dimensional analysis Different (Kate Ryan album) Hypermedia Befehlsprozessor Analogy Single-precision floating-point format Visualization (computer graphics) Endliche Modelltheorie Data compression Point cloud Algorithm Mapping File format Building Open source Attribute grammar Bit Computer Tessellation Type theory Volumenvisualisierung Endliche Modelltheorie Quicksort Resultant Data compression Point (geometry) Digital filter Open source Codierung <Programmierung> Exploit (computer security) Attribute grammar Number Wave packet Software Octree Data structure Standard deviation Information Exploit (computer security) Equivalence relation Visualization (computer graphics) Mixed reality Point cloud Video game Library (computing)
Scripting language Scripting language Suite (music) Transformation (genetics) File format Multiplication sign Projective plane Java applet Translation (relic) Web browser Translation (relic) Web browser Tessellation Process (computing) Utility software MiniDisc Routing Library (computing)
Gateway (telecommunications) Server (computing) Dependent and independent variables Tower Direction (geometry) Reflection (mathematics) Point cloud Routing Equivalence relation Tessellation
Computer program Server (computing) Statistics Gateway (telecommunications) Scaling (geometry) Transformation (genetics) Server (computing) Multiplication sign Projective plane Archaeological field survey Set (mathematics) Transformation (genetics) Equivalence relation 2 (number) Lambda calculus Point cloud
Computer program Software Multiplication sign Point (geometry) Software Computer program Set (mathematics) Information privacy Tessellation
Point (geometry) Laptop Greatest element Dot product Open source State of matter Transformation (genetics) Coma Berenices Tessellation Volumenvisualisierung Boundary value problem Right angle Data structure
Point (geometry) Metre Standard deviation Standard deviation State of matter Sampling (statistics) Maxima and minima Analytic set Variance Mathematics Sampling (music) Website Information
State of matter Multiplication sign Combinational logic Set (mathematics) Open set Mereology Variance Subset Web 2.0 Fluid statics Ontology Information Series (mathematics) Descriptive statistics Physical system Social class Mapping File format Electronic mailing list Data storage device Bit Instance (computer science) Statistics Tessellation Type theory Arithmetic mean Order (biology) Website Right angle Spacetime Web page Point (geometry) Server (computing) Twin prime Service (economics) Link (knot theory) Computer file Transformation (genetics) Image resolution Maxima and minima Black box Number Attribute grammar Data structure Default (computer science) Projective plane Shareware Query language Personal digital assistant Sampling (music) Blog Network topology Video game Abstraction
the caribbean.
they get started with an ex toward thanks for coming to this this last session and so now we have color manning from the united states talking to us about and continental scale point cloud management within china it really is continental scale so he was going. i like he said i'm commenting on here to talk to you about really really large point clouds were made with some open source software called entwined and a little bit of put all so the first time when i go over little bit of some of the open source software tools that on that make up the.
these projects put all or peto either pronunciation is fine and and twine.
so for someone to talk about pull bit it is the point data abstraction library and is used to translate and manipulate point cloud data so for people that are familiar with gel which is probably quite a few of you as a similar scope in point cloud land that jail doesn't restroom factor land. and it all provide you a processing pipeline to develop workflows which are composed of stages and stages our readers writers and filters so an example of a simple pipeline might be something like read a couple last files reject one of them to match the other and write the output to a tiff.
it but they can also get quite complex.
because these stages are composed will you can develop some pretty complex work clothes i'm not going to go through the details of this one but we're doing some reading from a and e p t data source which is what i'm going to go over shortly we do some rejection the noise ing and what we end up with is just the ground points from this data set and we write. the output to both the tiff and the easy file. so the building blocks that pull gives you are very powerful it's pretty unopinionated about how you composer workflows it gives you a small building blocks on which to build.
so you might imagine somewhere closer probably lot of people that work with point closer to have a lot of work clothes in mind so for example you might be seen how close your trees are to your power line over train track or maybe you're concerned with stripping all that out and you're interested in the train itself.
and maybe have some post earthquake point cloud model and you like to figure out how to turn that into a dem at different resolutions so you're playing around with some settings to figure out how to do volumetric change detection type stuff.
city planning type things as setbacks from curbs figuring out where to put signs that cetera.
or maybe just measuring something in a place that's not very easy to reach all the time. so probably everybody has worked flows in mind a lot of people can think of software and tools that do that but what about when your data instead of looking like those looks a little more like this like this is a i think sixty billion point city gathered with mobile ad are so they drove cars around with like ours attached to it or countries.
this is all of the netherlands six hundred forty billion points and on many terabytes or large states this is kentucky in the usa and so the data like this at this magnitude is sometimes delivered as flights but more frequently will be delivered as lots and lots of full density tiles with fixed.
but which can be very difficult to work with i am i mean you're not people are delivering map tiles and rest of tiles this way but you do see point clubs deliver this way a lot so you can talk to you about next about the software that i think is a better way to do delivery of lighter data and the software behind us.
called and twine so it's a point cloud organization software that enables you to efficiently query analyze visualize and enrich your very large point cloud collections it's very scalable up to trillions of points which will see shortly and it's built with a perilous parallel as asian in the cloud in mind. so what and twine does is generated by a new format called in twenty point tiles or the p.t. format and this is a static file structure that's not stick to the encoding so you can swap out the back and compression depending on who use case of or you can use the industry standards like was it that cetera.
it's got a flexible attribute schema so you're not bound by a fixed pre-defined sets of types and is fully lossless and a really important thing here is that it's lost listening its last lists in the strictest sense of the word such that the input data set is fully reconstruct able from the p.t. another. is really important when you're looking at multiple terabyte data sets because if you're going to undertake a transformation that converts these multiple terabytes to another multiple terabytes it would be great if you didn't have to keep both of them around and you could put one in cold storage so the e.p. to format has been designed with maintaining every aspect of the information from the input in the beauty. itself so that you could theoretically reproduce the inputs a completely from the party.
and so this play up so this is just a visual representation i mention that beauty is in our country structure and this is kind of what an arc tree is visually represented you can see the point budget slider being split up and down and as we decide yes i can load more points are no i want fewer points we can discard the ones that are least relevant depending on what.
currently looking at.
so it's kind of like slipping out tiles were baptized services for point clubs.
the.
so this is a bit longer video but i'm just going to show some of the visualization and scale ability of how big the p.t. stuff can scale to so this is a four trillion point dataset i think slightly less but are approximately four trillion individual points for the entire you united states. interstate system so as were zooming around you can see where happening all over the country but the part that we're interested in fills in quite quickly. and some people are probably thinking will have yet visualization i mean it's kind of cool but that's really not why we're using point clouds right but the idea the key thing to kind of think about imagine here is that if we can load the data that we care about very quickly on demand with a millisecond response time we can probably do a lot of other things too so after this.
the official playing a talk about some of the analytics and exploitation that you can do with this side of structure. thing to another twenty seconds or so anybody have any early questions so real quick. get it. it. the company that i don't think i can say the name of the sorry.
the videos the only public thing.
i'm. so that the data structured as a whole bunch of files on disc in an artery format so and so there's no server a typically you would store them in something like a three year distributed file system or bare metal server and its you can use any in coating you wish that data and in particular and most ladder. it always lies it and it's a bunch of as up files with some meditative that let you access them this way. so like i said and when scope isn't really just about visualization we try to be somewhere in the middle of this gradient between being able to view your data and being able to do things with your data probably a little bit more towards the exploitation side but somewhere in the middle there are a lot of projects way over the green and in a number of them were. over the blue and not quite as many in the middle so easy to it it doesn't try to be the best at visualization but it tries to be the most useful all round format. so i'll talk about someone else's and exploitation like i said in this is going to be using portal.
i saw go through just a couple were close and how you might use the p.t. to solve them similar before so for example maybe you're interested in this patch of this lovely patch of a forested hell but what you're really interested in his modeling how the water might flow over it so you'd like to get rid of all the vegetation and what you really want to some sort of.
watertight masher bastareaud or some other derivative products from the point cloud so like before you can probably think of ways to do this right you may have done this before or something similar but what if that patch of land is in a i think this one five hundred billion point data set that spans multiple terabytes how may be that complicated but you probably.
probably not thinking about my need to go query that tile database figure out which overlaps there are then have to do want to downloads and then used it can be difficult when your data exists in an ecosystem this large.
but with the e.t. reader with put all using the spatially accelerated data structure it's actually quite easy so you can see that at the top there we have an e.p. t. reader and the important part there's that work wearing by only the bones we care about and then we do some operations on it so we're detecting noise we running a ground algorithm and filtering the non ground points and. and writing they'll put to a tough and even in a mall to terrify dataset like that one was this would probably take only a few seconds or on the order of minutes.
another kind of orthogonal example is this is the state of kentucky also half a trillion points and how would you do something like generate a reasonable boundary for it right you don't you might think taking the hatters of the files and mashing all the bounce togethers but then you get a bunch of jagged edges it's not not necessarily so.
continue on display is kind of your user facing footprint and and this is also really easy with the p.t. reader because it's structured and a hierarchical manner by resolution so the key part here is that resolution there that's quite course i'm carrying four hundred points that are four hundred metres apart a typically and then i'm just using pulls has been.
he turned to create a hex boundary on that data and like i said that can talk he said his multiple terabytes and i think this takes about five or six seconds or so on my laptop.
he and another thing you can do with the p.t. structure and with total as you can do enrichment to the data and what i mean by that is that you can add new attributes later and at the bottom there so you don't need to add them or read you don't need to add them to the exact site itself so there's no rewrites involved you can write these low. locally and this is something will see a little bit later so for example if you have a web service but you don't have ready access to it can you swap out attributes for it with attributes that you've decided an example of that might be things like normals that you're going to reuse for lots of different algorithms or workflow results like class fires.
a typical example would be replacing the classification of some service with something better with a better version of a classification algorithm.
and that there's a lot of stuff on here a lot of it's not all that important but the point the point to note here is that the p.t. add on writer which maps dimension that's the result of work closer it we've we've signed a classification with some awesome ground algorithm and then we mapped that to a path and in this example it's on our local. our local computer and then later we can map these paths back into the attributes in the point cloud so if you have all sorts of different classifications for different contexts or you're comparing different algorithms you can swap them out can dynamically this way.
now we're going to have a little bit i and some of you i think there's been a lot of talks about seizing three tiles actually see ushered out there of three tile shirt. so somebody might have been thinking well to the sounds kind of like the tiles what like what is this what are the differences why would i use one or the other the first what they are cesium surrendering library. and three the tiles a format so the analog would be cesium is like. but i guess poetry which i haven't mentioned poetry was the visualize are using earlier but three d. tiles as the format. and in general cesium is really good for makes media types because you can do things like mix up your building models and train models and point clouds and you can love them all up in a single render and they've also got flexible tiling format so you can define how you want to split your data and it's got its just a really robust rendering library but. for point clouds in particular and i'm going to compare it with the p.t. here there are some drawbacks and this is these are really drive these aren't really. it's not things that cesium is like missing that they should have added but their scope as little more toward the visualization side so when you start to look at it for things like exploitation you're missing some important things so one n one example of an advantage of the p.t. over three tiles is that you can build the t.v. with open source tools which would be in twine. with the cesium you need to use easy my on for a building things and in general the formats just more oriented toward visualization you can't use standard ladder and coatings of the compressions optimized for g.p.u. so if you would write a great many things again said it might be a little clunky and and in general not run herbal. attributes are the prioritise so for example if you if you upload something to say to see my own it strips out the things that aren't render bill like your g.p.s. time in your skin return number scanning all which are really important to people that really care about life are using it for driving things and the last one there is that the meditative the matter data for equivalent. tea is much larger and cesium because the beauty is an implicit i treat we can bet a lot of information just in our notes structure well while cesium explicitly has to list a lot of implicit things and that's on the road map for them.
so i'll come back to that and a little bit and we're going to switch again or a quick to a new project i've been working on called the tools and and this is a job a script library that can run in the browser or no j s and it has tools to work with the p.t. data right now there aren't very many you can see there's only three tools one of them's validate so we can check out the men.
the data and make sure it looks good which would be useful if you are creating your own e p t and not using twine. and then to go back to the three d. tile stuff there's a tile command that translates the p.t. to three tiles as a one time transformation so this is this would be duplicated your data and yet another format or perhaps more interesting would be the live translation of the p.t. to three d. tiles not look something like this you just serve and e.d.t. project route and.
can you point cesium it that route and cesium makes three the tiles requests that are automatically converted. by the server from the beauty so you p.t. serve response to that with three towers data directly. and more interesting than that though is that that's actually i don't potent and stateless so you can run that in alaska so with something like a ws land an a.p.i. a.p.i. gateway for example or the equivalent in some other cloud you can have a server lists reflection of all of your point clouds and e.d.t. as three d. tiles.
for very cheap because you're not paying for certain you're paying for the mill seconds of the actual transformations so you don't have a server running all the time. so the last thing i'm going to talk about just a couple minutes left his home and example project of using tools like this to manage a very large data collection at scale.
so the sunlight our people from the u.s. or people that have worked with light are from time might be familiar with the u.s. geological survey of three that program which gathers lots and lots of light are so here's some stats about the data set that goes back about fifteen years seventy plus terabytes.
and existed as tile data in s three. as always he so leveraging the status of the u.s.g.s. just had a sitting there for a long time and people were downloading it but how can we leverage it and do other things like can we look at all of it. can we write software against can we query and filter it and most importantly can we get amazon to pay for it perhaps and the answer to that was yes actually all these so through the aid of us public data sets program and amazon paid for the the computerised for converting all the data from tile data. e.d.t. as well as hosting in as three for at least the next two years.
and here's kind of the the portal of what we ended up with so you can see all the footprints side and i talked about how how these were generated before all of these so you can see the point countertop over ten trillion but all these boundaries were generated and i think like for five minutes on my laptop so not perfect the quite course but it's pretty. good for something like this and it's a you wouldn't be able to do this with every structure and so you can see down at the bottom there there's poetry and classico little dots and those are to open source renders and on the right you have cesium which goes to the on demand reflected three tiles transformation so the state. only exists as a p.t. but we do reflected of caesium as well.
and i here's an example of just what you can do with this website are filtering down to the five hundred billion points or so and just loading about all of this is poetry. and you can also run analytics i mean it's full of the p.t. datasets available over h e t p so do things like sample the disease actually this data sets quite noisy as you can see the standard deviation of diseases three hundred but this state has an elevation change of approximately ten or twenty meters because it's iowa.
have you can also do sampling on the classification so this is counting this is clearing a really low resolution that same data set a really large one and counting the values that come back as classification you can see of one that's kind of weird their the to twenty nine i'm not sure what that would be. a but that's all i got him i'm going to i'm not going to put up a whole bunch of links you should only need this one because after the sessions over i'm going to have a blog post on the main page there with these slides and with all the links to all the projects have talked about so if you want to if you need to remember link that's the only one you should mean.
post isn't there yet when you check in me a little bit of time. but thank you that's often. thanks god i'm does anyone have any questions i do have swag for questions. so what's the time was the road map for a.t.p. tools and there's only three that she listed what more do you plan on having a while i this is the first time i've actually published publicize that i've given it out to a couple people that have used it but it's probably going to cut its going to depend on community involvement of what it would open. people think would fit in this space and say hey i need this there should be this tool so people like you will be the drivers of the road map so what can you get a spork the twins forks it. so every dixon for your kind of i'm when you're writing add ons to they can seem the same amount of space on your story just the original data said i know so the add ons to take up so for example an add on you can specify the type like for classification i think it's a sixteen bets eight or sixteen bets. one of the two eight eight its thinking martin it's a bit so if you're writing your own classification it will take up a bit times the number of points that you actually run the class fire on you don't have to so i mention the actress structure so tree structure you don't have to write and add on for the full set you can write for a subset so that all those queries like the bone square. we're talking about or the resolution queries if you have add ons that only going to a certain resolution you can you can write as a subset so they will only take the amount of space of the attribute tight times the number of points that you actually apply them to. who thinks the art by an upgrade. and i appreciate you slide down on last lists in the beginning and i appreciate that it included ordering does it mean if i have to state of iowa in the no one thousand seven hundred tiles and i give it to the intertwined in colder that i can get back those one thousand seven hundred tiles in the seam. naming wis every point in the same order and if yes how to implement that yes. so the air on on the in twenty no website there's a link called and one point tiles on the side bar and that's the description of the format so the key parts as far as what specifically what you asked for every file that comes and we tagged we add a new attribute called the origin id which maps back to. the combination of the files for matter data so everything in the last hatters all the alarms all the evil ours to so we store all that and we started filename itself and so not by default we don't include appoint idea because typically they're already ordered by g.p.'s time so we use g.p.s. time is an implicit ordering but you can. and said there's a flag on and twine that you can say store point id and it will tag every single point with its order in its origin file so hopefully that answers and. anything else. we have a few minutes so the more questions are welcome or more sports to give away. you know. well first of all thanks one point i think i saw them already on the web where you switch between the you asked on the mound. holland and i may be done work on the fly you will start the mill tissue resting or was that everything was rejected. that specific case i think everything was rejected so for the for the public services that we kind of post as demos we actually do something very that all the latter people won't like very much as we put on by mercator because then it everybody can interact with a very easily right i mean it's its demo data right it's not it's not really meant for. everything but yes so in those instances that was all wet mercator and for something i didn't mention about like the three d. tiles one that's actually a pretty pretty nice benefit of the eventual stuff is that cesium only supports i think let lying whether cater in the sea yeah for some combination thereof i think the points are you c f. but the meditative must be left lying part of the on the fly transformation of the usual stuff is doing the rejection from whatever your data set is in so if your data is on you tube and more some local coordinate as long as you have the court systems and they can be rejected to shop correctly on a globe you can do that translation but in your specific case i think all the data was in the same as here. it's going to tank. but no one else and i don't want to everyone in the short question. my concern was that a greyhound i should i actually should mention this i guess some guys might have seen me talking about greyhound a couple years ago which was a server that kind of did a lot of these features and this is the first time in presenting a p t a greyhound was alive server see a dead. but life server up and running all times and what grounded was translate a black box format that you could that wasn't a wasn't documented to anybody but it served a kind of like the bt does that black box format that you weren't supposed to look at because we wanted the abstraction layer of greyhound has been solidified and the p.t. and because we've implemented it as. we've intimate implemented the the ability to read it static lee in things like poetry etc and put all there's no server involved there so greyhounds kind of the need for it kind of goes away when you have a static format that's more recognisable are more usable i think the space for greyhound or what was ground might move towards. the p.t. tools kind of thing like if you do want to live server for some reason or maybe it's a series of land as i think that's where it would go right if you want to be filtering on the servers something like that so probably ground ground doesn't exist anymore but would be any beatles. thanks everyone i think you need for this too often. i.
Feedback
hidden