Dotloom Project - Next Generation Point Cloud Platform

Video in TIB AV-Portal: Dotloom Project - Next Generation Point Cloud Platform

Formal Metadata

Dotloom Project - Next Generation Point Cloud Platform
Title of Series
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Dotloom - Next Generation Point-cloud Platform
Mathematics Software developer Point (geometry) Projective plane Computing platform Bit Quicksort Point cloud
Mapping Building Graph (mathematics) Point (geometry) Projective plane Information systems Planning Discrete element method Supercomputer Data management Tekla Data center Point cloud Self-organization Modul <Datentyp> Computing platform Geometry
Distribution (mathematics) Email Computer file Distribution (mathematics) Point (geometry) Moment (mathematics) Projective plane Shared memory Theory Twitter Peer-to-peer Web 2.0 Process (computing) Software Software Hard disk drive Computing platform Process (computing)
File format Point (geometry) Tape drive Shared memory Point cloud Set (mathematics) Process (computing) Geometry Point cloud
Multiplication Distribution (mathematics) Multiplication sign Point (geometry) Projective plane Data storage device Bit Data storage device Peer-to-peer Process (computing) Synchronization Synchronization Right angle Process (computing) Communications protocol Communications protocol Point cloud
Bit Musical ensemble Normal (geometry)
Server (computing) Shared memory Client (computing) Bit Cartesian coordinate system Windows Registry Peer-to-peer Subject indexing Uniform resource locator Process (computing) Term (mathematics) Telecommunication Interface (computing) Synchronization Subject indexing File archiver Point cloud Local ring Directed graph
Greatest element Uniform resource locator Link (knot theory) Hash function Key (cryptography) Personal digital assistant Synchronization Cartesian coordinate system
Peer-to-peer Dot product Personal identification number Computer file Internetworking Bit Connected space
Revision control Subject indexing Information Internetworking File archiver Directory service
Intel Computer file
Subject indexing Email Link (knot theory) Computer file Robot
Multiplication Link (knot theory) Software Personal digital assistant Process (computing) Reading (process)
Algorithm Computer file Moment (mathematics) Source code Raw image format Public-key cryptography Product (business) Revision control Uniform resource locator Process (computing) Software Personal digital assistant Personal digital assistant Hard disk drive File archiver Process (computing) Point cloud
Point (geometry) Service (economics) Multiplication sign File format Virtual memory Goodness of fit Prototype Synchronization Internetworking Personal digital assistant Subject indexing Software framework Process (computing) Service (economics) File format Software developer Point (geometry) Client (computing) Bit Product (business) Connected space Peer-to-peer Subject indexing Word Uniform resource locator Process (computing) Software framework
Peer-to-peer File Transfer Protocol Word Web browser Computing platform Subset
Email Software Software Computer network
Broadcast programming Software Link (knot theory) Software Projective plane Open set Library (computing)
Implementation Computer file Local area network Multiplication sign Patch (Unix) Set (mathematics) Revision control Mathematics Latent heat Synchronization Different (Kate Ryan album) Area Default (computer science) Key (cryptography) File format Software developer Projective plane Data storage device Content (media) Directory service Subject indexing Uniform resource locator Software Musical ensemble Communications protocol Reading (process)
yeah so I'm it's not a remote sensing talk that is attacked and also this whole project is a little bit following some sort of conference driven development so we always had our self sold milestones for conferences and we was like it with milestones you usually are not able to to mattress them so this project has so many unforeseen changes that maybe what I was writing in the initially a few months ago it's not exactly as it should is it today
yeah so I'm um geo Republic is not is another country but our our the people working for this company are quite distributed and from Rhonda is the latest one and and then recently somebody from some of us joined and so I have to readjust the map because that didn't really work well so I have to change this completely for summer somewhere yeah yeah we have to yeah and
actually this is a project that that is funded by organization called a ist the National Institute of Advanced industrial science and technology in Japan and so they have a big plan they want to make a geospatial platform for point clouds for everyone who generates point clouds in Japan and there's a lot of data generated in Japan and so they want to do many things and they have a data center called a BCI I think it's it's one of the most powerful it's just released in August and a very very powerful supercomputer and so this is yeah we should we should help them to collect the data and it's miss mostly point cloud data but later we found out this doesn't apply only for point cloud
data and so what is thought loom initially we said it's a peer-to-peer distribution and processing software for very large geospatial datasets um yeah and why do we need this um because we
found out that big data is really really hard to handle and we have many projects that involve big data and for normal for for my mother 10 megabyte is already big data if he wants to send me that and for other people 100 megabyte and for me also one gigabyte already is quite annoying to send somewhere and so we have customers that send us 50 gigabyte or a terabyte and how do you send that by mail that's hard disk and so I think before actually doing some cool stuff we we we have to struggle this the transport of the data and it's also quite inefficient to do that so how often have I have downloaded an OS and planet file and then forgot about it and downloaded again one week later and so I always download everything and in my theory 99% of all data downloaded this kind of thrown away a few days later or a few weeks later and maybe my neighbor downloaded it already so it's very inefficient how we handle big data at the moment and yeah so we thought um we found out something a distributed web that's that's the new trend peer-to-peer and so we read about this dot project um it's a it's a data sharing synchronizing pod project and the biggest problem on this project is the name because when you search for that you find many things but not that project and so we gave it a different name and we called it dot loom that's how the name was decided and so
what we what we actually needed was a way to easily quickly and securely share very large point cloud data then we wanted to browse and show search to that these data sets and just selectively pick what we wanted to download not everything and then say choose later but selectively like this the this geo tape format just just take what we need and also we want to make this streamable so that we can already do processing when the data comes in yeah so here's
something we thought its I almost explained it like quickly share thank ya bite then browse
it selectively and choose what to choose the right one and their processing pipelines so what is that that project
that that project defines the dot protocol and it's distributed sync it improves the speed of download because the Downloads from multiple peers at the same time and it's stores data efficiently and a little bit I would say it's a bit of get ends a bit of Dropbox and it's a bit of BitTorrent and so to
show you maybe a bit why it looks like why it's like it so there's a command-line interface and [Music] yeah just before it worked but maybe because of turned off yeah but it's not
so it's up to you how to get out of this it's okay it's okay so it's not it's
really important so it was just a screencast how are you in did soon dad sink and then you can actually try it out yourself it's it's very similar it feels a bit like it and you share and you know how many peers you have and because a command-line tool is not really useful for for the people for the people we target and we are currently working on something that is called that's desktop it's an electron based application that should be able to do already many things but we started rewriting it in this react and so yeah we are quite quite late and so you can create that archive from from these local folders and you can download and synchronize these data archives and so in terms of point clouds we want to add features like preview the data inside publish to specific nodes we also want to introduce some some centralized place where you can register your URLs and where you can trigger processing and indexing
that was let's see if it works no okay no it works okay so in this case it's very similar so you have this desktop application at the top and you have something you want to share here on the bottom so all all you need for sharing
something is a URL at that URL which is a hash key that does not change you add it and then you can you see how your data synchronizes and with these three
dots you can also see how many peers already have a copy of your data and on the command line interface you can see that you are you're sharing one 1.9 gigabyte of data and 200 files and how many connections you have and upload download so it's a bit the BitTorrent
BitTorrent style and when you stop it let say you you lose the internet connection like in Dropbox then it stopped sinking and when you connect
again then it's it starts sinking again you can actually see inside your archive what kind of data you have this is just metadata so you don't need to download everything before you have kind of an
index of your of your directory and you also have an information of the about the version because you might have different versions in the internet out there depending on when people were online and offline and so it's it's
quite exciting to try it out yeah and you were later I would like to tell you why it's not so fun to actually
work on it
okay so something we are we want to add
now and we have actually started already we are using this index that is created we used as five index and sharing the link to remotely index data so we we don't want to download something for indexing it so we just we let some some bots crawl through the data and just read the header files and write an index so you can you can you don't have to first collect the data at one place and
also in the same way we want to do processing remotely possible so you just read from a dot link which can read data from multiple PS and you can also publish this immediately again into something that you publish to appear to peer network so in our case we have some
some users in mind so one one case is
for example a surveying company that's some customers we have they require an easy upload of raw surveillance data and at the moment they often send us hard disks because it's much faster and they have so much they have troubles with the network and and policies and yeah they need some processing pipelines to get this data like further processed and then they want to quickly also create this created data they want to also make it available to their customers and the
other use case we had in mind was the data scientist so that that person is like good access to the data resources also needs to know that this data source is exactly what he what he wants so it's not just osm planet or PBF or some some file name that doesn't say anything so the data archive guarantees you that there's this URL you get a version that only the person who has the private key can have created so it guarantees authorship then you can yeah he want to apply custom processing algorithms and also want to publish the data easily and then I thought also about the
self-driving vehicle because that's the word you have to you have to say in in Japan these days so if you want to get research grants so cars actually collecting a lot of data but it's it's so much data that you actually can't get it out of the car and cars are connected and disconnected to the internet but you actually have a very good connection point when you charge it and so we thought with the synchronization you could actually get at this point all your data out to the four other users so it's a big waste of sensors right I don't know this data is used elsewhere or just used in the car at the moment and so we are currently development
developing here and there so we have a prototype of the processing framework that that desktop is now rewritten and now become now we can add some some more interesting features the indexing service we didn't have a time for we actually wanted to add support for the dot put that URL into pottery that means that you can read from multiple peers the nodes the data and at this point the EPT format appeared and this is also a bit everything's a bit delayed so I don't know the status about this format so there's a lot to do but the reality is
that the reality is that we are
struggling with the following so one
more times the traditional idea of the motivation actually and difficulty compared is that downloading is quite hard and it's yeah it's also it's also difficult so the motivation for uploading is actually very low so people they have a high motivation to download data but they don't have a high motivation to upload data and a lot of our so we had built some seekin platforms and they were built once and never updated so I find that the tree-ring data is actually yeah the motivation is too low that people actually do it and so this is something but the acceptance of the current schema is quite ok so people upload download using FTP or browsers and we want to change it and we want to make downloading easier and also mostly uploading easier and the problem is that we are using peer-to-peer and this is a bad word so acceptance is our big issue and I
just recently went to a phosphor G conference but I got this t-shirt and I had to sign that I'm not using peer-to-peer software to get internet access and actually it would be much
easier to develop a software for label labeling and shipping data by mail so we would already have finished but it's too much fun to stop so yeah so we have lots
of lots of obstacles and the biggest one is this peer to peer a few opening ports then also the community is a little problem of we are using a lot of nodejs and the project is also very distributed and people tend to make tons of NPM libraries and it's much more work than we thought and yeah this technology is for early adopters so you never know what happens
and my I think actually many people say bring the software to the data that's in the the best thing but I think it's better to say make the data accessible where it's created so with this kind of that link you could collect the data where it's created and then people could just retrieve what they want nobody has to upload anything you just need to download any questions so if you
interested that's currently about the development area [Applause] I got the impression that it would be a complete file stored on the note or is it also partially contents scattered around you mean as peer-to-peer mode so that the publisher of course has the whole Tita also and then when you when you create Sony when you receive the sleep this is duckling and you save to your note you can actually choose so the protocol and also the implementation of the protocol allows you to selectively sync so you don't have to synchronize everything but the current tools usually just sink so if if the publisher like updates data second it just do double changes and that kind of thing oh yeah you can publish that updates the data set there's no shared can users just like it don't the change so yeah it's like so there is actually a competition to this project for ipfs and so that one one big difference of this one is that the UL does not change when the data changes so you can start with an empty directory and index this and then when you add data so it increases the bird increases the burden and so you know you know if your burden is right or not right but you cannot negative because we don't so it does not store by default like if you must or big data it and you will get a huge problem with increasing it's very strange if you things change a lot so this doesn't store for historical data but you could implement something to take snapshots from time to time that specific versions but you have to store them somewhere else but basically what happened it doesn't like some kind of pile pile storage is kind of things the first thing is you could have multiple copies you have so if somebody in your local network already created a copy then you didn't have to patch it [Music] network and yeah so you could have fast copies so they could be multiple locations where you think because it's a folder folder format like it's it's stolen folders and files it was actually quite early we wanted to use the country poetry format initially and contribute that to beat up but I said no we don't want to have the country format and so the keys that we are working on may be picky and so we're reading that this is kind of yeah


  272 ms - page object


AV-Portal 3.21.3 (19e43a18c8aa08bcbdf3e35b975c18acb737c630)