Big Spatial Data seminar Part 2: Analythics in the Age of Big Data
Formal Metadata
Title 
Big Spatial Data seminar Part 2: Analythics in the Age of Big Data

Title of Series  
Part Number 
2

Number of Parts 
2

Author 

License 
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. 
Identifiers 

Publisher 

Release Date 
2013

Language 
English

Content Metadata
Subject Area  
Abstract 
Big Spatial Data seminar  University of Alabama  July 25th 2013. Automated Event Services: That utilize Big Data technologies to enable interactive and collaborative scientific data analysis on big data and to share data and analysis methods seamlessly, in order to:  Relieve scientists from data management,  Empower scientists to focus on science and  Boost science productivity

00:00
Principal ideal
Information
Projective plane
Text editor
Mereology
YouTube
Neuroinformatik
Physical system
00:15
Satellite
Service (economics)
Distribution (mathematics)
Demon
Cone penetration test
Suite (music)
Projective plane
Mathematical analysis
Analytic set
Demoscene
Event horizon
Event horizon
Different (Kate Ryan album)
Endliche Modelltheorie
Right angle
Quicksort
Game theory
Procedural programming
Cycle (graph theory)
01:56
Point (geometry)
Divisor
Mathematical analysis
Interactive television
Analytic set
Set (mathematics)
Propositional formula
Volume (thermodynamics)
Mathematical analysis
Mereology
Event horizon
Rule of inference
Forest
Goodness of fit
Event horizon
Causality
Term (mathematics)
Figurate number
Descriptive statistics
03:39
Trail
Vacuum
Statistics
Divisor
Plotter
Web browser
Mereology
Event horizon
Plot (narrative)
Supercomputer
Usability
Architecture
Crosscorrelation
Term (mathematics)
Data mining
Energy level
Computer architecture
Physical system
Authentication
Collaborationism
Focus (optics)
Mapping
Information
Projective plane
Electronic mailing list
Mathematical analysis
Usability
Funktionalanalysis
Computer programming
Data mining
Event horizon
Visualization (computer graphics)
Integrated development environment
Query language
Canadian Mathematical Society
Physical system
05:23
Statistics
Multiplication sign
Range (statistics)
Computer
Parameter (computer programming)
Core dump
Bit rate
Parameter (computer programming)
Total S.A.
19 (number)
Thresholding (image processing)
Computer programming
Prototype
Process (computing)
Event horizon
Customer relationship management
Query language
Computer hardware
Quadrilateral
Cuboid
Prototype
Volume
Physical system
Äquivalenzprinzip <Physik>
06:18
Satellite
Distribution (mathematics)
Mapping
Distribution (mathematics)
Ferry Corsten
Interactive television
Workstation <Musikinstrument>
Interactive television
Thresholding (image processing)
Event horizon
Frequency
Uniform resource locator
Event horizon
Resultant
07:12
Satellite
Medical imaging
Algorithm
Coefficient of determination
Centralizer and normalizer
Network topology
Computergenerated imagery
Interactive television
Object (grammar)
Physical system
Condition number
Form (programming)
08:11
Distribution (mathematics)
Event horizon
Mapping
Query language
Personal digital assistant
Thresholding (image processing)
Event horizon
08:39
Personal digital assistant
Computergenerated imagery
Interactive television
Vapor
Energy level
Momentum
Office suite
Monster group
Physical system
08:58
Email
Distribution (mathematics)
Presentation of a group
Event horizon
Mapping
Distribution (mathematics)
Text editor
Thresholding (image processing)
09:49
Inheritance (objectoriented programming)
Hypermedia
Mereology
YouTube
11:05
Inheritance (objectoriented programming)
Hypermedia
00:00
dr. rahul ramachandran is deputy editor for earth science informatics and a principal research scientist at the
00:09
information technology and systems center at the university of alabama in huntsville so this is the second project that we
00:18
working on right now it's automated if I tell us exactly expect a traffic cone here and the goal of this project is to look at different Big Data technologies or I don't find event phenomena in that sense data the tv3 technology that we are looking at is new sily me and in polaris of Lego people think we are focusing more on side eb right now so
00:47
you know the first thing that we did was we try to scope the problem down and a game guy if you look at the research papers in in lacus wake science there are five basically common procedure projects you do event and analysis where you're actually looking in detail about you know what happened in the way there is even terminology papers where people are you analyzing the data to find because i'm dating myself an event etc you know was the spatial temporal distribution is anyway and what are the cycle duration things like that the third is took common bigger is the synoptic climatology where you're trying to look at animation of an event what happened before it after that and then the sort method is for gas plants that we are trying to find out you know the research is focusing on trying to come up with new methods for predicting about the event so the goal is if you build a new analytics tool from big day that you should be able to support four of the five common approaches that are used in TC electronic science which happens by doing to be a big Gator satellite going
01:58
to be more simulation stuff so we're focusing on your analytics you know the tool will actually provide things these value propositions to any user that for event analysis that we can allow a good point interesting events filament data the given climatology is a no brainer if you have large volumes of data it should be able to pivot climatology is very easily with the data sets you can do set up synoptic John technologies way if you have figure found events in your data then you can do causality analysis because the VSS and finally look for the forecast methods if you can build the school to be more interactive then you can allow researchers to give not arresting the rules to refine their methods for finding new and Sanders gets it we and again this is still evolving
02:52
in terms of how we envision the Civic analysis workflow to be and we think there are two stages to this the first stages you know working with the big data on an HP see where you're focusing on the analytics part which is more of an interactive exploration part and then you have go to the second stage where you've you've done your discovery of the data you can discovery of your rent and you can the segmentation in a factorization of the event and you reach the descriptive data bar where some much smaller piece that you can bring down within your detail scientist analysis but it's not as clearcut you know there is obviously overlaps between the two and the struggle is finding out you know what you can do and different stages and what what would be the most useful so
03:40
this is our design a high level system architecture so this is on HPC in trying to be looking at side EV 0 which is chunking the data on the HPC by building an event themselves on top the goal is to develop and events package that can be then run within an IDE or a Hui throw throw throw browser and we're using the CMS to handle the authentication information and also the collaboration project or the is you allow people to share what they're discovering so the initial focus is to basically build the simple stuff you know simple visualization for you in detection like bar charts contour plots and maps queries to do detection segmentation factorization correlation statistics the tracking is it's a trickier problem to increment at this technical architecture so that's its spot in our list but it may be little down the road in terms of its actually
04:44
get low so the notion of even a latex vacuum the goal is to build a package that can work with Python and are so a user who are familiar with you know using Python and are as part of their analysis can use it so it improves the ocean or adoption usability is part of it so this is a simple example and obviously based on our data mining work that we've done earlier so you can actually import a package so this is adam looking for and then you can run these functions for these functions will be actually running on the HPC but the user using the desktop do with analysis
05:23
so we have done some initial prototypes with this this is using Pilatus it is a homegrown system um we tested it out with SSM idea to microwave data the different parameters and raped in speed come whatever liquid water so this is a very small data set it's 1 1 terabyte we strapped to the box our own cluster and then we have an engine that we can query the status at very fast so this is a
05:55
pretty simple pretty you I where you can select you know which intersects you want to be I'm data from nineteen eighty seven two thousand of twelve give due process very basic queries you know one DS from around release program simple threshold on the data and then doing statistical analysis of the data and you can select your data channels your time range of interest on it so this this is
06:20
really simple to example but this is little pushing it is that it's interactive you're actually interactively playing with 20 years worth of satellite data you a simple threshold so simple question I want to look at hurricanes and dust off exit alright see you worrying great you want to find extremely vain recommends in this station region would be really get results based on the year and you can print out a particular year and then you can see season distribution of these events then you see one month looks the Lord so this is value in a small notion of the data pointing you in deeper actually comes what's going on then you can drill down
06:58
to a particular month and you can actually pick up the particular cancer this is katrina and you have a heat map which is basically a special frequency it's showing the location of terrain for
07:12
that month and then you can link it to the actual data so you can go see the reactor lettuce mi abaga you can browse the absolute image for that particular thing so this is a really simple example of but I presented this in a conference
07:27
and I actually after my dog the scientist who works with with this data is appealing you know why I want to be with your truth so she and I SAT and you know interkorean session so this is the world that she's working on right now this is a she's looking at gap things in Central America i will give not pronounces it's basically a phenomena that occurs ments terrace the better condition than arrived at the topology causes this region objective form and this regional jet then basically processes ocean of fillings and this weekend which is very important for the new bill industrial is it that's when they do the fishing and stuff like that so they are working on algorithms to you know detect these things and i said okay
08:12
i want to play with your to to see whether i can do this by just forming a simple great so we ran our simple threshold query on the wind speed for that particular region and then we select the particular year and then she could start seeing the season distribution of those events and which
08:29
you select a particular months the heat map shows exactly where it is you can actually see the three wins that social support excited and then this is not the only case that with it and again you can
08:41
verify by linking back with the actual later to make sure yes no toys all right
08:46
the other case that she looked at is this whole somali there which is not really important precursor to the indian monsters so it is again a Nolan urgent that occurs office on Somalia so again
08:59
simple stuff threshold very looking at a particular region you can see the heat map picks up exactly where it's happening you can see the seasonal distribution of the surveillance what other one question is when does Monica at least are so you can select the particular month and see where the onset of this jet starts within behavior it's really nice that if we have a tool when you have this large data that you can actually play with in explore interactive you the kinds of questions you can ask is you know this these examples demonstrate that we can actually do kind of neat neat things
09:40
with this um here that's what presentation people are any questions we were happy to answer
09:56
you