We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Hyperconvergence meets BigData

00:00

Formal Metadata

Title
Hyperconvergence meets BigData
Title of Series
Part Number
9
Number of Parts
169
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Rafael Monnerat - Hyperconvergence meets BigData This presentation show how to deploy **[Wendelin]**, the free software platform for Big Data & Machine Learning, using **[SlapOS]** , the free software hyperconverged Operating System (hOS). Written in 100% in Python, SlapOS and Wendelin, can create a complete Big Data Infraestruture with out-of-core capabilities ready to use and operate in just few hours. ----- This presentation aims to demonstrate how to use [SlapOS] (Hyperconverged OS) to deploy an entire Big Data Infrastrucure and show how "data life cycle" can be managed with [Wendelin] - covering ingestion, analysis, visualization and weaving it into an application. We'll show how Wendelin and SlapOS could handle acquisition, analysis and exploitation of data, making it a potential solution for IOT scenarios where data is available and needs some logic applied before being presented as web application, possibly on a commercial basis. The agenda of the presentation includes an introduction on SlapOS, as a tool used to deploy a wide range of different services and an introduction of Wendelin, as a tool in order to make out-of-core python applications. After a short introduction, we progress to show the steps to deploy SlapOS infrastructure and later to deploy Wendelin on the just deployed SlapOS, including an use case which shows SlapOS deploying a fluentd instance to ingest data to the Wendelin Database. To conclude, we make a live demo with an Jupiter using out-of-core python to handle wav files stored on Wendelin, and a second short demo on handle computer resources consumption data.
11
52
79
Goodness of fitAutocovarianceHypercubeData conversionSharewarePresentation of a groupProjective planeEvent horizonLecture/ConferenceComputer animation
Presentation of a groupInternetworkingOrder (biology)Key (cryptography)Internet service providerModul <Datentyp>Beta functionBitHybrid computerInternet der DingeLecture/Conference
WhiteboardSoftwareOpen sourceComputer animation
Open sourceEvolutePresentation of a groupLecture/Conference
Right angleSoftwareCore dumpOpen sourceComputerDatabasePresentation of a groupSource codeLecture/Conference
Internet service providerPoint cloudConditional-access moduleMachine learningVermaschtes NetzPhysical systemServer (computing)Multiplication signInternet service providerMatching (graph theory)WebsiteData centerComputer animation
Service (economics)Internet der DingeInternet service providerOffice suiteProjective planeHacker (term)SpacetimeEndliche ModelltheorieLevel (video gaming)Virtual machineCuboidCloud computingLecture/ConferenceComputer animation
Equivalence relationOrder (biology)Internet der DingeSoftware maintenanceLevel (video gaming)Lecture/Conference
Router (computing)Order (biology)Service (economics)Digital rights managementSoftwarePhysical systemCloud computingMultiplication signSound effectContent delivery networkNormal (geometry)MereologyComputer animationLecture/Conference
SoftwareKernel (computing)Installation artComputerServer (computing)BitSoftwareProcess (computing)Virtual machineCore dumpGroup actionEquivalence relationCache (computing)DampingComponent-based software engineeringComputer architectureNetwork topologyExecution unitShared memoryDirection (geometry)Computer animationLecture/ConferenceProgram flowchart
Order (biology)Virtual machinePhysical systemService (economics)SoftwareTextsystem1 (number)Instance (computer science)Binary codeMultiplication signRevision controlConfiguration spaceStochastic processLecture/Conference
SoftwareKernel (computing)Binary code1 (number)Service (economics)Instance (computer science)SoftwareVirtual machineComputer fileData centerOverhead (computing)Similarity (geometry)Multiplication signLaptopFlow separationGroup actionMeasurementBitInternetworkingDisk read-and-write headProgram flowchartLecture/Conference
SoftwareKernel (computing)Virtual machineService (economics)Multiplication signClassical physicsBitProcess (computing)Different (Kate Ryan album)Flow separationConfiguration spaceOnline helpPresentation of a groupCASE <Informatik>Program flowchartLecture/Conference
Computer networkCluster samplingIntegrated development environmentSoftware developerVertex (graph theory)Execution unitSoftware testingAutomationComputer-generated imageryElectronic mailing listVirtual machineMultiplication signShared memoryTerm (mathematics)Content delivery networkService (economics)Gene clusterProjective planeArithmetic meanComputer animationLecture/Conference
Computer networkPolygon meshCluster samplingComputer-generated imageryIntegrated development environmentAutomationSoftware developerExecution unitSoftware testingVertex (graph theory)Software maintenanceInternet service providerProduct (business)SequenceSoftware testingDistribution (mathematics)Equivalence relationPhysical systemMedical imagingVirtual machineComputer animationLecture/Conference
Distribution (mathematics)Computer architectureMedical imagingLecture/Conference
Single-precision floating-point formatData typeMathematical singularityVirtual machineLaptopService (economics)Line (geometry)Scripting languagePoint (geometry)Cloud computing
Single-precision floating-point formatUniform resource nameEmulationMaxima and minimaArc (geometry)Product (business)Physical systemData centerDistribution (mathematics)Execution unitEquivalence relationSoftwareInstance (computer science)Type theoryComputerVideo game consolePosition operatorVirtual machineOffice suiteFigurate numberSet (mathematics)Lecture/ConferenceComputer animation
Information managementBounded variationVector spaceVirtual machineLevel (video gaming)CASE <Informatik>Core dumpDifferenz <Mathematik>Multiplication signService (economics)LoginData analysisInternet service providerMereologyStack (abstract data type)Lecture/ConferenceXMLComputer animation
CalculationForm (programming)LaptopDatabaseGene clusterLecture/Conference
Multiplication signVirtual machineGene clusterDimensional analysisComputer hardwareInstallation artXMLComputer animation
Virtual machineComputer hardwareKeyboard shortcutLecture/Conference
DatabaseMachine learningEquivalence relationInternet service providerObject (grammar)Service (economics)Condition numberComponent-based software engineeringOrder (biology)XMLComputer animation
Computer programProcess (computing)Lecture/Conference
LaptopStrategy gameSpacetimeWindows RegistryCASE <Informatik>Selectivity (electronic)Software testingOffice suiteMultiplication signProgram flowchart
LaptopStandard deviationMultiplication signWindows RegistrySoftwareContext awarenessWaveBinary codeStreaming mediaCASE <Informatik>Server (computing)WeightProcess (computing)File formatLecture/ConferenceProgram flowchart
Uniform resource nameExecution unitQuadrilateralComputer-generated imageryPlotterFast Fourier transformComputerMedical imagingOrder (biology)Line (geometry)Instance (computer science)Block (periodic table)System callType theoryLecture/ConferenceXML
VolumenvisualisierungInstallation artComputer filePosition operatorLoginEquivalence relationDifferent (Kate Ryan album)Streaming mediaPasswordOrder (biology)State of matterWordComputer fontConfiguration spaceLecture/ConferenceSource code
Software engineeringSoftware development kitData streamArray data structureComputer programPlug-in (computing)Insertion lossOpen sourceMereologyVirtual machinePlanningConservation lawMultiplication signProteinComputer fileConfiguration spaceRight angleDifferent (Kate Ryan album)Buffer solutionLecture/ConferenceSource code
Artificial neural networkEmulationDifferent (Kate Ryan album)Streaming mediaData streamComputer fileConfiguration spaceVariety (linguistics)Extreme programmingWaveComputer animation
Uniform resource nameArtificial neural networkMaxima and minimaData streamFlow separationWaveVideo gameSharewareExtension (kinesiology)ComputerStandard deviationNormal (geometry)Array data structureEqualiser (mathematics)LaptopLecture/ConferenceXML
Value-added networkUniform resource nameMultiplication signContext awarenessMereologyProxy serverLibrary catalogDatabaseQuery languageObject (grammar)Streaming mediaSystem callWeb browserString (computer science)LaptopIdeal (ethics)WordWave packetLecture/ConferenceXMLProgram flowchart
Menu (computing)System callMachine codeStructural loadFast Fourier transformWaveReading (process)Entire functionLaptopWebsiteLecture/ConferenceXMLProgram flowchart
Uniform resource nameMetropolitan area network3 (number)Computer fileFunctional (mathematics)Machine codeString (computer science)Wrapper (data mining)Social classCore dumpStreaming mediaStructural loadRight angleMessage passingLecture/ConferenceSource codeXML
MalwareUniform resource nameMetropolitan area networkSummierbarkeitQuantum stateComputer fileCore dumpThermal radiationProteinVarianceMedical imagingOrder (biology)Process (computing)DatabaseSemiconductor memoryArray data structureMultiplication signLecture/ConferenceSource codeXML
CAN busUniform resource nameRadio-frequency identificationSharewareProcess (computing)LaptopObject-oriented programmingLevel (video gaming)Computer programCalculationReduction of orderMultilaterationMultiplication signDivisorWebsiteLecture/ConferenceXMLProgram flowchart
Maxima and minimaMetropolitan area networkLink (knot theory)WebsiteTwitterTheory of relativitySystem callWeb browserMultiplication signComputerCore dumpStack (abstract data type)PlotterCalculationLecture/ConferenceXML
Disk read-and-write headControl flowComputerBitLecture/Conference
Transcript: English(auto-generated)
Hello, good afternoon. My name is Rafael Monera. I'm going to talk about hyperconvergence. Hyperconvergency meets big data. I work for NexID from Paris.
I was born today. Hyperconvergency that we do with SlepOS and how we deploy big data projects using Vendeline,
how we deploy it, how we normally upload data, and I will start to make quick demos in the end of the presentation. So the goal of this presentation is to be a bit modular. Not necessarily you are going to use SlepOS and not necessarily you are going to provide big data
with Vendeline. So, but the merge of the two that we have been using, it's somehow the key that we could imagine for future in global and hyperconvergency with big data
in order to collect and automate the deployments for big data in an internet of things. So this presentation, so these two tools and this presentation reflects how NexID works
with their customers. So NexID is one of the largest open source publishers in Europe, despite the fact it is a small company with just 30 or 40 employees.
We could produce a big amount of open source softwares and I will board the two of them today. This is just what they stack that I'm going to base myself on today.
And these tools were mostly created to inglobate a need for a customer that couldn't find an alternative solution for the problem that it has.
And during the presentation I will give some examples of how SlepOS was designed and implemented and during the evolution of the tool was targeting to cover topics which don't exactly
are covered by other tools. So this is just a list of stack which is fully open source and mostly based on Python except the Fluentd1, which is actually written
in Ruby and it's not a software that was written by NexID. But in the market we couldn't find an as reliable solution in Python these days. So Vendaline core is to provide out of core
Pi data, which means that we can process data which is larger than the RAM of the computer. Neo is a distributed database for those who know ZOAP, it's a distributed ZODB. ERP5 was, it's an open source ERP,
not so exciting on this presentation. SlepOS is the tool that I will present and follow. The resist is something that we developed
to provide inter-connected mesh networks worldwide, fronted is to collect data and scikit-learn is for do machine learning and others. So this was the start of the SlepOS.
It started in 2009 or 10, I don't remember exactly the date, the day. And when it was built for the first time, we were proposing on that time to put servers in people's home.
So we designed a system that could be distributed in a way that it could work in more than one data center, so it could work in Amazon or HackSpace, OVH or any other provider. And also to cover, to be able to host services
in the people's home or office in a distributed way. It starts to work very well and then it, with the internet of things and other projects that it's coming up.
The model was, the two, the SlepOS was, became a tool that could be installed on machines that were in cars or trucks to provide a mobile cloud. We have a project ongoing on this.
It can be used to host services in people's boxes like in France there is a free box. So you could produce equivalent of it with SlepOS.
It's being used by in the turbines to win turbines for collect data to, in order to inform when there is a need for preventive maintenance. And also it's also used to create
internet of things routers. You get an Raspberry Pi and you install it. Then you can collect data over the several devices that are eventually connected in the network in order to collect data or manage certain services
at your home or elsewhere. So now we become beyond the data centers. We can manage at the same time mobile cloud and normal CDN using data centers,
using exactly the same system without significant modifications in any of the parts. And it's important to show that the SlepOS
can provide nodes everywhere and it uses a central server. For now it's only one, but in future it can be more than one master to control any amount of computers and devices which are with SlepOS installed.
So just to illustrate a bit, so what is the SlepOS? So SlepOS is composed by whatever Linux it installed, whatever Linux available today as base.
So there is a three core components that is the SlepOS core that coordinates everything. We are based in build out, so we can reconstruct a software from scratch
or use some cache automation to share already compiled softwares between two machines which are based on the same architecture. And we use Supervisor D for manage the process.
On top of it, so this is what is present on all machines. So on top of it you have the soft releases. The soft releases are some kind of equivalent of a group of packages that are placed
in a special way in the system in order to provide binaries to run whatever service that binaries are supposed for. And you can have several configurations in one machine. So the same machine can have more than one version of MariaDB or Apache or word process
running at the same time without conflicting each other. So the soft releases itself don't provide any running process, it's just software there.
And we have, and the software instances are the ones that runs the services. So you can imagine that when you install a package it only provides the binaries, and on the instance side it will tell how the binaries will run,
how the service will be composed. And in the software instances it's a bit similar as a kind of micro-containers. So there are containers but they are more light in a way that it don't provide overhead of recopying the same group of files everywhere.
That's why there is this major separation. So this represents for example one machine running anywhere. It can be a machine in a data center or my laptop which I move around anywhere
and can be hosted in a car if there is a need for it. And one important thing to remark is that it can run at the same time VMs
or virtual machines. So a bit like what OpenStack does, and also other services which are not virtual machines, which are not virtualized, it's just processes.
So a machine can compose many different kinds of services in a distributed way. So a cluster can use more than one machine, one machine, several machines, it depends how the composition will configure it.
I will, later on this presentation, I will show the case of the big data with Vendeline. So based on this configuration we can provide at the same time using sharing services, sharing computers,
we can supply several projects and run them all at the same time sharing the machines. And by this list you can see that they are significantly different in terms of goals.
So we are running for example today a CDN worldwide which is present in China, so we have services in China. So we have KVM clusters for big data in Teralab, which is a French project that provides big data
for large French companies at the Institut Min Telecom. So we have Vendeline for big data, which I will mention in the sequence. It's been in production today to provide
preventive maintenance for wind turbines in Germany. And we have development, we have distribution to test nodes, some kind of equivalent of Jenkins distributed in several machines worldwide.
So we have the automation of, we have a system that can produce VM images, so we automate, I don't know if people know or not, we automate the work that Packer does for generate VMs, so we can generate pre-built VMs.
And we also use to provide the Chromium OS images for Chromebooks, so we have our own distribution of Chromium OS, which is called NiU OS, and we use Zlapp OS also to build the images for ourselves or for the persons that want.
And for make this works, it's request to leverage the way to install it everywhere. So we have 20 different ways to install the same thing based on different architectures.
It will require much more effort to deploy anything. I will come back to this one.
So how we do the deployment of it. We have a one line installation script that you ask you if you want to connect to the master. So if you want to connect to whatever master, you can tell which one. You connect your device to the machine.
For example, this laptop is connected to the master. So I can use master to deploy services on my laptop. Or if you have a mobile cloud, it's the same way. I can control machines based on master, deploying them, deploying whatever service
to whatever machine it's connected to there. So we use Ansible, which is also a Python tool, to automate the setup of the node. It allows us to, with the same line, we can set up a Hasberi Pi, a laptop,
like in a Chromebook, or a production data center. So in this way, the Ansible take care of the minima, the particularities of the system that is being installed.
So we can support a very large amount of Linux distributions, for example, just by using this command. If your preferred distribution is not supported for whatever reason, we will be happy to add. We just add on demand.
So you don't have to actually connect always to a master. You can deploy a standalone node by just skipping the questions and running these two commands here. So if you type lapos-node configure-local,
you get your computer configured to use any software that is available in lapos. And this one is for preparing the machine's positions and folders and so on.
You don't, you have an API for, you have the, when you install this lapos, you also have a command line tool that allows you to supply and request and use a console to automate the deployments
of the software that you want to deploy. So here is an example of a request and a supply. So I'm deploying a software release of a monitor to a computer.
And then I'm requesting to run one instance of this monitoring on this computer. So it's the equivalent of setuping a monitor for a wind turbine, for example. Here is just variations of the services,
of the request that you can be done. So when you deploy this monitor, you already deploy Fluentd, which is what collects the logs from the machines, which leads me to the ventilating part.
So as the slapOS is everywhere and it's standardized in a way that we can put it anywhere. So we were able to quickly deploy a ventilating stack which is a tool for provide big data analysis
and out of core Python. And the advantage of use is slapOS in this case. It doesn't require hours of setup to have the stack. So even a data scientist,
which has no background on setuping a cluster, can set up it. And also allows the persons that we are not data scientist have the full stack of, for example, Scikit-learn, NumPy, out of core, distributed database,
and a Jupyter notebook, ready to use for make some kind of calculation. So both sides can benefit from the quick setup
by not spending time on learning how to pip install NumPy on REM and REM in your Hasberry Pi, examples like this. So the ventilating also was designed to work
on commodity hardwares. So it don't require a super powerful machine to be deployed. So you can, keeping the dimension of what you are going to do,
you can make big data with machines that you can buy in supermarket, for example. You can buy a few machines, i7, in a supermarket, then you can start to make big data. Because it's quite easy to find one i7 with a 16 or 32 gigabyte of run anywhere.
And SSDs are becoming cheaper and cheaper. So you can buy a one terabyte SSD these days, quite easily. And as everything was designed to be distributed with SlepOS, you can buy several cheap machines
and then you have big data. You don't have to expand 100,000 euros buying expensive hardware to start to make big data. So the stack is composed by average hardware.
Of course, people that has conditions can buy more reliable hardware, but it don't require a special service for it. So we use SlepOS. The ERP5 is just as a base tool with Neo
to provide an object database that we are going to manipulate soon. And the scikit-learn is to provide machine learning and other features that you can use in big data.
And the ERP5 is also used to provide one already old but equivalent tool component. Peridot to the job lead. So we already had a distributed and active,
what's the name in English? So it's active, so we can provide a background and a synchronous, we can do a background and a synchronous programming already
by 10 or 12 years. So since I started, I already doing synchronous programming. So, but the stack is nothing if the data don't arrive there. So you can only do big data if the data arrives to the tool.
So we use Fluentd mostly because it's one of the most reliable tools that exists today. We make a test in the office when we were selecting. And we put in a laptop in a normal,
our normal office, and we let it on during the weekend. Pushing data to the ventilator, which is the case. But we are not analyzing Fluentd. And it could last just one registry over a million
in the space of two weeks, which is very, very reliable. Because a laptop is turned off and on all the time. Because suspended, enabled, suspended, enabled. And the person go home with the laptop and it connects from the other network,
and then it turn off, suspend, and enabled, goes to 3G, then goes to Wi-Fi again. So all of this is just lost one registry. And for the places which we cannot afford run Fluentd process, we can just run an HTTP
and we can, on the HTTP server, and we can crawl with Fluentd. And what we extend Fluentd on this case is to stream binary data. Because I will show soon.
But we can stream wave sounds in a wave.wav format to the ventilator and plot an FFT out of it. So how we deploy, so here we learn how to request
into whatever computer, a monitor, which will come with Fluentd. And here, in with just the two lines, we can request the ventilator.
So the full stack with all tools and scikit-learn, scikit-learn, numpi, ventilator out of data, there is several other scientific tools installed on it,
is available just by typing these two commands in whatever node you want. Or if you are in a standalone fashion, you don't want to connect, you want just to have
an instance in your VM in Amazon, you just type this command and you get everything. So soon we are going to release, it was not ready for this conference,
but soon we are going to release ready-to-use images for KEMU, EC2, and Digital Ocean, VMware, VirtualBox, and so on, which can provide ready-to-use, ready-to-try, I would say, instances of Vendilin.
So you don't have to pay yourself or install big data anymore. So here, render the tags, I will show from,
so here is the configuration, here is what I ran earlier today to upload the data. So you generate a file which is basically like this, which says, look at this folder for WAV files,
save the position to know what you already send or not. Then you can tag your data, you can have different tags for different data and send to different ingestion policies
to you classify or shard, or you can do whatever you want with your data. And here, I just used the Vendilin plugin, which has already come with the monitor that I just said,
but if you just, if you are just installing the Fluentd from treasure data, you can easily install, it's just a file in a folder. Then you say where you are sending to and which is the user and which is the password. And here you can see that it found
six waves, yes, six waves, and it stream everything to this ingestion. So in few minutes, you can start to ingest files in your big data.
So probably it also works with the other equivalents of Fluentd, like Log Stash, and I forgot the name of the other one, which is also written in Ruby. So you can write plugins which are compatible just by making posts and making sure
that you already have, you are consistent when you send the data. I come back now. So this is what I just showed. You can limit the buffer and so on. If you buffer in memory or if you had
too much streaming of data, you can buffer in disk. And you can just run manually like this. Just a Fluentd, let's see, and then you pass the configuration file. Or you can, or depending on your setup or if you are using SlackOS or not, you can write a very complex configuration file.
So it takes just few minutes. Yeah, this was just what I show. When you run, it will just say that you send data. Then if you are using different plugins, for example, to get syslog or machine consumptions
that we use too, it just use a different, you have just to adjust the part of the source. I can show another example of data if I have time.
So, where the data goes. So I will just jump quickly. So the data goes to these, so this is the UE of ERP5,
which will be able, will just store the data. And by using fast input, you can create the entire path to be ready to use that configuration file that you saw.
So it will create a portal ingestion that you can use Python to, for example, if the tag is, depending of the tag, you can write in a different data streams. I'm not doing anything complex here.
What arrives, I just put in the same stream of data. I can just go to data streams, yes. Can just search for a way.
So here I have my wave that I send earlier with the amount of data. I can manually upload a file which will overwrite the file that is already there, overwrite the data that is there. And I can append files manually.
So you don't have to only rely on the Fluentd to send data, you can upload certain data that you have to manipulate or you can make posts, for example, to upload a certain database, certain data stored that you have.
Then, how to use this data. So I have, this data represents several wave sounds that was streamed to this computer.
I will just show data arrays after, when I'm starting to do demos. So here, a demo. So I installed SLEP OS everywhere. I have my Vendeline set up. So if you are in normal speed,
you can set up everything in one day, or less. And I go to my demo. So I hope I still have IPv6, else I will not make demo, but just show the thing.
And believe it or not, I have IPv6 here, even if you don't. So, so instead of use the normal IPython, we just make a small extension to the normal IPython notebook.
Not the entire tool, but we just create a different kernel that we can use with some magic, which can make our life easier, and to be more reliable when we deal with data. I will assume that you know minimally IPython.
If you get lost, raise your hand, because that will be time to do it. So, some magic. We just, this one, two, four. We just say where we are going to connect our notebook.
It's just a, it's just a reference. Then when you finish, you get an object called context. You can see it as a kind of proxy. It's not exactly a proxy, but whatever you put, context.getID,
you are doing a remote call. Let's see. Okay, I don't have IPv6. No. I have.
Yes, I have. Again. So, I can just call whatever I want. I'm doing a remote call to that object. For relative world.
So, based on this, I can get the data that I streamed. So the data that you saw, it's under this path. But of course, nobody remembers the ID of every object that you want to manipulate. So you can search the object by using the catalog.
So you can make queries to the database to know where it is, the paths, and so on. So you can just use portal catalog. Then it starts. This is an interesting part. So, even if it's out of core,
you are not calling the methods on the IPython notebook, but you are doing a remote call to manipulate the object, you can still do bad things. So for example, this gets the entire data which is in the stream. So if there is one terabyte data,
it will get everything as string to your browser. Not good at all. So, the only thing that you have to take care is to use different approaches. Not so much different, but you have to take care
to not load the data all at once when you want to manipulate. Here is just two examples that we are on core, that we are making the IPython notebook handling all the data. And here we are just managing small chunks of data.
Here is just few parts, because I have to import scipy to handle the waves that I want. Then I wanted to make an FFT without load entire data. And for this, if you read the code of this read function,
it will expect that it's a file. As it's expected as a file, it could use the file IO, the string IO of Python. However, if I use the string IO, I have to load the entire data to give to the string IO.
That's why I make this class as a wrapper to make an out of core, an out of core stream looks like a file. So when I pass the file reader,
it will behave like a file, but without load the entire data. Because you can imagine that the data can be one terabyte. In this way, by using this file, I can manipulate a one terabyte file as it is an average file,
without requires me to have one terabyte of memory. So here I just get one channel. And here I'm just saving, oh, I'm running out of time.
So here I'm just saving the arrays and plotting it. So I'm just plotting the arrays and making FFT here. So I get the array, I save the array, I reget the array in order to make it out of core.
And then I save it and I plot. I can save images too, to the database. Yeah, so here is where was the previous times that I invoked the FFT, so I can see the files.
Or I can reget, much later, one array that I already processed, and replot it, so I can save and recover the arrays that I'm using.
So, and I move to the second demo, which is, now I'm going to emulate an asynchronous processing, using, oops.
So it's the same thing as before. So here I made just a calculation to see how much data I have in the site, 33 gigabytes.
And here, I just make the same calculation, saving the calculation, 10 by 10. So I make, I put in background processing.
So I'm just putting in the background, the processing. And later, I'm checking if the processing is already finished.
Then I can make the same calculation again, but doing a kind of map reducing. And I'm using a cluster of instances, instead of program myself in the level of the IPython notebook.
So this is the kind of things that you can do after one day of setup. I already did asynchronous very quickly. So you could do directly, so you can follow the tutorial in the link.
I will make it available at Twitter to the site. So you can make the tutorial in the link for plot data directly in the browser using JavaScript. There is a short tutorial. And you can use pip install and use the core of Vendaline
without installing the full stack. So you can use Vendaline and out of core features just in your computer to make small calculations which exceeds the run that you have in your computer. Well, thank you very much.
I was extended a bit, sorry. So now it's, I think it's a coffee break. I lost, I still a bit of your coffee break. So if anyone have questions, can ask
or feel free to go to coffee break.