Enabling cloud for e-Science with OpenNebula
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 90 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/40282 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSDEM 201389 / 90
2
5
8
10
12
13
14
15
17
19
21
24
25
28
29
31
32
34
36
39
40
43
44
46
50
51
52
54
55
57
58
62
65
66
67
78
79
87
88
00:00
Point cloudCloud computingRule of inferenceLimit (category theory)Image resolutionForcing (mathematics)Office suiteAverageNumberMathematicsSubject indexingMoment (mathematics)Virtual machineResultantVideo gameGame theoryOperator (mathematics)Level (video gaming)State observerHorizonPhysical lawHypermediaShared memoryMusical ensembleNumbering schemeProcess (computing)DemosceneError messageMobile WebGoodness of fitPosition operatorSystem callField (computer science)Real numberVideoconferencingRow (database)Insertion lossCausalityComputer configurationWater vaporMachine visionRouter (computing)CircleWebsiteIntegrated development environmentBarrelled spaceGraphic designMessage passingCASE <Informatik>WindowInformation securityElectronic mailing listInformationMaizeBitAreaExecution unit1 (number)AdditionEndliche ModelltheorieMIDIMereologyRoutingWordStudent's t-testWell-formed formulaVibrationInformation privacyForm (programming)Coma BerenicesSelf-organizationGraph coloringOrder (biology)Open sourceView (database)Projective planeCore dumpData managementVirtualizationMedical imagingType theoryDifferent (Kate Ryan album)SoftwareInstallation artCartesian coordinate systemMultiplication signServer (computing)Open setShape (magazine)Computing platformRight angleDatabaseGroup actionFile systemInstance (computer science)Set (mathematics)Elasticity (physics)Software developerInternet service providerOnlinecommunityNeuroinformatikCloud computingService (economics)Product (business)AuthenticationPoint cloudClient (computing)Analytic continuationSystem administratorEmailComputer fileOcean currentSupercomputerTheory of relativityLink (knot theory)Address spaceInterface (computing)Flow separationPoint (geometry)Chemical equationExpert systemStapeldateiDefault (computer science)Propositional formulaData storage deviceLecture/Conference
Transcript: English(auto-generated)
00:03
My name is Ishan and I'm working as a system administrator in PDC, High Performance Campaign Center. So, we'll discuss about how we enable E-size users into clouds using Open Nebula.
00:20
So this is not a product showcase and I'm not affiliated with Open Nebula. If some of the developers are here, they can help me with that. I'm not from Open Nebula, so this is not a product showcase. So how we use it, there are various ways you can use different platforms. You can learn about how we are actually using it.
00:43
So, today's agenda is that who are the E-size users and some of the projects in your home. And challenges we face, and in the last we'll talk about federating different cloud centers with EGI related cloud platforms. And definitely QA.
01:01
PDC is a High Performance Campaign Center, it's located at KTH. So basically we are hosting supercomputers like Great on the left side, my right side and your left side. And we have other High Performance Campaign Centers. So the reason I told you about this, about the users which we are serving is basically from HPC mindset.
01:24
And who are the E-size users? Basically, initially we focused on bio-commediations and their current learning on HPC machines. And that was an issue. Because they think in HPC style that we are to get big machines for continuous number of hours.
01:46
But they want it in large scale. But that was the challenge for us, to change this mindset. But not all, some of them, some of the users need perceived peaks, like elasticity. And what we realized that in a long tail, if you see this picture of the long tail,
02:05
in the angle, in the start you have big users who are using HPC. And the long users who are not using HPC but they need these perceived peaks. So the set of the users are different. Some of the users need HPC machines for their scientific problems.
02:24
And right now, cloud is not in good shape to host those HPC users. Giving them thousands of cores for next like 24 hours a day, 7 days a week. This is not cloud work. So we had so many talks and details about this.
02:40
That who are the cloud moving force and since they are coming from HPC background they have this idea that they can get peaks even maintaining the number of thousands of cores. So cloud is not for that one. Cloud is for these long tail users who have small types of problems but they want peaks for couple of hours, two hours, three hours. But they go like this. If you see the Amazon slides, you see this elastic fashion,
03:04
when it comes up and down, like this slide. And this was a core proposition for us. So we started with a new project in 2009. At that time, cloud was just booming up in the middle years. And we found Ecolectis.
03:20
And at that time, it was 2.0. So we started working on that. And we hosted Ecolectic Environment and then formed and connected with another center. Later on, this project finished. Last year, my 2012 VenusC project, where we collaborated with several centers across Europe.
03:41
And there we hosted OpenNebula. We moved from Ecolectis to OpenNebula. And with CDMI and CDMI developed by Ian Hansen, I found him sitting there, accidentally. So he developed CDMI interface for us and we hosted, we run that CDMI interface and we done the different path.
04:00
OVA path was developed by ENG.IP, Italian group. And they were running OpenNebula. We were running OpenNebula. Another part of our BSC, Barcelona Supercomputer Center, they were running another solution. So we collaborated, combined this in a way that a user can submit jobs to any of the centers.
04:21
This project ended last year. And SNCCLOUD project is basically a Swedish project. This is currently running. We started first with public cloud. Why? Because we want to actually, I mean in a plain word,
04:41
we want users to become addicted to the cloud. And at that time, private cloud computing was not so mature, honestly speaking. So they needed a nice interface, point and click, all this elasticity built in, all this storage built in. And basically scientists are the scientists.
05:00
Don't treat them as a system, you know, batch experts. So they needed a nice interface with elasticity, you know, all this balance and business. And basically we want them to feel comfortable with the cloud. And at that time, public cloud was the only one. So we started with Amazon. We gave a few workshops. We built the images for them so that they can be addicted to the cloud.
05:20
So to changing their minds. We gave workshops in Oslo, in Baryan, in Sweden, in Aksara and other places just to familiarize them with the cloud data. Since we already had our private cloud, we wanted to move them from the public cloud because of the speed.
05:42
And if you write down the reason why, the problem which we are facing that when we talk about the scientific user, the biggest part is the data. So it's three terabyte, five terabyte files. So if you transfer that to S3, and right now in Europe, the region,
06:02
this is not a big problem. This is a really big problem. One is the privacy. Swedish researcher and the bioinformaticians, they call it bimodalition rather shared their brushes than their data. So they want, they share their toothbrushes than their data.
06:21
So they don't want to share data with anyone, especially their results. And they don't want to leave the data from the Swedish border. And hosting it on Amazon, that was an issue. And anyone, I mean, if they solve some of the privacy issue, then it is very cumbersome because of the speed.
06:43
So we face this issue with those users, with their fancy Amazon cloud management console back to the private cloud, with few challenges. And these are the two challenges.
07:00
Basically, what we face is, I am myself a business administrator, so I face this issue. One of the most kind is non-technical issues. You will see later, it's a lot of changing hearts and minds. So I think the issue is rather easier. What are the non-technical issues?
07:22
These suddenly come up that is not secure. So my background is, I started working at BDC from grid side, grid computing as well. So basically, coming from the grid side, I face this issue.
07:43
And several, when we were deploying the cloud, and both from the user side and both from other peoples, we face this. So they were thinking cloud as an SPC style to digest elasticity.
08:07
For them to tell them to launch more instances and cut down instances, it's very hard to tell them. It's hard to tell them to think in this way.
08:20
So we face an issue that when one of the users run instances in an elastic fashion, they were thinking that they just got the instance, they configure it, and then they keep running it,
08:42
even on the vacations. Human and self provisioning, these are also problems, non-technical problems. And the last one is very funny, because it has to live with other peoples. So it has to live with other infrastructure,
09:02
and then you get different groups.
09:37
And this was the non-technical issue.
09:59
One of the next steps is the use case,
10:08
and since this time is always busy, so he says, okay, I'll do this, and then it takes time, maybe hours, maybe days, or maybe five machine machines, and they get it after one week, and they say, okay, I'll give it one hour.
10:28
Since that week, besides all of this, then we create VMs. And when they see this, they talk to the user, you give this self provisioning to the user,
10:41
and they were afraid of managing all things to themselves. Both are the same.
11:07
You have to tune yourself, you have to choose yourself what kind of starts you want to prepare. You want either to control, if it's secure about your position,
11:21
more big issues, ways to solve this issue.
11:51
This is the way we solved it, and this can be extended or improved later. So, application data, and server network are for control.
12:04
We solved it with different image post-story, which we don't necessarily, as slaves,
12:41
should not get. This is one of the issues that we solved, because we know that, basically,
13:04
down there, they want to post outside.
13:43
They have to do this analysis, and so on, combined with cloud management. And the sharing of images, because we face this issue,
16:00
but even if it is loud, you have to go outside of your IDE, and go to Sunstone, or use the Spark here,
17:58
which are the legal software,
18:04
so that it's managed and passed, default password, pre-processing, what the user installed.
18:44
So, this is the real problem, which you faced with us. Did this.
19:22
ECI is a successor of PGE, Enable FAD 340, and you already know that. You can read more about that, FAD Cloud Task Force. It's actually federating different centers across the world. So, when you talk about federation,
20:13
in front of their database. So, OCCI is part of our federation,
20:32
with using LDAP server. A lot of users record, but basically it is using the same use record
20:41
on how you federate users.
21:01
This is also from the grid side. So, we were thinking in a grid, because we were trying to reuse everything we spend time running on it. So, we get it, federation by extra 529 because extra 529 is heavily used in grid work, and support of virtual network organization. So, we create a video to get this federation thing.
21:22
When a user mounts a VM on any of the resource provider, you will see that this nest is live. Where to get this list of the images. So, we use status lab, marketplace. I noticed that open network also launches one of them, but we were using status lab
21:41
in marketplace software to share the image. We have two schema, three institutions, and more are coming up,
22:02
more are trying.
22:28
Three of them running status lab, which is open network, three of them running. So, in total, we have 15 resource providers,
22:41
silo and technology providers, five user communities, user communities are the ones who are testing that. So, these are use cases. We were running. So, they were all the interface in front of that. I mean, you can, you can
23:15
GWT in between.
23:25
And each resource provider has to run this. He has to run a solution system, data management, accounting, monitoring, not the monitoring. He has to accept monitoring into this resource provider. And, accounting will be pushed to, so this is
23:42
this is the, and the client can access using the same client, without changing anything, as long as he has access to all the resource providers, through the centralized, federated authentication service. So, he can access a launch virtual machine to any of the 15 resource providers
24:04
without changing those resource providers. This is very federated. And this was another, this was the same thing, but different.
24:22
And these are the issues which we are facing now. Not solved yet.
24:45
This is a simple workflow that stops.
25:05
Honestly speaking, specify
25:25
that in the morning, like they need these number of instances, but when they go back home, this instance should be coming back. This is not where we need to audit
25:52
or to prospect the VM, running VM. Either way, the number of process, it is consuming, I mean, we need to audit the VM, privacy regulation,
26:06
and the student is going to speak about that. This is to get an application running,
26:33
like SPARA. And there we have really prepared
26:44
applications like Galaxy, like for SPARA, on the fly. Infrastructure as a service, not software as a service, so it's not a platform as a service,
27:01
it's not a software as a service, it's like an application as a service, so it just comes in between. So they want to get a bit more work. Some of the users, they know how to configure and install software on future machines. Some of the users, they don't know so they need. Platform security
27:24
is even harder than because the platform security is very should be able to access this file system or not.
27:48
This is on the port, on the end map, if there is running, if there is something, strange ports are running, you can notify the users. But in the platform security, it's very hard to have this artificial intelligence, but you should imply machine learning techniques there.
28:01
You know that if this user is allowed to access this part at this time. This is the problem, platform security is even harder in the point of view. So this is the last slide, I think so. So thank you very much and this is my email address and contact link
28:21
later on this talk. If you are interested, you can send me an email.
28:49
How are your HPC users?
29:03
HPC users, and HPC users is with thousands of cores and coming back to us, they want to go to that.