GPU Acceleration of a Global Atmospheric Model using Python based Multiplatform
Video in TIB AVPortal:
GPU Acceleration of a Global Atmospheric Model using Python based Multiplatform
Formal Metadata
Title 
GPU Acceleration of a Global Atmospheric Model using Python based Multiplatform

Title of Series  
Author 

License 
CC Attribution  NonCommercial  ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and noncommercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license. 
Identifiers 

Publisher 

Release Date 
2017

Language 
English

Content Metadata
Subject Area  
Abstract 
GPU Acceleration of a Global Atmospheric Model using Python based Multiplatform [EuroPython 2017  Talk  20170710  PyCharm Room] [Rimini, Italy] A global atmospheric model play an important role in shortterm weather forecasting and longterm climate prediction. The model requires enormous computing resources because the all atmospheric states must be calculated every time step (usually a tens of seconds to several minutes). However, since the most atmospheric models run only on CPU machines, they are not able to use the modern microprocessors with high performance and low power such as NVIDIA GPU and Intel MIC. It often costs a lot to convert codes from one machine to the other machine. Although it can be accelerated on GPU and MIC using OpenMP and OpenACC directives, it is not easy to achieve peak performance. I developed a new Python module named PyMIP (Python based Machine Independent Platform) to integrate C, Fortran, CUDA and OpenCL codes with a simple user interface. The main code includes configuration, flow control, IO and MPI parallel is written by Python. Only hotspots include huge number crunching code are written by compile language as C, Fortran, CUDA and OpenCL. The hotspot codes are compiled and imported using PyMIP in runtime. PyMIP enables that a user can switch machines with simple flag. I am developing a new global atmospheric model based on PyMIP to make it easy to utilize various modern microprocessors. In this presentation, I will introduce PyMIP and show the computational performance result in NVIDIA GPU of the dynamical core of the model developed based on PyMIP

00:00
Intel
Presentation of a group
Vapor barrier
Mathematical model
Physical law
Mereology
Mathematical model
Pi
Befehlsprozessor
Software
IRIST
Computing platform
Arrow of time
Multiplication
Local ring
Computing platform
Physical system
Graphics processing unit
00:56
Intel
Group action
Observational study
Interior (topology)
Content (media)
Parallel port
Mereology
Mathematical model
Neuroinformatik
Independence (probability theory)
Hypermedia
Befehlsprozessor
Computer hardware
Information security
Logic gate
Graphics processing unit
Mathematical model
Food energy
Domainspecific language
Virtual machine
Microprocessor
Process (computing)
Prediction
System programming
Computing platform
Numerical analysis
Resultant
02:46
Point (geometry)
Game controller
IsingModell
Multiplication sign
View (database)
Execution unit
Client (computing)
FLOPS
Twitter
Wave packet
Frequency
Inference
Readonly memory
Semiconductor memory
Befehlsprozessor
Band matrix
Operator (mathematics)
Core dump
Software testing
Arithmetic logic unit
Form (programming)
Social class
Graphics processing unit
Smith chart
Line (geometry)
Peertopeer
Singleprecision floatingpoint format
Subject indexing
Prediction
System programming
05:01
Commutative property
Group action
Codierung <Programmierung>
Multiplication sign
View (database)
Connectivity (graph theory)
Compiler
Open set
Parallel port
Mathematical model
Latent heat
Readonly memory
Semiconductor memory
Befehlsprozessor
Core dump
Videoconferencing
Cuboid
Integrated development environment
Graphics processing unit
Chisquared distribution
Default (computer science)
Pairwise comparison
Inheritance (objectoriented programming)
Electronic mailing list
Heat transfer
Code
Computer programming
Open set
Message passing
Process (computing)
Loop (music)
Prediction
Motherboard
System programming
Hill differential equation
Table (information)
07:25
Intel
Multiplication sign
Water vapor
Parameter (computer programming)
Parallel port
Mereology
Arm
Machine code
Variable (mathematics)
Neuroinformatik
Independence (probability theory)
Type theory
Semiconductor memory
Befehlsprozessor
Videoconferencing
Process (computing)
Physical system
Graphics processing unit
Mass
Control flow
Variable (mathematics)
Formal language
Virtual machine
Microprocessor
Type theory
Process (computing)
Prediction
Compiler
System programming
Physical system
Data structure
Sinc function
Spacetime
Codierung <Programmierung>
Connectivity (graph theory)
Microprocessor
Integrated development environment
Configuration space
Dataflow
Run time (program lifecycle phase)
Computer program
Code
Cartesian coordinate system
Componentbased software engineering
Subject indexing
Computer hardware
Computing platform
Active contour model
Library (computing)
09:45
Positional notation
Prediction
Scalar field
Multiplication sign
Operator (mathematics)
Vector space
System programming
Code
Damping
System call
10:26
Touchscreen
Multiplication sign
Parameter (computer programming)
Mereology
System call
Perspective (visual)
Power (physics)
Process (computing)
Resource allocation
Prediction
Computer hardware
System programming
Resultant
Condition number
Row (database)
12:12
Discrete group
Intel
Parity (mathematics)
LaplaceOperator
Multiplication sign
Connectivity (graph theory)
Infinity
System call
Wave
Prediction
Befehlsprozessor
Infinite conjugacy class property
System programming
Bus (computing)
Nichtlineares Gleichungssystem
Software testing
Graphics processing unit
13:37
Dynamical system
Simulation
Mathematical model
Physicalism
Core dump
Content (media)
Mereology
System call
Mathematical model
Process (computing)
Prediction
System programming
Physics
Process (computing)
14:20
Prediction
Image resolution
System programming
Physical system
Mathematical model
14:43
Theory of relativity
Prediction
Image resolution
Multiplication sign
System programming
Mass
Mathematical model
Physical system
Mathematical model
16:17
Image resolution
Physical system
Mathematical model
16:40
Home page
Code
Density of states
Social class
17:36
Moisture
Core dump
Mass
Water vapor
Food energy
Price index
System call
Number
Prediction
Time evolution
System programming
Videoconferencing
Nichtlineares Gleichungssystem
Computer worm
Momentum
Ideal (ethics)
Pressure
Solvable group
18:08
Point (geometry)
Spectrum (functional analysis)
Multiplication sign
Mathematical singularity
Orthogonality
Division (mathematics)
Thermal expansion
Domainspecific language
Mass
Sphere
Parallel port
Infinity
Event horizon
LagrangeMethode
Runge's theorem
Frequency
Dynamical system
Polynomial
Temporal logic
Point (geometry)
Gradient
Numerical analysis
Core dump
Rectangle
Element (mathematics)
Derivation (linguistics)
Prediction
Basis <Mathematik>
Grid Computing
GAUSS (software)
Order (biology)
System programming
Quicksort
Figurate number
Physical system
Data structure
Spectrum (functional analysis)
18:57
Dynamical system
Codierung <Programmierung>
Multiplication sign
Code
Core dump
Total S.A.
Line (geometry)
Counting
Mereology
Protein
System call
Mathematical model
Usability
Neuroinformatik
Mathematical model
Prediction
System programming
Musical ensemble
Loop (music)
Resultant
20:30
Meta element
Intel
Presentation of a group
Group action
Multiplication sign
Realtime operating system
Client (computing)
Mathematical model
Area
Independence (probability theory)
Medical imaging
Befehlsprozessor
Core dump
Abstraction
Library (computing)
Graphics processing unit
Source code
Concurrency (computer science)
Open source
Numbering scheme
Open set
Formal language
Virtual machine
Mathematical model
Process (computing)
Prediction
Order (biology)
System programming
Species
Figurate number
Data structure
Modem
Point (geometry)
Observational study
Codierung <Programmierung>
Image resolution
Disintegration
Mathematical model
Power (physics)
Latent heat
Data structure
Loop (music)
Mathematical optimization
Scale (map)
Focus (optics)
Projective plane
Code
Field programmable gate array
Computer programming
System call
Inclusion map
Loop (music)
Personal digital assistant
Mixed reality
Computing platform
24:04
Axiom of choice
Presentation of a group
Theory of relativity
Multiplication sign
Domainspecific language
Formal language
Number
Process (computing)
Integrated development environment
Personal digital assistant
Analogy
Energy level
Error message
Library (computing)
00:05
hello everyone I'm going to pursue pies encompassing it all over
00:11
you and it's my honor to present his presentation my name is jointly I'm prone Correa used the 2 novel at must be preset systems the title of this presentation is too few acceleration of local law at must be modeled using high some based on what the platform I'm going to talk about almost is to take advantage of all barriers model processes for largerscale assigned the competitions I have all part my presentation as arrows hostile you introduce some model processes focused on secure and CPU
00:57
2nd integers on new small so you pass it provides a more general aim the pioneer each side developed for what platform so that you presented results primetime needed to all go ahead to must be modern finally I you talk about the Sony entries of plants let's start with the 1st part of there's a
01:29
linoleum for you interested in computer hardware there I'll flying appeal processes in the computer market today the nation through traditional secures the mall study presented here although model processes is the work that this processing unit named a GPU developed by and media and I am the the if you saw is not a developed for dedicated group processing but now they are rather similar apropos a contagious domain is the interior to the course and name the might developed by intent was made by integrating all sixties if you cost and that the chief you wanted the In recent years at Tuesday's are regularly by having to improve policy by title poem you no logic gates to Bishop of hospice the numerical algorithms all income is a modern processors use hi conventional upon region highenergy chosen they also necessarily paralyzed descriptor
02:48
shows the trend of the content technology over the last 3 or so years convention the control performance is was proportional to the class period of was secure core and the crust Putin had been already read a form was is according to the Wars role for a long time however and the be the train to southern perplexity that had heartily interested to 2 to the power consumption mutations is there and as you can see at the park all the lines the normal cc increasing activity the for inference released into peers hair was thousands of course we do a ball and you have as Clusty of slowly this time this is to last so long time let me
03:43
introduce the people's pertain to fuse the presenting their modern processes and traditional appears tossed the poss percentage would produce of all signed up to the 2 D Ising ready imagine units named ALU use very the parliament while if you or OK some tender so to test and fro control and then the A. you want if you or OK most over trend is passed to you therefore from a point of view of the profs index which indicates how many hours Smith operations can be performed a 2nd but you views some much higher than the ship heroes the 2nd is that the memory banned the use of the chief used using the TDD our fight or 8 of you know is is much higher than the ship use using the deity altogether for memories however you know the panic for all advantages of the chief useful for once all highly paralyzed to I was using the client otherwise the post may be laughin disappears the
05:04
despair was show the specifications of some ship years antifuse sold on the market resulted these are not very today take a look at the secure and that if you it is there compare times marketing made boxes the memory then this is more than 4 times the you performance and the frosty by more than 8 times using group precision and 5 times in the Christian I did not make a direct comparison the table while I hope that the T policies goitalone The Wall Street sent video + const too few if you have if you have a close so far I might elude Nico and began brewing no I'm not but if you just had its own advantages and disadvantages the shot 2 views of the rest of the default data Paracomp high costs and high energy efficiency however you it was is not data parent connotation than if you may be using this for you also assumes that the purity is currently the pseudo Trichet's press pass on the motherboard they the connotation read the horse nearly might become more serious partly OK my
06:32
important cash about the model processes is this how can I utilize these various smaller processes in all modern the water quality being developed in my company UCB in for a full turn so Minnesota such as open ACC and open and are considered the more natural as it is in the examples through all the previous propriate detect was people the loop throughout the components is the core of the problem Tiger motion the list of the EU elegant as cruel however in light my personal experience to tick tumors so I use it to start to be what they are difficult to achieve the high once on the
07:28
other hand I think that each processor has a major problem in enduser which is designed to support for you for example for the following fuse up 0 4 AM disappears and I ask issue for In Midas since engines sup parallel anti they shouldn't be more benefical index lies in the proposal in the process the so my ques shown has changed the like these is there a way to integrate called by various native languages such as equal but opposite and to see Saul I made a
08:13
new small high some more generally in the Piney understand for quite some based the most independent flat though quarrel tiny these 2 seats to the process and the land is easily how improvisers 3 components as a long time I want the this system and to analyze variables that is followed by the pipe with our health is shared and placement Mr. types modem presents a thousand water is used to wrap of fulltime C and I you libraries 2 shows those 2 multilayer structural pi me the main called along the application parameters to be tuned by prices the increase the past that have really all picked it to the competition upon was such as computation for all contour pre and post processing and so on the nice competition parts are written by lower lenders it accordance pull tiger processes and Paracomp still rolled over orders and put them as by some more the price of conscience I also made is the or a video was depend on the target process because the teachers and Intel might have dedicated to the memory space and
09:47
this is a very simple example of a how to use a it calculates the notation T equals a times x and plus y but so I N operators and ASA scholar tho below is a simple python was using the long time what you now let's come what this call to compiling duties as fulltime see for the lineup up this year 1st desire for time
10:21
and see was an the
10:28
opposite theoretical the wouldn't size follow but that is about us to not have audio perspective on the screen explicitly compared to the paternity call this
10:45
finally DC supply so make all the using time needed to integrate the previous fall role in recorders the 2nd part to initialize the data the following sentence so this possible process so and many times the In this example Shapiro and photon despite the next part devised user arise that's in arise always for Tiger processes the next part is to compile a portal conditions defined in the previous similar they're called this Beaupuy's of constant named to every ICT the so rare poll the punishers the next part is to call on the portions the generalized or a arguments at the end of the result is checked really oppose power do so what I want to emphasize here if I change the process of true chip you and says the land it took for them destroying yearlong on that's if you without any anymore the other parts this is my goal to use the hardware by simply modifying the options in the I
12:15
a simple proposed test was performed using our twodimensional maybe cation to go to the runaway where attention is defined by the Laplacian operators as visitation discretize dissertation using the center of final people's missiles using dissertation us improve circular where can be selected as shown in the Indian nation I and 4 and so they looked
12:49
forward to Mandalay beautician awarded to new time the and Piney based support to us for the up this year is the key respectively the where the bus shows there also pioneered face to put on use about 4 per cent lower than that of pure fulltime call why parity best to put our has of good performances on and you get to fuse highly based opposite does not only as a spectator on Intel CPU and you may pilot did not suffer of the decrease in performance compared to pure for time pure and component that is sure so called for policy and you get a few I not as the DeSoto
13:42
applying pioneered to the global adolescent more and more than being developed in my company though girl at a
13:51
lesser mother is to make the greedy horizontally and politically on Deoras as shown in the pedia and to so it was really itations emotions at each agreed to point it must be moderate consist of a city parts dynamical call physics processes and data simulation I private primary only to the dynamical called DC parts
14:21
little complicated the spread of discrimination of the corolla at less the mother it is much more effective true pure musically published by NASA in the United States the I the this will give a short
14:50
social relation is a problem G. always pi model developed in NASA all I'm sorry 10 m from the and it some that on the the In the experiment the of the here absolutely time some nothing that
16:09
in this was this yet to walk
16:38
and thank the but but this is was
16:48
yet this creatio socialization is a you aspire this is so moderately wrote to by another it is interesting that class common goal and there are some pipelines then my country code as soon the yeah the global or it was more than sure later at fraud on DOS like these
17:27
but who cares if this will be done that's a home page at the who the
17:51
go annotations of the banner call consists of them parlaying condemnation mutations from last video indications of highs and turn and put in the speed temple so pressure so entropy and water of papers the the number had to be widely
18:11
used as agreed on dealers as shown in the left to see here however the higher the grade duration the logo perature she's so we use the she was pure greed as showing the light here thank that use
18:27
period occurrences global rectangular months and their internal cost closer points as shown in the figure the user make mess as a spectrum at Minnesota for those special data what the rest and the sort order event could promises for the time they what he was the the spectrum of the mass of the has Exxon to Paris can scalability and
18:58
I counted the called Alliance of the mother being developed in my company the kinase to are modeling total total alliances evolved to 100 the sodium sodium minus sojourn I I and about 100 to 40 posts I wasn't he said for comments and blank lines note that the panic call on this Cobol policy 60 to 70 % although more there a current use of all I was out the 600 the lines each occupies only about 4 per cent of the top 2 lines I attended all part into could I end up this year I call this an integrated the band using highly many scientific competitions as a result more than a tickle lotto computation time in a small portion of the cold that's why I think the mess using pioneered use usable Paul many scientific connotations the all 45 so what is used in the band of recall I complicated diesel teasing took for them and happens you call this or is not really between white people and
20:24
desire ensures too much protein the walk for all of the band calling using
20:32
this figure compares the was corrupt piled upon our core using pioneered primary by inter you're and B. that's a few and then like respectively the horizontal is the mother is about 100 kilometres it's a very low resolutions the focus times 1 day the world collect time by the 2 circadian territory on securing beachheads assisting cost was 30 hours or so minutes based on dis time then often see called on the termite was about 2 . 2 9 times faster than couldn't called on the 1 on and you get if you you a ball 7 . 2 port times using a tone of study a airports appears to be the NCI and they all 40 . 2 5 . 2 9 and a . 2 0 and 11 times faster respectively the specification so the processes used to in the is experiment are shown in the U. K. with low I think they're using up 0 point might is so bad idea 0 I plan to try I a species that a well see later look that I don't want to say and be that's if you stop there than in secure sprung to seize power I would like to show you that using Piney make this easier to use of various model processes I finish the
22:09
presentation I the center found a new project named always you share a similar to pyruvate why prime in the clients that the main problem should be Python all CCA or also that the main core and can also be fulltime cities suppressed pose Andrea so you forces that has more advantages than I mean I might change I mitosis I was a from my kind of all due method for any project named the realtime reprise enormous optimizer target quotas for the modeling of processes from 1 should look all these buy room structure and dependencies the nice thing about the loop I is that I cannot provide various optimizations such as a piling move through on only be there anymore applying the should the call if I can combine was to say and power as shown in the figure you to be very great Minnesota summarize I propose stars to utilize the modern process for this case and the connotations I made the new i some order Mendip the I need to Integrated Motor coders so clover and often it also where it'll be used to problem as the global at modern I have to images of plants 1 is top right open was to the to the mother you put you might also want to the other is to rewrite the model using all but for any such as a group I and I process it can promise I need yeah thank you so much for your attention attention
24:05
FIL be we not what that is the problem that they see when you have a higher level language which is compiled on lowlevel language like would be the would of course the trace the when there is an error you have to divide that out of the of these the device which has really as you get the right lane and there's a lot of the reason of this is what might be this system whole but all i so I'll decide understand your attention let me rephrase the time when there is a mistake there problem and you need to divide the produced the generated called out of a desert what we have an easy relation ship between the generated according to region quarter all what what would you use all that today all my presentation I did not on using all metaphor in so I I come what it tho come up data in a time called to for the end of this year and since then I I did not know using that operated by by heparin to use that the all this if yeah the thank you for the talk the dean Anthony on ideas and where we can use this technology outside of scientific domains outside of the thing not skin processing like have you this much many other industries where these might might make sense what would the the the analogy can be used outside of scientific what are all actually the same as those is I see the following question mess of ball in assigned to people in there are ultimately you must as yeah the I see there and after you could list all 42 I want to the number of they know bicycle choice for but after the putations fault Benny assigned people there's in full time this of proviral fish I really did not only 10 out so fulltime to courses many scientists is only useful time so I all into it to full fulltime and other Norway this year so it's not a question that may be an answer to the question that you just got that on that a science like they see that this use case other than scientific on environment in the that that sense industry where you crunch a lot of data you're where is all a lot of the people the use cases are based on them by all stuff like that and libraries recent libraries dying in deep learning industry an like socialist afraid that are very and the meaning for for this kind of usage so maybe it's a were trying using your library to do that assigns all other than in the scientific industry we OK thanks for your comment yeah so how the the