Bestand wählen
Merken

Big Data Analytics at the MPCDF: GPU Crystallography with Python

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
thanks for coming to my talk 1st and like to say that I work for the Max-Planck computing and the facility and so in short and B C D F which is our cross-institutional
competence centers of the Max Planck Society so we collaborate with the scientists from different Max Planck Institute of uh and society to support them out from the high performance computing point of view and forum data science as well Max Planck computing and the facility using 1 of the things the development of application and arboretum for high-performance computing as well as for about the intensity the form implementing and designing a solution for data-intensive project so good not only operates the state a
lot of supercomputers but also provide up-to-date infrastructures for data management all long-term archiving In collaboration with 1 of these institutes to be precise the Max-Planck for ISAF ocean center making Düsseldorf we're working in in
the so the peak data that even material science domain and the particular I'm supporting them in the imaging imaging area of out on Problem crystallography
so variable is to review quotes structures and composition of priest us and basically the ask us to perform Fourier transforms of large data sets by larger mean something like billions of atoms inside a constant they plan to retire high-quality of crystalline data and the data the the really apply Fourier transform which is imperative if you are interested in the looking forward to all high subproblems and this msec of operations had to use to uh improvement and reconstruct the parameters useful for the atom probe tomography and this is why it's state of the art of data mining and visualization the
Fourier-transform all scattering maps for nanostructures in principle can be computed with the formula in the upper box as long as the electron of atomic and molecular density is well defined them on and read the fine enough to resolve the atomic positions are inside the priest of so you can solve the basically the Fourier transform either by using the well-known algorithm of a fast Fourier transform which is pretty fast skills like and low band and with a suitable for the large crystalline structures called by direct operations which it's a bid was lower
but we prefer to follow this way because it's the most general case in principle you can compute the amplitude of a scattering maps starting from the up on the composition and scattering factors inside the crystals from any structure model I mean it can be also IDL all non ideal case the crystal campus and some information some kind some tension some strain so that's why it's very useful to compute
this greatly transformed by direct operations just to give you some numbers usually we have to do with them a lot of adults and as a seed billions of atoms so that in in
this presentation and showing results of these down 10 to the 8th atoms the reciprocal lattice space denoted by h usually require some very fine resolution so let's see 10 to the uh sits look at that and yes and so in terms of floating point operations you have 10 to the 8 times 10 seeks times which accounts the floating point operations uh due to the algebraic operations involved in 48 hours for essentially you're evaluating assignments and goes on this function in general you end up with 10 to the 15 for which a very huge number but compared to the peak performances of a modern architectures it since this algorithm is well-suited for a GPU computation which has um 10 to the 15 from help floating point operations per 2nd and I'm going to show you uh how these have written scouts perfectly on wikipage abuse on a competition time of order of a mean it's so what we're
doing in the implicit yeah not walking in perfect blue and has a
1st by GPU programming in combination with a high-level language style high-level scripting languages like python it's not the polar opposites Dubuque programming highly part very sensitive to the architecture yard where they had used to optimize the um floating point memory throughput in order to give you a tremendous high-performance when you want to assume that the scientific task on the other hand the title it is in favor of easy use me on is all the use of the but I could that aims to join together these 2 aspects by is the same as a single philosophy and can be considered a soft oversees the project but the for the rest of the book and just mn referring to buy them and why title when the
police easy and user-friendly unless easy to learn general purpose and that if the pocket and that could yeah it's a very valuable tool for the scientific community because it contains a lot of uh packages severity use for for addressing your scientific task and this and that allows you to write your coding the boson of lines instead of hundreds or even more in other programming languages and especially if this avoids still remain the wheel anytime it's a success in the displaying your data since the same difficulty of these opposition is is an essential part in scientific and in the process the and moved by a which is our foundation of package for scientific computing it's my gives you power for n-dimensional lady uh broadcasting function of my optimize the sum of linear algebra Fourier transforms of treatment tools to integrating scenes a + + and Fortran code perhaps the most simple and
user for program you can write it in by code that is a suitable to multiplied by 2 element wise your full times for Harry 2 things to import by hooded driver cooled down artists and then gives you access to drive a level of conduct interface and import by could in it because automatically picks the GPU available up and ran on it and and simple you define your four-times times Forum Medicis you uh look at memory on the device played a lesson in the GPU and literally all still device transfer your moment array to the device and now them in the red box the most interesting part you have about a purely called c the cold rapid in the Python so essentially this called executes on the device at and then it's called we didn't from mountain existing the
same results can be obtained with much less effort using GPU arrays since spike . 1st abstractions and so the GPU is equivalent of northern pike theory so in agreement with the edit run repeat style of Pico that as a tool of full aim 1st to simply usage existing that could C and erupting uh and this could a sea of avoid to reinvent basics of GPU programming or in and on top of the 1st layer of bike without moons offer abstractions
so this by called them From gives you easy and complete and that um on access to the GPU a so this and guarantees that the automatic resource management and error checking convenience in sense the provider abstractions as I'm sure you before and tightly integration tight integration with the barrier they of course it's fast and there's a very well above
conditions he had just some reporting that of some links where you can uh get more informations about them what that
we using to analyze our what Prescott's we're using by antiques which stands for tools for nanostructures crystallography it's an open-source library and the old arrays beans and 5 and equally and that the the core that's been developed and that did European Synchrotron Radiation Facility king I'm just think hurt talking about the main modules in charge of computing x-ray scattering by getting the benefits of using the graphical processing the processing units just to be uh to give a complete overview and just in mediating their remaining modulus but these are not touch the and the rest of the discussion so the main aim of finding excuse to given a larger sample of atoms missing billions of items to compute the fourier transform in 1 2 or 3 D coordinates and the receipt of a space with very of finer resolution using the uh and performances of GP use the performance uh can be obtained by using either in B. you would have to be and so as along would by could library all as a see the with open see as we'd by science
it and implicit yes the um default impide on environment is provided by an apple in the distributions by any export bite on version 2 . 7 and the ball it can be simply the loaded by the web site project you can ask for our account and become a developer or simply pp style by antiques it's a required to have like who don't of course if you if you want to run on a GPU and buying much lead if you want to display the display data by Linux if
GPU are not available or can run on the CPU as well so it's recommended to warm little uh to import by a 5th T W package and option optionally you can use some external library the CCTB IX library which stands for crystal computational tools and likely you can install wheat call the are under your Python distribution what do what makes the dynamics of a very valuable this you can um simply use the fight on interface you don't need to run code I just finished to say you don't need to could and then them from showing us who got a piece of code but the at least it's them useful and nice to ever a view or of what's going on essentially called that f h is device can you have a device can for each mode used in the Beineke's library a couple of remarks you uh in Dixie um index indexing that uh our aid by combining threads and multi trades and law for each block you are OK did the shared memory because threats are better for communication and synchronization from global memory view out trusting fitting your input data to the shared memory so trades on a single blocks have access to all of the the same portion of the data and importantly you have each trait to compute each single reflection and also you have faster and optimize the trigonometric functions this is just the continuing to together the of the remaining of the atoms included in your data sets and now find to bite interface a simple you all read your data uh essentially our data provide us the Moon give us the so we've file accession book pulse which is essentially made over 4 columns X Y and Z the atomic coordinates and the 4th column is the master of the 2 charge ratio which helps to identify different atomic spaces is in Europe in input the data file it's a good habit to to convert of reality uh nanometers units in some form a dimensional units called fractional coordinates which depends on the crystal you are exploring and you can define your of reciprocal 3 D space Hk yeah in who ran
your um function F H K of 3 to compute the them the Fourier transform then the number the the name of the GPU cards available passed by command line that you can choose whatever you have fully censored if I'm running on our Maxwell architecture which they performance when essentially what these computing the computing is the formula in the box the discrete Fourier transform distributed on several GPU and what did the returns is up to black are complex and pyruvate FHT and it was so the competition time which is nice sometimes especially at the beginning when you want to perform some speak this is just to give an example of how what you can um gets or you can pick it display form of for instance on the left is in each cell playing you my showing hundred blocked of scattering maps for a monatomic Kubik status under troop cells In this case you can also appreciated the fact that I'm working with the non ideal structures you can see a as light in uh offsets along the vertical direction on the right is just on my computer the complex uh refraction index in terms of the scattering angle some performances on the left this will I ran on our infrastructures uh secede the Pascal active uh mass architecture the Jedi 9 18 so you're by adding the number of grid points in the reciprocal space and as well the number of outcomes and please compare with the plot to reported in the seminal paper would for the 1st time by a b as been introduced to the scientific community nowadays we can reach a a throughput pro-GPU of order over 4 times 10 to the less reflections per items per 2nd more
benchmarks uh please them band tension that on the vertical axis it's on robotic to mix case and this is the difference in competition time by attending PlanetX on a GPU on on 64 logical CPU for instance looking at the bar charts in the middle of using that resolution of 64 tubes in the reciprocal space you can can make your competition in roughly 5 minutes compared to 2 hours on the CPU so there is a kind of a factor of 24 between the 2 he had just all make more example on them in a new generation of the GPU in this case is the task of architecture in fact between the maximum pass collected architecture you gain roughly of um half an now for instance in the most extreme resolution case and this is how we deploy uh I this of data science project basically we summit you know my eligible on them then in supercomputing cluster and then the boxes just a small script culture you submit your job because accuse those who learn workload uh manager now
you have a rule data and in the end uh you would like to convert it in a form which is viewable understandable to you once you want to visualize your data because it's important as well 1st we had it going to users of psyche a match which is um collection of uh operates for image processing developed by the sci-fi community and written in the Python language but the especially of I'm using the matching troops operate with iterates across your data volume trying to find the regions which matches we that Europe isosurface value so in the end what's them the board you have added a volume of from this 3 D Q you want to extract the SA face of equal value and as a surface so it takes 2 input uh um but I mean that's the data volume called pulse spec which is the scattering amplitude essentially computed with bionics and the barrier you are choosing but
to be more interactive with the visualization is it convenient to use the decay of the visualization toolkit which is an open source software for computer graphics e image processing scientific dieselization is a collection of the C + + library about whether also of how rapid in on 40 cents the so what we need to do is to convert power trading empire of a given back by and by by any x in decay XML based formats and I'm going to use by ABT K which is a very uh easy to use the title and collaborator package it seems me empire pyruvate traits in your BTK XML-based so once you have these BTK XML of file you can never process these uh Phys using 1 of the most common applications like 0 3 easy my out the whole para we're using part of you as our main watercourses for 3 D analysis part of you I said
um open source multi-platform can perfectly is scalable the for a user for for these results visualising huge amount of the danger the N 3 D it's the scaleable in the sense you can around from your notebook up to you in class they're all be memory supercomputers has an intuitive user interface and when you are
doing with them scientific dieselization uh do need that the time with well uh in well done especially the representations of the data types used in bottom you are uh measures and the two-hour proposes we are using the trick to union agreed mostly uniform this is just to say that
part of you is very popular in academic 0 so government institutions when you think
about part of you you just think about the small client application In reality part of you is um at all stock of libraries and that the decay is the course provides all the functionalities for doing your of digitization and volume rendering concerning tied on time you comes with PV Piketon which is a nice application which allows you to up to meet your task and um them make Europe Python scripting for visualization so this
is the graphical user interface basic 3 steps when you visualize your data reading filtering and rendering as most of the um there we has a menu bar with the most uh with all the features included in bottom you a toolbox with the most common features use the for these are lies visualizing your data at pipeline at uh browser where a collection of a pipeline object with the um In in they invented the that syntax is super is uh percentage the you can have a look inside your data at all in Europe pipeline collection and change parameters indeed Inspector all improper to spy then of course and help and finally of treaty uh viewing so you probably are wondering when I'm talking about part of you in applied on confidence if you look at this plot which is ah isosurface of my uh simulations than with by any and
behind this pulse there is a Python scripting essentially so you set up your Canada and your parametres change of parameters you use all of the the steps you need to apply contour plot of thresholds or whatever and finally you visualize your data this is just a collection of a fish that's you may want to apply a collection of all this what sort a collection of all this the 5th that's gives you are pipeline objects I'm using for my ordinary work just 3 of them for except for instance
instance I'm and on the right I want to make our home to plot of my data and then I want to it for instance make our surface that uh and extractions by just looking at a range over values I can also inspect in the data opening a spring ship uh um the the of but also and the good you can make a query on your data according to some threshold criteria let's say I want to extract for or my data the just the cells or the grid points according to some might create years and also you have a
this is a nice feature by clicking and dragging on your viewing to select the the data you are interested so I am going to conclude with the next best we're in we want to have this if you are interested in looking for some subvolumes in your full space let's call some sport you want to see the identified this uh sports measuring angles and the distances and the study of the tentative in the past to the data that Iris in order to improve the their problem of tomography of reconstructions as so part of your and piled on it's very useful for addressing this task and thank you
for your pin that there's some few
if we have time for questions hello thank you for the talk I you say that you use your mobile direct Fourier-transform because of your items are not agree space right you're not dealing with perfect crystals a you you try to use the nonuniform fast Fourier transform which doesn't require you to have a equispaced grid no I haven't used the
and or what have a very but so I wanted to 7 the he then you you and it's all the the nonuniform it's actually runs on top of of us Fourier transform
but it's actually a mathematical theorem that lets you go from uh it is basically to anonymous basically and there it
is and look on as fast as the 1st times so as far as I know of there's been out there's a lot of software is used in crystallography that has that has been around for decades and when I started crystallographers so we're using our computers that are mostly unknown to daylight SGI Silicon Graphics machines and I aware
of any of the integration of of the things that you do with software that has been used by crystallographers more traditionally
you think and it's in the so I know anything k and my is the integration of time x toward we did extend the library computed the crystal of crystallography computation of 2 books for making other a scientific task like animal um computing dressing dressing gown incidental use it in the so difficult to pregnancy psychography technique and so on so this is the only think I'm aware of that OK any other questions then let's thank the speaker again
Subtraktion
Sichtenkonzept
Punkt
Kartesische Koordinaten
Computerunterstütztes Verfahren
Computeranimation
Systemprogrammierung
Graphikprozessor
Bildschirmmaske
Webforum
COM
Supercomputer
Facebook
Projektive Ebene
Softwareentwickler
Term
Aggregatzustand
Service provider
Packprogramm
Computeranimation
Graphikprozessor
Kollaboration <Informatik>
Domain-Name
Datenmanagement
Flächeninhalt
Supercomputer
Plancksches Wirkungsquantum
Visualisierung
Bildgebendes Verfahren
Implementierung
Telekommunikation
Quader
Ortsoperator
Dichte <Physik>
Ordinalzahl
Gradient
Analysis
Computeranimation
Ausdruck <Logik>
Data Mining
Graphikprozessor
Algorithmus
Visualisierung
Schnelle Fourier-Transformation
Datenstruktur
Nichtlinearer Operator
Gerichtete Menge
Raum-Zeit
Datenmodell
Tablet PC
E-Funktion
Fourier-Entwicklung
Dichte <Physik>
Arithmetisches Mittel
Mapping <Computergraphik>
Menge
Physikalische Theorie
Schnelle Fourier-Transformation
Fourier-Entwicklung
Visualisierung
Aggregatzustand
Data Mining
Ortsoperator
Nichtlinearer Operator
Gerichtete Menge
Datumsgrenze
Klassendiagramm
Raum-Zeit
Zahlenbereich
Ordinalzahl
Rechenbuch
Teilbarkeit
Analysis
Computeranimation
Mapping <Computergraphik>
Graphikprozessor
Ordinalzahl
Information
Resultante
Server
Punkt
Skalierbarkeit
Zahlenbereich
Ordinalzahl
Computerunterstütztes Verfahren
Kombinatorische Gruppentheorie
Term
Raum-Zeit
Analysis
Rechenbuch
Computeranimation
Graphikprozessor
Algorithmus
Punkt
FLOPS <Informatik>
Hilfesystem
Bildauflösung
Ortsoperator
Algorithmus
Nichtlinearer Operator
Gerichtete Menge
Raum-Zeit
Ordinalzahl
Ordnung <Mathematik>
Zentraleinheit
Webforum
Multiplikation
Punkt
Prozess <Physik>
Gewichtete Summe
Minimierung
Schaltnetz
Extrempunkt
ROM <Informatik>
Code
Computeranimation
Task
Demoszene <Programmierung>
Graphikprozessor
Spezialrechner
Code
Wissenschaftliches Rechnen
Skript <Programm>
Lineare Geometrie
Optimierung
Parallele Schnittstelle
Gerade
Leistung <Physik>
Programmiersprache
Lineares Funktional
Architektur <Informatik>
Sinusfunktion
Höhere Programmiersprache
Matrizenring
Polarisation
Einheit <Mathematik>
Festspeicher
Mereologie
Ablöseblase
Projektive Ebene
Skalarprodukt
Computerarchitektur
Ordnung <Mathematik>
Zentraleinheit
Resultante
Momentenproblem
Quader
Desintegration <Mathematik>
Element <Mathematik>
Oval
Code
Physikalische Theorie
Computeranimation
Übergang
Quellcode
Graphikprozessor
Webforum
Typentheorie
Operations Research
Optimierung
Druckertreiber
Schnittstelle
Array <Informatik>
Abstraktionsebene
sinc-Funktion
Mixed Reality
Druckertreiber
Array <Informatik>
Festspeicher
Mereologie
Wärmeleitfähigkeit
Lineare Abbildung
Fehlererkennungscode
Mereologie
Abstraktionsebene
Datenmanagement
Information
Binder <Informatik>
Service provider
Computeranimation
Homepage
Integral
Open Source
Graphikprozessor
Datenmanagement
Konditionszahl
Overhead <Kommunikationstechnik>
Schnelle Fourier-Transformation
Information
FAQ
Druckertreiber
Parallele Schnittstelle
Feuchteleitung
Schnittstelle
Distributionstheorie
Web Site
Multiplikation
Computergraphik <Kunst>
Datensichtgerät
Versionsverwaltung
Ordinalzahl
Hochleistungsrechnen
Gradient
Analysis
Raum-Zeit
Computeranimation
Open Source
Graphikprozessor
Stetige Abbildung
Modul <Datentyp>
Stichprobenumfang
Programmbibliothek
Punkt
Coprozessor
Softwareentwickler
Default
Hardware
Bildauflösung
Array <Informatik>
Raum-Zeit
Streuung
Wellenfront
Modul
Fourier-Transformation
Software
Datenstruktur
Einheit <Mathematik>
Speicherabzug
Projektive Ebene
Programmbibliothek
Versionsverwaltung
Simulation
Programmierumgebung
Zentraleinheit
Kernel <Informatik>
Distributionstheorie
Spiegelung <Mathematik>
Punkt
Gemeinsamer Speicher
Datensichtgerät
Computer
Computerunterstütztes Verfahren
Pascal-Zahlendreieck
Gesetz <Physik>
Rechenbuch
Raum-Zeit
Synchronisierung
Computeranimation
Einheit <Mathematik>
Puls <Technik>
Datenverarbeitungssystem
Dateiverwaltung
Chi-Quadrat-Verteilung
Schnittstelle
ATM
Lineares Funktional
Sichtenkonzept
Winkel
Stichprobe
Ruhmasse
Plot <Graphische Darstellung>
Ideal <Mathematik>
p-Block
Ein-Ausgabe
Konfiguration <Informatik>
Fourier-Entwicklung
Software
Vertikale
Datenstruktur
Menge
Automatische Indexierung
Rechter Winkel
Festspeicher
Programmbibliothek
Trigonometrische Funktion
Ordnung <Mathematik>
Versionsverwaltung
Instantiierung
Schnittstelle
Telekommunikation
Quader
Zellularer Automat
Zahlenbereich
Ordinalzahl
Zentraleinheit
Term
Code
Ausdruck <Logik>
Graphikprozessor
Bildschirmmaske
Multiplikation
Programmbibliothek
Thread
Fünf
Datenstruktur
Diskretes System
Indexberechnung
Paarvergleich
Elektronische Publikation
Hochdruck
Chipkarte
Mapping <Computergraphik>
Maxwellsche Gleichungen
Formale Sprache
Fourier-Entwicklung
Computerarchitektur
Röhrenfläche
Datenanalyse
Formale Sprache
Flächentheorie
Iteration
Bildverarbeitung
Kartesische Koordinaten
Pascal-Zahlendreieck
Computeranimation
Datenmanagement
Puls <Technik>
Maßstab
Prozess <Informatik>
Skript <Programm>
Explorative Datenanalyse
Benchmark
Bildauflösung
Feuchteleitung
Kartesische Koordinaten
Benchmark
Programmierumgebung
Ein-Ausgabe
Teilbarkeit
Dialekt
Generator <Informatik>
Ein-Ausgabe
Projektive Ebene
Extreme programming
Message-Passing
Instantiierung
Subtraktion
Quader
Zentraleinheit
Mathematische Logik
ROM <Informatik>
Whiteboard
Datensichtgerät
Task
Graphikprozessor
Bildschirmmaske
Flächentheorie
Spezifisches Volumen
Matching <Graphentheorie>
Bildanalyse
Schlussregel
Maxwellsche Gleichungen
Beanspruchung
Differenzkern
Computerarchitektur
Visualisierung
Klumpenstichprobe
Resultante
Umsetzung <Informatik>
Multiplikation
Computergraphik <Kunst>
Skalierbarkeit
Klasse <Mathematik>
Bildverarbeitung
Computer
Kartesische Koordinaten
E-Mail
Benutzeroberfläche
Analysis
Computeranimation
Datensichtgerät
Spezifisches Volumen
RFID
Graphikprozessor
Open Source
Standardabweichung
Software
Supercomputer
Typentheorie
Notebook-Computer
Visualisierung
Speicherabzug
Parallele Schnittstelle
Schwarzsches Lemma
Analysis
Leistung <Physik>
Architektur <Informatik>
Elektronische Publikation
Benutzeroberfläche
Prozess <Informatik>
Theoretische Physik
Open Source
Applet
Bildanalyse
Elektronische Publikation
Menge
Portabilität
Kollaboration <Informatik>
Array <Informatik>
Zellularer Automat
Festspeicher
Mereologie
Dateiformat
Modelltheorie
Softwareentwickler
Multiplikation
Stichprobennahme
Selbstrepräsentation
Supercomputer
Computeranimation
Graphikprozessor
Datenstruktur
Softwarewartung
Typentheorie
Datentyp
Minimum
Mereologie
Biprodukt
p-Block
Einflussgröße
Server
Kartesische Koordinaten
Computeranimation
RFID
Graphikprozessor
Client
Bereichsschätzung
Minimum
Volumenvisualisierung
Programmbibliothek
Visualisierung
Hilfesystem
Lineares Funktional
Parametersystem
Architektur <Informatik>
Mathematik
Element <Gruppentheorie>
Plot <Graphische Darstellung>
Objekt <Kategorie>
Funktion <Mathematik>
Digitalisierer
Mereologie
Client
Benutzerführung
Benutzerführung
Quelle <Physik>
Punkt
Flächentheorie
Regulärer Ausdruck
Kurvenanpassung
Zellularer Automat
Mathematik
Computeranimation
Graphikprozessor
Spannweite <Stochastik>
Puls <Technik>
Flächentheorie
Multitasking
Punkt
Schwellwertverfahren
Basisvektor
Parametersystem
Schwellwertverfahren
Mathematik
Snake <Bildverarbeitung>
Abfrage
Plot <Graphische Darstellung>
Digitalfilter
Quick-Sort
Objekt <Kategorie>
Rechter Winkel
Zellularer Automat
Parametersystem
Attributierte Grammatik
Streaming <Kommunikationstechnik>
Warping
Parametrische Erregung
Instantiierung
Beobachtungsstudie
Graphikprozessor
Winkel
Mereologie
Abstand
Ordnung <Mathematik>
Radius
Raum-Zeit
Computeranimation
Persönliche Identifikationsnummer
Rechter Winkel
Uniforme Struktur
Schnelle Fourier-Transformation
Raum-Zeit
Virtuelle Maschine
Software
Theorem
Computerunterstütztes Verfahren
Silicon Graphics Inc.
Sommerzeit
Task
Software
Programmbibliothek
Computerunterstütztes Verfahren
Kontextbezogenes System
Integral

Metadaten

Formale Metadaten

Titel Big Data Analytics at the MPCDF: GPU Crystallography with Python
Serientitel EuroPython 2017
Autor Bernardo, Giuseppe di
Lizenz CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben
DOI 10.5446/33715
Herausgeber EuroPython
Erscheinungsjahr 2017
Sprache Englisch

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract Big Data Analytics at the MPCDF: GPU Crystallography with Python [EuroPython 2017 - Talk - 2017-07-12 - Anfiteatro 1] [Rimini, Italy] In close collaboration with scientists from MPG, the Max Planck Computing and Data Facility is engaged in the development and optimization of algorithms and applications for high performance computing, as well as in the design and implementation of solutions for data-intensive projects. Python is now used at MPCDF in the emerging area of “atom probe crystallography” (APT): a Fourier spectral analysis in 3D reciprocal space can be simulated in order to reveal both composition and crystallographic structure at the atomic scale of billions APT experimental data sets. The Python data ecosystem has proved to be well suited to this, as it has grown beyond the confines of single machines to embrace scalability. This talk aims to describe our approach to scaling across multiple GPUs, and the role of our visualization methods too. Our data workflow analysis relies on the GPU-accelerated Python software package called PyNX, an open source Python library which provides fast parallel computation scattering. The code is well suited for GPU computing, using both the pyCUDA and pyOpenCL libraries. Exploratory data analysis and performance tests are initially carried on through Jupyter notebooks and Python packages e.g., pandas, matplotlib, plotly. In production stage, interactive visualization is realized by using standard scientific tool, e.g. Paraview, an open-source 3D visualization program which e.g. requires Python modules to generate visualization components within VTK files

Ähnliche Filme

Loading...
Feedback