We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

When Models Query Models

00:00

Formale Metadaten

Titel
When Models Query Models
Serientitel
Anzahl der Teile
112
Autor
Mitwirkende
Lizenz
CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 4.0 International:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
The design of large-scale engineering systems, including but not limited to aerospace, particle accelerators, nuclear power plants, is carried out by a wide range of numerical models such as CAD files, finite-element models, and machine learning surrogate models to name a few. In order to provide a uniform modelling interface, we encapsulate numerical models in notebooks. A notebook is controlling model creation, execution, and query of results. Numerical solvers are embedded into Docker containers and provide an isolated and reproducible environment exposing a language-agnostic REST API. A model registry enables efficient queries of models. The overall system is represented as a collection of models that exchange data. Then, the design optimization involves execution of a dependency tree of models to study the impact of a parameter change and perform its optimization. In this contribution, we present a model query mechanism allowing notebook models to query one another. The model dependencies are represented with a graph with suitable processing algorithms. In order to ensure that only affected models are executed we derive and cache a model resolution order. The presented modelling framework relies on open source-technologies (packages: pydantic, Fast API, Jupyter, papermill, scrapbook, containers: Docker and Openshift as well as databases: MongoDB and Redis) and the talk will focus on good practices and design decisions encountered in the process."
29
AggregatzustandRandverteilungNotepad-ComputerMathematisches ModellEntropie <Informationstheorie>Exogene VariableWort <Informatik>Selbst organisierendes System
Gerichteter GraphPhysikalismusPhysikalische TheorieGrundraumPartikelsystemInelastischer StoßBildgebendes VerfahrenMathematisches ModellNumerische MathematikStörungstheorieGerichteter Graph
EnergiedichteKompakter RaumLorenz-KurveVererbungshierarchieEinfacher RingKontextbezogenes SystemFlächeninhaltIterationMultiplikationMAPNumerische MathematikSelbst organisierendes SystemMathematisches ModellProzess <Informatik>WhiteboardPhasenumwandlungSystemtechnikMathematische ModellierungAnalysisThermodynamisches SystemLeistung <Physik>SystemprogrammierungEinfügungsdämpfungGeometrieDatenstrukturPartikelsystemAnalysisProjektive EbeneImplementierungMAPMultiplikationsoperatorThermodynamisches SystemRechter WinkelGanze FunktionInformationInstantiierungMereologieKomplex <Algebra>DifferenteObjekt <Kategorie>DatenfeldService providerMechanismus-Design-TheorieMathematische ModellierungSoftwareComputersimulationInelastischer StoßKreisflächeApproximationMinkowski-MetrikBildschirmmaskeZentrische StreckungPhysikalismusDigitaltechnikAnalytische FortsetzungValiditätFramework <Informatik>VererbungshierarchieDatenverarbeitungInterface <Schaltung>Mathematisches ModellAbfrageNichtlineares GleichungssystemUnrundheitFrequenzGrenzschichtablösungZusammenhängender GraphPunktProdukt <Mathematik>TouchscreenBimodulMatrizenrechnungHyperbelverfahrenForcingNichtlinearer OperatorExpertensystemPhasenumwandlungPolstelleTrajektorie <Kinematik>DipolmomentProgrammverifikationKartesische KoordinatenLeistungsbewertungSelbst organisierendes SystemKonstruktor <Informatik>MomentenproblemGesetz <Physik>TopologieQuick-SortHypergraphEinfügungsdämpfungGeradeZeiger <Informatik>TUNIS <Programm>Formation <Mathematik>GRASS <Programm>BenutzerschnittstellenverwaltungssystemEinflussgrößeStützpunkt <Mathematik>ZeitzoneDreiecksfreier GraphDatenverwaltungComputerspielZentralisatorNotepad-ComputerStellenringNeunzehnStrömungsrichtungLesen <Datenverarbeitung>Turtle <Informatik>CASE <Informatik>AutorisierungComputeranimation
Mathematische ModellierungGeometrieMathematisches ModellLorenz-KurveDatenstrukturEnergiedichteThermodynamisches SystemGerichteter GraphVersionsverwaltungBeschreibungskomplexitätVerschiebungsoperatorSystemtechnikVarianzExplosion <Stochastik>TabellenkalkulationAvatar <Informatik>AnalysisNumerische MathematikZusammenhängender GraphSkriptspracheSystemprogrammierungSoftwarearchitekturSoftwareInterface <Schaltung>Weg <Topologie>DatenbankNotebook-ComputerSystemaufrufGraphCachingWärmestrahlungStochastische AbhängigkeitKonfigurationsdatenbankSkriptspracheGraphFramework <Informatik>Mathematische ModellierungInterface <Schaltung>Güte der AnpassungMathematisches ModellVerkehrsinformationInformationMathematikNotebook-ComputerDatenbankVerschlingungWellenpaketEin-AusgabeAlgorithmische ProgrammierspracheCachingEinflussgrößeProzess <Informatik>BeobachtungsstudieAbfrageDatenverwaltungSoftwareMechanismus-Design-TheorieTabellenkalkulationAnalysisProjektive EbeneThermodynamisches SystemSoftwarearchitekturVirtuelle MaschineSelbstrepräsentationZentrische StreckungBasis <Mathematik>PhysikalismusVersionsverwaltungZahlenwertSoftwareentwicklerSystemtechnikQuaderRandverteilungLastAnalytische MengeBildschirmmaskeProgrammschleifeZusammenhängender GraphKonfigurationsraumCASE <Informatik>ÄhnlichkeitsgeometrieNichtlinearer OperatorSystem FMobiles InternetGRASS <Programm>Formation <Mathematik>Wissenschaftliches RechnenWhiteboardSystemaufrufGrundsätze ordnungsmäßiger DatenverarbeitungARM <Computerarchitektur>PunktOrdnungsreduktionEinsAutorisierungAlgorithmische LerntheorieTopologieGerichteter GraphComputeranimation
Mathematische ModellierungInterface <Schaltung>Zusammenhängender GraphThermodynamisches SystemDatenbankCachingMathematische ModellierungProjektive EbeneAbfrageComputersimulationVersionsverwaltungAnalysisCodeSichtenkonzeptTopologieWeg <Topologie>InformationMathematisches ModellDatenverwaltungMathematikGebäude <Mathematik>BildschirmmaskeSelbstrepräsentationMechanismus-Design-TheorieEntscheidungstheorieDifferenteBitrateAuflösung <Mathematik>Metropolitan area networkCulling <Computergraphik>Prozess <Physik>EinsFormale SpracheAdditionHypergraphXML
SystemaufrufMathematische ModellierungGebäude <Mathematik>DatenbankCachingAbfrageStochastische AbhängigkeitKonfigurationsdatenbankParametersystemData DictionaryHash-AlgorithmusInverser LimesROM <Informatik>IndexberechnungMathematisches ModellInhalt <Mathematik>Ein-AusgabeHypergraphAuflösung <Mathematik>Token-RingGerichtete MengeTopologieAlgorithmusPolygonzugGanze FunktionLinearisierungLeistung <Physik>MathematikSystemprogrammierungBitrateInformationTransaktionSoftwareentwicklerDruckspannungTrigonometrische FunktionThetafunktionVersionsverwaltungGeometrieVollständiger VerbandRadiusJensen-MaßFunktion <Mathematik>Meta-TagSoftwarearchitekturSkriptspracheNotebook-ComputerThermodynamisches SystemVirtuelle RealitätInteraktives FernsehenMathematische ModellierungMathematisches ModellEin-AusgabeVerkehrsinformationNumerische MathematikKlasse <Mathematik>AbfrageKonfigurationsraumREST <Informatik>InformationDifferenteMathematikElektronische PublikationGenerizitätTopologieHypergraphAuflösung <Mathematik>AnalysisCachingParametersystemSoftwarearchitekturThermodynamisches SystemEnergiedichteResultanteGeometrieInteraktives FernsehenAnalytische MengeCodeShape <Informatik>ServerPlotterTabelleLaufzeitfehlerHash-AlgorithmusRelationale DatenbankLinearisierungInhalt <Mathematik>PropagatorAggregatzustandDatenfeldGruppenoperationInstantiierungSichtenkonzeptCASE <Informatik>DatenbankFunktion <Mathematik>Open SourceDatenstrukturVersionsverwaltungMultiplikationsoperatorOffene MengeVerschlingungBeobachtungsstudieWissenschaftliches RechnenMechanismus-Design-TheorieAuswahlaxiomFigurierte ZahlZeitstempelSpeicherabzugMereologieDatenverwaltungRechter WinkelZellularer AutomatEinfache GenauigkeitARM <Computerarchitektur>IntegralVirtuelle MaschineFormation <Mathematik>StellenringGeradeSystemaufrufDivergente ReiheKonditionszahlSpieltheorieFehlermeldungSystem FGenerator <Informatik>SoftwareschwachstelleBrowserMetropolitan area networkTexteditorGerichteter GraphPunktBasis <Mathematik>NummernsystemOrtsoperatorLeistung <Physik>EinsWhiteboardGarbentheorieComputeranimation
Thermodynamisches SystemVirtuelle RealitätNotebook-ComputerFunktion <Mathematik>ImplementierungREST <Informatik>CachingDatenbankSoftwaretestTypentheorieWinkelBetrag <Mathematik>DruckspannungGeometrieVariableEin-AusgabeGlobale OptimierungMultiplikationKonfigurationsdatenbankMathematische ModellierungData DictionarySoftwarearchitekturSkriptspracheSystemaufrufAnalysisNebenbedingungGewicht <Ausgleichsrechnung>Radikal <Mathematik>PaarvergleichReelle ZahlInteraktives FernsehenSystemprogrammierungLeistung <Physik>EnergiedichteMathematisches ModellLorenz-KurveEinfügungsdämpfungATMTaskWurm <Informatik>ZeitabhängigkeitVersionsverwaltungDokumentenserverEreignishorizontSoftwarewartungNumerische MathematikMechanismus-Design-TheorieSoftwareGerichteter GraphStandardabweichungCodeDateiformatMathematikMathematische ModellierungThermodynamisches SystemMathematisches ModellGlobale OptimierungAbfrageMechanismus-Design-TheorieVirtuelle RealitätComputersimulationREST <Informatik>SoftwaretestBitUnimodale VerteilungKeller <Informatik>MengenlehreObjekt <Kategorie>DatenfeldSichtenkonzeptProjektive EbeneMultiplikationStandardabweichungGeometrieSkalarproduktDatenbankSoftwareentwicklerVersionsverwaltungImplementierungVerzweigendes ProgrammCachingInformationParametersystemMereologieStellenringInstantiierungWissenschaftliches RechnenDatensatzVerkehrsinformationWeg <Topologie>RPCNotebook-ComputerInteraktives FernsehenZusammenhängender GraphEinfache GenauigkeitDifferenteTrajektorie <Kinematik>Interface <Schaltung>Service providerPartikelsystemKartesische KoordinatenPunktwolkeStichprobenumfangPhysikalismusPhasenumwandlungVariableSoftwareData DictionaryProzess <Physik>Minkowski-MetrikInverser LimesKlasse <Mathematik>WinkelverteilungNatürliche ZahlArithmetische FolgeVirtuelle MaschineSimulationWinkelWärmestrahlungPuls <Technik>Open SourceKontinuierliche IntegrationSystemtechnikDatenaustauschFigurierte ZahlResultanteBenutzeroberflächeInformation RetrievalCADApp <Programm>GeradeSpektrum <Mathematik>LoopGeneigte EbeneNumerische MathematikAutomatische HandlungsplanungRandverteilungEinfügungsdämpfungSelbst organisierendes SystemEreignishorizontKonditionszahlMetropolitan area networkArithmetisches MittelHypergraphZeitzoneMAPWort <Informatik>SystemaufrufMobiles InternetGruppenoperationGerichteter GraphStützpunkt <Mathematik>IntegralWhiteboardGewicht <Ausgleichsrechnung>Nichtlinearer OperatorTrojanisches Pferd <Informatik>MeterKoroutineAutorisierungPunktSchaltnetzFormation <Mathematik>PlastikkarteSymboltabelleBildverstehenDatenverwaltungComputeranimation
SoftwarewartungMathematisches ModellNumerische MathematikMechanismus-Design-TheorieMathematische ModellierungDatenbankCachingInteraktives FernsehenGlobale OptimierungSoftwareGerichteter GraphStandardabweichungDateiformatCodeCachingBeobachtungsstudieFunktion <Mathematik>ProgrammbibliothekThermodynamisches SystemSoftwareentwicklerSystemaufrufMathematische ModellierungMathematisches ModellCASE <Informatik>Leistung <Physik>MathematikMereologieHilfesystemProzess <Physik>SchlussregelComputeranimationVorlesung/Konferenz
Mathematisches ModellMathematische ModellierungVersionsverwaltungHypergraphGruppenoperationStichprobenumfangGemeinsamer SpeicherVorlesung/Konferenz
SoftwareentwicklerSoftwarewartungProgrammbibliothekCachingMechanismus-Design-TheorieMathematische ModellierungGlobale OptimierungInteraktives FernsehenCodeDateiformatSoftwareGerichteter GraphStandardabweichungMathematische ModellierungMathematisches ModellComputeranimation
GraphfärbungUmsetzung <Informatik>Vorlesung/KonferenzBesprechung/InterviewXML
Transkript: Englisch(automatisch erzeugt)
Hello everyone, my name is Michal Maciejewski and today I have the great pleasure to present to you my token titled when models query models Let me start with expressing my gratitude to organizers and allowing me to participate remotely to your quite a lot
I wish I could join you but some Personal issues block me from that This is where I carried out the detail Zurich together with my Supervisor as miss my age Bernard upon and August Martins from culture Institute and
Georgia Valanet from Lawrence Berkeley national a blotter. I will start by Mentioning actually nice coincidence because today is almost 10th anniversary of the exposome discovery answer it was possible by a great effort of engineers and scientists who
Build and operated large non-collider along with four detectors here. We can see Image of collision of beams of particles taking place in one of their detectors and by reconstructing
that collision and matching to physical models scientists could Identify as this blip here and he exposed on being created and that proved theory created 50 years ago, so that's quite incredible and that's a big step for
particle physics below our understanding of matter and Universe as a whole Now, how does it work? Let me start with just a very brief accelerator principle To me it's like a swing on the playground so we have a
charged particle which is kept in circular trajectory by a dipole magnet so just to North's and South poles put on top of each other that for a positively charged particle traveling perpendicular to that field
Experiencing force that would bend it so we can see as Particle goes out of the screen to sternity to the middle of the circuit a circle and then sterling at some point there is radio frequency cavity that gives electric kick That accelerates particle and goes round and round in the circle and there is just this one equation Lawrence forms that you may
remember from high school physics that is governing the Movement of particles a good approximation and was really interesting here is that these magnets are actually super powerful So they operate at a3 Tesla just to compare an MRI magnet would run at 1.5 Tesla and
There need to be cooled down to very low temperatures like 1.9 Kelvin, which is even less than the outer space. That's quite impressive and Yeah, we need quite big circle 27 kilometers circumference because larger circle circle and
The field then we have the more powerful collisions and the more powerful collisions that work more we can learn about behavior of particles and We superconducting magnets, which is the objective of My project design of those magnets and we can reach a compact design in my fields that
allow for building better accelerators And it's actually not only Large enough wider that's one through them all but actually there is an entire complex of accelerators Which provide particles and pre accelerated before going to the next stage and that different experiments at CERN
But what's really striking here? That's for instance protons and protons built in 1959 and is operated until today so This shows that the design is really important part and have to be done, right and to maintain knowledge about the systems and
Information that is relevant keep it operating for a long time after it was put into operation So motivation for my project, which is the design of superconducting magnets Accelerator comes from the fact that these are Multi-stage data processes that involve various disciplines and the future changing teams of experts and software tools
That are on their own subject to updates in replacement and may last as you can see from years to several decades So we are aiming at consistent sustainable and reproducible organization of numerical models Because together with construction evaluation data And told using the process and as you can see here circular
Large amount lighter the future circular collider would be about four times bigger than LHC today, so and that's the design and engineering effort taking place right now, and I'm contributing to that and
Inspiration for this project just called PI and BSC. So Python implementation of certain MBSC concepts where MBSC stands for model-based system engineering as formalized application of modeling to support some requirements design analysis Verification and validation activities that start in the conceptual design space and continue throughout the long and later
lifecycle phases And one of the concepts brought by MBSC is design system matrix So we try to decompose a bigger system into subsystems like we decompose our software solutions into Modules and interfaces between them here is some pretty similar
Where this matrix shows the connection and coupling between systems and also definition of interfaces. So what data goes towards system as they are put together and If we look into one box, which is the super product magnets
We can see that this is composed on it owned by several sub models that also have search dependencies and exchange information between them and when we design a magnet we want to put together these models, but we don't want to Duplicate information, of course is not a good practice involving software and modeling
So instead we want to have specialized models that are good on its own and instead we want to provide query mechanisms such that these models can access data between each other and that's what MBSC framework is all about and it's especially important that we have many many models that are on different scales
It could be a circuit composed of many magnets it can be a magnet itself or a cable that is Used to create that magnet or even a strand of that cable so that happens at different scales I mean to somehow have first of all dependency three of these models and a query mechanism to put them together and
What we do is Rely on the smaller basis of engineering and that's actually a concept that is first of all introducing a quite generic Concept of a model which is a simplified version of something
It can be a graphical mathematical machine learning deep learning or even physical representation like measurements That's abstracting the reality to eliminate some complexity so that we can focus on one particular aspect of system and its design and MBSC we shift system engineering
from documents to interconnected models So that models are queried they generate news and are traceable and repeatable and if you look at typical project management pipeline we have Initialization then study that finishes With conceptual design reports or report telling what are the suitable solutions if you want to realize that project
then we move to design of particular solutions, which ends technical design report and Then we build it some documentation Commission and finalize the project and there is always procedures and report and actually what's quite common in This community, but I would say even science in general is that when we write the report?
we have some models some measurements some excellent spreadsheets some analysis scripts and That all is put together in the report, but typically this link is only
Available for the creator of the document and once the document is created country Trains back where this data comes from. So if we have to reanalyze something this information might be missing and we need to bring it back and recreate which sometimes is even impossible when
People leave software changes and we can really find the particular setup that was used to create that information So instead to these Static documents we proposed to introduce models and with these models be able to always retrieve information trace it back and
Put it together and that's exactly that Python MBSC framework all about and I will talk about its microservice architecture and Here, I'd like to really Note Similarity between system and software architecture
so as you can see a system architecture is represented as subsystems along interfaces between them and Actually was pretty obvious of software a software architecture in our case. It's microservices and interfaces between them so there is this similarity and in fact
I tried to leverage good practices from software development and use them in system engineering to improve the design processes So the framework itself is composed of several components So from user perspective, there is configuration that tells what models are involved in a particular design
This forms a model dependency graph. This graph has to be a directed cyclic graph So that there are no loops and we can always execute it from start to end and as we execute these models, they are
Typically living in some notebooks or scripts. So When we don't process magnetic analysis, this would perform a modern query Find its dependent models execute them and get that information back and when the model is executed Be it a notebook or a script. This is calling an electromagnetic solver to solve
for particular physical quantity that we are interested in and As you can imagine we might have some redundant holes if two models depend on one and if the input information the same we can Reuse that immediately by accessing a cache database. So all the execution of models is cached
We take certain snapshots such that we can reuse that information and also after a study do some Analytics like what's the fastest model was the most performant which gave us the biggest margins. This is still available and
We choose load box because after execution of each of the nodal we can create an HTML report and then build a book from that HTML report such that we form document a bigger report
So what are the three pillars based on this? Initial overview of the project. So first of all containers for numerical models and in general for Reproducible environments. So whatever we use we put it into a container and expose an generic interface Once we have that so that is how to use
particular solvers We have a model query mechanism that has two components one is dependency tree, which indicates what is dependency between different models and how a change would propagate across this model tree and the cache database so that when a cache is
Available, we just return that information. Otherwise, we just need to run the model again. And the third pillar are model views. So auto-generated documents where
We can profit from notebooks and build on-demand a Representation of our design Such that it's also available for decision makers for managers in a convenient form. They need to run any code They have a view and it's actually a quite cheap way of generating the view
and with that we also keep track of some information for Reproducibility like versions of the particular Packages that we used version of the container we use version of software in general so that we can always redo that analysis later so
Starting with containers for numerical solvers. So we have a solver which is solving a particular physical problem magnetic mechanical thermal We then encapsulate in the docker container so that we have it's all dependencies and running environment And then we provide a generic REST API to interface with that
solver That has four methods initialize upload files run these files and download results and We already created some containers some are available provided by other companies once we have that
a Numerical solver API we can enable model queries. So how to get Information between different models and for that we have this pymbsc class which is building an execution query
Based on the config source model target model some inputs and we can either get reports or some figures of merit Or artifacts if these are larger files and as I mentioned already in case the core is executed again with the same inputs and The underlying model will change and it's not only the model but also dependencies. We return output from the cache database
The cache database is a document store a dictionary a Document store which is implemented in MongoDB. So we have here a schema with name of a mobile its path
execution time last modification timestamp hash of its content and contents of all dependent models input parameters and output figures of merit or artifacts that we may want to immediately return once We executed the model so We chose MongoDB because we may we have freedom for the dictionary fields that we don't know
a priori and could have a different structure and That was quite convenient convenient choice and We have Indexed that is based on model hash which brings a high cardinality and that's actually the only table that we that we need to
Store in our cache so we don't need to really a relational database here And one important thing to note once we have our model dependency 3 We realize that for instance a mechanical model depends on geometry, but also depends on magnetic which also depends on geometry. So if
We want to then find the shortest path of execution of dependency. We need to perform three linearization which actually is inspired by the method resolution order of Python So Here once we have a model that we want to execute we need to check its
state if it changed we respect what's in cache, but also whether any of the dependencies changed and This we do by just going through the shortest path of this tree and we don't need to waste time and again, if we have a cache hit we return what we had but if
One of the dependencies or the model itself changed we need to rerun it such that we have Current result but with that tree Re-realization we only rerun those models that changed. So that's a big gain also in computation
And one important step is change propagation. So what this Dependency 3 and 3 linearization allow or is that if for instance one model will change let's say electro-parameter model and you want to rerun particle accelerator again
you would check if its content and input files and parameters changed and If not, we change check all the dependencies Here we see that only Something in the magnet change we go into the magnet we check dependencies of a magnet
That model change would only run this one while the others return the cached results so that we can then put back together The entire model and run particle accelerator study only on change of one particular Model and update only those that that changed and one thing that also brings to the third pillar
Which is model views once we reform a model re-execution and Document a report is auto-generated and we can see it here Open now the link so if we click here we have
CDR example so conceptual design report of Of Magnet design so we have some introduction and geometry with code so that we can see what was
Used with versions of API and time of execution that table parameters And we thought we can still have interactive view of model results. So that's that's pretty convenient We can zoom we can look at what we got and the same is also for some model results where
we have Informational geometry again Some input files that we can print out. So it's all available and again some physical results in this case magnetic field Which we can still read and check that information so that's quite convenient for
Everyone on the team but also people from other groups from other Systems that they could quickly check was for instance the big field or stored energy and that could inform their design and this information Is cross referenceable and people can quickly access
Okay So just to summarize this part, so we have this piMBSC microservice architecture first step is that we have containers for numerical solvers and We also allow for command line interface or REST API JRPC calls, but for those tools that we provide we package them
Docker we presented this numerical solver calls with four endpoints in the REST API and then there is this piMBSC query calls that allows models to exchange information and get Data from each other and this all goes to a piMBSC cache that stores information on model execution
So that we can on one hand Retrieve it quickly if we run the model again with the same parameters but also we can do some analytics afterwards and check our model see how they change and track that information
Then we support two execution modes, so One is local execution when designers play with notebooks and do some analysis make plots and get data the right shape that is by a Jupyter server
Python with some virtual environment and local machine naturalized Docker MongoDB instance It produces a notebook output, which then we can put together into a book and record that People can communicate outside But that's one way. The other is distributed execution. Of course we can imagine that
Once we have certain design and we want to change parameter, we don't do it manually. We can do on-demand compute and Here we just in a way abuse GitHub CI but by making it running with its REST API on demand with certain parameters and
for that we use pay per minute to execute and notebook programmatically and For that we rely either on GitHub runner or OpenShift instances to run these Docker containers and allow for The computation to be done on the cloud
Yeah some implementation details, so as I mentioned MongoDB for cache database Docker is locally OpenShift for distributed execution, FastAPI for REST API development. That's really fast development poetry for the virtual environment dependencies, pay per minute and Scrapbook for notebook execution and
We create these books with Jupyter book and eventually GitHub for versioning and CSD pipelines Which actually is quite neat with poetry. So here sample pipeline that we That we use for our packages
And now I like to come to one application to show how we use that tool in practice. So here is one example of optimization where the objective is to Obtain certain field quality so that particles are really focused and traveling in the desired trajectory this
Accelerator of course means certain safety margin. We don't want to degrade the material too much as well as its insulation and This is taking place by optimizing a cross-section of that magnet where we put certain cables here around
the XY plane and We adjust the positioning angle inclination and the number of conductors and for this optimization We have four models geometry electromagnetic mechanical thermal that run in the loop until we find with a genetic optimizer a set of parameters that minimize all the objectives and
The optimization execution is actually quite simple. So we execute models of this By nbsc app is so we need one line per model to retrieve figures of merit Then they are returns dictionary. So put them together Collect also some artifacts that we may use for further processing and it calculates course one value that represents the design
And we then take the figures of merits course and artifacts and return it as as a result Once we have it We can run and the run is
Very similar to what we see so far. So we have Different solvers for different models. They have small API calls The optimization notebook is performing by nbsc query pulse to each of the models combines them into one computes score and that score is then stored in the cache and
database and so that after an optimization run we can retrieve it from a Simulation of a cockpit of Of the optimizer which in itself is a notebook and that's that's the cockpit where we can see the progress of
optimization if we click any dot we can see a cross-section of the magnet that was Used for this optimization and was the best individual in this Genetic optimizer we can see some parameters for that
Model and we can also see how the design variables compared to the best in Individuals so far and this is a class to information First of all, if we are hitting the limits and maybe we should expand the design space in that variable Or we are somewhere in between and we are still exploring that printer space
So putting it all together with nbsc we could provide query mechanism for models so that Physical connections physical interfaces that exist in a particular system are represented
in the design phase When we use computer-aided design With queries and that is a very clear interface between these tools different components of accelerator can query each other with that tool and Also within one system different models can query one another and
We know that the particle accelerator is a system of systems implemented as multiple project remote queries for data exchange and one system like a system of components that are implemented in a single project with local queries for Data exchange and one important part in a design is versioning so
here we rely on GitLab which is a standard way of keeping track of information where we can have reference of our main branch of the design, but also certain sub branches for particular versions of a design which can be updated and we can keep track of that or
sometimes to keep Information about main main changes main Main versions of our our design and for each of these dots We can run a continuous integration pipeline and at the end produce this report so that every change
We can keep track of what was done and maintain also that book as an artifact in GitLab so that Anyone can access it rather quickly Okay So to conclude the PyMBSC framework, which is a Python implementation of
model based system engineering concepts It's quite generic. It's not only course work like magnets but can be used in other engineering fields It is a set of containers with numerical models a model query mechanism with a cache database And a model view based on Jupyter notebooks and Jupyter MOOC for documentation
And on top of that we added a set of optimization algorithms interactive cockpit for multi objective multimodal optimization and one important thing is that we rely on standard open source software technology stack and
Really pay attention to testing and documenting so that this solution won't last For years as needed. So thank you very much for your attention It was great pleasure to provide deliver this talk If you have any questions, I would be very happy to answer We have a few minutes for questions
Would be to ask you if you could expand a little bit on the motivation for this project I'm curious you know how expensive these model API calls were to justify this caching caching approach and secondly why Develop or choose pi MBSC and was there no other package available for example
uh-huh, and So on the first point, yes, so some models may take a few hours to run For more advanced if you look at the party or on the peak
accelerator study this can be a few days and And sometimes the system did not change so in this case we will chance rerun something that you know the output for so here is a big gain and as for
Having already a Package library for that we of course did a search and try to see what's out there And we did not find The tool that would exactly match our our requirements there is a
Cool and develop in Java, so that would require us to then go for via py4j and Maybe complicate our our stack Plus did not support natively you know books and then that was our our one of our requirements rely on those
Okay, thanks. We've a one more question in the room Thank you for your talk. It's really nice to see how this works in action So the examples we've seen have a very strict order in which the models are run Are there more exotic examples where you iterate over two models that feed into each other?
And you try to get them to converge or you have multiple versions of a model side by side Uh-huh, yeah, thank you. That was a good question So when two models require one and
This is called co-simulation so We indeed run one and the other and to resolve that We would have a model which is co-simulation calling to internally So that way we we can run the
To and respond So you encapsulate them in another model exactly, right Yeah, and run until conversion. Thank you Yeah Okay, so it looks like we have no more questions, so once again, thank you Michael for your interesting talk