Research data and mathematical modeling
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 20 | |
Author | ||
Contributors | ||
License | CC Attribution - NoDerivatives 3.0 Germany: You are free to use, copy, distribute and transmit the work or content in unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/35361 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Place | Leipzig |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
Leibniz MMS Days 20185 / 20
00:00
Mathematical modelMathematicsMathematicsComputer animationLecture/ConferenceMeeting/Interview
00:22
Mathematical modelBasis <Mathematik>Direction (geometry)Projective planeComputer animationLecture/ConferenceMeeting/Interview
00:42
Basis <Mathematik>Mathematical modelProcess (computing)Physical systemMathematicsNatural numberNatural numberMathematical modelPhysical systemMathematicsBasis <Mathematik>Projective planeFormal power seriesProcess (computing)Descriptive statisticsDirection (geometry)Cartesian coordinate systemAreaPhysicalism
01:49
Mathematical modelPhysical systemProcess (computing)MathematicsNatural numberPhysicalismMereologyAreaSocial class
02:10
Mathematical modelHydraulic jumpMathematicsVolumePhysicsMortality rateAerodynamicsFluidNumerical analysisStatisticsMereologyMathematical analysisStatisticsDynamical systemPhysicalismGrothendieck topologyMathematicsModulformMechanism designCartesian coordinate systemAreaVolume (thermodynamics)Product (business)Geometry19 (number)Pulse (signal processing)Numerical analysisAlgebraComputer animation
04:52
Mathematical modelSlide ruleProcess (computing)MereologyDynamical systemComputer animation
05:18
Mathematical modelTheorySet theoryMathematical modelProof theoryStatistical hypothesis testingTheorySpectrum (functional analysis)Different (Kate Ryan album)MereologyCartesian coordinate systemGenetic programmingSet theoryMathematical modelProcess (computing)Directed graphComputer animation
06:36
Special unitary groupMathematical modelAerodynamicsComputational physicsNumber theoryFluidMassNumerical analysisCone penetration testStochastic kernel estimationResultantSparse matrixMoment (mathematics)Projective planeFluidStatisticsDynamical systemLink (knot theory)PiComputer animation
07:46
Maxima and minimaTime domainMathematical modelAerodynamicsFluidMassNumber theoryNumerical analysisCone penetration testStochastic kernel estimationComputer animation
08:25
Mathematical modelTime domainRenewal theoryMaxima and minimaHeuristicMoment (mathematics)Equaliser (mathematics)Numerical analysisComputer animation
08:49
Mathematical modelNormal (geometry)Link (knot theory)Computer animation
09:35
Distribution (mathematics)Projective plane
09:59
Mathematical modelSequenceMathematical modelSign (mathematics)Parameter (computer programming)Mathematical analysisPresentation of a groupInterface (chemistry)Equivalence relationMathematicsMathematical modelMathematicsMathematicianPhysicistMathematical modelParametrische ErregungWell-formed formulaGenetic programmingDescriptive statisticsDifferent (Kate Ryan album)Projective planeDirection (geometry)Flow separationMoment (mathematics)Standard deviationAlgebraic structureResultantDiagramPoint (geometry)Group representationMereologyGrothendieck topologyParameter (computer programming)2 (number)Goodness of fitProfil (magazine)Positional notation
18:26
Mathematical modelOpen setLecture/ConferenceComputer animation
Transcript: English(auto-generated)
00:00
So, the aim of my talk is quite different. I would like to give perhaps a bit of our view to the MMS network and make some little promotion, but it's clear to me that it's not such a big thing.
00:21
Perhaps you know our software database, SVMath, or the mathematical database CBMath, which is a successor of the former Zentralblatt. Why do I do this? Because within the network we would be a possible partner for projects in the direction of information infrastructure.
00:49
Perhaps this is a basis for discussion on former projects. The title was mathematical modeling and research data.
01:02
The first question, mathematical modeling, I think is no question to you. I asked Wikipedia, and they say it's a process of developing a description of a system using mathematical concepts and language. Mathematical is our topic, so I asked myself where it is used.
01:25
There are three main application areas, natural sciences, social sciences, and engineering. I'm an editor of Zentralblatt and I think in MSC codes.
01:42
Natural sciences are physics as well as geosciences, physical areas, and biology. Social sciences are a part of MSC class 91, and engineering is a general applied theme, and especially computer science is a part of it.
02:10
If you haven't seen it, the CBMath entry site, the question to you perhaps may be, if you use it or not, why do I stand here?
02:29
We have asked since the beginning of Zentralblatt, as well as the former
02:40
JABU, which is also integrated in the corpus, what is the role of applications? Because we are a mathematical database, but always applications are necessary for promoting mathematical ideas, developing new mathematical ideas.
03:03
Applications since the beginning have played an important role, also in the 19th century. The JABU, which is digitized and a part of CBMath since 2004, had in all its volumes about 25% of its pages on mechanics, physics, geodesy, etc.
03:29
Also now, Zentralblatt has more than half of its entries from applied areas.
03:45
I've made a little statistics. This is publication year 2016, and you see the parts of our database publications from this year.
04:03
It's about 110,000 documents, and it's sorted by the MSC codes. It begins with general mathematics, education, algebraic areas, then the biggest part is on analysis and geometry.
04:21
Then I would say there's the border to more or less applied areas as probability and statistics, numerical analysis, computer science, physics, especially fluid dynamics, which may be the biggest part of it,
04:40
and other areas of applied mathematics, which contain also social sciences, business, finance, and so on. If you search for fluid dynamics from the year 2016 or MSC code 76, you find about 4200 documents.
05:06
What this slide shows, we are a document-based reference database, so it's maybe not the thing you need. As we know that if we talk about research data, then publications are only one part of the research process.
05:27
The spectrum today is broad. You have models, theories, applications, you produce data, make simulation, parameterize models, test data sets, and so on.
05:46
The big part is software with source codes, programs, libraries, and so on. What we think you need is not only a publication-based infrastructure, but
06:03
a more differentiated possibility to have references, for example, to software or to models. Publications nevertheless are containers for research data. We can try to extract from
06:28
the text in more or less natural language symbols, theories, proofs, and methods. An example, and it's the result of a project also, is the Software Database SVMath.
06:48
It's an open-source database containing at the moment more than 20,000 software packages. That means R packages for statistics or references also to big software like R or Python or Mathematica or whatever.
07:13
There are references to SVMath articles, so you can see also how a software package is used.
07:23
If here I'm looking for fluid dynamics, then you find about 640 entries, starting with names, sparse matrix, and so on. You can click and have a closer look. You will find information, links, and see the references within SVMath.
07:47
Then the question is, what could we do besides that? One important thing are these references.
08:01
That means references to SVMath articles mostly come from references within the articles itself. We can look within the reference texts which software authors are using, because typically they cite the article where a software was presented.
08:26
My colleague Olav Teschke asked himself what to find besides software within the references we have. At the moment only a number, there are 26 million references in our database.
08:49
He had a look. Normally a reference has an author, title, and a source. If this is not the case, you have other things. For example, software names or URLs or whatever.
09:09
The interesting thing was that there are many dead links, starting with HTTPS and leading to nothing.
09:25
Or it was too old or whatever. It would indeed be better to have something persistent. The second most thing was software. We try to handle software.
09:46
The third thing are institutions or projects or funding information or whatever. These are not interesting if we have research data in mind. The problem is the rest is quite small.
10:07
The result was nothing else besides software can be found really interesting within the reference texts. If we want to extract other structural information from texts, we need full texts or whatever.
10:28
I think, and there are talks in this direction, for example, of models. As you all in the network say, okay, I have this problem and I use this model and then we have this solution.
10:48
It would be very nice to have a database for models, but this is a big thing. How to identify such things if we don't have information quite good structured in references, for example.
11:03
The easiest thing or the first idea would be the name. That's indeed a quite good idea. If we know the name, then we can look within the database where it comes. You could see how it is used or other things around it and so on.
11:31
Therefore, we need some manpower. The second was references. That's good for automatic methods to identify things and a positive example is SVMath.
11:48
Other methods are a question. One idea could be by looking for formulae as we have made a first step in this direction by formula search.
12:07
What is possible in SVMath within the database and especially also in archive full texts. There are many problems and especially if we are looking in the history, you see there are different descriptions and different terminology.
12:31
Historical and also between several schools or several subjects. People from different directions speak another language.
12:45
Symbols are different if you ask a physicist or a mathematician for example. Formula, notations and even names. The next thing is that we would need full text.
13:05
References are quite easy to have. Some publishers make problems but we try to have a big corpus. If you are looking for model names, normally they are not mentioned within the abstract or within the references.
13:33
The third thing is that data may be incomplete, especially parameters are missing and others.
13:45
To say at all, I would say there is no standardization and you can't cite a model at the moment. You can't say, okay, I use model 40510 with this and this parameter but this would be nice.
14:05
This leads to repeated reinventing the wheel. We think from publications, that's our point of view. If we want to try to expand the database, they always would be the starting point.
14:30
There are two talks I want to mention. The first one by Wolfram Sperber, my colleague from Berlin, tomorrow. He will give you a short idea about ESV math.
14:43
And to Thomas Koprocki, he speaks about model pathway diagrams, which is a possibility to structure models in a general way. This would be one possibility to build up such a database for models, for example.
15:13
One project idea would be to identify models and to build up such a reference database.
15:20
So that it's not necessary or that it's possible to find and reuse existing models and especially to reference them. So that everyone knows, okay, this and that and you don't have to read two pages to understand which
15:41
model he's using and if this, what the publication uses, it's the same model as you want to use. Therefore, we have to analyze and collect existing information about these models from publications and to develop concepts for semantic representation.
16:08
Fritz Karlsruhe has some experience in this direction. The one thing would be the publication site from CB Math.
16:21
It's a structured information. We have keywords MSC and so on. We know how to aggregate also other information. And one positive example for another idea in this direction was SB Math planned for some thousand software packages.
16:48
That was the first idea and now having more than 20,000. Yes. One problem may be full texts, but it would be possible, for example, to start extracting information from full text from the archive.
17:10
We do this also at the moment, for example, for our formula search.
17:24
The other thing, if we think of further information from other journals, this would be not that easy, but there's always the problem what is mathematics and what not.
17:42
At the moment, it's quite strict and many applied mathematicians think, okay, if they look in their profile in CB Math, there are not all publications contained.
18:01
We would need more journals from applied mathematics, but we could not handle them as the core journals at the moment.
18:21
If there is the need, we should think about that. Okay. Thank you and I'm open for questions.
Recommendations
Series of 22 media
Series of 13 media