We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Towards a Guideline Affording Overarching Knowledge Building in Data Analysis Projects

00:00

Formal Metadata

Title
Towards a Guideline Affording Overarching Knowledge Building in Data Analysis Projects
Title of Series
Number of Parts
30
Author
License
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Tight and competitive market situations pose a serious challenge to enterprises in the manufacturing industry domain. Competing in the use of data analytics to enhance products and processes requires additional resources to deal with the complexity. On the contrary, the possibilities afforded by digitization and data analysis-based approaches make for a valuable asset. In this paper we suggest a guideline to a systematic course of action for the data-based creation of holistic insight. Building an overlaying corpus of knowledge accelerates the learning curve within specific projects as well as across projects by exceeding the project-specific view towards an integrated approach.
Keywords
Computer animation
Computer animation
Computer animation
Engineering drawing
Computer animation
Transcript: English(auto-generated)
I'm a Ph.D. student at the University of Applied Sciences in Zwickau and at the Technische Universität Dresden, and today I would like to present a paper that I created together with Bobo Theer-Schneider, also from Dresden, guided as Towards a Guideline
of Fording Overarching Knowledge Building in Data Analysis Protects. My presentation is structured as follows. First of all, I would like to say a few introductory words about the motivation and the derivation of the research question. Then I will say something about the theoretical foundries of sense-making and the knowledge-making
we have derived from it. In the third part of my presentation, I present the artifact we have designed, our reference model, and then I close with a small discussion and outlook.
The major purpose of this long-term design science research project is to elaborate methodological support for data-driven knowledge extraction projects in the manufacturing industry. Therefore, our main objective is to help artifact users gain a sophisticated
understanding of the principles by which to conduct data-driven knowledge extraction projects, then to reduce the associated hurdles for manufacturing companies, the domain, and to create a basis to address and solve them in the future in a repeatable manner.
The addressed application field of the presented reference model comprises tasks in the industrial domain, which require a high level of innovation and are conducted in a
or research project comes from specialists. Specialists dealing with data analysis projects in the industrial domain face the necessity to cover the methodological
skill set required in data science, as well as a deep understanding of the domain fundamentals to consider relevant casualties and interactions, and to propose fully derive and interpret results according to their context.
Relevant, a pre-study in form of an explanatory study with six qualitative expert interviews aimed to identify the challenges that appear while setting up a
data-driven knowledge extraction process confirmed these hurdles. The interviewees expressed their wish for more structured and guidance in data analysis projects, while they found existing standard processes too generic to apply for
their domain, as well not sufficiently considering real-life problems like the data acquisition, data quality, and operational data processing. The main hurdles are the
intricate communication between domain experts and data scientists, the scarcity of human resources for data analytics projects, and the lack of domain-specific standardized procedures, which lead to a singular quality of the execution and the use of results in
this data-driven analysis. These shortfalls especially hold true where a limited number of experts must realize on one side data analytics projects next to revealing work
tasks as in the case of in small and medium-sized enterprises, in startups, and in the R&D or planning department. So this leads us to the formulation of our topic towards a guideline affording overarching knowledge building in data analysis projects and the artifact creation
in the form of a reference model follows the design science research paradigm of Hefner to address the relevance, design, and rigor of the development artifact we follow Hefner too,
and our step-by-step approach follows the steps recommended by Pefers for CSR research. Therefore, our research question is driven by the following research question. How can a reference model be provided for complex tasks in the industrial domain,
which provides methodological support for the data-driven construction and utilization of the overlaying corpus of knowledge? We use the sensemaking foundations for our knowledge-making approach. The concept of sensemaking is originated in social psychology and describes
how human beings in a social setting derive understanding of their surroundings by combining various information, creating connections, and finally adding their own reasoning on it. The literature includes five key activities, you see on the left side,
which constitute the making of sense and thereby act as design goals for our developed reference model, and the developed framework should not only support the understanding of facts and the creation of insight but also their use of internal and cross-project improvement.
Therefore, another key activity is needed. By including the creation and use of a knowledge base, we want to establish a link to the field of knowledge management and thus create the
notation of knowledge-making. The term knowledge-making is intended to emphasis the creative, intuitive, and iterative nature of the approach, which is oriented towards human behavior and its underlying cognitive and social processes. And resulting from this,
we formulated our design principles, presented the reference model based on the derived knowledge-making key activities, you can see on the right side. To represent and reduce reality and to create an understandable formulation of complex facts
for a class of similar problems, a reference model is provided. We based our approach on three widely established concepts, the standard procedure models, the concept of data aggregation, and the field of knowledge management.
We attempt to provide the means for the effective combination and domain-specific adaptation of these concepts, while additionally overcoming their shortcomings. We especially want to emphasize the importance of considering the various aggregation levels
in which information fragments can occur in, calling attention in particular to the intense interaction of all five levels—you can see later—of aggregation, implying the necessity to expand awareness to each of them and their interrelations within each step of action.
Relevant objects within aggregation level 1, the analogous level, can be controllers, motors, GPS trackers, and sensors or transport systems combined by the respective digital
counterparts in AL2, the representation level, like output data of controllers, performance data of motors, GPS data, and other sensor data. Furthermore, AL2 addresses additionally
descriptions of the target systems as ET conceptual models. Within aggregation level 3, the transfer level, a suitable concept must be chosen to gather, process, and contain any
relevant information fragments to transfer them to a higher level of aggregation and the reef end utilize insight. A suitable concept can be an enterprise-specific analysis framework and individual adoption of the data mining standard processes, like ASUM-DM or QSP-DM.
Within aggregation level 4, the implementation level, we found facts and interrelations and implemented by integrating the derived insight within physical instantiations, like digital
shadows or digital twins, simulation, or visualization. And in LI5, we can see the information level. This is our knowledge base. This can take many forms, like classical SQL databases or ontologies. Attempting to recap the essence of the various data mining
procedure models where the reef generalized version of data mining project phases can be seen in our general procedure model. Based on the specification of the analysis project goal
in phase 1, a conceptualization phase follows in phase 2. The data analysis core activities are performed in phase 3 and 4. And phase 5 builds on the previous phase and should be carried out in parallel as it provides the methodological and meta information
of the data analysis project and includes monitoring and implementation during and after the project. Now we combine the aggregation levels and our generalized procedure model
to the model. And you can see AL1 and AL2 supports all the keys. The keys are on top. And they interact with each other. All the data from interdigital sensing of AL1 and AL2 feed the AL3 over all keys too. And then from AL3, the information
go to the AL4 and AL5. You see AL4 and AL5 interact with each other too.
Using an example, I would like to explain the interaction of the application level and the procedure level and show the most important aspects per level. Imagine you want
to estimate the duration of an individual logistical process in a project. That's the task. You are doing this for one of a manufacturer who hardly has any routines and cannot use the knowledge from all projects one to one. And first of all, you check in level 1 which
data you need and which sensors can record and provide this data. For example, GPS sensors and timestamps are important for you. In level 2, you use software like Tableau to digitally
represent the real life. And then in the third level, you derive measures and rules based on the findings from level 2. In this step, the actual data analysis or data mining takes place. For example, you use Assume-DM or CRISP-DM or CAD-ED or the data
mining process. And here, for example, you carry out cluster analysis or case-based reasoning in order to identify similar old cases and to draw a conclusion about the duration of
new processes to be planned. The findings from 3, e.g. the analysis and the rules, flow into 4 and 5, after which a digital twin can be built in AL4. And in level 5,
the knowledge can then be made available, for example, in form of an ontology. Oh, I'm over time. With this contribution, we create a framework supporting the systematic data-based creation insight. And future work will be devoted to the demonstration, evaluation,
and revision of the concept in practice. Future work will comprise the development of taxonomy of methodological principles. And the last one is the integration with standardized
approach, like the reference architecture model industry 4.0, or with data management aspects,
like the data lifecycle approach, can create synergies and add a helpful dimension to support the organizational implementation of the suggested method within enterprises. Okay, that's it. Do we have questions?