We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

ediarum - from bottom-up to generic programming

00:00

Formal Metadata

Title
ediarum - from bottom-up to generic programming
Title of Series
Number of Parts
60
Author
Contributors
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
With ediarum, the TELOTA group of the Berlin-Brandenburg Academy of Sciences and Humanities has been developing a working and publication environment for digital scholarly editions for several years. In contrast to similar projects, a bottom-up approach was chosen, i.e. the research environment was developed for the needs of individual scholarly editions projects. Due to the success of the research software, ediarum is used for more and more editions. This led to the change of the development model to a generic approach. The advantages and disadvantages of the development approaches and the organizational structure are discussed in the article.
6
Thumbnail
15:56
53
Thumbnail
22:03
Computer programmingSource codeDigital signalJames Waddell Alexander IIFile formatCodierung <Programmierung>Single-precision floating-point formatIndependence (probability theory)SoftwareTelecommunicationExtension (kinesiology)SoftwareSoftware developerField (computer science)File formatPairwise comparisonProjective planeVideo projectorVideoconferencingDigitizingContext awarenessCASE <Informatik>File archiverEquivalence relationModal logicDecision theorySource codeDifferent (Kate Ryan album)FeedbackResultantDatabaseLetterpress printingPresentation of a groupPoint (geometry)Multiplication signAdaptive behaviorWebsiteWeb 2.0Extension (kinesiology)Form (programming)Focus (optics)TelecommunicationPerspective (visual)1 (number)InformationFrequencyAuthorizationSingle-precision floating-point formatGoodness of fitLibrary (computing)Stability theoryAdditionGraph coloringWordCodierung <Programmierung>HypermediaOcean currentComputer programmingKey (cryptography)Total S.A.
Extension (kinesiology)TelecommunicationSoftwareDigital signalGreatest elementGeneric programmingComputer programmingCore dumpData modelModul <Datentyp>Probability density functionData managementDatabaseInterface (computing)outputLibrary (computing)Module (mathematics)ImplementationProcess (computing)Variable (mathematics)MathematicsHuman migrationLatent heatPrototypeChemical equationSoftware maintenanceText Encoding InitiativeCodierung <Programmierung>Uniform resource locatorDigital object identifierUsabilityBridging (networking)CodeComputer programmingData structureCondition numberHuman migrationProjective planeAdaptive behaviorMultiplication signSoftware maintenanceSoftware developerImplementationStaff (military)Latent heatConnectivity (graph theory)MathematicsGeneric programmingText editorAreaSoftwareLetterpress printingData modelAdditionProgram codeModule (mathematics)System callPoint (geometry)CodeComputer fileElectronic data interchangeExtension (kinesiology)Generic programmingFocus (optics)View (database)Performance appraisalVariable (mathematics)Set theoryTask (computing)DigitizingGreatest elementSource codeProcess (computing)Compilation albumProbability density functionStandard deviationDatabaseInterface (computing)outputPrototype1 (number)Basis <Mathematik>Error messageChemical equationType theoryPRINCE2Data managementModal logicLine (geometry)Presentation of a groupIdentity managementGenetic programmingLibrary (computing)Complete metric spacePrisoner's dilemmaIndependence (probability theory)Arithmetic meanDifferent (Kate Ryan album)Core dumpSoftware bugResultantSimilarity (geometry)Operator (mathematics)Flow separationDigital mediaGraph coloringVideoconferencingModulare ProgrammierungComputer animation
Transcript: English(auto-generated)
To introduce myself, my name is Martin Fechner and I'm working at the digital humanities department at our institution that's the Berlin-Brunberg Academy of Sciences and Humanities. We're going, our institution is going to make
ground research in the field of humanities and our department digital humanities are helping all the research projects with their research and bringing it to the digital. This lecture deals with a software project which
was developed in the context of scholarly editions. To say what are scholarly editions? Scholarly editions make historical sources such as letters, diaries, etc. from archives accessible to researchers. For this purpose the sources
are transcribed and commented by the researchers and at our institution the Berlin-Brunberg Academy of Sciences and Humanities there are a lot
of such editions like on philosophers like Leibniz and Kant and on other famous personalities such as Karl Marx or Alexander von Humboldt who has written a lot of letters and papers which are not published yet. So our
software project which we build now establishes a digital workflow for the scholarly editions and this is known as digital scholarly editions.
Digital scholarly editions fellow Genuine Digital Paradigmum that means that digital thought not only as a tool, we don't only build a tool, but as an independent method. It's a thinking of how to establish the editions and
digital scholarly editions are usually in our field are encoded in the XML format and according to the guidelines of the text encoding initiative which is a standard. And the aim for our project or at our institution is to
make the edited historical sources available for subsequent use and to be able to link them with further information and to link them to other
databases. So that's to bringing it to the digital not only the books who are in the library and you can read but you have all this stuff and other databases you can use. A further goal that it's from our perspective more from
the technical one is the single source principle that means to be able to generate everything from one data source in this case the XML documents and both web publications and print publications are relevant presentation
formats for scholarly editions now. So our initial situation in 2011 and 12 was as follows. It was clear that we needed a new workflow for many
projects in the house. The previous one no longer met the requirements for digital scholarly editions. Before that the main focus was on scholarly editions printed ones but we had to produce digital websites and more. So
the edition project Schleiermacher in Berlin I have to say Friedrich Schleiermacher was a famous Berlin theologian of the 19th century and this edition project offered at that time 2011 the opportunity to implement a new
workflow. At that time however there was no software that would have met the requirements of the edition projects at our institution for the creation and publication of digital scholarly editions. Furthermore we had only few
resources like it's usual in the science or in humanities. We had only few resources that we could use to create a software solution. Consequently the decision was made not to try a completely new development like
we had heard in the video talk before but to use our resources carefully and to use existing software and further develop it where necessary. In our case the XML database exists DB, the XML editing tool, Oxygen, XML author
and the typesetting engine context. And this approach led to success. We
were able to create a software solution that we now call Ediarum and this solution still supports the workflow of the project today, relies on sustainable data format, the XML format and makes all desired forms of
publication on the web and in print possible. So this successful digitization project was characterized by the following key points. We use a bottom-up approach that means we had a concrete development for one existing project
that was the Schleiermacher scholarly edition and we had to provide more resources for development than was usual for previous projects. For this
digitization project we had about two developers with one full-time equivalent over a period of about half a year to a full year and for comparison previously we planned only with periods of a few weeks or months
for smaller projects. We had a close communication with the research project or the scholarly edition project and we made use of existing stable software with good support, which was the key. We don't need it to program
everything ourselves and we developed adaptations and extensions of this different software solutions by our own solutions. And as a result of this pilot project there was a very positive feedback. We took this
feedback as an opportunity to immediately transfer our development and software concept to another research project. And that was the project
Commentaria in Aristotelum Graeca et Putsantina. It's a manuscript research project there we started from scratch and put together a software package similar to the one for the first pilot project. This project was
also subject to the key points which I mentioned and was very successful. We moved one step closer to our goal, the digitization of the scientific scholarly editions at our institution. But nevertheless with the success grew
the desire to use Idiarum, how we called it then, our software solution in many other projects. But we were only able to realize one or two new projects at the same pace. We had to be more productive. And at the same
time we discovered that we often reused the code already developed for one project in another project. And this copy and paste approach
accelerated the implementation but also led to existing errors being duplicated. That means we had if we found an error in one project we had to remove that error in all our projects. And that's a work we don't
want to. So the operation of several software solutions for individual projects including further development and back fixing as well as the desire to be able to implement new projects easily led to the consideration of
restructuring our development process. As a result we no longer programmed for each project alone. That's a bottom-up approach I've talked about but we
switched Idiarum to a generic easily adaptable program core. So the concept of Idiarum today includes common core components which are used in all projects. The new features are implemented and bug fixes are
made in the core components from which all projects profit at the same time. Furthermore there are project specific extensions. That means for each
project there's its own program code that supports the special requirements because the projects we have are very different and heterogeneous so we have to have a specific code for each project also. The basis for this
approach was standardization of the data model which is used by the project. At this point one common data model for all projects, even different ones,
cannot be emphasized clearly enough. This is because all Idiarum modules built on the data model in one way or another. It's the basis of all software and a generic development of Idiarum requires a standardized data model. So 2015 the standardization of the data model took place first for a
certain project type of which we have many projects that are modern German editions. There are different editions too but we have a lot of the scholarly
editions for modern German and based on this several Idiarum modules were developed which take over individual tasks within a digital scholarly
editions. There we have IdiarumDB that provides possibilities for project user and data management within the XML database. There are IdiarumBaseEdit that extends the XML editor with the necessary features to provide the
researchers with a meaningful data input interface because the researchers don't have to learn the technical stuff behind it, they have to do their research and to read the manuscripts and to transcribe and
comment them so they don't have the time to learn all the technical stuff. We have IdiarumPDF that contains the program code to generate a PDF from
the XML files via the typesetting engine which follows the layout of common print editions of scholarly editions. We don't print that PDF to a book that there are the publishers who does that but the researchers wanted
to have a PDF to read it their editions and to correct it. And then we have of course IdiarumWeb that contains a program library with which digital presentation for digital scholarly editions can be created without
much effort. Here I have our structure visualized with different layers and the different modules represent the different layers of a digital edition.
You see at the bottom line the data model and the layers above are the data itself, the management and the publishing on presentation components. And we have in color are the core components and then we have here in
gray project specific components. And our development workflow today is that follows. If there is a new feature request from one project and we check to
what extent to this requirement also exists in other projects. If it is only a project specific need the implementation takes place in the
project specific extension of Idiarum but if there are similar requirements in other projects the generic development process starts. That means for the development specification all similar requirements of the different
projects are brought together and sometimes during the implementation the possibility must be provided that the feature can be adapted project specifically. So the usually done this by integrating variables in the code that then are defined in the project specific extension. The generic
development approach has enabled us to strengthen and continuously develop the core components which is good for all projects and furthermore it's
relatively easy for us now to set up new projects and make them ready for work. Finally the project specific components as in the bottom-up approach make it possible to implement specific requirements for individual projects. So I come to the conclusion and summarize the advantages and
disadvantages as well as the path of a change from the bottom-up to generic programming. First let us start with a path towards move from bottom-up to
generic programming. First experience will be gained in individual pilot projects and we saw that more resources than usual must be made
available for these pilot projects. They have to be well developed and you don't have to short the resources down. And then three conditions must be met before a change over to generic programming can take place. First the
pilot projects were successful, yes, that's clear. Then further projects are to be implemented, that's clear, and a common core such as a common data model or repeating program code can be identified. So the next step in the
changeover is the development of a core component without project specific requirements. The new work you have to be, which has to be done. And the implementation of the generic components for concrete projects of the
next step. And here the last point is, and that's very important, you don't have to forget the migration of the pilot projects to the generic components because you have developed a new structure then and the pilot projects
are still individual projects, you have to migrate them. And once all this has been done, the further development and maintenance of the software can take place in the generic program, as I have described. This
approach requires addressing the following challenges. We heard in the in the video talk before, in science there's only a project financing, that
means a generic development as I have described here, must be project independent and requires additional funds of its own. Because you can't fund it out of the projects. The next one is the generic development and the
project specific development compete with each other. That means that the completion of important milestones for the projects and you have these milestones for evaluations or publication dates etc. And that leads to
postponement of necessary generic development. And that hurts a lot because you see the importance of the generic development and a lot of projects want special feature but you have to reach a milestone for one
project and you can't do the other development. The next one is when switching from individual projects to a generic approach, similar but not identical program code must be brought together and this can be difficult and
as well can mean additional work. Migrating the pilot projects as I have said and other existing projects to the new approach can be very time-consuming.
This improves the sustainability of the project which is good but there are no visible new features for the project which is bad for the evaluation and the outcome of that specific project. But nevertheless this approach which I have
described has proved successful for us because we see the following advantages. The first prototype is ready for use more quickly in the bottom-up
process because at first only the requirements of one project have to be considered. You don't have to think about all the stuff you can focus on one project. And the goals of your software are supported by the strong focus on
the projects. You don't lose the contact to the project, you communicate with the projects very very close and you know what they want and what they need. And the software is strongly oriented to the concrete needs of the
users. The software is not designed too theoretically. That means the danger of developing without the real needs is much lower than in a generically planned software if you only have the general view. So also not
too much is developed. The necessary features are implemented but no unnecessary ones. The use of generic core components and project
specific extensions will achieve a reasonable balance between desired standardization and required project specific adaptations. And the last point is the generic core components, that's very important, significantly
simplify the maintenance of many projects. And we have a lot of projects we have to maintain at the same time and we don't want to have different software code beneath it. So if you interested in our software solution
here some some further information and by that I thank you for your attention.