New Approaches towards User Research and Software Architecture in Research Software Engineering: A Humanities Example
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 60 | |
Author | ||
Contributors | ||
License | CC Attribution 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/42509 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
6
13
21
25
41
53
00:00
Sima (architecture)Partial derivativeAerodynamicsPartial differential equationSoftware engineeringProjective planeFood energyLecture/ConferenceComputer animation
00:42
SoftwareSoftware engineeringSimilarity (geometry)Asynchronous Transfer ModeState of matterDatabaseMultiplication signComputer animation
01:27
Similarity (geometry)Asynchronous Transfer ModeGraphical user interfaceProjective planeUsabilityMultiplication signState of matterPhase transitionComputer animation
01:50
Graphical user interfaceProcess capability indexGraphical user interfaceDatabaseProjective planePhase transitionPhysical systemWeb 2.0Cartesian coordinate systemInformation retrievalWebsiteField (computer science)Computer animation
03:02
Performance appraisalSoftwareArchitectureCellular automatonProjective planeSoftwarePosition operatorPoint (geometry)ArchitecturePerformance appraisalDatabaseSoftware architectureComputer animation
04:02
Hand fanNumberWordAreaDatabaseGroup actionArithmetic meanComputer animation
05:06
Link (knot theory)Sign (mathematics)Graphical user interfaceDiscrete element methodMultiplication signComputer animation
05:37
Software engineeringInformationQuicksortObservational studyFile formatGenderMultiplication signSoftware engineeringComputer animation
06:47
Software engineeringMathematical analysisImplementationMIDIComputer-generated imagerySoftwarePerformance appraisalMassCodeProjective planeGraph (mathematics)Performance appraisalGoodness of fitMultiplication signDatabaseFile formatMereologyDivisorUser interfaceProduct (business)Process (computing)Instance (computer science)Query languageLibrary catalogDigitizingSoftwareUser profileCodeHand fanPoint (geometry)Focus (optics)Sinc functionComplex (psychology)Cartesian coordinate systemDigital rights managementServer (computing)Field (computer science)Computer programmingRevision controlLattice (order)Maxima and minimaParticle systemPhysical systemPhase transitionPrototypeSpacetimeOpen sourceTheory of relativityLinked dataCASE <Informatik>Software maintenanceFactory (trading post)Computer animationDiagram
12:01
Position operatorService (economics)Run time (program lifecycle phase)MathematicsPoint (geometry)Web applicationSoftware engineeringSoftwareWeb 2.0Computer configurationRevision controlSoftware developerComponent-based software engineeringWordPosition operatorMultiplication signSoftware architectureDatabaseData miningOpen sourceDigital mediaCartesian coordinate systemProper mapRight angleAndroid (robot)Data structurePhysical systemState of matterTemplate (C++)MereologyForestOrder (biology)TheorySinc functionProjective planeSet (mathematics)Computer animation
17:07
SoftwareFront and back endsTrigonometric functionsPosition operatorRepresentational state transferArchitectureDatabaseSoftwareArithmetic meanFood energyExtension (kinesiology)Cartesian coordinate systemCanadian Mathematical SocietyComputer animation
17:31
User interfaceOpen sourceGraph (mathematics)DatabaseRight angleRepresentational state transferFront and back endsProjective planeLecture/Conference
18:25
Front and back endsSoftwareArchitectureRepresentational state transferInternetworkingPosition operatorProjective planeQuicksortCodeComputer animation
19:16
Standard errorPrototypeInformation systemsProcess (computing)Projective planeWebsiteCartesian coordinate systemInformation retrievalData miningPrototypeComputer animation
20:24
Discrete element methodStandard errorPrototypeAmenable groupData miningMereologyTouch typingComputer animationLecture/Conference
21:09
Standard errorPrototypeProjective planeRun time (program lifecycle phase)CausalityInsertion lossMultiplication signMereologyPhase transitionStatisticsPhysicalismCartesian coordinate systemLatent heatCodeMathematical singularityState of matterDesign by contractPosition operatorProcess (computing)Frame problemDivisorGoodness of fitUniverse (mathematics)Incidence algebraFocus (optics)DigitizingBus (computing)ArchitecturePerformance appraisalComputer animationLecture/Conference
24:24
Standard errorPrototypeCartesian coordinate systemSoftwareComputer animation
24:45
Proper mapData managementProjective planeCartesian coordinate systemSelf-organizationSoftware maintenanceVirtual machineFile formatSoftwarePlanningService (economics)Multiplication signTwitterNormal (geometry)Different (Kate Ryan album)Lecture/Conference
Transcript: English(auto-generated)
00:00
Thank you. And after corpus linguistics and aerodynamics and partial differential equations, I think I'm kind of the exotic bird here in the room or maybe also in this conference because I don't come from the STEM side. I'm a research software engineer in the
00:23
humanities. Well, and the humanities can be a little broad. In this project here, we are focusing on art history, so very exotic maybe. And I work for the Censors LOD project and no, it has nothing to do with the big controversy back in the 80s where the German
00:46
state tried to inquire a lot about the German population. No, it is about art historical reception data. Basically, what did artists in post-Antique times, for example, in the
01:04
Renaissance, know about art in antique times? For example, ancient Greece, so we can basically collect data to answer questions like what kind of artworks did Michelangelo see and how
01:23
did it influence his work? And this database dates back into the early 80s in the U.S. and then later migrated back to Germany and has been in a very, well, for some people
01:40
usable state, but somehow also for some people not usable state because for a long time this has been a long-running project for about 20, at least the funding phase was I think 20 years until, and then the funding ended. The people contributing to the project applied
02:05
for another grant to continue using this kind of, to continue this database, collect more data, enrich it, but the grant was rejected. And one of the reasons was, well, your application is running on a commercial system that is kind of out of date and the
02:25
graphical user interface is very difficult to use because this might look very familiar to you, like this is how maybe websites looked in the 90s or like early 2000s with a lot of fields where you can put stuff in and get stuff out. However, people have been
02:46
googly-fied and now you are accustomed to more, let me say, easier approaches to information retrieval. So what I want to talk about now is how are we going to fix
03:07
that problem with that kind of database I just showed you because we are in the fortunate position that the city of Berlin granted us another three years to come up with new concepts and recommendations, how to go from there and to apply for that kind of grant
03:26
for once more. And this is not three years, not six years, but 25 years. So we have to think about the software consequences in a more long-distance future. And unfortunately, my colleague Andrea cannot be here today who focuses on the user-centered design
03:45
in the agent art history or in general user research, but I will try to summarize her key points anyway. And later on, I will also discuss a little about the software architecture evaluation we did for this project, how do we go in the future there. So, we had this database.
04:07
There were some users and they have been used. The user numbers are not very high, but in the art historic community across the world, our database was always seen as a kind of
04:22
treasure, what you can actually do with the data, how to use that data, for example, for your own research, for publication and so on. But for some users, it was not always clear what do you actually collect, what is in there and why is it in there.
04:46
And these kind of conclusions we could only do by doing some user research on conferences, talking to art historians, what their goals were, what their actions, what their means were to do their research. And I have, can you read those papers?
05:09
It's just the notes from my colleague. And, whoops, not, but time, times are, basically times, times are changing and people
05:29
are getting accustomed to more modern solutions, to look things up on their phones and to actually want to have the data in a pure format, which in the humanities is, we're going there,
05:43
but in not every sub-discipline, we are there. So, we have talked to a lot of people what they know, what they want from us, so we can take this into consideration. But since we are only, well, we're not really software engineers per se, that's not our background,
06:07
I come from information science, my colleague actually from gender studies, so we're both just kind of wiggled in there and just learned, well, software engineering on the go.
06:25
We were kind of, well, now we have this information, what do we do with it? And since our time is short and, well, resources are always kind of small, we also thought about, well, if we don't have the expertise maybe somebody else has,
06:41
and we did a rare thing to do for a humanities project, we actually got in contact with UX designers and now we're working with her to understand our users better, to observe them and to accompany us along the process of
07:06
creating the kind of new prototypes, applications, user profiles and so on, so we get to the goal we want to go. But, unfortunately, I cannot summarize it as well as my colleague would have, so I
07:25
skip to the software evaluation part because, well, where did we come from and where do we go now, right? As I said, the software, the database is based on a commercial product provided by
07:41
a company seated in Berlin called Progamfabrik, so basically program factory, if you want to translate it literally, and they offer a digital asset management solution called EasyDB or EasyDB,
08:03
EasyDB instance has been modified to the max, so it's really hard to upgrade it to newer versions and to develop new things from there, and on top of that it's not even open source, so, well, I can understand why some reviewers of our grant application said, well, you should,
08:25
you should go open source, dude, and so what we basically did is, well, what are other people doing, right? This is the first thing you usually do when you're diving into a new subject.
08:41
We looked at our colleagues in the fields of digital humanities and art history, what kind of database solutions they were using, and they listed a few of them on the left side there, and basically we just started trying them out, and I don't mean, well, I am now a user and I click
09:03
here and I click there, no, it means actually, well, how easy is it to install this into our local like server infrastructure? Is it easily deployable? Is there well-documented code and is there long-term support? Who are the maintainers? Is there a big community?
09:22
And so forth and so on. The Software Sustainability Institute also provides a good guidebook and catalog of criteria on how to evaluate software for those kind of projects, which helped us a lot. Other factors were like, because
09:42
a focus on a project also was maybe we should go into a graph database or linked open data to also look at the capabilities of that. And, well, there were some solutions like ResearchSpace which provided, well, very, very nice user interface and queryability,
10:05
if that's the word, where you can query complex graph data in a very open format. However, there were basically, there was, well, one or two big walls in the way of actually
10:22
testing it the right way. And this was porting the data we have. Well, it is in a Postgres database, but, well, if we want to just take our data, put it into a different system, we would have to do a lot of adoptions that it actually works there. So this was also not a
10:47
good idea to actually go in the long run. Well, what could we do now? Well, just a few days ago, we had a meeting, well, and we kind of have to scrap that kind of evaluation approach there, and we somehow have to do it differently. Because, well,
11:09
the recommendations we say now have impact to them for the next 25 years. And I will, how am I doing with time? Okay. Okay. Like, if there's still time tomorrow,
11:23
I will showcase some of the system, but I think that's not the main point here. And, well, since who, maybe one or two of you have delved into like DevOps or something, and if you do that, you come across the Rugged Manifesto. And I'm usually not a big fan
11:41
of reading quotations out loud in presentations, but since this is being recorded, and so people can listen, I will do it anyway. So the third point in the Rugged Manifesto means, says, I recognize that my code will be used in ways I cannot anticipate, in ways it was not
12:01
designed, and for longer than it was ever intended. And I think we've all been there, that we've kind of used some application, not as it was intended. So totally different, differently, and I have anecdotes from the humanities. Maybe I just drift off here, because
12:21
I'm, okay. So where do I go? Yes, for example, in another project, my two colleagues of mine are working very hard on, they are creating a digital edition and geographical information
12:42
system about historic data, about the Prussian state, or more like the royals in Prussia, and what's called the important people. And since people, they wanted to start entering
13:03
data right away, they did not use some kind of database, or even like XML, or something like a structured format, no. Well, they say, okay, we want to create organic grams, what do we do? We take Excel and create clip arts. And now, if you want to use clip arts for a web application,
13:24
for example, well, at least it's XML-E, so you can go from there and start migrating the data. Or another example is people creating transliterations of Quran codices in Word documents,
13:47
using, well, some kind of template, but it's not always very pure. It's like, okay, it works, but it doesn't work really for if you want to have pure or data from the fair principles. So, well,
14:01
this is basically the problem. How do we actually go against this kind of misuse, and how do we make sure that we have the data and the applications robust enough so they last at least 25 years? Because, well, if you think back, who was computing 25 years ago?
14:28
One, one person, right? And it has changed a lot, right? Well, and so I don't know what the future brings. I can't even tell like what the new features of Android phones will be like in
14:42
the next two years. It is very hard to anticipate how technology changes and how this technology again changes user behavior. So, what are the recommendations we can actually provide? You know, everybody's very tense now, right? And there are only a few.
15:04
Well, although we have this kind of runtime of 25 years, the grant will only contain a developer position for like, what's it called, a half time or like maybe two-third position. So, while
15:22
the actual art historians are collecting and collecting data, doing their research and so on, there's this one gal or guy who has to keep this service up and running. And it is no option actually to just keep the old version of this database up and running. So, what do we do?
15:44
Okay, we could of course upgrade it to the newer version, would mean first it costs a lot of money. Well, it is a little more extensible, but we're still faced with a problem. Oops, still not open source. So, that's basically a no-go. And so, it has to be maintainable for
16:07
basically research software engineers in the humanities, who has to not only under, has to also be in the mindset of an art historian and a proper research software engineer or a
16:22
software engineer in general, you kind of have to create a more modularized approach of your whole software. I don't think that still having a monolithic software architecture for web applications is still, is not proper anymore for these kind of runtimes. And I don't
16:43
speaking anything in you here for some of you people. So, basically if we started, for example, migrating, do we have a, the point is, just once he said no, where's the, so basically
17:02
by modularized approach, I mean, just break it up into different components and then replace it when it needs to be replaced. Right now, everything is one application, like basically the database is connected to the CMS, which directly outputs the data. We have no API, no means of extensibility, but if we have to, if we would have to start
17:26
redesigning the software tomorrow, an idea would be, well, we keep the current easy to pick up and running since it is configured to enter the data properly in the current way and it works. And then we start maybe to create a REST API or GraphQL API,
17:46
talking directly to the database, providing APIs for the open source community and humanities community, and the better usable front-end for, well, the art historians and interested people in
18:09
goodbye because actually implementing the front-end is actually not the hard part, but since this is so intricate data, redeveloping all the features to
18:24
enter all the data we need in the proper way for the researchers in the project would just take another two years. So it's not a sensible approach to start there, I think. So going from there, we always have to have a mind, a kind of trade-off of sustainability
18:51
and up-to-date-ness to always take the approach of the minimal effort in the long run so it stays maintainable, but you always also want to have an up-to-date code base with the best
19:04
principles and a testable suite and such. So it's very hard to find the up-to-date-ness, to really be up-to-date with their code then. And I'm really looking forward to how this will evolve in the next years in the hopes that the grant will get accepted.
19:25
So where do we go from now? We're currently in the process of actually creating our own now project website where we will share our findings. We are hoping to get more insight about UI and UX in complicated research applications
19:44
and we will start prototyping our new research approaches or like information retrieval approaches in a few weeks with our UX designers so we maybe might be able to produce the first prototype at the end of the year or maybe since we have to consider other projects as well maybe start up
20:04
next year. And with that I want to thank you for staying this late and I'm thankful for questions that you hopefully have. Thanks. Thank you very much for the talk and we have
20:23
two questions I see. So yeah I'm working in mines in an archaeological research institute so you're not the only humanities guy here and I know all the problems you mentioned. And I just want to know one thing. You should be part of the union of academies in Germany, right?
20:43
So did you get in touch with the academy of science and literature in mines with the digital academy inside because there are a lot of people and they have a lot of ideas about all that things. Have you talked to them somehow? We have been talking
21:05
to them a lot because I'm also working with Torsten Schade and Sarah Pitroff at different projects and yeah so we are in contact. Thank you very much. One comment or
21:21
question more. I know also humanities is not like of physics where you have a lot of money but still if it's a 25-year project and just outlined how much work has to be done and you have a bus factor of less than one shouldn't you kind of say well this is actually a large project we want to have two people. Which I think is kind of impossible in the landscape of
21:46
research funding because it seems like so much work and also like a work for the actual 25 years. Yes we are still in the process on what to recommend and the current state is that they actually want to want to insert like as an like two-third position but we urge them to do
22:09
actually more because the the long tail will be that the architectures and the whole application will get even complicated more complicated in the future because I don't know what will be there
22:21
like VR, AR maybe this that might be where people are in a few years. And unfortunately like just from experience within the humanities and digital humanities world in the
22:41
the union of academies there are now projects that have been accepted for like with a full one full-time position but usually if you apply for more they just say nah because the main focus there is in those projects is not on the digital side but still on the
23:02
humanity side of identifying that the art historical data doing publications and complete a corpus of maybe let me just make something up of all antiques situated in a
23:20
specific part of Italy. So this you have to find a trade-off and but and another thing is that this person employed then might not be there for the whole 25 years since the contracts are not necessarily bound to the whole runtime of the whole project but to
23:46
phases of evaluations which take place every three or six years. So and so other people can also adapt you have to create a good kind of approach to your code
24:02
to make it maintainable so other people in the future might as well take over. Does that answer your question? Yeah okay um we have another question. Yeah my desire from Humboldt University I just wondered why you're talking just about application sustainability and giving a time
24:23
frame of 25 years I would just talk about data sustainability because it's for sure that you will have to change your applications every 10 year latest. I mean no if looking back 25 years it was 1994 and if you try to use software from 1994 that won't be too successful
24:42
today I suppose. So why not just talk about some data sustainability and then change your software stack every 10 years as it's normal for many infrastructure organizations. Well data sustainability or in general research data management is also a thing we
25:02
we have on our minds and also the academy we have now two people also focusing on that so we have this ball is rolling and yes like changing your application every every decade or so well
25:21
this is kind of normal but the the problem that I personally experience within my my team of still I think 15 people were like 15 people like me in different kind of projects and is that as these as the fundings of these projects run out there right right now
25:46
the we are still inclined to keep those services up and running so we have to keep kind of maintenance and we don't also want to and we cannot change the basic the basic UI and UX because there is no time for that
26:03
so these are basically the the challenges we face there of course if we have a proper research data management plan and the corresponding repositories and export the data in a human and machine readable format we can go from there but this will also be I think
26:21
a step that is that has to be taken within these 25 years to to make to port this data into that kind of format actually we have this on our mind