We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Challenges for Scientific Databases

00:00

Formal Metadata

Title
Challenges for Scientific Databases
Title of Series
Number of Parts
9
Author
License
CC Attribution - NoDerivatives 4.0 International:
You are free to use, copy, distribute and transmit the work or content in unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Interview with Frédérique Lisacek: Swiss Institute of Bioinformatics, Geneva, Switzerland, recorded at the BEILSTEIN OPEN SCIENCE SYMPOSIUM (22 – 24 May 2017). Frédérique Lisacek is talking with Carsten Kettner about challenges for scientific databases. In Switzerland, at the Swiss Institute of Bioinformatics there are two groups responsible for setting up databases in bioinformatics; one in Geneva looking traditionally at protein data, for example SwissProt, and one in Lausanne looking into DNA data. The main challenge in hosting and maintaining data collections is infrastructure, both in terms of hardware and information curation, i.e. carried out by experts in the corresponding field. Financing curation is a big challenge as it does not traditionally fit in either research or infrastructure. Frédérique tells how important standards are. They need to be developed between people who have the scientific knowledge and people who have the technical knowledge. The coexistence of several standards for the same thing is probably an advantage as no one standard will be able to capture all aspects; however, we should narrow the number down
Keywords
ChemistryComputer animation
Activity (UML)ProteinHost (biology)Sea levelInfrastructureWine tasting descriptorsGesundheitsstörungDisinfectantCollectingDNS-SyntheseMeeting/Interview
InfrastructureBase (chemistry)Host (biology)HardnessMeeting/Interview
Separation processStiffnessGrowth mediumBreed standardAddition reactionWine tasting descriptorsGesundheitsstörungWasserverbandGermanic peoplesTitanateMeeting/Interview
Transcript: English(auto-generated)
Frederick Glissantsek from Geneva. You are from the Swiss Institute for Bioinformatics. I guess that Geneva is a very nice place for hosting big data, such as CERN is doing and the SIB is doing. So do you have an idea why Geneva is such a great location
for that? Well, you're putting me in a difficult situation, because I cannot say it's only Geneva, and it's Geneva and Lausanne. And there's really cooperation, especially at
the level of SIB, that we are, we have, there's a certain division of labour. There's a tradition in Geneva to look at protein data and in Lausanne to look at DNA data. And the big data at the moment is rather on the DNA side, so that there's
a lot of activity happening in Lausanne as well, and on DNA. And so, because Swiss Prod was born in Geneva, so there is this sort of tradition, yes, to have a specialty
on protein information and annotation and curation. So was it born by coincidence in Geneva, or was there any plan behind that? How would I know? No, I think it was Amos Berock was in Geneva, his family decided
to settle in Geneva, and so it happened there. Yeah, I see. So what would you say are the largest challenges to maintain and host data collections, let's say in general? Infrastructure is a challenge for national agencies who are funding research, like as
if you didn't need the means to actually complete your research. Are we really talking about just the infrastructure? I mean, infrastructure is hardware. Yeah, but it's not only that, it's hardware, but you have to fill in the content. I mean, database information curation is part of it, and actually it doesn't belong, it's
halfway between research and infrastructure, and it doesn't belong to any in a way. And this is why it was so much of a challenge to actually finance this aspect. But data is not all, I guess. I mean, if you just collect data, let's say raw data,
you also need something additional. Let's say the descriptors of these data. Yeah, so this is what I call curation and the annotation, and so this is all thought about by people who design databases, and so it's, I mean, like if I take again
the example of SwissFrot, the schema of the database was certainly designed by someone who knew, I mean, it was Amos in collaboration with a few people, but there was actually some exchange between the people who actually do the schema and the people who have the
notion of the content of the data. So it's always, communication is the only way to actually solve problems as far as I'm concerned. And we saw that just in the presentations just then. I mean, communication now is helped by a number of media that is adding
to science. So what about standards? Standards, yes, this is one aspect. Thank you for helping me in going in the items of the list. So this is essential, but I mean, it's part again of exchange between
people who have the knowledge of the scientific knowledge and people who have the technical knowledge. So how can you actually exchange and really frame the data in such a way that it's not going to be too framed, but still there's going to be some flexibility.
I still think that the coexistence of several standards is a good thing because you cannot actually capture everything in one box. So we speak different languages, you speak German natively, I speak French, and yet we use English. So let's have a few languages
and a few standards and a few different means coexist, but let's narrow them down so that it's not scattered all about the place.