Logo TIB AV-Portal Logo TIB AV-Portal

Open Data & Culture - Creating the Cultural Commons

Video in TIB AV-Portal: Open Data & Culture - Creating the Cultural Commons

Formal Metadata

Open Data & Culture - Creating the Cultural Commons
Title of Series
Number of Parts
CC Attribution - ShareAlike 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date

Content Metadata

Subject Area
Overview and goalsOpen data is gaining considerable attention within political, organizational, scientific and citizen communities internationally. As awareness of open data grows in those spaces, it has also grown in the cultural heritage field. More data is being made available online, free for the public to reuse. The attention and interest that open data is garnering has led to a number of questions by cultural organizations: what are the benefits and challenges for both institutions and society of open cultural data? How do institutions open up in an effective way? During this session, will answer these questions, and more.
State of matter Personal digital assistant Sound effect Directed set Open set Computer font Open set
Observational study Open set
Bit rate Digitizing Projective plane Bit Right angle Open set Bulletin board system System call Open set
Presentation of a group Apache Forrest Multiplication sign Content (media) Bit Digital signal Open set Set (mathematics) Open set Metadata Sample (statistics) Integrated development environment Forest File archiver Descriptive statistics Library (computing)
Type theory Information Different (Kate Ryan album) Multiplication sign Content (media) Videoconferencing Right angle Digital signal Endliche Modelltheorie Descriptive statistics Metadata
Slide rule Hoax Link (knot theory) Observational study View (database) Computer-generated imagery Virtual machine Mathematical analysis Open set Average Mereology Metadata Theory Machine vision Medical imaging Read-only memory Semiconductor memory Different (Kate Ryan album) Videoconferencing Dependent and independent variables Projective plane Content (media) Bit Ripping Line (geometry) Translation (relic) Cartesian coordinate system Open set Personal digital assistant Information retrieval Statement (computer science) Einbettung <Mathematik> Website
Context awareness Dependent and independent variables Information Computer-generated imagery Content (media) Electronic program guide Internet service provider Open set Content (media) Open set Computer cluster Different (Kate Ryan album) Extension (kinesiology) Resultant
Standard deviation Standard deviation Service (economics) Inheritance (object-oriented programming) Digitizing Projective plane Content (media) Numbering scheme Public domain Insertion loss Streaming media Content (media) Neuroinformatik Force Connected space Word Arithmetic mean Internetworking Right angle
Point (geometry) Google Bücher Dependent and independent variables Information Link (knot theory) Europeana Multiplication sign Projective plane Content (media) Metadata Database Total S.A. Event horizon Open set Metadata Arithmetic mean Angle Visualization (computer graphics) Different (Kate Ryan album) Right angle Object (grammar) Row (database)
Default (computer science) Collaborationism Link (knot theory) Link (knot theory) Computer file Computer file Content (media) Similarity (geometry) Database Public domain Open set Metadata Hypermedia Hypermedia Repository (publishing) Order (biology) Right angle Row (database)
Server (computing) Computer file Link (knot theory) System administrator Multiplication sign Source code Open set Metadata Wave packet Web 2.0 Frequency Hypermedia Internetworking Different (Kate Ryan album) Average Program slicing Link (knot theory) Mapping Information Block (periodic table) Optimization problem Moment (mathematics) Projective plane Expert system Coma Berenices Bit Multilateration Set (mathematics) Open set Particle system Hypermedia Internetworking Software Personal digital assistant Blog File archiver Freeware Row (database)
Multiplication sign Web page Database Bit Line (geometry) Cartesian coordinate system Limit (category theory) Field (computer science) Computer programming Hand fan Medical imaging Spreadsheet Different (Kate Ryan album) Figurate number Descriptive statistics
Source code Information Copyright infringement Multiplication sign Projective plane Source code Shared memory Electronic mailing list Similarity (geometry) Planning Public domain Database Open set Connected space Sign (mathematics) Software Radio-frequency identification Different (Kate Ryan album) Object (grammar) Information Object (grammar) Quicksort Extension (kinesiology)
but a chat home sh and
a so if any of its printers
and years on the Mac tapped and then is the talk of English and as that of the of Don can in English which after and and we met a lot about open data so far and now we come to special Fiat open-cut showed hm and and then the next 2 persons on data explain as what is called encounter data and analyzes important and what i its benefits for society both ends institutions and spam how can culture institution open up in an effective way and on states and please give a warm welcome for your the speaker and bold font Danny Ojeda case it was Piketty's from amsterdam he lives then books for the Open Knowledge Foundation and his
specialized in digital habitats have an extended study I
end of afternoon over them together is done EDTA is an entity can serve as the Open Data evangelist look them the but for the Open Data from days Open Knowledge Foundation in Germany and by everybody um yeah going and it is an english arab
they might but by the Germans not good enough to talk to you today um I'm your I live in Amsterdam um I work for the most foundation for the open glam initiative and I'm about to tell you what that is and next me is Daniel the Chairperson of the Open Knowledge Foundation Germany and I guess does lots of other stuff here in to the 2nd you stay by your side smiling bulletin tall call right um a brief overview of what I'm going to talk about so 1st of all I'm going to go into the notion of the cultural commons and more specifically a digital culture commons I will talk a bit about why it's important that this is an open commons um go
briefly into few open projects and then in the end going to pure rating this vast amounts of data in the comments so 1st of all I work for your knowledge Foundation and people have been earlier in his room I think I have seen a few other things that we do around transport data and we most people I think no us for work within the Open Government Data and C can portal which is being used by governments such as the UK 1 and E U 1 2 published datasets and basically we work to open up data and knowledge make it freely available for everybody to reuse and also make it used and use fall and I think those last 2 things are really important because just opening up stuff is often not enough the so this is the open glam
initiative but we do what I just said but then with cultural data so we go to institutions help them opening up the digitized content and metadata and bring it to the people who can then make use of it so it's a bit of a weird thing so 1st of all of um and then explain what it actually is white clam I think that's the biggest question mark and so open is and digital content related it's free to use reuse and redistribute without any technical legal restrictions and I'm glad is an acronym for galleries libraries archives museums it's being coined by somebody from the Wikimedia movement a few years ago I believe because every time giving a presentation saying goes letters archives museums and takes a lot of time the if you want to know more about what open this please refer to the Open Definition there you get a really clear set of clear description about what we mean with open and particularly what's open is not so for example using a non-commercial restriction it does not count as OK so I will 1st talk a bit about the comments and I will start with a doing that's why talk about the traditional common as from from economics so from Wikipedia resources accessible to all members of a society so traditionally you had um
environmental examples forest fish air this is all stuff where everybody can make use of everybody can walk around 4 is everybody can chop the tree and bring it home everybody efficiency we can all breathe the same air um so traditionally people say this is non excludable and also non-drivers in 2013 this notion is a bit debate it because mean people too old forests fish are getting MT and what they don't have seen people claiming air but perhaps we can 1 so what is the digital
commons this is a more thorough description information and knowledge resources are collective Lee created an ownership between or among a community and that is freely available to 3rd parties and thus their oriented to favor use and re-use well to exchange as a commodity so the promise of the digital commons is besides being non-excludable because everybody can access the same dataset non-rival rest you don't own your data you shared with people people can also used indefinitely I can take a certain dataset can can take a piece of content I can reuse it in any way I once and at the same time you can still also do that and you can do it again and again and your neighbor can do it and so on so digital artifacts can be curated remix and annotated by anyone any time we can build upon each other's work can continue it but can also start with a new dataset and do a whole different thing or replicate the thing when I talk about some cultural content of think it's important that uh they did difference what we also talk about the culture institutions is made clear so you have content and metadata to content is quite
easy it's uh digitized material from institutions which can be anything can be a video a painting a book of 3 D model so here we have the 16th century medieval manuscripts 15 centrally from the Ascension Day Christ going to heaven and you're on the right we have a gun she lived from the early 20th century is 1 of his landscapes this this that's content the content also comes with
metadata and metadata pure basics of metadata is about finding the data so you want to know who made it you want to know what he named it's what's style of work when the D created is also very useful and also some more technical details what's the size of the painting what kind of a pain dt use in such an mitigate go even further by uh for example adding a transcription so people can actually it becomes machine readable people can work with it you can translate the content you can make an an even an analysis of and and link to further reading and as you see as this is often the part of metadata it it did to the borders the line between metadata and content becomes a bit fake because somebody who did transcription can say well this is actually my original piece of work this is content so I will operate this and and so on so there are all kinds of difficult discussions around content vs. metadata and which I retrieval not going to today because we can spend hours of living is so why that should this data be open and this is what I usually tell institutions so 1st of all I will go into each detail more often in the next slides so 1st of all helping glance fulfill the public mission they will reach out to a larger audience allowed them to participate at connecting contextualize collections of think is very important and have finally keep memory institutions relevant in the digital age so 1st of all the public mission if you just look at mission statements and responsibilities and whatever they call it on their websites they are public institutions they are Daria publicly funded they serve the public which means you the taxpayer and every single institution we come across basically maybe a few research institutions don't but they all say uh EV enable axis of accessibility sustainability um access again there's a little bit of education and they're made available for use and such is all there and these are not mission statements for over of the last few years but these are mission statement which has been there for ever since they started and they have are still working on these emissions so by opening up your data by giving it to the audience to a much wider audience that you could ever imagine you are simply serving a public mission and you can reach out to a global audience
this is a small case study from the Netherlands we have theory of the Institute of sound and vision and at the open images project so what they did they did some of their uh unofficial collection data uploaded to Wikimedia Commons so it could be used in Wikipedia it resulted in 500 articles actually started using embedding these different video clips uh international they'll started using it and the day now get two-and-a-half million views a month I just by putting it on Wikipedia and it's the thing is this notices the video over a
cow but it is that's what 1 example and it's it's a cow who wandered of from the mother wandered off in the woods founded the here and now you have some inbreed thing what Netherlands fifties and within the same thing as it's 15 per cent of their collection so they have hardly digitized anything and they could only be even less they can make openly available so fitting 0 . 0 15 per cent of their collection is available on Wikimedia Commons and results two-and-a-half million fuse a month of
participation you can invite your visitors and users to contribute to aspects of your collection they can take data from different collections put it in a uncurated put it in a different context they can rich it's a lot of institutions that Open Data uh got response from users saying well actually you say this is 1947 but this is not 947 because I can see that because of the helmets he has during this battle and so on this is 945 and they change it afterward and um and so to some extent guide users can also provide new contents by adding information but also by uh and uh there there's been some to for all road shows for example that people bring their own collections from world war 1 world war 2 for example to be digitized and adults um
gather connecting contextualized thing so this thing goes work but it's his paintings are all over the world we have the thing go museum in Amsterdam but they don't have everything at all their stuff and engineers stuff in Berlin their stuff in the US and I think particularly Internet can do really well is bring all these different collections together and you as a researcher for example you can from your computer you can get access to all these different things bringing together and look at it compared them showed in a timeline and also at secondary literature like uh put the diaries while you created this and you can actually do much more for a research and then you could when you would do physically the and of course the
issues and I've decided to not to go too much into the issues but there's still a lot of work that needs to be done so institutions are concerned about loss of revenue streams would I say well if we open it all up where there was gonna pay us for that uh attractions of private scheme to look down at its opening up and digitization is pretty expensive so it's an institution cannot do it themselves so often they go into some public private partnership of think the most famous 1 is who water project would they digitized your stuff but in return you give them the exclusive rights for 15 years to uh to work with it which is often which and on 1 and it's nice that it's there but on the other hand is really weird to when stuff that's sold it it's in the public domain everybody should have access to it all of the sudden is not accessible anymore accepts fire Google service words of worries about the misuse of data and content and think what you very often here is well what if use my stuff and is a concern and I don't think it's a very legend 1 because people can do that anyway and you often is institution don't have the means to see them um and I think this is by far the biggest 1 legal uncertainties licensing and often works so you have a work you don't know made its quality and how are you going to ask him or her or its inheritance evidence of what you can do with it can I make it accessible no idea so just lays there untouched by anybody and the technical challenges so what are we we should all use the same standards and we should use tools to get access to data and we use it I will briefly talk about a few open
projects the show what has been done over the last few years so this is your piano and this it's a really big projects uh funded initiated by the European Union it has been there since 2008 um it originally started with the ID of the political idea of the European Union that's um culture can bring us closer together we can learn about each other's culture we can bring we can show stories from different angles and mean if you just look at um major events in in history there is always a different story from a different country and from a different Village even so that was the original idea what they didn't say that was also a response to the Google Books project they came along and started digitizing European books and which all went well until they started touching the French books and they get really upset with these Americans touching a French heritage and we're going to do it ourselves they now have a 28 million records have from 26 100 institutions in their database but it's all metadata so information about the work but and it all links to an actual did digitized objects but you can't access by European all this stuff from 1 from 1 entry point basically and which sailed to change but um they they kind of realized that they matured and metadata in European is now all wondered 0 license which means that it's as freely available as possible anybody can do anything with it it's basically out of copyright but to do that what content is much harder because people are still living um to De here it's art and and metadata is something the institutions created themselves basically said they own the copyright the owner rights they can uh open that up and for content is much more difficult story and a link to institutions some yeah this
is a visualization we made over the years how much European has grown were looking at a time and I can show you the link if you want to uh it's initiands consistent and you can just look it up you can see over the years how much European has grown this
the PLA in the Digital Public Library of America lunch I think 3 weeks ago now so basically what they're doing is a similar thing as European member than in US so they start off with 2 and a half million records from 2 and institutions metadata again with link and luckily the PLA and European very much work together so they realize that in order to do this right we need to align our efforts to make 1 a big data base where everybody can get access to this thing and
this is Wikimedia Commons media repository of uh the Wikimedia Foundation so people can upload content to that it's volunteer-driven but they do have a lot of collaborations with culture institutions so they talk to institutions they upload the material to the commons and Wikipedians can use it to enrich Wikipedia articles for example which leads to a major outreach for these institutions and again people come back to them saying well this metadata is wrong and so on they're 17 million media files by far not all of them are very interesting but there's there's a lot of stuff there I don't know how many institutions actually work together with the Wikimedia Commons because sometimes the font themselves just to public domain material and uploaded there so I don't really know that and its contents you can actually see something and there's always a very clear licensing visible so and the default is a quite strict that if it's not on sure that this is for example in the public domain they will remove it finally the
Internet Archive runs the bit quicker volunteers 9 million media files and also lots of different stuff if you're grateful that then you're really gives it there uh content varies slices
so in summary have 30 million metadata records that just within in Europeana and deep later there's more there are 25 million open media files and Morse coming because in Europe which is doing pretty well when it comes to digitization less than 10 per cent is actually digitized at the moment I think it's far less than 10 per cent if you take the average over all of your and so Moss coming so here we go how to make sense of that's that's what I want to go into now because if you go to a European uniporter right now you typing thing go you get 25 thousand it's how how useful is that so we need a way to cure rates this commons so um recently um the cofounder of the open isolation Rufus Pollock voted the blog post about small data for speak their think this is kind and don't if you've ever read it it's on the block of the optimal solution is recommended is basically about instead of creating these incredibly you 2 sets of of of data owned by a company on 1 server for example we need different packages of data so when I again the van Gogh example it's for me it's very useful to get a torrent file we just all of his works I downloaded and I have it all and find relevant links that's uh something which which uh the web is getting better and better at so you have an often you find interesting and who influenced him where did he live which are the artist live there and such so you can continue exploring researching and finally we need tools to make sense of this vast amounts of data and to work with it because we should allow people to collect the data so bringing together visualize it's not only put it on maps but also you can think of showing a timeline often an artist which periods were where and why and an extra information on top of that annotates and in research that's something that they're really really looking at uh the DID of annotating the academic texts and an atmosphere all kinds of text and doing it together in a collaborative way and being able to say well this paragraph was inspired by this off the uh is a critique on this this is he said this also in this book and so on create this whole link of annotations that match up just to anything you want with it and much more and when we talk to culture institutions about doing this they often say well you know we are the experts we are we we are to cure we've learned years and years is we know how to do it is best if the user start doing is they will rip are collection of particle take all meaning out of its um and Our work is basically all for nothing so if I was in a train yesterday from M sensibly and so a lot of free time and no internet so I did this so and so basically what what we're trying to do a lot with institutions of letting them work together with the users because very often they also can do anything everything themselves and users are and the community is very strong uh and they really want to help people and in some cases they even are the experts so you can make use of people or making sense of the heart maybe but people are willing to contribute their expertise to your collection to make it better and such and you for a lot of institutions that do not have the exact situation in Germany is but in The Netherlands institution getting less and less funding every year and they see great that as as an outcome of that they started working a lot together with cultural institutions and with a community the and within the administration we have been working on a couple of uh software tools to facilitate this so 1st of all we have Texas which is the tool to collaborate around text so it allows you to pull in texts from all kinds of different sources uh we source and other Wikimedia project has a lot of different sources lot of different texts there from philosophers you of all transcribed you can pull it in and you can start annotating and linking and so on we have the time either which I'm a
big fan of it's just really simple application is allowed to populate a Google spreadsheet and it shows you the time the time line so if you are a bit familiar with Korean databases you can do this in figures 20 minutes maybe clean up your daily updates search for what we did here search for all medieval philosophers between 1 thousand and 1400 um and pull in the fields of geo-location the description and the dates in image and you can show it it
and crowdsourcing we have a tool called grab crafting which allows you to easily set of tools applications for 4 and 4 institutions to work together with with the community so here this is a transcription the application the and this this is something that we are talking about a lot about with different culture institutions really interesting to see how there's a certain needs to to get this implemented and basically the ideas that I my fairly limited programming skills could go to an institution and say well we can set this up for you in half a day and we can get this going and people your volunteers which institutions very often have a lot of can help you improving your data that's very easy finally curation this is
we also 1 of EPA's the Public Domain Review picking the most beautiful stuff out of the public domain highlighting its showing it and asking people to write an article about it and this is when it comes to public outreach is by far our most popular project gets think more than 100 thousand hits a month and it just shows what kind of value there is in the public domain still but you need curated to some extent if you just go to give people a huge database and it can make sense of it they also lose their interest so to
conclude the 21st century gland it remains and it has always been the key preserver of or a shared culture teach um enough sort of source of information expertise about your collection and they curate contextualize and tell stories about the collection that was there it will be there and institutions are still very relevant also in 21st century and also when whenever all their stuff can be found online but at the same time they can make use of an audience so they can reach out to an audience far beyond the well of 1st founders and they could can connect to other collections that contextualize stories about its objects and a close connection to his audience and improvements to digital collections that comes with that I'm at the Open plan initiative we trying to help institutions to reach the sky and Goals and we are a relatively small so what we do is we work a lot with different networks in other countries so I'm doing this in the Netherlands together with a colleague in the UK and Daniel here doesn't use are many years in Germany but that's all OK if but in other countries there are for example similar initiatives uh like in Finland's and in France and in Spain and we in US quite a be in that so what we try to do is connect with different people at home and work on a similar thing basically get this material accessible and we work together we share experiences and we help each other out and and such share documentation because in the end it's it's far more useful when somebody in italy ghosts actually to an Italian institutions talking with them well what bi-directional issues while can be overcome then then it's just me sending an e-mail saying well I I noticed that your data is not open can you do something about it but yet so thanks a few have but 1 1 thing but if you if you're interested in the topic around Open Data and culture in g and of you think it's in the end of November we are organized with a couple of partners uh the Wikimedia and Jewish Museum and pirates we organize a conference which is 2 days with talks about Open Data and and culture will not only open data but mere formal to the list could and this was and
this is this