Multimedia as data source - 11 September 2014

Multimedia as data source - 11 September 2014

Multimedia as data source - 11 September 2014
Multimedia is used and generated as a part of research across all domains. Multimedia can function as an information and a data source. The management of multimedia, as research material, needs to be shaped by the intent for use and the research methods being applied to the material. For example, it is possible to read or mine a text or an image and be informed by both processes of investigation. So the critical questions for data managers is: what does the researcher need and want to do with this type of research material. How are support services and systems designed and provided to enable that.
today I'm going to talk a bit about multimedia as
a dot I particularly about data management in support of research where multimedia isn't artists also N is the research material and I guess I need to confess right up front on and not a new media expert mom what could be refer
referred to as a digital librarian sorry I I come at this with a very different kind of understanding and potentially more an exploratory approach and very interested and the corner questions and core it's awful it's closest thing into they have what I wanted to start off today
with was to I can have a bit of a think about what multimedia is because I have to say that this is what went through my head when I was thinking about offering up this way but I also wish to acknowledge colleagues over at Titus Kahn University who triggered the idea for this B but not because I am interested in looking at the different types of dada particularly the the sort of data that is generated in support of performance studies and so that we this this idea came from but I really wanted to look at how multimedia was defined to get a bit of an idea about the way that it's referred to by different groups of people and also what it is from a technical point of view so you can see there on this slide of of given given a mixture of definitions of what multimedia is and I think the 1 and there is the 1 that really holds my attention probably the most is that it's about different types of dada that's actually contained in the file and there are different ways of referring to those components ounce picked some of those out in the 3rd point I was going trade to discover that there our our chunks and engines and parts and that's because I don't have any media background I come at this as a as someone who is looking at the material nature in some ways of multimedia I
wanted to have a think about where multimedia appears in the research environment and fear of listed films as an example of multimedia and and we pages where you can have a mixture of many image and sound possibly some graphics and digitized documents they can have both image and annotations mark-up and transcripts and also satellite images which is something quite new to had the great benefit of a lecture on within the Australian National doses recently from Stewart mentioned from Geoscience Estonia and opened up my eyes to what a satellite images and all the layers that actually exist in this little image that was really interesting I hate he gave a great talk on the dyadic you that that developed but that's a whole other topic I just thought pause at this point to see if any of those in the group had a background multimedia we have some questions about the definitions of multimedia at this point so I kissed
looking interior where multimedia appears in the research in the minds of our got me thinking about the fact that this topic to come up through those working with people who undertake performance studies so I listed some of the research demands there where multimedia is generations to help me kind of unpick whose exley using all creating multimedia and where that TEC-9 and why they're using multimedia to get a better understanding of how it's created and also perhaps used so I guess I'm looking at this from a
cultural production perspectives and my son is to look at the methods that researchers we use that they use to create will persist information and I found that a very kind of iterative process I bouncing across from the idea of something is information and something as dada and having a bit of trouble identifying what was what I and I keep circling back to asking myself What is the researcher doing and I think it really helps to say to look at research methods to understand when a digital object is being treated as a piece of information and when it's being treated as a source of dada and I'm sure that they could be quite exhaustive conversation about this but I really thought that help to understand the purpose to which this material as it was being put and how it was being used a winter there was a human looking at it or whether it was a confuse looking at at at the multimedia and whether that was a useful distinction or not I I must say that I have all the answers and if anyone out there has some the answers that big right what is the slots test coming through that's really good of euros the
for Italy's yeah so that's a multimillion living working with or where crates and some of the low include over be inferior wall there's any of her where I got to do with looking at multimedia was trying to understand what happens to a digital object and that's what I refer to something as a piece of digital material and so I tackled annotation as an area of research I kiss interpretation or a process in which information is supplied or daughter is applied to dada and it I became very tangled as size I picked out 4 areas where I could see the word annotation being used and the 1st was genetics at which was really interesting I'm quite fascinated by the idea of the automated annotation and how that actually operates in genetics and it's more out of curiosity than ever having edges are today some of his studies genetics but it was really interesting to understand the capacity for generating very large amounts of automated annotation and then I moved into geoscience to look at what happens when people edit annotate geospatial information and that of 2 military that's actually used to understand what's actually happening and with our daughters being applied to daughter where the information's being applied to information or dot has been applied to information are really a don't have the answers to this but I wanted to unpick what was actually going on to get a better understanding of how multimedia was being used and enabling research so I moved on to linguistics which was again quite fascinating to discover the different types of annotations and the applied to to languages clan of listed them their descriptive analytic times secrets and text so that was like it's really interesting for me to understand to take apart say a descriptive annotation from a time sequence annotation and try and understand what started information it helped cluttered distinguish different annotations city did help me out of that pattern information kind of dilemma what's what's datum what information and I reheat nice that material was becoming a what seem to be becoming increasingly multi media and nature if it on head start off that way in a 1st place for the last I looked at was born medicine and images following this line where where researchers they annotate images and then to that in different ways by drawing and ending nights and marking areas and I I feel this is really fascinating and it made the idea of simplifying managing multimedia into what as Tarjan what is information kind of meaningless been away because it might be our theoretical concept rather than something which actually helps the researcher to do their research sorry on my phone and put them
examples in front of us here and this is a biomedical slides and it's been annotations and you can see this being annotated with aligned shape and also with some words and that a scanned image onto they are OK the next image of
Gupta is this is wonderful jane annotation image that I looked at it really could make a cable proto of but it might be pretty interested in understanding hello Allen Jaenisch assessed sexually manage the daughter and what information they derived from the daughter really really complex and the other thing of mentioned the fact that machine that should generate these annotations made me want to understand where those annotations are actually push have link to the chain sequence that I think it's a whole investigation and to itself and the
next image I've got which is slightly more familiar from many people is is a google Earth image which has been its satellite image which is Karp street muggings and bubble pop ups and slime tracing and it made me want to understand a little more about how that how that information was being captured to not have to support researchers who want to manage the daughter effectively enviable to but it should make available to cited or to presented as part of the research this is the last 1
which hyp those of you who ever been to Portland enjoy I found this on flicker and it's a graphic image in the background and on top of that it looks like they're at least consider very carefully placed in alignment with what's called a spectrogram which is someone saying it rains a lot in poland level this is really interesting I wanted to understand a whole lot more about how these discrete pieces of dada were actually brought together and with the you capture that was 1 thing or out with you capture that separately and if the researcher uses the separate components of as part of that multimedia but maybe understand that where the annotations of all the combinations occur there may be very critical and supporting some of the research findings is this the 1st question has come out in some someone's interested to know whether any 1 in the in in the audience has worked with men tape archives or in Ingres you've worked with a collapsed or if anyone would suggest some points of common yeah I have a way of making a type I can't stand on dating fed I national of cops maybe major agencies commentary to understand a little more about what prompted that question I'm in relation to multimedia and and if it's an opportunity to to sort of unpack that a bit more there's a there's another question here is this accurate to save multimedia is not the role of primary data visualization or we're all grown leader in know I really don't have the answer to that but I do think and that's why I've danced around what Staten wants information and how that kind of fits into a discussion of multimedia becomes it made me want to understand lists about that the finding of them more about what was important to the researchers to enable them to do their research what what is the informational what is it that that's going to enable them to do their research and does started become information to the context that research for Endre is those words you had was coming so why can we like to know if you have any thoughts about how multimedia mind the work all be used in forensic linguistics well I guess if you capturing sound files and I I I know that this is potentially been on news of light with analyzing the voice of someone who's been involved in the migration overseas and I guess that is the underlying that is actually looking at how sound file is created and what you can pick out from the different sounds that a captured and I really don't know how sound files work and I'd love it if there is someone who's and got a bit more expertise in sound to contribute to but I I mentioned this is where those layers and they have to pick out different spectrums and particularly changes and modulation is really important to see if their signature associated with people's voices but was that we need a linguist and what's the accretion at a grant wontons or overrule the earlier questions or world you know you might be working with the commitment of quotes 1 of the audience members suggested the they say and in the control of another audience member just happens to be doing right yet to recover NASA magnetic tape archives to mind in each spectrogram and audio radio astronomy well what an interesting project what I'm fascinated about looking into multimedia it it also is discovering language that have never used for distance of maybe use which spectrogram and it's still something that makes me think at the doctors but I'm not sure whether that's an appropriate description a lot I thought what I would do it is
introduced a part up project that's happening here in Australia and that kind of 4 May emphasizes this idea of what's information and what's Tarjan and what other research is looking at and it's a project by Suppan Griffith but I think with people got around Estonia I live by much for 9 called the prosecution project its center
of excellence and and policing and security and other looking at criminal trials over time and dumb they've been digitizing archival materials and transcribing them end up you'll see
there on the slide and ice slashed image fear that marks applied to to come to give you a few of the digitized image which to me is information but also on the lower part of the image in this where the data entry cues for transcription where FBI found interesting in the exchange with Mark about this project I met him during an interaction with along the path I recently penumbras done is dead that they're that they're really looking at making the absolute most of this digitized material of looking at it from an informational point of view to look at being allowed to read the at the records of these cases of criminal cases here in Australia and also looking at what they done underlying that information can tell them it's been a pretty interesting prices to get to grips with what it is that they're doing and I hope that this offer some insight are to perhaps what's important to understand what the research is trying to do and that there are interested in using 1 the method and worry about our feature of multimedia to enable them to do their research
sorry our markers and emphasize here the outcomes stand still looking at a mixture of research methods both quantitative and qualitative his RCB analytical and I will put the link into the slides so that others can have a chance to to go have a read about but the quantitative are aspect of it was something that that was a little more familiar to me the quantitative aspect of it was something quite different and it made me realize that that's looking at mixed research misses was also a way of understanding how multimedia is operating as both information will and the dinosaurs but it's a tortoise like that which finding I guess and lightning is the word to use and that they're getting that done along through transcription human transcription that in other cases of digitization it can be character-recognition sorry this is where
Micali got to his multimedia and the dinosaurs I got got a plot where I decided that it could be both information and data at the same time because it's the way that the research is using it in building whatever they are learning from that multimedia work with a is being looked at as a piece of information or as a donor cells to do their research and reading the case since all reading the court records and also doing text analysis will gonna mining is enabling this research this residual that research groups to do their research which I think it's a pretty are incredible potential from our 1 source of digitized material and I think that's quite an exciting prospect 3rd
from a point of view of management it made me think about how how they were going to approach to managing that and other markets clumping kind enough to give me a description of hell of the banking and to the prosecution project is going to work they've got archival materials as digital images either going to transcribe those images images into an SQL database and that supports and doing quantitative analysis of longitudinal and compassion patterns this is an e-mail that he St. on my last week other looking to extend that database by accessing the linking of data sources in the speech and try college and possibly on other projects are other digitized material like place this it's to enable qualitative what he's referring to his case level as rare as well as quantitative analysis and and then looking also to register down by existing interests transcribing the trial transcripts and other text archives and so I guess what I understood from from this was that of my notions of splitting something into information daughter we're helping me to understand and what it is that was that enable this research group to do the research but also by the detector dig even deeper into what sits underneath this application to understand how this storing the digitized images and how this storing the transcriptions and where the wanting to store the linkages between those 2 things and I realize that multimedia in this context is very complex and all that language that I introduced at the beginning about life and components but is important to I think inform how we need our support the management of this material so that's where I got to with the
prosecution project some just stop there before I get onto the 3 applications at the in end I ask
if there are any questions the citizens rivers it assumes full sources people working on a radio phase of goods including the Smithsonian the National who those local and last but not
least I decided to have a look at 3 applications and enables a person to manipulate multimedia type pick traded that seemed to be reasonably familiar to many and I just wanted to have a look at how they have they enable material to be Broughton ends held they enable informational Dodgers be applied and what happens in these 3 applications and these are I case recently downed ubiquitous applications Final Cut prior and arc GIS and WordPress this certainly not the theory domain-specific or hand in X I guess lists are commonly used applications that you might find and biomedicine or our it it's more specifically so I had a look at Final Cut primary to just try to understand some of the language that's used to understand what's actually happening when you use final cut pro if you've got any I'm confident users in the group today would be great if you have offered some advice but I just want to look at what the final cut prior does to digital material and from what I can understand this set it technically consists of separate files is something called a project file a media source file and render or case files and to me that I gave in understanding that to the multimedia was being captured in different ways potentially for different purposes and I have to confess I have ever used Final Cut prior and I think this is an interesting way for us to understand how multimedia is the brought into an application where is saved but also to try and understand what happens when you want to try get of that material out of the application and how you do you store that with you store that as a combined object or with a basic part objects they open archives information system our model which is used in the digital archiving world I has been interpreted differently and different way to give you an example in a long time ago when I was working on the National Digital Heritage archiving NewZealand we decided to be very clear that we would kept committed dot separately to capturing material that we were hoping to cape and I think it was the Dutch National Library decided to go in a different way they decided to build the digital objects with both the IT sorry the object that was being collected and to me that is 2 very simple ways of approaching capturing multimedia is to seperate concerns if you like different types of digital information or to actually build into a bundle but it made me realize that if I was trying to get material and a final cut pro would want to understand how this material could be linked back together again in case I ever wanted to work on that multimedia material again I would carry on with that 1 but I did and I did look at this someone there and how the output of Final Cut prior and is captured and how the components are captured and I don't have the answer to that to that it's Iike GIS this is another tool that I haven't used that we and to have a look at how material was of fusion that application and what happened to it when it was being used and what I can understand from this is that it's possible to pull in images to a special images and it's possible plunges special dada if you like like long that currents that kind of dada and to build up our lives within this application end again it maybe think about being able to maintain those components separately but also to maintain the final put which may be a combination of those components could be critical to a researcher and it might not but when your dealing with different parts of of material how was that used by the researcher and from a point of view of looking at man it's from a human point of view we can read it back and is an important aspect in the research or a or is it the annotations on the maps that and more important I cannot do that but I guess I'm in terms of being have to support researchers who use all create multimedia it's important to ask what it is that they want to do with it and with that they want to they construct to reconstruct from those original components for the last 1 is WordPress and this is 1 of the I state many more people had experience with I've always wondered how people get the content out of wood press our so it had a look to get an understanding of what happens if you had a a website up using the WordPress applications and you want to suck all the content outside you can capture and perhaps put into a different application and at may be very important to keep disgraced our narrative that's and posts all our pages or comments separate from categories and tag someone action going on I think you can I get that at a separate pieces of dada but may be wonder how research and my actually use that material wins I would I just want to reimported I'm into another application whether they actually want to persistence tags or categories to say how much contents being given lies and categories of tags but there is a good is called comets and loads you questions firstly those owners a you using we all firms called the American system of that's being building you and they use an extended though will rule that Lotus in the all columns can edit digital file objects such as video audio in these shocking of the in the in the ideological middle and and also agrarian record up to which files can be linked then the measure itself in the embedded in the objects or exported to an XML so of fault that currently working on being idle to publish all objects and records into WordPress export of the middle of the that's really interesting because it sounds like on that kind of well an understanding of the complainants seems to have a fun if I understand correctly have really informed the way that the application has been designed so that you can maintain digital material discreetly irrespective of whether it's strife O'Donnell multimedia on on using that distinction very seriously that it sounds like it's possible to pull everything apart and put it back together again side I like if it's a feature of multimedia that's useful for us that research data management community to be aware of that they and that light legalize perhaps that they have to pull it apart and understands when you put a apart help was constructed so you can put that together again might be quite important depending on what it is that you want to do with that material to that's really interesting and so on that the content is being made available to go into WordPress fascinated with question here but ozone injury assaults on of co-occurring words so its 1st first-order sources is alleles are really don't and then justified it onto that and bang on and I think I am intrigued by the discussion of Occam's going from a daughter capture application to what can be
construed or of is a publication and application and that the material becomes like this as a means to communicate the research but the daughter capture application is we're all those pieces of information and daughter actually get brought together sorry sorry to the excretion on web archiving long long time ago I am I saw we about just at the National Library of New Zealand sorry this is being something that's interested me for quite some time and have being really of often hard thing I guess the city or to to assist with enabling material to be published to the Web in a way that you can have a pipeline of their content with the it's starter or information going into an application on the web and then to harvest at website as a whole and also I suck that material back out again as discrete components I while I was at the National Library they can't NewZealand we develop something called the whip credit which was used to that harvesting from the weird so use the part that I would sort out if this you can from the way that it looks on the web and by the example that I I have in mind here is about how What kind of result you get from that's not end there the website that I was particularly interested in was the fuck Kapadia NewZealand that's being put on like all tied up and the front into that is really important from collecting point of view to capture because it shows you how the interface is designed and how the information is presented by but the baking into their is a content management system and before that is a records management system if are that's still correct the Ministry of corporate heritage site capturing websites can be done for why it's so you could use with dreaded told a has an engine underlying it cohere tricks switches and I used by the Internet Archive but I need to see that as a workflow to support researcher potentially capturing information using the data captured by a device like Oken's potentially on on would fill up 1 a while back order exact 9 4 some linguists here in Australia name being to put that into a content management system and then use a content management system to publish to the web and and I think if there are multiple purposes to which this multimedia materials can be put h at each step through that it's important to understand and what needs to be brought together and what needs to be able to be pulled apart and also what what is it that you want to keep the Interfax I really hot we see that of Italy that that's possible we can actually say that life cycle and ensure that that material is retained in different ways the exhibitors is the only ones and of course a lot of questions firstly the Coens saloon saying the extremely large falls closes would be another considerations regions went all in the imaging own grossest multimedia approach it follows a toll on reserve in viral probe this is stuff that made that comment at home you might be reading my mind as tomorrow on giving a talk at a a workshop at itself Wiles on digitization data management and large images and I've been digging around to learn about our file format called big test it sound like an enormous argument but it's it's a cognitive file and I think that that's a that's a really important point to make a 1 of Lin's by looking into our large image formats is areas of research that use this like in the example given is by made before where they had these microscopic images that are just enormous and that they need to retain the image and also the annotations on the image and they may wish to align those annotations in the GARCH they're looking at particular shapes so morphology of sort cells perhaps and it made me realize that our multimedia the way that I kind of understood that from Alaba perspective was really limited and that the way that's a cancer researcher might look at large microscope make images as multimedia and the way that they work with that material is is quite different and that the real constraints certain applications handle of all sizes so that need to be converted in compressed and also it's sometimes necessary to cut them up and tile them so that you can actually move from those large images across a network and that really made me think quite hard about how you support researchers to manage their daughter especially if your inner pulling apart from a large scanned image size yeah I think that side of research is is very new to me and it's certainly our challenging way that I've understood to multimedia come to be given I kissed Michael for heritage background conclude questions chromosome those of and the questions that are related to each other firstly we want to archive and reserve of WordPress website and 23 research things that just been running over the past 30 weeks so this person would appreciate any in inside seeing you have been to not the extraction and press attention just glad exporter relating to this on the same track there are also asking should we be considering exploring export options or similar from many applications that our research on the services up in searing offering to researchers for example by maker Akshay itself was shit itself and the 1st of all only if years how what this be your 1st and foremost I think that there is an export function to work with press I haven't had a go at it but I think it's important to to set us and I'll be very happy to have an exchange about that because I don't know what happens when you have the Export button and what part of package she gets and have discrete those components are with the most up to give the I think that's quite critical and to understand to know how well what is it you comprise things apart and know about their relationships even though you are kind of prize in them apart if you want to dump them out of an application but the I do think it's important to think about the export import and export and what happens within the applications with multimedia because there may be other data types and multimedia maybe a bit of a 50 year because if it's coming in from different sources and then combined to give up to create an output from my daughter management point of view and you want to understand the provenance of all those components irrespective of whether a stutter all information and potentially be able want to be a which keep those apart if you particularly of value added into something which you have existed that you don't so for example you have access to some dada and you create your own annotations will those your annotations and you may you may share that those annotations with and the party that slow and use that the initial daughter but keeping those things discrete might be just as important as they have lot to link them together so do think we need to kind of explored how material
is processed brought and processed and and how we might support that coming out the the and so that we can enable researchers to teach they make their own Dana available if that's appropriate but also to maintain the doubted that they may wish to add to as they go through an undetected types of research I don't know how this could all be done because it's so sorry diverse it is maybe realized that perhaps putting on the head of not understanding and is trying to find out what was going on and what needed to come in and what was happening what needed to come out of that research process through an application with digital material was the starting point and then in each case I think it would have to be explored to know what to do and how much if to apply addition the WordPress example was where I started this was to understand what what I do want to go that material out and how what I want to keep it and that may or may not be important to our research and our our colleagues Ivan WI and spot that's because it sort of set me off on a bit of an exploratory prices I really hope that some of the people participating today undertake a bit more of that to extend our collective understanding because it it's silly well I have to say I found it quite intimidating I realize that there was so much know that it was quite overwhelming is the Kosovo 3-D rendering follows a long time ago I had to write some information on annotation which is why I picked it this is our presentation to try and understand how you would located annotation and a three-dimensional object it was a research project that I am very incidentally worked with some of the researches on it this to Sydney looking at the lights all of our games or fiddle platforms like comes a 2nd Life to see how you it's apply an annotation and that kind of 3 D environment and it made me realize that I needed to understand a whole lot more about wishing in 3 dimensions x y and lead ends with a you would like a piece of information in their heads and what sort of tolls I don't even know the kinds of tools that generate three-dimensional object public aid tools the use of architectural or industrial design but I certainly am not familiar with those types of tools are curious but it's they does if it's dealing with space that a 3 D 8 files to again a whole other area of multimedia that I think having some time with those who who work with that kind of material or having a background in and say design three-dimensional design would be incredibly useful because very very much looking at that from the point of view of an outsider I really I don't know our and I certainly am I I can't comment on rendering 3-D files except perhaps that they depending on what's in there could be pretty large come I to my office single someone saying gist Castaignede very gain on trading works at the New Zealand types have you separated a metal from the object where the object and and the other in in UT so it was essential to the orders kind came out try stand beginning had this fragmented out from the object I think the on the TV answer to that is that we had a collection management system which had different modules to it so I had a module which collected descriptive information so that was where meted out it was captured it was also a module that and I moved digital objects to be loaded into it which was it's kind of online catalog but underneath that will rule so what we called conscious of what we call the object management system where the objects themselves were loaded in end linked to the metadata that was in the collection management system so we had quite a lot we had 2 separate systems and that to some degree of dictated the way that we managed that material we manage the metadata their separately to the actual digital object itself and and other systems that's not the case that they manage that material within the same application end as as to where the object in this animated that began when I started exploring big and kind of I-CASE reacquainting myself with TIFF as a file format I really and I give woke up to the fact that a TIFF file has made it our energy and dada and and I really didn't understand enough about file formats to begin to understand and how I would describe it and so once a set appealing bit the layer of a to file I realized that there was a whole lot more information in there and when I looked at LAN sets and images of I discovered that they would like is within those arms file formats within a coordinated CD if that's right that got 3 layers in the stomach or the dot X Islam the coordinate system layer and the scientific data taught layer and I am reading it from a piece of paper but I realize that that these were treated quite separately in the structuring of a file format I think depending on the way daughter and information is captured I think you could call a move that line between and the object in the middle . at that time perhaps describes and supports its retention as to the last question how did you decide what was essential to the object when you're in the collecting business you have to make decisions about what it is that you think that's important a cake because the whole point of keeping material as a collection is to enable other clicks to be potentially used viewed in some way end ideally you want to to capture the essence of the object so my comments about we archiving which is that when we were looking at what we could capture someone on their work to his up a here and Australia Jason and J. C 2 you just cutting holes in the internet it's in it's you that's all you're doing a cutting holes it's not the network and I think you said so and what we could represent was on a piece of the Internet hit a point in time and I found that pretty interesting but also some websites were really difficult to harvest using the kind of tools we had we used a tool called HD track which is of consoles toes whipped red at all but wiped out and you got a variable results depending on how you use the sittings almost holes and flesh was notoriously difficult to if the capture sorry in some instances other collecting institutions around the world have done film footage of a website to tuition capturers and and other instances have gone to the banking to to capture what Senate and taken screenshots to reflect what the interface was like at the front sorry I I think in each case it's important to understand what it is that's of value to the researcher and and a research context and also an occluding context because if you go into goes the bother of our capturing information effectively is with a view to making that accessible potentially reusable again that's a really
really good bundle Christians in their cracking
and will be made available
hands on loan for what they say you haven't been unable to attend the it's been a real pleasure and I really hope they get to have a bit more discussion input because there's plenty to learn at
their we need to click to brain at work differently