Add to Watchlist

Andrew Treloar, ANDS at DataCite summer meeting 2012

14 views

Citation of segment
Embed Code
Purchasing a DVD Cite video

Formal Metadata

Title Andrew Treloar, ANDS at DataCite summer meeting 2012
Subtitle Seeking Serendipity: repurposing DataCite metadata to augment ANDS discovery
Title of Series DataCite summer meeting 2012
Part Number 4
Number of Parts 10
Author Treloar, Andrew
License CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
DOI 10.5446/6571
Publisher DataCite
Release Date 2012
Language English
Producer DataCite
Production Year 2012

Content Metadata

Subject Area Computer Science
Series
Annotations
Transcript
Loading...
My I think I spend more time here than in Australia but it's not True store Today it is talk to you about how we using admitted after To augment our existing Discovery Systems and I feel a little embarrassed talking after shots Because there was really amazing talk about a whole range of fantastic things is But with more mundane There same To me at least not quite as exciting but hopefully still of interest to you so 1st there some old each time even I apparently spend all my life in Europe Pat Summitt you not it hit me talk about this trend in national that we're a initiative unsurprisingly Estonian government we got to charges of funding 1 from a finger was focused on collaborative research infrastructure and then a 2nd tranche from the thing was cold Super science on not entirely sure why has some particularly super softer science anyway I we're collaboration between March University the University that I work at Australian National University and a federally funded R&D organization called the call of Scientific and Industrial Research Organization account good equivalent in Europe were about 50 staff We have a couple of Things that we say we're interested in we're interested in more researches reusing more data more often big focus on day reuse which can operate at the conference and secondly this idea Gators a first-class object resonating very nicely with the previous talks so it's not just about publications about data as being as important and in some cases more important than publications More recently we started talking about what we're trying to do in ends as enabling full transformations so we're trying to achieve transformation was wrong and you as a subtle difference they you say that
That column is headed data that column is hated structured collections but essentially from things that I managed to many of our men aged stuff that's in people's pockets on USB drives On how drives on their laptops whatever to structured collections of data that are now much beaten men aged for belong to from data that is Disconnected from the context in which it was created to data is now deliberately connected in the context much like those graphs That we saw in the previous talk of making connections between the data and the publication and the research project and the researcher and the institution and the instrument and services duties rich may of connections from data That is largely invisible because it's in someone's pocket to data that is now much more fondled and lastly your memorable phrase slowness in rural about use from Darfur that is single use to Darfur that is now much more reusable and saw
Those buildup if you like you need to get as many engaged then you need to get connected once it's connected it's much more fineable once you've found the thing you can reuse And so are off the goalie's so the Australian researchers can work better with data but of course we're not just about estradiol researchers when making a data available the whole world As part of the East with building this thing that we call the Australian research data calm so that the metaphor is like a common area That people can come to which brings together the data and descriptions and the relationships between and the infrastructure that lets people and things Comments and the reason I'm telling you this Is that what I want to focus on for the rest of my talk is what we call the window even to Commons so the window into the Commons is out Discovery system and that when we were trying to deal would add discovery environment for this strain in research data Commons we had Some deliberate things in mind the 1st wars We were not trying to replace discipline specific discovery environments if your marine researcher and you want just marine data their places that you can go to that And you probably know about those already because you work in that space so we went try to replace those were trying to complement gold walls discovery across disciplines or discovery of data by people who were not We did that discipline the 2nd thing that we're trying to do it at Discovery Systems is we didn't want to say to people you have to counter us to find data so we deliberately went for a model where we go to where the users on that is we make our Outdated descriptions discoverable Through the installation environments that people you switch primarily at the moment means making sure that people can find our staff in Google and being in Yahoo and Dr. go on whatever roasters Corky using so we make The data easily accessible for people who they don't have to change their information seeking behavior The 3rd thing we're trying to do was provide context around the data so it's not just the data etc. ready it's the data that is linked to the publications that is linked to the institutions to the researchers the research projects And we did that for 2 reasons 1 walls too Provide More context to help with discovery of what I mean by that is maybe you met someone at a conference maybe you tweaked talk a little that Twitter in time so you can remember someone's name or you can remember the research project about that's all you can remember which I make it easy for for you to find But 1st of all that Project and they strongly at follow the links to the organization I work for the projects they work on the data they produced but that context is also useful For more Assisting with so what you found some data you need some way of assessing its value and making links to funding bodies Auto institutions as a way of helping you say Well yeah that was founded by the National Institute of Health is probably good off that's associated with that research group of the day I don't want anything to do with so the context is helpful
I said we don't Replace discipline portals we link to them and we tried to do is as saying his window into slow what this looks like he is because so they see is the production version of research that Australia We had 40 thousand not collections 5 thousand all parties Proteus eh Persons Control plus exist system on a path is a person or an organization we have a number of services associated with data And we have about 27 thousand research projects and so you can put in a search he well and it will search the underlying database and all those things those 40 thousand collections 27 thousand activities is a awaiting page of voters that's being indexed by with search Slug Such The context living up talk about where the data side knitted outcomes in 1 of the things that we Decided to do fairly early on in our development of about discovery service 2 not just support beat on looking for this query but to support serendipity to make it easy for people to find things that they had deliberately searched for and I think in the back heads when we're doing is we're thinking on something a little bit like guy the Amazon system people who searched for these also searched for that We did decide to do it like that as it turns out that that was the kind of ideas so that The same chilled by theories we provide suggested links and we start small and we gradually add functionality so the 1st stage walls
What you might think of the internal suggestions sorry if I go he and I say I don't know The Australian researcher kangaroos contractually required to do this work overseas and a case so he is a survey of kangaroos from
Soccer combat and your city
Down the bottom of the page it stated Now he's suggestively so in addition to the search that I dominates saying will here are not I 892 data collection with matching subjects too if I want war staff on kangaroos because you know you can't get too many kangaroos click on that and informers of time he's is alive around in the fullness of time now it is actually doing something about that will go off and pulleys that was not what I wanted It is still running running but it's gone invisible what's interesting about OK see if it comes back slowed That's that was studied 1
Yet it this is of course exactly what you want for dinner would stop script I say we are the 1st blog Call suggested links that this particular queries These are other things that got Kangaroo staffing the sort stage 1 was that stage 2
As Young said we were vote early on in data side so we thought well Now that a data sites such IPI water we do that and so were show you this morning is not eating production about we'll be in production by the end of this month I think that what we're doing is we using the title of the record that someone is looking at it as a search process against the side metadata So rather than just doing suggested links inside our own collections with broadening that we search in real time we start for the best possible match we reduce the match percentage if we don't get any results we start by looking for Words in the title that doesn't work we will look for most of the words in the title if that doesn't work with some of the words in the title And we preferentially ranked data sick results Ahead of the other resource types the data site knitted after displays so what this looks like living dangerously now but I'm not doing it dinner with the browser that's been shown to be problematic are now doing this on alpha code Code running in the cloud stretching out the degree of difficulty here it's like walking up tight taught right with my blood blindfold site such a kangaroo actually there kangaroos a bad example for this 1 is pulling not gonna be a lot of good matches in Darfur let me look for marine sediment where I know that this is gonna work better so again I do search helpfully reminding me it's a demonstration of our I get the same kind of results
So it's sigh interested sedimentation stress in his particular kind of call On the Great Barrier Reef stray Arlette click on this 1
OK a what a game is suggestively they start off by providing the internal records and being in real time painted external websites 44 collections That led do any refresh you probably see that little data internal records external website so that's a surge in real time against the better Sunday metadata we haven't pulled the day after that Assad across using the IPI to query in real time stuff back so I can do exactly the same thing as I did before we get the same script but that's that's give it a go but now I live on pulling back the better side metadata using the search IPI let's say uninterested these side niche we thought about being directing a fake easier to partial pressure but having rated happen optical and so I click end each tells me well here is a description he is a citation it's a collection as I said we preference collections up and I can now look at the data side I bounce off to the side service more interestingly I can say Well I don't wanna see the data side today looked enough metadata already our go-to and and not I don't like that everybody loves made at rights but now raid and enough of the maitre data to decide that I want to go look at the record on
And he a please Do not wear this tweaking do not tweet enters week on it be let back in the country founded so I can just click directly and go to bypass the downside and go straight to original souls
What can I can relax because Saw that stage to stage 3 With still thinking so it would building a staff in stages so what other things could we mine for those additional suggested links that we have some ideas the National Library of Australia Is 1 possible targets they have a very comprehensive system called Troy which we could search against others that thing could start a special data directories is the episode living Australia which is associated with GABA infected GB stole The director there is living straight to run but I'm not bitter we got over 5 and because we don't have a thing to just be Israelian wearing conversations with dance in the Netherlands about using em announces which is a research portal the Dutch head as a possible targets and I'm sure there are other ideas will come along as well in terms of data side we have a number of possible enhancements that we're now looking at around 1 would beat we simply taking the Darfur search rankings using the search IPI we might want to think about tweaking that at the Darfur citing high like things we care about where discussions about a site developers It would be nice if you could do this sort of see also suggested links thing at the search level because of the moment But do research
I'm scrawled up close that I only get to see those suggested links when I go eat into the record it would be nice if I could see suggested links for the entire state of records rather than have to look at each 1 18 nice if you could use the power of the use of followed 2 gates to the page they're all get to al-Qaida on to re ranked turns in the search query in other words to say All I care that followed these sequence all things to get lets use that to tweak The results search at the moment we just using title title electoral pull things to search body nice if we could use Al subjects takes against the downside subject takes give you much richer results but that's push the search Oberon and a little bit harder We got some other ideas it be nice if you could say I'm looking for things with a similar spatial coverage of all things with a similar temporal coverage or both that it be nice if you call would remind information about the cold so He is some other stuff written by these people so again it's Scott Amazon richness of us were for things that share keywords lots of ways that we could improve but would be found The ways in which we help people stumble across serendipitous stuff That they wouldn't otherwise missed there's issues for the future Conversation I young market had with young eased as he can medically click on someone clicks on view resolve records they bypassed at site or together and just jump directly to the underlying thing fling not but we haven't had the conversation a gag you can tweak that he shook his head and said he didn't houses case Skiles this is obviously a city you start doing Federated search there's a view debate in this game for while synergies stopping Federated search has been a sky water lots of See also services so if you mention 18 This example that as well as 2 thousand 144 collections from data side we have another 10 thousand from sits about 3 thousand from somewhere and so on and so on and so on how's that gonna work you what you may not have noticed was that infect you can use the Web page as soon as the page loads you don't have to to wait for those things to paintings but you could legend someone sitting heartwarming thumbs waiting for seeks of seeks of these affiliate and that's not a good user experience
And how See you I don't work I mean you could begin a 4 foot back you can see that that's fine with 1 external collection it's gotta be OK with came not really work with 20 or 30 or hunger were still thinking through those habits at your work in practice and of course there's the for a month ago was the case question so we worry about all of these issues is the actual user
Is it going to be enough to simply say to the user here are some suggested links and get a blow the water get you the user probably don't care whether this comes from Data off Iran's internal from off that Australia you just take note He's some other stuff you are interested in my suspect that's where we're not going slide not have information about me in the middle and links to EU demos hours using now I have to say to you and start over there you is the website for all of us it links to the production environment that I did the 1st such as on research data Dormansville told is the pre release beta thing I can guarantee that it will always be out because we're fiddling with but it's the 1 that lets you play with the the also so I figured I should show it to you and you know what you're Australian so you and be telling developers that I've done this from but they add Fisher should decide how delight toward you go out they who Tweedy it's a delight see tweeting this particular station and is being really really helpful for together just reach with somebody else's said something better than they encourage you to keep doing thank you
Inheritance (object-oriented programming)
Transformation (genetics)
Multiplication sign
Range (statistics)
Collaborationism
Staff (military)
Goodness of fit
Video game
Object (grammar)
Universe (mathematics)
Subtraction
Physical system
Collaborationism
Service (economics)
Focus (optics)
Inheritance (object-oriented programming)
Metadata
Staff (military)
Mereology
Binary file
System call
Equivalence relation
Personal digital assistant
Data storage device
Strategy game
Self-organization
Object (grammar)
Laptop
Context awareness
Building
Service (economics)
Scientific modelling
Multiplication sign
Complementarity
Graph (mathematics)
Mereology
Twitter
Connected space
Latent heat
Meeting/Interview
Linker (computing)
Data structure
Descriptive statistics
Physical system
Area
Spacetime
Information
Projective plane
Moment (mathematics)
Staff (military)
Transformation (genetics)
Local Group
Connected space
Integrated development environment
Personal digital assistant
Self-organization
Window
Web page
Spacetime
Context awareness
Beat (acoustics)
Inheritance (object-oriented programming)
Service (economics)
Disintegration
Chaos (cosmogony)
Disk read-and-write head
Theory
Emulation
Number
Revision control
Data management
Lecture/Conference
Linker (computing)
Database
Moving average
Information
Aerodynamics
Physical system
Service (economics)
Link (knot theory)
Product (category theory)
Web portal
Building
Software developer
Projective plane
Computer program
State of matter
Range (statistics)
Bit
Functional (mathematics)
Voting
Computer animation
Query language
Uniform resource name
Time evolution
Function (mathematics)
System programming
Computing platform
Software framework
Self-organization
Physical system
Data structure
Experimentelle Versuchsforschung
Domain name
Email
Meta element
Service (economics)
Demon
Set (mathematics)
Archaeological field survey
Archaeological field survey
Term (mathematics)
Counting
Demoscene
Sequence
Data management
Number
Computer animation
Database
Personal digital assistant
Computer network
Moving average
Data storage device
Gamma function
Domain name
Web page
Greatest element
Installation art
MUD
Multiplication sign
Demo (music)
Analogy
Matching (graph theory)
Infinity
Data management
Summation
Linker (computing)
Scripting language
Mutual information
Addition
Matching (graph theory)
Information
View (database)
Suite (music)
Staff (military)
Ultraviolet photoelectron spectroscopy
Group theory
System call
Frequency
Computer animation
Database
Blog
Natural number
Quicksort
Code
Demo (music)
Perturbation theory
Water vapor
Point cloud
Real-time operating system
Web browser
Metadata
Goodness of fit
Type theory
Meeting/Interview
Linker (computing)
Electronic visual display
Ranking
Information
Alpha (investment)
Service (economics)
Link (knot theory)
Product (category theory)
Process (computing)
Real number
Metadata
Degree (graph theory)
Latent heat
Word
Voting
Website
Data type
Resultant
Matching (graph theory)
Domain name
Row (database)
Mountain pass
Programmable read-only memory
Nuclear space
Real-time operating system
Bit rate
Total S.A.
Type theory
Stress (mechanics)
Computer network
Dependent and independent variables
Moving average
Statistics
Website
Organic computing
Descriptive statistics
Chi-squared distribution
Metropolitan area network
Partial pressure
Data recovery
Parameter (computer programming)
Quantum state
Maxima and minima
User profile
Sample (statistics)
Content (media)
Langevin-Gleichung
Uniform resource name
Sieve of Eratosthenes
Compilation album
Website
Right angle
Data structure
Row (database)
Surface
Freeware
Service (economics)
Proxy server
Line (geometry)
Volume (thermodynamics)
Metadata
Sampling (statistics)
Smith chart
Sequence
Local Group
Sound effect
Social class
Computer multitasking
Scripting language
Self-organization
Data type
Window
Multiplication sign
Sine
Distribution (mathematics)
Core dump
Weight
Group action
System call
RAID
Computer animation
Mathematics
Stress (mechanics)
Field (mathematics)
Computational science
Game theory
Cuboid
Units of measurement
Data transmission
Pulse (signal processing)
Surface
Freeware
Proxy server
Cloud computing
Volume (thermodynamics)
Bit rate
Total S.A.
Sound effect
Social class
Pointer (computer programming)
Type theory
Dependent and independent variables
Organic computing
Data type
Chi-squared distribution
Geometry
Distribution (mathematics)
Data recovery
Parameter (computer programming)
Exponential function
Weight
Group theory
Group action
Maxima and minima
Computer animation
Stress (mechanics)
Field (mathematics)
Compilation album
Data transmission
Pulse (signal processing)
Pressure
Home page
Cellular automaton
Matching (graph theory)
Analytic set
Dynamic random-access memory
Sound effect
Landing page
Finite element method
Lipschitz continuity
Row (database)
Integrated development environment
Website
Arc (geometry)
Data type
Newton's law of universal gravitation
Metropolitan area network
Service (economics)
Beta function
Link (knot theory)
Observational study
View (database)
Distribution (mathematics)
Tape drive
Computer engineering
Web page
Water vapor
Metadata
Workstation
Parameter (computer programming)
Core dump
Weight
Digital object identifier
Protein
Population density
Explosion
Image resolution
Content (media)
Computer animation
Oval
Field (mathematics)
Computer cluster
IRIS-T
Partial derivative
Personal area network
Optical disc drive
Units of measurement
Thermoelectric effect
Pressure
Query language
Demo (music)
Home page
Directory service
Disk read-and-write head
Number
Sound effect
Meeting/Interview
Linker (computing)
Term (mathematics)
Energy level
Ranking
Row (database)
Integrated development environment
Ranking
Data conversion
Website
Conditional-access module
Physical system
Data type
Link (knot theory)
View (database)
Real number
Software developer
Web page
Moment (mathematics)
Metadata
Staff (military)
Directory service
Term (mathematics)
Protein
Content (media)
Computer animation
Partial derivative
Element (mathematics)
Website
Energy level
Optical disc drive
Electric current
Query language
State of matter
Scaling (geometry)
View (database)
Home page
Water vapor
Bit rate
Disk read-and-write head
Total S.A.
Type theory
Linker (computing)
Dependent and independent variables
Row (database)
Information
Data conversion
Website
Organic computing
Logic gate
Thumbnail
Family
Source code
View (database)
Data recovery
Structural load
Moment (mathematics)
Complex (psychology)
Parameter (computer programming)
Bit
Term (mathematics)
Sequence
Entire function
Maxima and minima
Uniform resource name
Compilation album
Website
Energy level
Resultant
Electric current
Row (database)
Web page
Pressure
Surface
Freeware
Service (economics)
Proxy server
Volume (thermodynamics)
Power (physics)
Sound effect
Goodness of fit
Ranking
Game theory
Scale (map)
Information
Distribution (mathematics)
Weight
Group action
Word
Computer animation
Query language
Personal digital assistant
Stress (mechanics)
Field (mathematics)
Game theory
Data transmission
Resolvent formalism
Pulse (signal processing)
Scale (map)
Source code
Slide rule
Beta function
Proxy server
Information
Demo (music)
View (database)
Scaling (geometry)
Software developer
Water vapor
Weight
Workstation
Computer animation
Integrated development environment
Personal digital assistant
Linker (computing)
Website
Row (database)
Kolmogorov complexity
Loading...
Feedback

Timings

  546 ms - page object

Version

AV-Portal 3.8.0 (dec2fe8b0ce2e718d55d6f23ab68f0b2424a1f3f)