Infrastructure for data on open access: openness, sustainability, reproducibility

Video in TIB AV-Portal: Infrastructure for data on open access: openness, sustainability, reproducibility

11 views

Formal Metadata

Title
Infrastructure for data on open access: openness, sustainability, reproducibility
Title of Series
Author
Laakso, Mikael
License
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Technische Informationsbibliothek (TIB)
Release Date
2019
Language
English

Content Metadata

Subject Area
Abstract
There are still considerable limitations for conducting comprehensive open access. monitoring. Over the last decade science policy has been pushing hard for open access, but open data and tools for monitoring the current status and in particular development over time are still lacking. Current bibliometric databases used for publication analysis in the context of openness have their biases and limitations in how comprehensively journals across disciplines, countries, and languages are selected for inclusion. Being commercial, access to them is limited, and datasets created on the basis of such data can rarely be freely redistributed in their most usable form. In order to understand how the landscape is changing over time it would be important to be able to capture the publisher, journal, and article-level developments in a consistent and reliable way. Key questions that will be discussed: What pieces of the infrastructure are still missing? What questions still go unanswered because of this? Why and how should the information environment for open access be improved?
Loading...
Differential form Latent heat Video game System programming Information Presentation of a group Open set Data type
Angle Information Key (cryptography) Scientific modelling Blind spot (vehicle) Decision theory Planning Integrated development environment Black box Function (mathematics) Open set Electric current
Subject indexing Latent heat Latent heat Web crawler Integrated development environment Sound effect
Trail Standard deviation Process (computing) Observational study Software developer Uniform convergence Water vapor Open set Term (mathematics) Number Web 2.0 Digital photography Subject indexing Medical imaging Latent heat Digital photography Angle Subject indexing Energy level Self-organization Bounded variation Freeware Pairwise comparison Data type
Satellite Trail State observer Greatest element Archaeological field survey Open set Graph (mathematics) Mereology Digital photography Heegaard splitting Mathematics Analogy Integrated development environment Data conversion Bounded variation Software developer Content (media) Sound effect Bit Line (geometry) Measurement Table (information) Subject indexing Digital photography Integrated development environment Äquivariante Abbildung Self-organization Figurate number Identical particles
Domain name Context awareness Information Program slicing Dependent and independent variables Entire function Metric tensor
Service (economics) Information Key (cryptography) Archaeological field survey Usability Basis (linear algebra) Open set Perspective (visual) Web 2.0 Subject indexing Goodness of fit Data model Integrated development environment Database Telecommunication Universe (mathematics) Authorization Matrix (mathematics) Software cracking HTTP cookie Task (computing) Annihilator (ring theory) Electric current
Trail Scientific modelling Basis (linear algebra) Student's t-test Open set Complete metric space Black box Average Number Radio-frequency identification Computer configuration Term (mathematics) Database Subject indexing Hausdorff dimension Addition Service (economics) Information Metadata Trajectory Usability Digital object identifier 19 (number) Inclusion map Subject indexing Exterior algebra Database Telecommunication Universe (mathematics) Self-organization Bounded variation Electric current Row (database)
NP-hard Freeware Pay television Open source Observational study Model theory Letterpress printing Auto mechanic Open set Complete metric space Perspective (visual) Event horizon Data model Video game Hybrid computer Green's function Subject indexing Hausdorff dimension Software framework Subtraction Logic gate Embargo Differential form Standard deviation Link (knot theory) Information Software developer Moment (mathematics) Content (media) Expert system Paradox Bit Digital object identifier Measurement Maxima and minima Degree (graph theory) Inclusion map Category of being Arithmetic mean Database Green's function File archiver Partial derivative Website Game theory Bounded variation Spectrum (functional analysis)
Standard deviation Freeware Observational study File format Content (media) Metadata Image registration Open set Web 2.0 E-text Uniform resource locator Peer-to-peer Revision control Repository (publishing) Website Embargo Modem
Greatest element Identifiability Observational study Auto mechanic Real-time operating system Open set Mereology Perspective (visual) Number Revision control Web 2.0 Authorization Information Subtraction Sampling (music) Differential form Source code Multiplication Key (cryptography) Sine Computer Real number Element (mathematics) Sampling (statistics) Image registration Digital object identifier Greatest element Windows Registry Uniform resource locator Repository (publishing) Internet service provider Boom (sailing) Revision control Website Self-organization Game theory Sinc function Resultant
Open set Number Web 2.0 Goodness of fit Video game Videoconferencing Energy level Information Website Sampling (music) Physical system Covering space Source code Standard deviation E-book Information Sine Real number Mathematical analysis Metadata Morley's categoricity theorem Volume (thermodynamics) Image registration Set (mathematics) Digital object identifier Greatest element Open set Windows Registry Metric tensor Subset Database Revision control Website Identical particles
Point (geometry) Subset Observational study Hybrid computer Green's function Time travel Videoconferencing Content (media) Energy level Set (mathematics) Open set Website
Web 2.0 Subject indexing Observational study Whiteboard Green's function Hybrid computer Videoconferencing Auto mechanic Energy level Video game console Open set
Trail Observational study Whiteboard Open set Whiteboard Mortality rate Entire function Exception handling Metric tensor
Trail Information Observational study Profil (magazine) Cellular automaton Order (biology) Survival analysis Subtraction Mortality rate
Game controller Pay television Dependent and independent variables Feasibility study Whiteboard Routing Number
Point (geometry) Pay television Observational study Scientific modelling Control flow Feasibility study Cursor (computers) Division (mathematics) Open set Total S.A. Green's function Green's function Energy level Figurate number Information security Reverse engineering
Ocean current Trail Information Green's function Surface Time travel Energy level Figurate number Open set Information security Total S.A. Reverse engineering
Subject indexing Analog-to-digital converter Scientific modelling Basis (linear algebra) 1 (number) Open set Perspective (visual) Twitter
Expected value Key (cryptography) Analog-to-digital converter PRINCE2 Scientific modelling Program slicing Valuation (algebra) Diagram Information
Graph (mathematics) Product (category theory) Service (economics) Open source Open set Evolute Template (C++) Mathematics Internet service provider Matrix (mathematics) Website Software framework Decimal Freeware Subtraction Digitizing
Taylor series Service (economics) Group action Service (economics) Product (category theory) Observational study Ferry Corsten Mathematical analysis Ultraviolet photoelectron spectroscopy Mathematical analysis Open set Perspective (visual) Mathematics Writing Sample (statistics) Strategy game Number theory Telecommunication Authorization Location-based service Information Hydraulic jump
Email State observer Taylor series Serial port Observational study Information Heat transfer Mathematical analysis Heat transfer Writing Sample (statistics) Location-based service Information security
Data model Heat transfer Energy level Open set Mereology Hand fan Twitter
Point (geometry) Data model Dot product Hypermedia Key (cryptography) View (database) Source code Order (biology) Energy level Open set Metric tensor
Standard deviation Enterprise architecture Information Software developer Source code Shared memory Online help XML Open set Windows Registry Table (information) Data storage device Telecommunication Pressure
Group action System call Pay television Open source Prisoner's dilemma View (database) Calculation Electronic mailing list Open set Total S.A. Peer-to-peer Radio-frequency identification Telecommunication Database Subject indexing Representation (politics) Commitment scheme Data structure Abstraction Library (computing) Collaborationism Standard deviation Information Software developer Projective plane Term (mathematics) Group action Open set Radical (chemistry) Exterior algebra Commitment scheme Hybrid computer Order (biology) Freeware Reading (process) Electric current Library (computing)
Trail Server (computing) Identifiability Prisoner's dilemma Open set Parameter (computer programming) Total S.A. Web 2.0 Data management Telecommunication Operator (mathematics) Database Authorization Negative number Commitment scheme Information Library (computing) Software developer Metadata Mereology Line (geometry) Group action Computer network Self-organization Cycle (graph theory)
Data management Category of being Service (economics) Authorization Information Open set Embargo
Standard deviation Service (economics) Pay television Software developer Variety (linguistics) Software developer Electronic program guide Metadata Expandierender Graph Open set Data management Measurement Computer configuration Information Subtraction
Standard deviation Software developer Structural load Machine vision 1 (number) Sampling (statistics) Metadata Open set Expandierender Graph Limit (category theory) Measurement Subject indexing Computer configuration Universe (mathematics)
Standard deviation Software developer Metadata Expandierender Graph Public key certificate Event horizon Open set Measurement Video game Process (computing) Computer configuration Algebraic closure Blog Matrix (mathematics) Energy level
Asynchronous Transfer Mode Mass Open set
i know my title sounds kind of scary and specific but this is mostly about problems i've been having for most of my adult life in getting good data on the open access so getting reliable reproducible unsustainably available and shareable data on how old.
next is doing so that's what it's about expressed in this convoluted and nice academic form all but what i'm going to talk about today isn't just about a problem with open access it's a problem with school early publishing in general that we really don't know a lot about what is really happening considering all the types of.
outputs journals articles publishing models we have its kind of a black box that we can take small peaks at and and observe from different angles limited angles but there are many unanswered questions that we have. but i think open access is the key its its key in opening up a science to everyone but it's also the key to making in information about scholarly publishing more available so this is a chance to open up the outputs as well as providing more transparency to the whole to. system at large all them at that date than everything that goes with it. and as i mentioned and as a cabbie out here i don't have the answers to how things should be infrastructural he provided but through my experiences in doing research on open access for over ten years i've found problems some blind spots and things that could easily be improved and those i'm going to percent.
you today. so to start with to get your minds massaged into the right mood set here i think some simple questions that still kind of don't really have good answers is how certain science policy decisions for example plan s. to figure out.
but away how we can assess the impact of trying to tweak the publishing environment like how can we know what the causal effect of implementing a you know a specific science policy intervention is on the scholarly publishing landscape because it affects everything and and euro thing.
these are so interconnected that how can we know and how should we inform ourselves in in how science policy kind of skiers the landscape if we have bad data of the landscape over all and at how much about open access is really about bitter indexing and web crawling and better discovery and how much.
it's about actual organic growth in their being new journal started more articles being put up on the web for free or how should we discern between better indexing and just more content i think that's a question that they still kind of an answer because we can all see the percentage is rising and. numbers included in the deal wage a director of open access journal said but those many of those journalists have been around for years publishing an open access they just haven't been indexed so how should we figure out how much open access is actually growing compared to just being better indexed. then something that many people have tried to answer and still are working on probably for the rest of their lives is to figure out how open access or article processing charges are developing over time and how it to the first between publisher types or a research disciplines or you know there are many different angles you can look at this. but there's really bad long it dude in all data for knowing exactly how the pricing levels have evolved and will continue to a bald says were so bad at keeping track of it i notice provocative for many of you but i'll show you some evidence to the and then thinking about science and. and it publishing up large how much as science growing and compared to how open access is growing is the tide just raising all boats in the water or is open access kind of outpacing the growth of science over all we know that more and more stuff is getting published each year but it's open access got a growing fast. third on science over all you know we can think about that question for a while. but most of my studies and most of the studies on open access have been comparable to taking a camera you know not an actual camera but going on the web and taking a snapshot of what is available today from a certain population open access by some criteria so taking a snapshot day. camera so to say so you get us that the picture of what's available on the web today of what has been published in the past this has been the methodology that this kind of the been been the standard for many open access studies of but then when you have an old photograph from two thousand and fourteen i know i'm becoming very abstract here we can think of this.
house as open access percentages store or the landscape of open access journal in two thousand and fourteen you have a need to image here of how things used to look you know we can see certain features around his house but if you today want to attempt to compare how much has changed since two thousand and forty in trying.
to use the same equipment the same scenarios the same of circumstances you're still going to end up with something that it's really hard to compare to try to take a picture of the same house and tried to figure out what's changed why it's changed but it's still so different that you can't really figure out course all relationships and you know the. but these aren't just reproducible or comparable because the whole landscape has changed so much so it's it's a problem in in kind of providing reproducible measurements and data on open access and even if you do your best you put on your academic kept and really try to be strict about it you have your. mark the spot where you are taking a photograph in five year intervals you tell the house owner to not change anything in the house you want to look at how the environment is changing around the house you're still going to end up with small de francis that you can't really explain like there are small changes in these pictures these measurements that still make it hard to. figure out what's the effect of science policy what is the effect of just bitter indexing what is the effect of too expensive or too weak to have a kind of the actors in the landscape so it's pretty much this that you are trying your guessing like why are certain pieces being added or missing from the picture between measurements ok. this went where the abstract pretty fast but i hope you understand the kind of idea here rather than showing you open access percent to just and graphs the same idea kind of converts to this photograph analogy here that currently we don't really have good data on comparing long into denial development of open access. but how could we improve rather than a camera should we set up some other measurement equipment leader bottom up you know placing the landscape with sensors and some kind of identity caters to monitor change or should we do it kind of top down compared to satellites that there should be some survey. since organization keeping track of everything or should it be no video camera like we should have a kind of audit trail we we would should monitor every change every time it happens so that we can understand science better of course all of these analogies again are hard to directly implement to be. the matrix and economic an army of of publishing but we should think along these lines that rather than having a camera should we have some other ways of monitoring creating a date that creating observations of scholarly publishing so this is just again getting you warmed up to the idea that some things may be broken and could be improved when it.
it comes to the data and for my kind of talk today i don't have a long table of contents our agenda here but that split into two so i want to talk a bit about baby a metric data and then how economic data intertwines with said be a metric date and i think the most interesting part is somewhere in between way.
it becomes difficult to assign responsibility and really domain expertise when it falls between accounting political economy and the traditional information science s. and it becomes complicated pretty fast when you want to know more about the economic aspect than just what the price you been paying for publications so.
querrey challenging but that's why i really like the slice in between because it's easy to look at things in an entire own context like pure economic or pure baby metric but combining them is where you get to most of the interesting and juices of what's actually happening.
but first the more purely be a metric perspective.
it's not just me and researchers like me who make their living on researching open access and scholarly publishing a school of communication i think society at large and actors eight within science scientific publishing would all benefit from bitter data better be a matrix being available out in the open it's not just that. curiosity for curiosity sake it would facilitate better and more and more efficient knowledge about the landscape. and then the problem the problem with the web is that of course you're being money toward wherever you go by cookies and crackers and commercial companies know when your every move answer probably tracking your eyes were you're looking but somehow scholarly journalists have been pretty secure from the survey. plus there may be the worst morning third aspect of the web where you know there are a lot of structure data but still we don't really know the complete universe of how many journalists there are how many articles are there we have different guess the mets we have different in the excess but we don't really kind of no to fool universe if we would like to have it we can look up different index. yes but even though everything is now web based we still don't really have an authority index that would include everything that would be kind of collected from from the web and i think that's problematic we should have a kind of a baseline for everything that's out there and and filter from their faith. and there are three key obstacles size in the current environment is that this commercial dominance also for be billion metric information is problematic so scope wish web of science all our friends at these institutions have good data it's fairly good its people work at cleaning it up and keeping it curated.
and on and recent but it's the limited in what you can do with it you can't really share it you can't really build on it your limited to whatever dave selected for you to view as a pay per view option of comparative here to have been ocular is that this coin operated on directed at exactly the mountain they want you to see you know you're limited in what you can do. with that data.
then the other aspect this this long to dental aspect me seeing like amnesia the database usually forgets most people most sensible people probably interested in what are the active journals right now what are their a.b.c. is right now article numbers and so forth and that's a reasonable wish to have but some people. all would like to know also historical data what was the a.p.c. five years ago what was the publishing model five years ago you know things that are not just about the snapshot right now but also about the trajectory of the journal more long term so the problem is that most of these are pretty much a black box this when it. comes to knowing about how a journalist ended up there and what they've been doing before today you are looking at the database so it's of course a challenge to the soul of this but it's problematic for figuring out how things are moving since you can see other than the individual snapshots of the database and similar to the kind of be no clear. dollars idea is that these index s. that are curated and cleaned up and have nice met the data with them. of where the selective coverage they don't attempt to index everything they index certain things that fulfill their criteria and that's not good for science over all it would be better if we would have an open be a metric date a database that we could consult and then filter from there if we want base. on certain criteria so my wish is that we would have a more open alternative that would get rid of some of these problems i know certain things like this amnesia aspect is hard to solve but their student should be some historical records also in addition to just the most recent news of the journal i wouldn't like to be a. korean in a hundred or two hundred years from now looking at school early communication in two thousand nineteen because were so bad at keeping track and recording a recording everything we're doing like it's it's a it's problematic aspect. and then there's a large variation for reproducibility what you use your population when you're looking at the open access you can console these proprietary databases but you can also course include open access information from the deal wages they or the road database that is managed by the iss. an organization but then you're not really it's hard to compare things that have been selected on different criteria it would be nice to have a complete more complete universe of publishing and a more complete definition of what is open access to really figure out what the publications are out in the open currently fulfilling certain criteria.
so this is so where were the creative and every now and putting together a measurement of open access and collecting data of the population because there's so many different ways you can go about it.
good and bad and then something that the event open access no experts are people who are have a long long in the game can't agree on is what the us open access is it just the most liberal form of licensing fees can be considered open access everything else is just free for the moment or is it. so can you also consider. content that that is available through secondary open access mechanisms providing free access like for a cup of green open access if it's not so easy licensed for example or then what about these illegal means like some would say site how they say illegal entry print or ship and maybe most would agree but. should you consider that open access what about copies on research gate that maybe shouldn't be there is that open access so it's all very complicated when you when it comes down even to defining what is open access once you decide on a pop up population and of course flexibility is good but there should be some. on standards and definition so we can move forward and actually monitoring because everyone interprets this differently as a difference even that they go around filtering stuff with them studies off then mixed together journal perspectives with articles perspectives and that makes life hard for. but for understanding how things are actually developing because these two are connected journals are publishing individual articles but the open access status on the open access data concerning and individual articles is not necessarily dictated by the journal so you can have have a self archive are. archive copies you can have hybrid open access you can have a lot of different variations to a complete fool open access or complete subscription access so it's where a not paradox all but shall we say you can spend a lifetime trying to figure things out this way determining open access status by by. more information or vice versa and you would just get a lot of great gray hairs trying to get answers are digging clarity into the situation so it's all complicated than i know i'm i'm a bit of a downer now this is bad news not good but we're getting forward way we've identified that things are complicated we need to think about. individual all the articles we need to think about the individual journals but we shouldn't get confused we should have clarity on how we move forward and i think one way of providing clarity is having categories asians classifications this is the smallest phone to you'll be exposed to today's to excuse me but i think it's useful to percent. kind of developed framework for looking at it in what different ways content can are is open to various degrees of this isn't so. a couple years back this open access spectrum was published where they have like six different categories of how openness of articles can be kind of obsessed and and categorized so i think ways like this is a way forward to tag being to making it machine understandable in what way.
content is provided open open access but you can consult this more closely which is to present an idea that there should be some agreement on how we categorize things if we want to move forward since things are messy.
this is a name emerge from a study i did a couple of years back where i look that it takes research and one thousand six hundred articles within ethics research looking at where they are available on the web what the front web locations can i find these articles on and like most other open access to.
these about half of the articles were available at least in one location as open access but quite many of the articles were available in multiple locations as multiple different versions so how should you collect data about this in a structured way should you only consider journal websites should be all he could see it or repository.
they should be prioritised one or the other or should have some kind of a here are key no one knows i'm just saying that things are diverse and complicated and and challenging to provide infrastructure data form since people use many different mechanisms currently to provide open access so it's hard to really pin down make a date. the for what is a messy reality if but then i mention that this has some elements of a lecture on this may be the most lecture he part of my talk in kind of educating you on how the methodology for open access studies have developed since well the dawn of open access studies i think the. earliest once came in the mid ninety's somewhere where most of the methodology some results were anecdote also to say you had people who had come across the least of journals that made content available on the web and they wrote it up into a brief article but then this time has moved on you know we we got. a limited perspective on it that when i jumped into the game in around two thousand and eight two thousand and nine we still use manual sampling like took a small random sample of articles i looked at the open access status from it then extrapolated but today people are much smarter and computers are much better at crawling to wear bench. raping the web so that we can have a kind of evolved into all to make that sampling where we can take hundreds of thousands or if not millions of articles and check the open access status of them or then something that is close to even real time now as out kind of disclosed. and the most important cornerstones i think for today for providing open access data is of course provided by the iss an organization they provide a fundamental cornerstone of the journal make a date that the author at the iss and number publisher journal tidal this is not long to dinner date them but they are improving. i'm trying to keep up with how fast things are changing in the digital realm and i do think that this is a good foundation to stand on than in the late ninety's cross ref was founded started with the d.o. i identify gator which is now defacto way of. having a persistent identifiers to content and it's also a key as will see to the real time open access measurement. the deal way jay was started in two thousand treat. couple of hundred journalists and it's now grown to around thirteen thousand journals fourteen thousand journals almost today it keeps rising a lot but it's it's been a key resource for me and it's still kind of the authority of repository for just a least have open access journal that fulfill common criteria it's not all journals. although banks journalist but it's a very transparent and kind of a good cleaned up least the fact that journals. google scholar has been used for many studies looking bottom up what articles are available to other mechanisms and then true deal way jay journals like were on the way become we find stuff and then where he recently in two thousand seventy in on a paywall entered the picture and i would say that's kind of a game changer since they provide arctic.
the level open access categorization based on the deal why identity caters so that's kind of the gold standard for evaluating article level open access status now and forgetting open access data on articles.
and it kind of the best journal level analysis or date that there is is provided by walt crawford who produces an annual e-book and an open data set based on video a.j. and he has in the past manually collected article numbers from journal way. website so you know entering things by hand copy paste thing visiting each website and doing it then of course it's on its doesn't sound like a good time if that's kind of the gold standard of school early publishing and the knowledge that we have to do this manual thing about things that are already on the web i think we can do better in trying to make waltz life easier. not having to do this manual annual exercising aggregating baby metric information into into the covers but having some kind of system that would do this more of a medically provide a p.c. numbers provide article volumes provide a lot of different make that information for journals so it's a very valuable. the starch and i really applaud the effort but i think we should have all trying to alter made this thing and make it more real time so that they do it wouldn't need to be a person that's kind of the blue between the blue metrics and some kind of knowledge and insight into the phenomenon but this is a great e-book i recommend everyone to have a look at it.
and focuses on the video ajay open access journal said. i already praised on pay wall once and do it once again i think it's the gold standard for of figuring out open access status and the data for individual articles they published a great paper where to look at the data set that they have of course this is again a snapshot study.
i don't think much content was open access in the nineteen fifty's it's just what of the content published at that point in time was away level in two thousand and seventeen two thousand sixty and so it's always problematic to go back in time and see what the status was for example in the early ninety's but but this is the best we can do without a time machine.
and this is also a paper i recommend both open access to advanced the people as well as beginners to kind of console because it's really an interesting look at the different mechanisms and the growth that has been happening within them in recent years.
and one thing i'm really happy about that i saw on twitter. some months ago now were on pay wall mention that based on their data they know which journalists have one hundred percent open access without being index the in the the way jay so basically they are using these article level data about what these open on the web based on the allies today. and classify journals us open access journal son release that this kind of a data set so it seems like they will be able to add a long least thousands of new known open access journalist based on them what i'm up crawling the web so i think that's really promising and i think this is the right approach to try to get the universe. police thing of open access journal stand video edgy is good we know those journalists at here to certain criteria but having a complete least for anyone that and everyone that's interested is also a valuable and something i'm really looking forward to. but then where things get messy and which is where people create messes know i'm joking here but but were journalists kind of sleep like for example linger to gloss hour or a journalist in from a trickster quantitative science studies how should this be kind of interpreted in the data how should we code this this is.
and a social thing happening and maybe it will spread maybe will become a big thing and we need to be able to code it somehow to the kind of follow it then monitor it then understand how these phenomena works but there are many exceptions in scholarly publishing like this where we should be able to capture. and as sense to understand how how things work that entire editorial boards are leaving journals to form new journals that are based on open access because that's something that i think science policy would be interested in monitoring but it's not really something we're able to capture other than anecdotally are following headlines are things like that you know things are hidden under.
the layer of the blue metrics and then another issue i'm concerned i'm interested in his journals disappearing like this is a snapshot a figure from out a study i did a couple of years ago together with my colleagues a punk and where we looked at the open access journal that were active in two thousand and two.
to how many were active still in two thousand and fourteen and around half of those journalists had become inactive or some of them even disappeared. so and usually these are cleaned off from in the excess like the way jay for example profile journals that are inactive so you're kind of losing information and order in texas as well are also interested in active outlets mostly so i think it's interesting to also follow what's happening outside of.
but it's just active and all going well today to had a kind of have data also that captures things changing over time as i said i don't have the answers how this would happen but i think it needs to be captured somehow how things are also not failing but shall we say not remaining active if i can be so democratic diploma. tick. then another study this sounds like a cell from ocean beach here but but a study i did on a journal fully within feel also feed where they kind of evaluated between go either open access or then doing compromises like they get they got different offers from for public from publishers to be deal.
they'd open access or to be subscription base but have free access to society members. they had to pay a lot more if they wanted to be open access compared to if they remain shut corruption based on the kind of had to wrestle with feasibility like for a journal that is going well as a healthy number of subscribers going open access could be a compromise economically but also a compromise in in the editorial board.
doing more volunteer efforts like taking more control of the journal taking more responsibility and being more independent from a publisher if they don't want to go to a p.c. route where they get money funneled in which can be problematic in the humanities to have an a.b.c. driven journal that actually turns a turns out that like a.
take a break even so open access is not like that it's not the solution out the magic solution need to cater for it in need to figure out what kind of model works for a journal and this is where some journalists have then reverse flipped this is also a recent study i participated in where we looked at open access journal start to have gone back. or converted back to subscription based where there were around one hundred fifty journals that were still active today that had at one point be an open access so how should this be kind of cool did and no notice in any open access data that we have because these journalists have been thrown out of the deal way jay they are. scope as but you're not really most of them are in scope is not all of them how should we kind of understand like what can this phenomenon of journals going back to subscription based tell us about how science policy is doing or how different disciplines are taking on to open access this is all very tricky. and these are marginal things maybe but i think there's they are aspects that would need to be considered because they inform us about how things are going. and the best the european commission can do when it comes to open access monitoring is roughly this you know where the high level division into gold open access green open access than not open access so i think this is where the problematic because just looking at these kind of high level figures for global.
the all publishing in over all over multiple years how much can you really say about what's going on other than that the percentage is going either op or down or comparing these snapshots with each other from different years.
i think we need to do better than this because this level of data is not really good enough to inform science policy or inform even you know and the world about how scientific publishing is doing because a lot is bubbling under the surface has and exists in tension with economic interests if so.
the most viable of a way it's not really the video camera that's what i argue that we should not out of that the camera that's not what i argue for getting like information about how things used to be in the past we would basically need a time machine like we need to be able to go back in time and figure out you know what was open access the ninety nine to treat you know.
marty we got to go back and figure out exactly what happened because we can't rely on the information we have today because we've been bad at keeping track and this is something i've been kind of exploring a lot of this research about this reverse flips and disappear journalist but also for figuring out the origins of current open.
taxes journals so looking at journals in texas go pers i know i just had a bad mouth a day index but here i am looking at open access journals index in scope was looking at which ones were born as open access one for which one converted to the model can bring some insight into how open. access is developing you know we can see growth here most open access bigger show growth that's i guess a positive thing but if we look at it on a year on year basis how many journalists were kind of started open access publishing we can see that early on before the year two thousand there was a lot of journalists that had converted to. open access already and in the more recent years you know that's all resent from my perspective ten years back most of them were born open access journal so that i think there is there are trends happening in in kind of commercial interests coming into the picture for creating new journals and early on.
on it might have been more you know.
not villain tropic what shall we say journals converting to open access not just pure of business interest but perhaps just of digital ization of facilitating such a model so i think all of this is interesting because we shouldn't just look at percentages and increasing diagrams we should try to understand what's really happening over time because that is. it's the key key thing i'm kind of barking at here. therefore this kind of smaller slice i had hear about economics is that we know relatively little about the relationship between cost and price i think most of us would agree that scholarly publishing from the customer's side is expensive extortionately expensive and we should strive to make it more costly. patient and also try to cut more of the just commercial profits and dividends out of the picture so we can agree on that but just how much are we paying too much and how much can we improve is the big question and and hear a hard fact that i want everyone to understand is that if you are a.
stock exchange least that company you're not going to accept a a zero growth in profits like that that is not on your agenda you want to avoid any scenario where things are standing still because all of the stock exchange valuations are based on expectations of future profit.
and growth so as long as we have commercial actors involved they are not going to want to make things cheaper and less expensive like that they are going to do the dec the exact opposite like we can't expect to have them deliver us free comprehensive open access data that exposes the weak spots and all the interesting. in between it's because currently they want to entrench the market with different products and services own large journal portfolios and it's problematic like it's the relationship between economics profits and be a matrix this is something that would as a need. need more digging into but one thing many of you might be familiar with this graph but i think it's a classic and needs to be shown in these circumstances is kind of the evolution again over time that's beautiful the evolution of our time of how journalist migrated from smaller publishers to bigger publishers in thailand.
i'm with digital ization because in the early days digital journal publishing was hard you know it was there weren't templates and frameworks an open source solutions and service providers for a waitress things in the workflow it was pretty problematic to set up a good way based website in the ninety's so that's why many people.
jump on the bandwagon to do to the big publishers and kind of facility that the transition of growing days although police to the size they are today. and this does not only have to do with that kind of b.b. metrics and and the economics of the publishing landscape these companies are also capturing markets around publishing for writing for analysis for registering for evaluating be would be what it may this is an interesting study of.
of just academic or scholarly communications startups where the author looked at what the time is for for kind of exit strategies what when these start ups were purchased by a investor usually a large commercial publisher and turned into their portfolio of services and products. so this is where much of thing that is happening and of course i'm i'm happy for her start ups that made money on this they achieve their goals but we shouldn't build our open access data on kind of a commercial ground that has a seven year perspective on exciting the market we should figure out a way that we can have a sustainable ground up won't be sale soul.
to the highest bidder when times get tough it's it should be more sustainable than that. and one thing which i think it's interesting for anyone who's into this stuff i don't know if anyone is but kind of how things are moving of from hand to hand it's kind of like a stock exchange how how journalists get transferred between publishers the iss and has a website called the journal transfer dot org.
when you can see incoming and outgoing journalists from publishers and you can see how they are kind of trade as a new of securities in a way that they are bought and sold and you can i am happy that there is kind of a trace of this that you can monitor how journalists are trading hands and going from independent the big publishers are between big publishers and can kind of figure out. things in this way but i've seen where a few studies or any kind of observations of what this really means because you can look at it on different ways like who is giving away the journal or selling it and who is kind of receiving it and i think that would inform things as well for science policy or four for information size in general like how are things evolving.
because things are changing hands constantly and figuring out what the trends are could help us know more about open access like how does this influence open access are they transferred and flipped to open access are they transferred and closed our day what happens you know i don't have the answers but i did i just know that there are things happening. and one thing that would help the economic aspect of this if he's it would have more transparency both and a higher level but also on the closer and closer resume the level of the landscape where we would figure out where the money's coming from from funders some public bodies and how its funneled through weary us institutions.
and even their its aid as individual invoices and and part of it us big deals figuring out all the money flows that go into the pockets of publishers and facilitates publishing i think this would be a healthy exercise as part of this paper were i to disfigure from they did it for the united kingdom.
i really applaud the effort because i've tried to do the same in finland and i think i stopped after week when i got so buried in to the kind of the sources of money going from taxpaying funds to how its final true ministries to true institutions and to too many pockets before it ends up on on some a bank. found in the bahamas that it's really really a challenge to follow the money because it's it changes hands it's not the same euro that goes from the public bodies to the publishers its kind of something happens along the way and you need to be aware of how things are these tribute did once once that happens if but one thing which is a. great foundation and i think it's the kind of excellent starting point to getting inside this the open a.b.c. initiative that has been featured here as well in multiple sessions i think that's the key that in order to get these top level view of where the money's coming from and where it's going with at least need to connect the dots between the billion metric.
it's an economics that we know that the publication is a so to associated with the euro or dollar or you want amount and can kind of go from there because we need to have the article as the smallest molecule no atomised i guess the smallest maalik wayns smallest important here but i'm getting way out of my depth here but we need to.
kind of us have with the detail data and ordered him to zoom out that's what i am kind of getting at we can start from the top we shouldn't look at that at the source of money we should look at where it's going and work ourselves backwards i think that's a much more helpful approach and that's something that the open a p.c. an initiative is helping with. and something i'm happy with that finland has also been very active in his publishing publisher agreements like this helps out with transparency in many ways in trying to get pressure economic pressure on publishers to slightly slow down the thirst for more money to try to also have some. communication between the customer side because the publishers are global enterprise is that you know have all the information while customers are usually regional national all may be communicating with with other countries but rarely international in the sense that they would have the international leverage so i think it's a very good the.
development that there is more international date information sharing and things happening in this regard even though there's a lot to do to turn this into some kind of data these are mostly just scanned p.d.f. store or bare bones tables.
and then reaching the gap between kind of cost and price is a good reprint recently published where they looked at what it actually costs to provide most of the central steps that go into the kind of professional publishing nowadays and to time to assess what the gap is between three thousand euro. a p.c. for hybrid and then what what the actual cost might be if you kind of do all the work yourself or outsourcing and they came to the amount of around four hundred dollars they have multiple scenarios there that you can have a look at but this is a great way of trying to figure out what are we talking about like what what is that the. that view of the labour and and the time between two different steps that go into producing a journal article from a publisher's perspective. and there are developments i also seen score so coming up in other sessions here for funding development of what i'm talking about about these opening for structures that would help us understand more about open access and and a what's happening but the problem is that there isn't really free money in large quantities. that is to go around four days because we are paying through the nose for a subscription access and to commercial publisher so whatever is left and then be funneled to discourse but it will need to improve and grow in school both in contributions and also the kind of infrastructure is that are there primarily it's now deal way jay and sharper romeo. so but there could also be others that could sign up to this and make it like a viable alternative to contribute to financially for institutions that this is a again too much reading for a representation really but i think there is a need for further collective action on this kind of collaboration to agree agree on data. standards agree on data collection methods. i open a p.c. is a great about the collective action up example of how you can kind of feeding into the same database with the information you're sitting on a which wouldn't otherwise be shared and one kind of radical commitment by by luis some years ago was about libraries reserving some percentages of their.
budget to fund open infrastructures to find things that are common open source projects and things that are don't really get the money it would deserve in order for us to get the more open data future that we want and i think this is something were to think about that it's hard to argue that an optional in a voice is a priority when you're.
cutting down access to a valuable resource but something has to be done like if we want the future to be open have more open data on what's happening we also need to fund it it can't just be a a shoestring operation. and there needs to be better mate the data for all the actors i'm usually against the web server surveillance or any tracking but i think like school early publishing it's not really about personal data we should be able to identify individuals journals funders you name it organisations will enough so that we could have good database. cases of how these interact so that we could analyze what's happening it's not about the negative connotation of surveillance i think it's just a matter of accounting bookkeeping like a school a publishing shouldn't be comparable to you know the horrible things that facebook does with your data you know it's not the same thing i've heard people really against. opening up school a publishing and trying to tag everything and anything but i don't think it's something i'd i don't think that argument holds i think we should do better in having identifiers for the central actor so that we can also see how things are moving and of course orkut is a good initiative in this regard trying to at least i'd. and defy authors so that we are not just relying on school push proprietary make an instant for identifying authors to figure out affiliation cycle aberration networks but we can do better like it should spread to organizations and funders and and so forth.
and an interesting development that this kind of in line with base about being more about make the data centric and having everything identified as something i i i heard was advertised on the alaska conference some some week back about the open access switchboard i haven't really well.
i just did it all that sounds like can look at data matching service like an author goes into the service and tries to find journals that match their funder criteria and their organizational policy so kind of like look like a dating service almost like like authors looking into the service and not having to navigate the jungle.
of a price capping or embargo time for copyright that they would have kind of all of that coded from the start based on who they are funded by where they work and what the journalists are within their field that fulfill these properties i guess this is good and he should but i haven't really figured out what this means it sounds like.
a good development that we are it's more easy to navigate the jungle of different varieties that are moving around in open access but i think we need still need to figure out how this will work for for not just being then again where the publisher controlled environment and and able to kind of guide contributions to.
i was that by premium partners of the service maybe i'm too cynical but a mighty take a waste and kind of read the regime's here is that things are macy but things are also improving like something in the years ago when i started studying open access a there was nothing like it did you had to deal way jay.
and you could manually sample journalism make something out of it but you know you couldn't get a lot of knowledge out of it now you have a p eyes provided by on pay wall you have cross-reference writing a fantastic method data for publications that the index you know there are limits not limitless possibilities but a lot more possibility. season and new ones are coming up every day but we still should have a vision for where we want to end up what kind of questions that we want answered and who should be controlling the data and i know i use these buzzwords here of sustainability and reproducibility and i still hold onto that like it should be financially sustainable. we should motivate institutions to contribute financially to this so that it's not just publishers shopping in barrow loads of money to fund infrastructures that cater to their interests there should be some pockets of money also coming in from universities and and governments the kind of facilitate open infrastructures or so.
so not not not bad and i think reproduce that these getting better like the reminder of the house there now we have better possibilities to really thinks varietals to really compare how the landscape was five years ago to how it's now we have gotten better the tools have gotten better and things are constantly improving so it's not just negative he.
here but what i'd like to emphasize that still be a matrix is where the snapshot driven we need to do better at having long into denial date i would like like they should be we should treat publications like human lives that we have a birth certificate than major life events and and then maybe a death certificate but some closure i just need some kind of. closure to what's happening to journals because it seems like they are mostly flying in the dark when it comes to knowing about what's happening on the journal level but thank you that this most of what i had for today if you're still thursday for more i produced a you to be doing my sauna indeed in the summer were irate review some of these topics.
you can pause at them and to enjoy older all the details their digest it in at a slower fashion and then i also published a job blog post on the the elephant in the lab blog about some of the things i mentioned here but i kind of go more into depth about the motivations but don't get it.
Loading...
Feedback

Timings

  808 ms - page object

Version

AV-Portal 3.13.1 (abea844c86ad1b15ca76e1472346f3fd8bea123a)
hidden