Bestand wählen
Merken

Science is broken

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
to what
would it be and
the the here so uh many of you probably know me from doing things around IT security but I'm gonna surprise you to almost not talk about the I. T. security today but I want to and gonna ask the question like can we trust the scientific method and I want to start this by giving it which is quite a simple example so if we do science like we start with the theory and then we're trying to test if it's true right so I mean I said I'm not gonna talk about IT security but I shows an example from IT security or kind of writing unity so there was a post on right a while ago uh from such a picture from some book which claimed that if you use a moniker crystal that can protect you from computer viruses which to me it doesn't sound very plausible right like through these crystals and if you put them on your computer this book claims this protects you from but of course if we really want to know we could do a study on this the and if you say people don't do studies on crazy things that's wrong I mean people who studies on homeopathy your what kind of crazy things their completely implausible so we can do a study on this and what we'll do is we'll we will do a randomized control trial which is kind of the gold standard of of doing a test on these kinds of things 0 this or question do Monica crystals prevent never infections and how we would test them that's our study design is OK we take a group of maybe 20 computer users and then we split them randomly to 2 groups the and then 1 group we have 1 of these crystals and tell them put them on your desk or on your computer and then we need the other group this all control group that's very important because if we want to know if they help we we need another group to compare it to and to rule out that there are any kind of placebo effects so we give these control groups of fake market crystal so we can compare them against each other and then we wait for maybe 6 months and then we check homonym of infections they had to no I didn't do that study but a simulated it the Python scripting ends given that I don't believe that this theory is true at a simulated with random data so I'm not gonna go through the whole script but I'm just like generating uh I'm assuming there can be between 0 and 3 malware infections and it's totally random and then I compare the 2 groups and then I calculate something which is called the p value which is a very common thing and science whenever you the statistics and the p value is it's a bit technical but it's the probability that if you have if you have no effect that you would get this result which kind of another way means if you have 20 results in an idealized well then 1 of them is a false positive with trees 1 of them says something happens although it doesn't and in many fields of science this p value of 0 . 0 5 it is considered that significant which is like this 20 studies so 1 error in 20 studies but everything under idealized conditions so and as it's a script and I can write it in less than a 2nd adjusted 20 times that of wants so here are my trent is simulated studies and most of them look not very interesting so of course we have a few random variations but nothing very significant except but if you look at this 1 study it says that people that the method Presley had on average 1 . 8 MeV infections and the people with the fake press had 0 . 8 so it means actually the crystal made adverse but also this result is significant because the the p value of 0 . 0 3 so of course we can publish that like assuming I really did the study is this so and the other studies we just forget about it and they were not interesting right and who cares like not significant results OK the so you have just seen that I created a significant result of random data and that's concerning because the people and sigh of I you can really do that G. and this phenomenon is called a publication bias so what's happening here is that drink studies and if they get a positive result meaning you're seeing in effect when you publish them the and if there's no effect you just forget about them and then we learned earlier that with this p value of 0 . 0 5 means 1 in 20 studies is a false positive but you don't usually don't see the studies that are not significant because they don't get published and you may wonder OK what's stopping scientists from doing exactly this what's stopping as scientists from just doing so many experiments still 1 of them looks like it's a real result although it's just a run of group and the dis concerning answer to that is it it's usually nothing and this is not just a theoretical example so I want to give you an example that has quite some impact and that's was research a well and that is the research on antidepressants so-called as our
eyes and that in 2008 there was a study so the the interesting situation here was that the US Food and Drug Administration which is the authority that decides whether a medical draw is uh can be put on the market as they had knowledge about all the studies that have been done to to from the register this medication and then some researchers looked at that and compared it with what has been published and they figured out there were 38 studies that saw that these medications had and it had a real effect that real improvements for patients and from those 38 studies 37 got published but then there 36 studies that said these medications don't really have any effect they're not really better than a placebo effect and out of those only 14 got published and even from those 14 there were 11 where the researcher said OK they they have been the result in a way that it sounds like these medications to something but also yeah there are also of a bunch of studies are just not published because they had a negative result and it's clear that if you look at the published studies only and you ignore the studies the negative result that haven't been published and these medications look much better than they really are and it's not not like the earlier example there is a real effect from antidepressants but they're not as good as people have believed in the past so we have learnt in theory with publication bias you can create result old of nothing but if you're a researcher and you have a theory that's not true but you really want to publish something about it that's not really efficient because you have to do 20 studies on average to get 1 of the strand results that look like real results so the they are more efficient ways to get to our results from nothing and this if you during a study then there are a lot of micro decisions you have to make for example you may have dropouts from your study group people at an know they moved to another place are they knew you no longer reach them so they are no longer part of the study and the different things how you can handle that and then you may have chronic case results really not entirely sure or is this an effect or not and how do you decide what you expect measure and then the also you may be looking for different things maybe they're different tests you can do 1 people well and uh domain control for certain variables like to use the men and women into separate to see separately or reduced separate them by age so there are many decisions you can make why doing a study and of course each of these decisions have a small effect on the results and so it may very often be that's just by trying all the combinations you will get a p value that looks like it's statistically significant although there's no real effect so and this this term called P. hacking which means you should be addressed adjusting your minds that's long enough that you get a significant result and I'd like to point out here that this is usually not that scientists service OK today I'm gonna Piek my result because I know my theory is wrong but I want to show its true but it's it's the subconscious process because usually it is usually the scientists believe in this series honesty the honestly think that this theory is true and that the research will show that for their mind subconsciously say OK if I analyze my data like this it looks a bit better so I will do this so so subconsciously they may p had this uh themselves into getting a result it's not really there and again we can ask what is stopping scientists from P hiking and the concerning answers the same usually nothing and so I came to this conclusion that say OK the scientific method it's a way to create evidence for whatever theory like no matter if it's true or not and you might say that's a pretty bold thing to say uh and I'm saying this even tho I'm not even scientists late and just like some hacker who whatever um but I'm not alone and that there's a paper from a famous researcher genuinely this and said Why Most Published Research Findings Are False he published in 2005 and if you look at the title he doesn't really question that most research findings are false he only wants to give reasons why this is the case and he makes some very plausible assumptions if you look at not that many negative results don't get published at that you will have some bias and and comes to a very plausible conclusion that this is the case and this is not even very controversial if you ask people who were doing this what you can call Science of Science or middle science who look at scientific methodology they will tell you yes of course that's the case some will you say yeah that's that's how science works and what expect that I find it concerning and if you take this seriously it means if you read about the study like in the newspaper the default assumption should be that stop through Wiley my usually think the opposite as it is science is a method to create evidence for whatever you like you can figure out if you think of something you really crazy like can people see into the future like those are main mine to have some some extra perception very confused very common sense things that kept Mindanao or and there
some psychologists call della BEM and he thought that this is the case and he published a study on it the boss tidal feeling the future he did a lot of experiments where he did something and then something later happened and and he thought he had statistical evidence that what happened later influenced what happened earlier so I don't think that's very plausible based on what we know about the universe but and it was published in a real psychology journal and a lot of things were wrong with this study basically it's a very nice example for p hacking and yes even a book by Darryl Bem really describe something which basically looks like the hacking where he says that solely to psychology uh um but the study was absolutely line the execs existing standards in experimental psychology and that a lot of people from concerning so if you can show that precognition real that you can see in the future then what else can you joint how can we trust the results and psychology has debated this a lot in the past couple of years so there's a lot of talk about a replication crisis in psychology and many effects as psychology this sort of work through a they've fared out OK if they try to repeat these experiments they couldn't get these results even though it has fields were build on these results um and lunchroom example which is 1 of the ones that is not discussed so so much better uh so there's there's a theory which is called moral licensing and the idea is that if you do something good or something you think is good then later basically you behave like an asshole because you think I already did something good now I don't have to be so nice anymore and there were some famous studies that they had this theory that people consume organic food that later and they become more detrimental or less social less nice to their peers them but just so last week someone tried to replicate this original experiments and they tried it 3 times with more subjects and that the research methodology and they totally couldn't find that affects it but they would've seen here is lots of media articles I have not found a single article reporting that this could not be replicated maybe they'll come but this just a very recent example but now I want
to have it has more bonding for you because you may think now yeah these psychologists that all sounds very fishy and they even believe in precognition and whatever but maybe a few there's not much better maybe you just don't know about it yet because nobody has has started replicating studies in your field and there are other fields that have replication problems and some much worse for some of the the FAA company and and uh in 2012 they published on the research we have tried to replicate cancer research and preclinical research that is the stuff in a petri dish or animal experiments or not not trucks and humans but what happens before you develop a truck and they're only able to replicate 47 out of 53 studies and the threat of a set of landmark studies so studies that have been in part publishing the best journals now there are a few problems with this publication because they have not publish their applications they have not told us which studies these word that they could not replicate in the meantime the they have published 3 of these replications but most of it is a bit in the dark which points to another problem because they say they did this because they collaborated with the original researchers and they only did this World by agreeing that they would not publish the results but it's still psaltery concerning so but some fierce don't ever replication problem because just nobody's trying to replicate previous results I mean then you will never know if you're results hold up um so what can be done about all of this and fundamentally it I think that the core issue here is that the scientific process is is tied together with results so we do a study and only after that we decide whether it's gonna be published or we do a study and only after we have the data we're trying to analyze it so essentially we need to decouple the scientific process from its results the and 1 1 way of doing that as pre-registration so what you doing there is that before you start doing a study you were registered in a published go register and say I'm gonna do a studies like on this medication or whatever on this psychological effect and that's why I'm going to do it and then later on people can check if you read it that much and so yeah that's what it says um and this is a more-or-less standard practice in medical draw trials of the summary about it is it does not work very well but it's better than nothing so and the problem is mostly enforcement so people register study and then don't publish it and nothing happens to them even though there really you require to publish it and there's there's 2 campaigns I'd like to point out this the all trials campaign which is decided by a but gold acre he's a doctor from the UK uh and they like demand that's like every trial done on on medication should be published and there's also project by the same guys the compare project and they're trying to see if the medical trial has been registered and later published did they do the same or did they change something in the protocol and is what the reason for it or did they just change it to get the result which they otherwise wouldn't get in um but then again like these issues in medicine they often get a lot of attention and for good reasons because if you have pets science and medicine than people dying that's pretty immediate and pretty massive but if you if you read about this you always have to think that these issues in Proc trials at least they have pre-registration most scientific fields don't bother doing anything like that so whenever you hear something about maybe about publication bias in medicine you should always think the same thing happens in many fields of Science and you really nobody's doing anything about it and particularly to this audience I'd like to say there is currently a big trend that people from computer science want to revolutionize medicine big Data and Machine Learning these things which in principle is OK but I know a lot of people in medicine are very worried about this and the reason is that this computer science people don't have the same scientific standards as people in medicine expect them and might say yeah we don't need for you need to do a study on this it's obvious that this helps and that is worrying and there and I come from computer science and a row understand that people from medicine are worried about this so there's an idea that goes even further as pre-registration and it's called registered reports uh there's a couple of years ago some scientists wrote an open letter to The Guardian but there uh that was published there and the idea there is that you can the scientific publication process upside down on the so if you wanted to study the 1st thing you would do with the register report is you submit your design your study design protocol to those journal and then journal decides whether they will publish that before they see any results because then you can prevent publication bias and then you prevent draws only published nice findings and ignore the negative findings and then you do the study and then it gets published but it gets published independent of what the result was the and there are of course other things you can do to improve science there's a lot of thought about sharing data sharing code sharing methods because if you want to replicate the study it's of course easier if you have access to all the details all the original study was done some then you can say OK we could do large collaborations because many studies are just too small if you have a study with 20 people you just don't get a very reliable outcome so maybe many situations it would be better get together 10 teams of scientists and let them all do a big study together and then you can reliably answer a question and as some people propose just to get a higher statistical thresholds that p value of 0 . 0 5 means practically nothing of there was review paper that just I agree to just like put the . 1 more to the left and have 0 . 0 0 5 and then would already is awful lot of problems um and recumbent physics they have uh they have something called sigma of 5 which is I think 0 point and and 5 zeros and 3 or something like that so so the the in physics they have much higher statistic of false now whatever it if you're working in any scientific field you may ask yourself like if we have such as the results of a pre registered in any way um and do we publish negative results like we tested in effectively got nothing and are there implications of all relevant results and I would say if you answer all
these questions with no which I think many people will do then you're not really doing science what you're doing is the acme of our time on this thank you very much know have more cerebral who have similar slide that was not the shooting at least in the uh yeah uh big also that that they're bad incentives in science so so at a very sad thing to to evaluate the impact of science is citation culture is a if you're scientific studies cited a lot then this is a good thing and if you draw a lot this is a good thing and this for example the impact factor but there are also other measurements and so universities like publicity so if you study gets a lot of media reports then you press department likes to um and disincentives tend to favor interesting results but they don't favor correct resolves and this is bad because if we really is the most results are not that interesting most results will be yeah we have this interesting and counter intuitive theory and it's totally wrong and then this this idea that science self-correcting um is so if you come from scientists with these issues with the publication DOI bias empirically freely available immediately changed that's what scientists do right and I want to cite something here with this uh so it's a bit long but there's some evidence that infuses statistical tests are commonly used research non cific Navy from the countries also stop published that's also like publication bias um and then it also says significant results published in these fields are seldom verified by independent replication so it seems that the replication problem these wise words were set in 1959 so uh by statistician co-chaired studying and because science is a self-correcting in 1995 the complaint that this article presents evidence that published results of scientific and the investigations are not a representative sample of all scientific studies these results also indicate that practice leading to publication bias has not changed over a period of 30 years of and he re and 2018 and publication bias is still a problem so if sciences self-correcting but it's pretty damn slowing correcting itself right and finally I would like to ask you if you're prepared for boring science because ultimately I think we have a choice between what I would like to quote TED talk science and boring zones so are so with that talks science we had mostly positive and surprising results and interesting results we have large effects many citations lots of media attention and you may have a TED talk about it unfortunately usually it's not true and I would like to propose boring science the alternative which is mostly negative results pretty boring small effects but it may be closer to the truth and I would like to have boring science but I know it's a pretty tough sell yeah sorry I didn't hear that yeah yeah thanks for listening thank mutation to questions you don't know and we know that much but question the ministry means does question 1 or 2 mention but I just want to common Hanna you missed out very critical or topic here which is the use of Bayesian probability ICD conflicting media use with the scientific method which is an hour which gave the rest talk I fell a study unnecessary and anti-science slant on the p-values is an hour the be all and end all of the scientific method so the values as of calculating the probability that data will happen given that no hypothesis is true was basing probability will be calculating the probability that your hypothesis is true given the data on and more more science also sons realize that this sort of method is probably better way of doing science and the values of so this is from a base that alternative to the use of proposal warning signs is doing is on the edge of a Bayesian probability sorry and the theory actually I agree with you I unfortunately I only had half an hour here where are you going out it's like wearing going up the selected in the find you somewhere in a bar I know him on the island I had you know science is broken then signed his it is a little bit like next lecture acted as waiting there is like you scratch my back and I stretch your yours for publication I maybe 2 more 1 minutes the case when yeah I thank you for your talk of I'm curious so here is the you know about ways we can address this assuming good actors assuming people who want to do better science that this happens out of ignorance or willful ignorance so what we do about bad actors so for example the medical community of drug companies merely really like the idea of being profitably incentivize when's rental files to make of this and we will see that you something how do we begin to address the time trying to maliciously sleepyhead or maliciously on abuse the the edge system or something like that I mean it's a big question right but I think if the standards are are kind of confining you so much that there there's not much room to cheat and that's the way out there and a basic and also I don't think delivered cheating is that much of a problem I actually really think the bigger problem is people honestly believe what they do is true but gave 1 last you start please so you the value in science is the sum of the new account of the publications right compensating so and so on so e this tool that to improve the situation you described as your notes of with publications available quarreling prospective but should do impose no more of a high standard so the journals of those who must like raise the bar they should enforce uh publication all particles before related setting and the GCC so users journals who should like the work this or can we irregular scientists to something else the timing you can publish in the jobs that have better standards there are journals that have these registered reports but of course I mean as of find is always difficult because you're playing in a system that has all these wrong incentives of gay guys that's it we have to show them please there is a reference better
bedesigned salt water good air uh and 1 last quiz given really warm applause Thank STdB to just stand and
miss the the set of a
the best way to
the top of 2
Resultante
Computervirus
Kontrollstruktur
p-Wert
Soundverarbeitung
Gruppenkeim
Computer
Computer
Abstraktionsebene
Physikalische Theorie
Eins
Netzwerktopologie
Zufallszahlen
Exakter Test
Gruppentheorie
Standardabweichung
Reelle Zahl
Konditionszahl
Statistische Analyse
Randomisierung
Skript <Programm>
Beobachtungsstudie
Soundverarbeitung
Softwaretest
Statistik
Vererbungshierarchie
Computersicherheit
Vorzeichen <Mathematik>
Malware
Ideal <Mathematik>
Malware
Hoax
Spannweite <Stochastik>
Datenfeld
Gruppenkeim
Körper <Physik>
Zufallsvariable
Rechter Winkel
Konditionszahl
Gamecontroller
Simulation
Beobachtungsstudie
Arithmetisches Mittel
Fehlermeldung
Resultante
Bit
Prozess <Physik>
Kontrollstruktur
p-Wert
Schaltnetz
Gruppenkeim
Term
Physikalische Theorie
Entscheidungstheorie
Data Mining
W3C-Standard
Variable
Domain-Name
Negative Zahl
Mittelwert
Reelle Zahl
Plot <Graphische Darstellung>
Hacker
Default
Hacker
Einflussgröße
Softwaretest
Soundverarbeitung
Beobachtungsstudie
Autorisierung
Prozess <Informatik>
Reihe
Systemaufruf
Negative Zahl
Vorzeichen <Mathematik>
Variable
Entscheidungstheorie
Dienst <Informatik>
Physikalische Theorie
TVD-Verfahren
Mereologie
Gamecontroller
Beobachtungsstudie
Resultante
Offene Menge
Petri-Netz
Bit
Prozess <Physik>
Gemeinsamer Speicher
Computer
Kartesische Koordinaten
Analysis
Eins
Maßstab
Konfigurationsdatenbank
Standardabweichung
Code
Datenreplikation
Randelemente-Methode
Protokoll <Datenverarbeitungssystem>
Hacker
Gerade
Schwellwertverfahren
Prozess <Informatik>
Güte der Anpassung
Arithmetisches Mittel
Kollaboration <Informatik>
Datenfeld
Twitter <Softwareplattform>
Menge
Benutzerschnittstellenverwaltungssystem
Projektive Ebene
Standardabweichung
Physikalismus
Soundverarbeitung
Green-Funktion
Sigma-Algebra
Physikalische Theorie
Code
Datensatz
Reelle Zahl
Schwellwertverfahren
Algorithmische Lerntheorie
Informatik
Stochastische Abhängigkeit
Beobachtungsstudie
Soundverarbeitung
Kollaboration <Informatik>
Protokoll <Datenverarbeitungssystem>
Paarvergleich
Informationsverarbeitung
Peer-to-Peer-Netz
Datenreplikation
Quick-Sort
Programmfehler
Körper <Physik>
Hypermedia
Mereologie
Wort <Informatik>
Speicherabzug
Verkehrsinformation
Beobachtungsstudie
Resultante
Bit
Gewichtete Summe
p-Wert
Statistische Hypothese
Dijkstra-Algorithmus
Softwaretest
Exakter Test
Vorzeichen <Mathematik>
Prozess <Informatik>
Datenreplikation
Statistische Analyse
Auswahlaxiom
Einflussgröße
Internetworking
Schießverfahren
Teilbarkeit
Güte der Anpassung
Stichprobe
Zeitzone
Frequenz
Stichprobenumfang
Teilbarkeit
Rechenschieber
Software
Datenfeld
Menge
Rechter Winkel
Hochvakuum
Resolvente
Standardabweichung
Wasserdampftafel
Soundverarbeitung
Physikalische Theorie
Hypermedia
Stichprobenumfang
Äußere Algebra eines Moduls
Grundraum
Beobachtungsstudie
Soundverarbeitung
Fehlermeldung
Relativitätstheorie
Physikalisches System
Datenreplikation
Elektronische Publikation
Frequenz
Quick-Sort
Netzwerktopologie
Auswahlaxiom
Körper <Physik>
Hypermedia
Wort <Informatik>
Partikelsystem
Verkehrsinformation
Beobachtungsstudie
Hypermedia
Systemprogrammierung
Menge

Metadaten

Formale Metadaten

Titel Science is broken
Untertitel How much can we trust science in light failed replications, bogus results and widespread questionable research practices?
Serientitel 34th Chaos Communication Congress
Autor hanno
Lizenz CC-Namensnennung 4.0 International:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
DOI 10.5446/34851
Herausgeber Chaos Computer Club e.V.
Erscheinungsjahr 2017
Sprache Englisch

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract We're supposed to trust evidence-based information in all areas of life. However disconcerting news from several areas of science must make us ask how much we can trust scientific evidence.
Schlagwörter Science

Zugehöriges Material

Folgende Ressource ist Begleitmaterial zum Video
Video wird in der folgenden Ressource zitiert

Ähnliche Filme

Loading...
Feedback