You're Just Complaining Because You're Guilty: A DEF CON Guide to Adversarial Testing of Software Used in the Criminal Justice Systems

Video thumbnail (Frame 0) Video thumbnail (Frame 3820) Video thumbnail (Frame 4764) Video thumbnail (Frame 6407) Video thumbnail (Frame 7467) Video thumbnail (Frame 8967) Video thumbnail (Frame 11573) Video thumbnail (Frame 14748) Video thumbnail (Frame 16204) Video thumbnail (Frame 17712) Video thumbnail (Frame 19510) Video thumbnail (Frame 22947) Video thumbnail (Frame 26923) Video thumbnail (Frame 31522) Video thumbnail (Frame 32713) Video thumbnail (Frame 35297) Video thumbnail (Frame 39095) Video thumbnail (Frame 44239) Video thumbnail (Frame 45518) Video thumbnail (Frame 47117) Video thumbnail (Frame 48254) Video thumbnail (Frame 50994) Video thumbnail (Frame 54432) Video thumbnail (Frame 57031) Video thumbnail (Frame 58792) Video thumbnail (Frame 61502) Video thumbnail (Frame 63449) Video thumbnail (Frame 65482)
Video in TIB AV-Portal: You're Just Complaining Because You're Guilty: A DEF CON Guide to Adversarial Testing of Software Used in the Criminal Justice Systems

Formal Metadata

You're Just Complaining Because You're Guilty: A DEF CON Guide to Adversarial Testing of Software Used in the Criminal Justice Systems
Title of Series
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Software is increasingly used to make huge decisions about people's lives and often these decisions are made with little transparency or accountability to individuals. If there is any place where transparency, third-party review, adversarial testing and true accountability is essential, it is the criminal justice system. Nevertheless, proprietary software is used throughout the system, and the trade secrets of software vendors are regularly deemed more important than the rights of the accused to understand and challenge decisions made by these complex systems. In this talk, we will lay out the map of software in this space from DNA testing to facial recognition to estimating the likelihood that someone will commit a future crime. We will detail the substantial hurdles that prevent oversight and stunning examples of real problems found when hard won third-party review is finally achieved. Finally, we will outline what you as a concerned citizen/hacker can do. Nathan Adams will demo his findings from reviewing NYC's FST source code, which was finally made public by a federal judge after years of the city's lab fighting disclosure or even review. Jerome Greco will provide his insight into the wider world of software used in the criminal justice system—from technology that law enforcement admits to using but expects the public to trust without question to technology that law enforcement denies when the evidence says otherwise. Jeanna Matthews will talk about the wider space of algorithmic accountability and transparency and why even open source software is not enough.
Statistical hypothesis testing Service (economics) Decision theory Characteristic polynomial Black box Attribute grammar Software output Proxy server Systems engineering Physical system Service (economics) Algorithm Information Decision theory Gender Electronic program guide Statistical hypothesis testing Process (computing) Software Computer science output Right angle Physical system Computer forensics
Statistical hypothesis testing Computer program Prisoner's dilemma Decision theory Demoscene Statistical hypothesis testing Software bug Independence (probability theory) Process (computing) Software Iteration Personal digital assistant Software System programming Hill differential equation Complex system Process (computing) Figurate number Complex system Arithmetic logic unit Resultant Physical system
Standard deviation Multiplication Graph (mathematics) Black box Area Category of being Software Personal digital assistant Different (Kate Ryan album) Energy level Energy level Metropolitan area network Physical system Self-organization
Computer program Building User interface State of matter Decision theory Multiplication sign Range (statistics) Database Function (mathematics) Area Type theory Different (Kate Ryan album) Data acquisition Tower Circle Office suite Endliche Modelltheorie Website Multiplication Exception handling Physical system Metropolitan area network Algorithm Computer virus Open source Computer simulation Computer Arithmetic mean Message passing Process (computing) Prediction Tower Right angle Simulation Computer forensics Associative property Existence Cellular automaton Term (mathematics) Mobile Web Information Forcing (mathematics) Cellular automaton Neighbourhood (graph theory) Coma Berenices Computational complexity theory Uniform resource locator Personal digital assistant Intercept theorem Dilution (equation) Local ring
Trail Context awareness Demon Decision theory Multiplication sign Cellular automaton Decision theory Cellular automaton Computer simulation Online help Client (computing) Vector potential Order (biology) Uniform resource locator Personal digital assistant Sheaf (mathematics) Website Right angle Website Simulation Row (database)
Computer program Service (economics) Distribution (mathematics) Multiplication sign Computer-generated imagery Mobile Web Similarity (geometry) Online help Data storage device Axiom Mass Metadata Product (business) Software bug Order (biology) Different (Kate Ryan album) Touch typing Computer hardware Encryption Process (computing) Office suite Computer forensics Local ring Descriptive statistics Mobile Web Service (economics) Shift operator Email Touchscreen Key (cryptography) Information Digitizing Open source Arithmetic mean Process (computing) Software Personal digital assistant Touch typing Formal verification Right angle Computer forensics Resultant
Computer program Confidence interval Client (computing) Mereology Graph coloring Rule of inference Perpetual motion Hypermedia Profil (magazine) Different (Kate Ryan album) Determinant Metropolitan area network Physical system Algorithm Pattern recognition Matching (graph theory) Gender Projective plane Physical law Line (geometry) Digital photography Personal digital assistant Coefficient of determination System identification Right angle
Point (geometry) Computer program Group action Divisor Gender Algorithm State of matter Decision theory Cuboid Process (computing) Validity (statistics) Extension (kinesiology) Digital rights management Traffic reporting Metropolitan area network Compilation album Physical system Algorithm Information Compass (drafting) Prisoner's dilemma Gender Physical law State of matter Process (computing) Personal digital assistant Coefficient of determination Order (biology) Faktorenanalyse Right angle Local ring Row (database)
Computer program Statistics Finite state transducer Drop (liquid) Function (mathematics) Likelihood function Rule of inference Statistical hypothesis testing Front and back ends Mixture model Latent heat Different (Kate Ryan album) Energy level Cuboid Office suite Noise (electronics) Information Weight Projective plane Sampling (statistics) Statistics Front and back ends Web browser Computational complexity theory Process (computing) Visualization (computer graphics) Personal digital assistant Interrupt <Informatik> Interpreter (computing) Office suite Computer forensics
Point (geometry) Computer program Statistics Finite state transducer Weight Multiplication sign Source code Division (mathematics) Set (mathematics) Limit (category theory) Electronic mailing list Function (mathematics) Product (business) Statistical hypothesis testing Number Independence (probability theory) Revision control Mixture model Mathematics Uniformer Raum Profil (magazine) Different (Kate Ryan album) Single-precision floating-point format Exception handling Physical system Source code Service (economics) Prisoner's dilemma Weight Software developer Sampling (statistics) Line (geometry) Statistics Performance appraisal Category of being Uniform resource locator Sample (statistics) Software Personal digital assistant Internet forum Compilation album Office suite Probability density function Row (database)
Pairwise comparison Intel Statistics Weight Multiplication sign Weight Sampling (statistics) Menu (computing) ACID Generic programming Statistics Likelihood function Hypothesis Computational complexity theory User profile Mixture model Sample (statistics) Profil (magazine) Row (database) Identity management
Point (geometry) Noise (electronics) Presentation of a group Observational study Validity (statistics) Information Code Source code Sampling (statistics) Disk read-and-write head Likelihood function Rule of inference Statistical hypothesis testing Performance appraisal Revision control Medical imaging Uniform resource locator Personal digital assistant Different (Kate Ryan album) Calculation Physical system
Statistics Game controller Observational study Code Finite state transducer Line (geometry) Insertion loss Rule of inference Likelihood function Product (business) Statistical hypothesis testing Independence (probability theory) Revision control Order (biology) Inclusion map Profil (magazine) Hypermedia Operator (mathematics) Personal digital assistant Integrated development environment Information Physical system Source code Addition Validity (statistics) Information Linear regression Sampling (statistics) Bit Total S.A. Line (geometry) Statistics Statistical hypothesis testing Computational complexity theory Type theory Uniform resource locator Sample (statistics) Hypermedia Software repository Personal digital assistant Computer cluster Order (biology) Revision control Hill differential equation
Statistical hypothesis testing Computer program Statistics Software developer Source code Similarity (geometry) Compiler Mathematical analysis Bit rate Entire function Declarative programming Computer Bit rate Software Error message Physical system Social class Source code Dependent and independent variables Observational study Software developer Computer program Open source Motion capture Statistics Statistical hypothesis testing Entire function Statistical hypothesis testing Computational complexity theory Process (computing) Software Personal digital assistant Revision control Computer science Musical ensemble Computer forensics
Statistical hypothesis testing Source code Pairwise comparison Addition Functional (mathematics) Open source Variety (linguistics) Finite state transducer Motion capture Bit rate Entire function Statistical hypothesis testing Independence (probability theory) Frequency Software System programming Pairwise comparison Physical system
Statistical hypothesis testing Computer program Service (economics) Multiplication sign Source code Software bug Statistical hypothesis testing Wave packet Revision control Term (mathematics) Software Computer hardware Repository (publishing) Finite state transducer Physical system Source code Matching (graph theory) Independence (probability theory) Planning Hecke operator Database Group action Term (mathematics) Limit (category theory) Statistical hypothesis testing Software Order (biology) Iteration Right angle Resultant
Statistical hypothesis testing Standard deviation Service (economics) Open source Code Finite state transducer Computer-generated imagery Source code Student's t-test Black box Software bug Statistical hypothesis testing Independence (probability theory) Hooking Term (mathematics) Phase transition Software Finite state transducer Pairwise comparison Traffic reporting Condition number Physical system Source code Interface (computing) Computer program Open source Electronic mailing list Sampling (statistics) Planning Statistical hypothesis testing Exterior algebra Software Mixed reality Information retrieval Phase transition Interface (computing) Resultant
Statistical hypothesis testing Boss Corporation Context awareness Group action Algorithm Building Validity (statistics) Algorithm Decision theory View (database) Denial-of-service attack Set (mathematics) Black box Student's t-test Wave Software System programming Lorenz curve Computer forensics Associative property Physical system
Email Email Link (knot theory) Software Link (knot theory) Different (Kate Ryan album) Touch typing Twitter Twitter
hi I'm Gina Matthews I'm a computer science professor at Clarkson and we have Nathan Adams who is a systems engineering forensic bioinformatics services and jerome greco who is a defense attorney at the Legal Aid Society we're all super thankful that you got up for 10:00 a.m. to come here what we think is a really important topic for all citizens of the modern world and to and people interested in technology we're gonna be talking about Avice aerial testing of software using the criminal justice system you're just complaining because you're guilty so software is increasingly used to make huge decisions about all of our lives I think in this room we know that from hiring to housing to hire we find partners and friends how we navigate streets how we get our news and the weightier the decision the more crucial that we can understand it and question it what input is being given to that decision is the decision correct for whatever metric you would like to measure it by is there other information that really needs to be considered that's not being considered and what kind of bias is involved in that decision are they're protected attributes that are being considered like race and gender or even if those attributes are not considered directly what about proxies for those characteristics that are just as effective as the characteristics themselves a criminal justice system is just one example of this but it's a pretty important one and software and algorithmic decision-making is increasingly used throughout the criminal justice system and usually it's black boxes for which trade secret protection is aggressively claimed and often the rights the intellectual property rights of companies are being deemed more important than the rights of individual defendants to understand or question the decisions that are made about them or the public's right to attend a public trial understand a public trial process and even besides that there are many evidences of problems that bubble up so it's not just that it's a black box we have evidence that there's trouble and how are we going to find bugs and fix the problems if the answer is always you can't question that you're just complaining because you're guilty for
example can you imagine being sent to prison rather than given probation because proprietary software says you're likely to commit another crime but you can't ask how the software makes that decision that's the Loomis vs. Wisconsin case what about the primary evidence against you in a murder trial being the results of DNA software but one program says you did it and another says you didn't that's the Hillary case what about being accused of murder solely because of DNA transferred by paramedics to the scene but they don't figure that out for months that's the Anderson case those are real examples for those of us who build
technology software we know that software and complex systems need an iterative process of debugging and improvement that's just a fact anyone who uses technology let alone builds it knows that there are glitches and bugs and unintended consequences and you know how easy it is for there to be substantial bugs that you just haven't found yet that you're shocked when you find them there's a huge advantage to independent third-party testing we just know that it's well documented you need teams that are incentivized to find problems rather than teams that have a vested interest in showing that the system is working just fine thank you very much and we're dealing with a system that actively D incentivizes that if only those with interest in the success of software see the details we have a huge problem and a big recipe for injustice and that's what we're going to be talking about this morning I'll hand
it over to Nathan Oh to Jerome so black boxes and proprietary software and trade secrets are increasingly becoming a problem with criminal justice system unfortunately so much so that we can't discuss all of them today or in as much detail as they'd like but I'm going to give you an overview of with some examples so you understand how the problem what the problem is and how it's actually affecting cases just quickly this is a graph from OSAC which shows all the different forensic disciplines that are being used in the criminal justice system some are a lot more accurate and reliable than others we've
broken down the technology being used by law enforcement to four distinct categories although they're not as distinct as they may appear some technology fits in multiple categories in fact the evidence gathering evidence assessment categories often bleed into each other but I will be giving at least one example from each category before we get into that I've broken down a lot of the technology based on what I did three different secrecy levels so there's secret which is we don't want you to know this exists but if you find out exists we don't want you to know that we have it at all and then their secret as applied which is we have it but we don't want to tell you when we're using it or we don't want to tell you how we're using it and then there's the trust us category which is okay we have it we yeah we used it in this case but don't look at the man behind the curtain stop asking questions just trust us it works exactly like we say it does I mean why wouldn't it
starting with predictive policing so predictive policing is basically using data and algorithms to make decisions that were traditionally left up to human law enforcement officers which in theory that sounds great right you can remove the bias from the system except in reality that's not actually how it works because if you've trained the algorithm based upon data that was from years and decades worth of races policing you're going to end up with a racist output and so if you have you know and the problem with that is that you have officers who now can say well the computer told me right oh it's not my fault the computer made the decision and and you know the computer it has no bias and it's like well you kind of trained it to have a bias it's so for example if you over police the neighborhood you're gonna make more arrests in that neighborhood whether or not there's more crime there if you feed that into the algorithm the algorithms going to think oh there's more arrests there so there's more crime there so we'll send more officers there which then will increase more officers making more arrests to meet their quotas and to justify their jobs and their existence and it becomes a self feeding circle which is not always actually the best method in fact often is not with this comes a lot of a lack of transparency most of these companies are requiring non-disclosure agreements claiming that they have proprietary trade secrets so you can't see how it works under the hood and also saying that the data that they're using to train these programs or to make the program's work are sensitive so you can't review them all this is preventing public scrutiny of of the program's themselves and how they're being used
for evidence gathering today we're going to focus on cell size simulators and mobile device forensics so cell size simulator for those you don't know is a device that forced the mimics being a cell phone tower and forces all the cell phones within range of it to connect to it and then it can lock on to a particular phone and use that to get a very precise location for example a particular apartment in a multi-story building which is the USP lambis case it also some of them also have the capability of intercepting content meaning they can intercept text messages and voice phone calls the reason why I don't use the term stingray device which is what probably a lot of you have her to be called is that stingray is a very specific model there are other models like the hailstorm and they all have their own capabilities and differences and so cell size simulator covers all those models instead of just referring to one specific one most people had no idea actually pretty much anyone outside a law enforcement and military had no idea that these were being you because they all required non-disclosure agreements so local state and local state and law enforcement were signing non-disclosure agreements with the company usually Harris Corporation and also with the federal government and so they were being used in criminal cases without without defense attorneys knowing without defendants knowing and of course without the general public knowing that is obviously changed but they're still making their efforts to keep it secret in fact the NYPD used one of these devices over a thousand times between 2008 and 2015 without ever once getting a warrant because thank the NYC Lu for their great work and being able to prove that on top of that to this day we still don't know which mile the NYPD is using because they still refuse to give up that information and they are doing everything they can to keep that quiet including spending lots of money litigating against it but we're gonna talk about a case in which I had worked on which People v Gordon case out of Brooklyn
so in People v Gordon this was at Matthew kureta was the attorney of record on the case and we also had a lot of help from Becca Wexler who was at that time a legal fell to Legal Aid Society and essentially they found our client location that really was not connected to him and traditional cell phone tracking was not accurate enough to get them to where he was and so we said the only possible way they could have done this is a cell site simulator so in our motion we said we're moving suppress you use the cell size simulator without a warrant and if you didn't use itself eye simulator explain to us what you did because we can't think of it our technologically possible way prosecutor responds and says concedes and goes yeah okay we did we used one right Frost this is the big deal this is the first time we're aware of an in New York State on an open case that we have been able to identify when a cell size same they've been used so we're ecstatic we think we're gonna win we've got a great thing going judge issues a decision a few
months later grants our motion is suppress the alleged ID and we're on top of the world we think we've broken new ground the decision gets published a New York Times article comes out and then all of a sudden the NYPD says no no no you're all wrong we didn't use one well a prosecutor in the case just filed in court you know with affirmed in court that you did use one obviously not beneficial to their case so it seems weird that they would lie about something that was only gonna hurt them and then on top of it you had months to correct that record he did nothing it was only when it became very public and there was an article about it that all of a sudden you said you know we have to deny this and the only thing we could possibly think of is that they're bound by their non-disclosure agreement they felt it was necessary to continue their denials especially when it went public which is particularly problematic because now it's already been established and they're still in denial they're still trying to keep it secret even afterwards keeping him with this
year's Def Con theme looking from 1983 looking a year in the future 1984 this basically is a description of all of all of us what we have in our pockets as most of us know our cell phones are the the most successful largest mass surveillance tool ever created and so we're going to talk a little about mobile digital forensics Riley V California was the case with the US Supreme Court said a warrant is required to look through somebody's phone or pull data from somebody's phone this is often done with a device called the celebrate Youth Fed touch this is a virgin to they see on the see on the screen but there are other companies like Magnum paraben and others that also provide similar hardware and software and the purpose of these is to extract data from your phone so they could be reviewed and and tagged by law enforcement now this isn't really as terrible right because it's available to outside law enforcement you we can test it we can see if we get the same results we could see what mistakes it has there is a financial barrier but beyond that you know my office has one and I can see if I get the same thing as law enforcement if I pick up something different or or the mistakes it may make right but that's not true with celebrate
advanced services and gray key so we all probably remember the 2015 the San Bernardino shooting case and law enforcement that and department justice st. Apple you need to help us get into this iPhone right you need to put back doors in your encryption and Apple saying hell no we're not going to do that Thank You Apple for once and yeah one time that you deserve 400 applause but with that they we get the FBI and Department Justice later withdraws the request and says what we got into the phone we don't need your help anyway and everybody goes well you just told us it was impossible and you couldn't crack it so what did you do shortly after celebrate advanced services pops up and what that is is it allows law enforcement to send the phone to celebrate they conduct some secret process and then they send it back to the law enforcement agency unblocked without the encryption and no longer a problem recently breaky which is a product by great shift has appeared and it also does a similar process or has a similar result I should say but instead of sending it off to a lab they actually send an actual product called gray key to law enforcement agencies and they can do it in-house the problem with these though is they won't sell it to me and I can't look at any of this as you can see there's a even an email telling me no and I can't know exactly how it works and I can't verify that it's not deleting information that's not changing metadata but law enforcement still trying to put this into evidence without anybody and including law enforcement they're not really sure how it works they're not in celebrites lab and they're not taking apart the gray key device and they're just trusting it because it benefits them and to be clear I don't think celebrate or brace shift are doing anything intentionally malicious but of course we all know just because you program something to work one way doesn't mean it actually is going to work that way right there are bugs there are flaws there are plenty of problems in fact if that wasn't true most this audience wouldn't have jobs or at least we have to find a different hobby and we probably a very different conference at DEFCON so in terms of
evidence assessment I'm talking about facial recognition is that is one of the big things that we're seeing now in the media especially with recently the ACLU challenging Amazon's now foray into this and connecting to actual politicians instead of at the the actual people that it was meant to connect to the problems we're having that this is multiple one is we're often not being told what company is the actual company being used to determine the facial-recognition the match and then even if we are we don't know how the algorithm works or how its programmed and we're being told that this blurry surveillance still has seventy percent confidence match to either a mug shot or driver's license photo or sometimes social media profile right and the whole thing comes out to okay let's assume let's assume that's right let's say it is a 70% and you know I can't even verify that because I don't know how it works or you won't let me see how it works but is that enough for that to be used as evidence and I in a trial is that enough for you to arrest somebody and okay let's say 70 percents not enough is 80 percent is that what really really want is evidence in court cases and if you say 70 percent is more than enough okay then what about 60 percent where do we start drawing that line and who gets to make that determination right and most the law enforcement that has very limited rules if any on how they're using this and how they're being trained including examples of them actually manipulating photos to make it more likely to get a match which seems just like evidence tampering to me in particular when we have had limited ability to do testing on on this facial recognition which and and facial identification through examples perpetual lineup which is Georgetown law and also the gendered shades project we have seen significant flaws and based on race gender and age for example the gender shades project had showed that women dark-skinned women were more likely to be misgendered by the program than say a light-skinned man right and leading to more false false identification and so part of the reason that that's believed is the way that a lot of these programs are training their algorithms the day they're using is our light-skinned men so tend to be more accurate for that than they would be for a dark-skinned woman and as a public defender of large percentage of my clients are people of color and it makes them already vulnerable in the criminal justice system even more vulnerable and more likely to be falsely identified so talk
about individualist individualized assessment there's a couple different
examples there today I'm talking about
sentencing algorithms in particularly the State vs Loomis case this is the case that came out of Wisconsin it's the US Supreme Court chose not to take it up so it is not law across the country but is indicative of the fights that are happening everywhere right now across the country and if and if those local defense attorneys are not challenging it they should be and if there any of them in the room I'd be happy to talk to you about that later in this case they used a risk assessment tool called compass made by North Point in order to get a report of a recommendation of sentence for the defendant Loomis one of the things that is is acknowledged even by the company is that it takes gender into account when it makes its decision one of the ways it does that is that there's the idea that men are more likely to be recidivists meaning they're more likely to reoffended therefore there should be less likely to be given probation so if you take a both a man and a woman are exactly the same in all other aspects same crime same criminal record everything else this program is less likely to suggest probation for the man than it is for the woman it's also more likely to suggest a higher sentence for the man rather than woman that seems extremely problematic especially when we're calling this individualized assessments when you're doing based upon the history of a group that you were born into that let's say this sounds terrible to me the other problem is oftentimes we don't know what factors are being included at all we don't know exactly how what factors are using and this is not just for sentencing this we're having this problem for bail for parole these decisions are being made and it's not just kompis it's is programs out there and more coming up every day to try to take over the market and even when we know the factors we don't know how they're weighing it so for example I don't know how much comp is took into account in gender how significant was that when it makes his decision that seems like an a pretty important thing but of course they're hiding behind proprietary trade secrets by saying well if we released this information somebody will steal it from us a competitor and all of a sudden will will have no jobs and look I'm sympathetic to some extent but that doesn't Trump somebody's right you know we're talking about people's Liberty here this isn't like a minor thing we're going to prison right and that's really important that they be able to challenge it that their defense attorneys to be able to give them a full defense and it's not that we don't want to it's just that we're actually being hamstrung from doing so and that's obviously very problematic and so these black boxes and these claim trade secrets should not be able to be used in the criminal justice system to override somebody's right to face their accuser at the challenge what's happening to them thank you with
that I'll leave it to Nathan hi so I'm Nathan Adams I work for forensic DNA consulting company in Ohio and my background is in computing so I have a little different flavor than a lot of the folks who work in forensic DNA who are typically biologists so we had the opportunity in a criminal case to examine a previously secret software program that evaluates forensic DNA information that was developed by the New York City office of the chief medical examiner so when I say osya means that's the the lab that developed this program an FST forensic statistical tool is the name they used for it a little background on it is that FST is
approved for use on DNA mixtures containing DNA from two or three individuals so as a as a general rule the more DNA from different individuals you have and in a mixture the harder it is to evaluate whether any single person could have contributed the program does attempt to account for missing data so if you have an incomplete sample if it's low level the signal doesn't isn't very clear it also allows for spurious noise drop in of DNA information and the output of it is intended to be a very concise likelihood ratio which is a statistical weight in the United States at least all DNA conclusions that suggests some defendant could be included as a contributor to a sample that is their DNA is possibly present on the item of evidence in question they need to provide a statistical weight because if every other person and the world could have contributed their DNA if that's as specific as we get for that test that doesn't give us very much information at all half the the box the jury box could similarly be contributors on the other hand we get statistics that are suggesting that only one person in the world could have DNA that matches this item and a whole look here it's the defendant so FST is supposed to streamline this process for complex mixture interpretation problems they never sold this to other labs although they tried it what we learned ultimately is that it is a fairly straightforward Visual Studio project running c-sharp with a sequel back-end and the timeline
will during the middle of the timeline will we'll take a break and go into the the problems that we looked into and identified but FS T's initial use was approved by the New York State Commission on forensic science in 2010 it valuates data at 15 locations on the human genome so we're looking at 15 separate genetic locations that it will evaluate a single reference to that is a locus plural as low-side those are the two locations that we're looking at for those mixtures up to three people they initially attempted to go up to four people they published an article that expressly stated their intent to do so they never did that in 2011 it took him a little time to bring it online but online it was online as of April of 2011 started to be used on on criminal case work so keep in mind that that all forensic science is fairly expensive to conduct these tests DNA is no exception to that so a lot of these investigations are reserved for particularly violent crimes so in New York City that would include possession of a firearm there's a lot of sexual assault homicide investigations that use DNA or or they evaluate DNA at least and sometimes property crimes but these are particularly significant investigations that they're making so if someone is incriminated by the software it is in a pretty serious situation where they could be facing a lot of time in prison in 2011 I don't know if anybody wants to predict but what happens next but that same month they modified their production version of the system and cause it had they had to take it offline so this goes back to what what gene and Jerome were saying you know everybody makes mistakes OCME made a big one by modifying their live version of the software taking it offline I think we have documentation now through Freedom of Information requests that suggest this happened either the first or second week that it was being used on casework again in likely and sexual assault and homicide investigations so these are pretty serious investigations we were only told that it was taken offline last fall but it went seven years without us knowing that they had messed up and had to take it offline they fixed the
problem that they caused taking the the system offline but they also made some additional modifications to the program it was later claimed that these modifications made after the system was validated after it was approved for use in casework after it was brought online for casework they claimed that these changes that they made did not affect the underlying methodology of the program but they didn't tell us they thought that until 2017 because nobody knew that it had happened in the first place in July of 2011 they finally brought it back online so it took them three three months or so to actually get it up and running again after they broke it in 2016 my company was hired to to work on a case where the source code to the software had actually been ordered over by a federal court so Chris flood and so V Levine the two public defenders in that case contacted us and asked us to take a look at the software so we were involved in an investigation into what FST was actually doing because this was not only the first time anybody had access to the the source code but anybody the first time that anybody had access to an executable version of the system so nobody had been able to put through different sets of data to test it at any point and it's it's a lifetime until we got it this is a a short fairly small set of output that fsd produces when it's run in a single case on a single evidentiary item it produces a single PDF as output this is a portion of that PDF that we generated during our investigation if you look at the columns they have alphanumeric designations that indicate which genetic location we're looking at on the human genome so these are our 1 through 15 that FST is looking at the first row is a reference profile so this is somebody's DNA profile typically it would be the defendants DNA profile that they got from a cheek swab or a blood draw and we're making an evaluation to see if that person could be a contributor to the sample and the three lines below the reference profile are the evidentiary profiles OCME tests each evidence item 3 times 2 or 3 times to develop DNA profiles from those items so now FST is intending to compare the reference profile to the evidence profile to see whether or not there's support that that person could have contributed their DNA to that item the statistical weight is reported for four different subpopulations in New York Asian black Caucasian and Hispanic or the designations because DNA has a tendency to be more similar between or within a population than between populations there is going to be a different statistic reported for each one of these and in an effort to be conservative the laboratory will report the lowest of those four statistics the higher that number is the more support there is for this person included to be included as a contributor and typically that is an incriminating issue if the defendants DNA is present on an item typically that's a bad situation for the defense so this is the laboratories attempt to be conservative to report the lowest statistic so this is the
significance of the statistic this is a likelihood ratio it says that the evidence that is these three rows of DNA profiles generated from from evaluating that sample is 70 times more probable if
the sample originated from the reference profile that is the defendant and two unknown unrelated individuals so this is a three person mixture the this is the prosecution's hypothesis they posit that the defendant and two unknown unrelated individuals contributed their DNA to this item as opposed to the defense's theory that it's just three random people whose whose identities we don't know so in the comparison of these two hypotheses the statistical weight is that it's 70 times more support for the prosecution hypothesis if the the prosecution hypothesis is true rather than if the defense hypothesis the issue
with this is that we have documentation of the validation studies conducted by OCME in 2010 and for this same sample the same evaluation that I just showed you it wasn't 70 point six that should have been reported it was 157 so we were scratching our heads when we came across this this was just a sample that we put through the executable of the the executable version of the source code that we were provided in this case and so we found that something was wrong at first of course I thought that I hadn't configured my version of it correctly but after double and triple-checking everything we realized that these were in fact two different values to be reported there had been no noise from OCA means that this should be the case that this is the case that this will be the case so it was upon us to identify what happened so at these 15 genetic loci we identified an issue where if we ran the sorry these are the different likelihood ratios that were reported between the 2010 validation study and the calculations I'm doing in 2016 we
realized that in the the bottommost image you'll see that there are three columns that's three genetic locations at which I did not give the system any information about the DNA present on the item and it came up with the same 70 point six values as if I gave it information at all 15 locations so this is this was the the smell test that led us to uncover some code that was actually tossing data so there without acknowledging it to the analyst running the system to the defendant or even saying to the world that they do this OCME in 2011 had started tossing data based on some rule within their system so this calls into question whether the
the validation study is is relevant because it's studying a system that had been modified and pieces of information we're not being considered any longer than the casework version so as a bit of a refresher likelihood ratio above one is incriminating the higher it is above one the stronger that that evidence is supposed to be a likelihood ratio below one is generally exculpatory as generally supporting the the defense's theory so when we did a breakdown of this we identified that one of these locations actually has a likelihood ratio these likelihood ratios are calculated at each look genetic location and then multiplied together using the product rule but we identified that one of them was actually exculpatory so OCME had told FST to throw out certain types of data and it turns out that sometimes that data is gulp Ettore so they are removing information that supports the defense's theory without telling anybody including the casework analyst running the system two of these other locations for this particular sampler in our inculpatory so they would support the inclusion of this person as a contributor but what we do know from the validation study is this particular individual whose reference profile were we're comparing is not a contributor to the system so it is a false positive that it's above one at all and we're kind of at a loss as to why it would be so high as as a statistic of 157 in the validation study and then after they make modifications to the system which now in 2017-2018 they're purporting to make it an improvement to the system or or that has no substantial impact we're finding out that they're throwing out data some of which should be considered exclusionary should be considering supportive of the
the defense's theory of the case the first public acknowledgement of this was in 2017 it was acknowledged by a US attorney at an assistant US attorney so the first person to publicly disclose that this was in fact happening was a prosecutor that is not a biologist that is not a laboratory director that is not a scientist of any sort but but a prosecutor the protective order that had been covering our investigation was vacated after substantial effort by ProPublica and the the Yale media freedom and information access clinic who wrote to the judge and asked for the protective order to be vacated in the interest of the public good public interest the OCME for some reason did not oppose this and then ProPublica posted the code online so if you if you want to go to this this github repo has everything that you need data wise to get this running on your system you will need some some Microsoft products though at least to make it expeditious so as a brief recap and I'm sorry I need to wrap this up quick but twelve samples were tested as regression tests when this modification had been made and only two of those twelve had samples where data was tossed so we're modifying that they're modifying FST in a way that's is throwing out data and then to demonstrate that that doesn't affect the operation of the system they only evaluated twelve samples but only two of them were affected by the the modification made so we have an incredibly small sample size for them to be basing their conclusions on and this is out of a total of 439 possible samples that they could have evaluated in this regression test so that's a problem and then we recently learned that they have had 16 additional quality control tests they call it which could indicate that additional modifications to FST have been made that we're not aware of yet it's only 70 lines including whitespace and comments of the modification that was made in 2011 so that's just demonstrating how much a little bit of code can affect it and
we're just going to run you through a few quotes from other from another case that involved similar probabilistic genotyping software and the reasons why a defendant's should not have access to the source code the responses include from a developer of one of these probabilistic genotyping systems these complex software systems is that you don't use source code to validate software and it doesn't get better a professor of Medicine says that the only reason you would need to have source code is if you want to modify the program DNA technical leader this is Anna forensic DNA laboratory says we don't need the source code because the source code isn't normally used when we validate software so we don't need it because we don't need it another laboratory director said that you know what DNA analysts only get one class in statistics and typically none in computer science so you know what are we going to do with it well I'm here so you know [Music] [Applause] so I think this is a great one to cap it off and then Gina I'll take it over so he poses a question to the court these are sworn declarations to a court by the way they're not just you know I was chatting with them and they told me these things says if we're to discuss errors and DNA testing would you want to capture an air rate for the entire
workflow for the entire DNA testing process yeah exactly like is that really
a question so Gina's going to explain
what else we got going on so we in addition to this talk we are we have Brown Institute magic grants to do comparison testing of probabilistic genotyping software systems worse we're focusing on FST with and without this check frequency for removal function and also comparison to other systems there are other open-source systems available we're trying to tell this story to a variety of audiences including hopefully coverage in the press for a general audience to the technology audience like you guys and also to a legal audience we recently had articles in the champion which is the magazine for the National Association of criminal defense lawyers and basically we're arguing what I think all of us in this room know perfectly well that independent third-party testing is essential and unfortunately
independent third party testing of these systems are really hard it's hard to get access to the executables to the hardware and even if you do it's often under protective order or it's very expensive you know might be thirty thousand dollars to get the program and five thousand dollars to go to training or you might not even be able that might even be sold to two independent testers it's difficult to get old copies of the software or match up results that were reported for particular defendants to the version that was applicable at that time let alone getting this is just access to executables let alone source code or bug databases or testing plans or design documentation or other things that would be incredibly relevant to understanding what the heck is going on and even if you acquire these things there's off in terms of service that limit the publishing of results how crazy is that this problem of trade secret protection aggressively being claimed over the rights of the public and of defendants we feel is often done really to shield from legitimate questions of quality and fairness legitimate questions of quality and fairness much more so than to protect from competitors and it is fundamentally thwarting the essential iterative improvement and accountability to stakeholders beyond buyers if the people purchasing the software are basically just asking the question are we sending people to jail okay good then we have a big problem there's also difficult to connect audiences if you were to find a bug in FST how would you be hooked up to a particular defense attorney that might that bug might be relevant and we would
really love you to help what are some
ways you can help I hope just listening to this talk you say to yourself I could have debunked those quotes right and I have some serious credentials why don't they ask me so if you say that to yourself we could hook you up with some defense attorneys that could help you say those things also we would really love to see advocacy for requirements in the procurement phase of software once it's in use it you know under certain terms is service or whatever it's harder but why not say if we're gonna use public money for criminal justice software require would be great or at least give a lot of credit in the procurement phase for source code software artifacts like bug reports internal testing plans software requirements no clauses presenting a preventing third party review how hard would that be come on access to executables for third-party testing on a reasonable conditions and here's a big one scriptable interfaces to facilitate automated testing you can get your hands on these things and if you want to run it through you know a thousand sample tests you know what are you gonna do that by hand bug bounties for finding things would be great funds for nonprofit third party entities to do independent testing all these things would be on our wish list
we'd love to see you be third party reviewers go get your hands on FST or lab retriever or LR mix or Lake Ltd or your formics other open source PG systems predictive policing software like civics cave take a look find some bugs or bad code please do something about it yourself but also let us know construct software yourself based on published alternatives and then compare the results you're getting to the black boxes so many things our community could do to change this conversation please
help us do that and the big picture is that black box decision-making is happening all around us and our community could do a lot to bust open those black boxes or to compare them to one another or to fight for accountability and transparency the Association for computing Machinery's Tech policy groups came out with a set of principles for algorithmic accountability and transparency that could be a place to start if you're involved in building software systems you could point your team and your boss and your company to this is some professional ethics guidelines that say we should be building an awareness access and redress accountability explanation data provenance auditability validation and testing we can all do a lot to provide the evidence that's needed to improve systems for all stakeholders so that we're not running our society on buggy or possibly even malicious algorithms that are hidden from view we would like to thank the
many people without whom our work would not be possible a special shout-out to sue we have four of our students who are working with us here in the crowd Mariama Marcia Stephen and Abby you guys could wave and I will simply end with please get in
touch with us if you if you think you can help with this effort in full disclosure the best way to get a hold of us is probably the three direct emails but we tried to set up some more joint ways we set up a Twitter software justice it's just recently set up so be gentle with it we also set up a discord channel if you find us off for justice on Twitter there's a recent link with an invitation to our discord channel we're working on setting up a subreddit for software justice but it's not up yet but please get a hold of us and let's all work on this because I think our community could make a big difference and a big difference is really needed [Applause]