AV-Portal 3.23.3 (4dfb8a34932102951b25870966c61d06d6b97156)

Terminology and classification in the Prosecution Project

Video in TIB AV-Portal: Terminology and classification in the Prosecution Project

Formal Metadata

Terminology and classification in the Prosecution Project
Title of Series
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
A recording of a presentation from Mark Finnane for the December 2018 AVSIG meeting
Mathematics Graph (mathematics) Projective plane Bit Statistics
Domain name Mathematics Graph (mathematics) Projective plane Bit Quicksort Distance Statistics
Building Uniqueness quantification Graph (mathematics) Statistics
Frequency Video projector State of matter Multiplication sign Graph (mathematics) Musical ensemble Statistics Row (database)
File archiver Statistics
Execution unit Service (economics) Computer file Graph (mathematics) Moment (mathematics) Virtual machine Database Statistics
Type theory Group action Graph (mathematics) Statistics
Maxima and minima Bit Statistics
Type theory Term (mathematics) Statistics Mach's principle
Divisor Term (mathematics) Graph (mathematics) Shape (magazine) Statistics
Cuboid Energy level Quicksort Statistics Family
Statistics Event horizon Descriptive statistics
Pattern language Quicksort Statistics Event horizon
Point (geometry) Category of being Chain Visualization (computer graphics) Statistics Form (programming)
Web page Email Term (mathematics) Computer-generated imagery Convex hull Statistics
Email Link (knot theory) Computer configuration Cube Computer-generated imagery Code Database Information Statistics
Source code Email Link (knot theory) Observational study Interior (topology) Database Grand Unified Theory Number Attribute grammar Graphical user interface Computer configuration Ring (mathematics) Selectivity (electronic) Fingerprint
Source code Link (knot theory) Observational study Computer configuration Database Information
Web page Email Functional (mathematics) Multiplication sign Range (statistics) Set (mathematics) Database Counting Number Attribute grammar Frequency Graphical user interface Term (mathematics) Energy level Selectivity (electronic) Condition number Area Source code Execution unit Link (knot theory) Information Direction (geometry) Range (statistics) Field (computer science) Bit Cartesian coordinate system Complete metric space Category of being Process (computing) Bridging (networking) File archiver Row (database)
Source code Email State of matter Multiplication sign Web page Programmable read-only memory Login Attribute grammar Usability Group action Number Attribute grammar Order (biology) Process (computing) Term (mathematics) Personal digital assistant Different (Kate Ryan album) Commodore VIC-20 Videoconferencing Website Electronic visual display Extension (kinesiology) Row (database)
Source code State of matter Lemma (mathematics) Aliasing Attribute grammar Usability Group action Attribute grammar Order (biology) Number Network topology Commodore VIC-20 Electronic visual display Free variables and bound variables
Source code Email Gender State of matter Aliasing Source code Attribute grammar Floating point Group action Counting Total S.A. Wärmestrahlung Digital electronics Supersonic speed Number Network topology Information Energy level Identity management
Area Source code Execution unit Gender Aliasing Floating point Mereology Number Thetafunktion Network topology Normed vector space Traffic reporting Identity management Identity management Row (database) Window Flux
Email Source code Link (knot theory) Key (cryptography) Aliasing Programmable read-only memory Visual system Floating point Attribute grammar Electronic mailing list Group action Number Identity management Row (database) Window Free variables and bound variables
Email Number State of matter Aliasing Electronic mailing list Group action Control flow Physical system Row (database)
Email Source code Trigonometry Number Key (cryptography) Aliasing Energy level Identity management Maß <Mathematik> Row (database) Number
Web page Email Mechatronics Statistics Code Aliasing Visual system Row (database)
Area Statistics Meta element Electric generator Key (cryptography) Code Multiplication sign Product (business) Category of being Frequency Visualization (computer graphics) Term (mathematics) Energy level
Point (geometry) Area Category of being Statistics
Frequency Mapping Moment (mathematics)
Frequency Observational study Code Multiplication sign Database
Point (geometry) Information Quicksort Event horizon
so mark welcome please introduce yourself a bit more if I've missed important things and your project yeah and this is a complete change from the
more technical side that so he's very expertly presented there even if there's
the sense of her being a bit distance from the research domain because I'm sort of making potatoes researcher I'm a
historian a criminologist and professor of history at Griffith University for
the last five years I've been directing this project or the prosecution project
which is a history of the criminal trial in Australia and what's unique about it is that we're building a database of as
far as we can get them all criminal prosecutions in Australian criminal
jurisdictions which are mainly the state's the six states and the Northern Territory over very long periods of time so we have records dating from 1788 through to the 1960s this has been a
digital project that has relied on on
partnerships with archives that provide the data so a typical data is from
original court registers and we extract
that data transcribe it because mostly
manual data so there was no way of accessing the data file machine technologies at the moment so I've had to organize transcription using the
research and the volunteered community into a database that we built with a research services at Griffith University and on this topic today we probably really should have somebody from our
research ten years to talk about some of the issues that are likely to be most
interest to this group but yeah I mean
you've indicated an interest in this new type of research so I might just
introduce a little bit about it and show
you some of the tools we have and particularly the issue around what we do with the data once we
get it because let me say there are two
types of users of this kind of data there are researchers like ourselves who may be interested in telling the individual stories or looking at in kind
of conventional social sciences terms
looking at aggregated data and analyzing that in terms of one of the factors that
shape how a criminal trial develops and
what its outcomes are so at the
individual level we also have very large community of people involved in family history and genealogy and so on there are also access our data box those sort
of users are really interested in individual stories and really in
descriptions of events and individuals
as they were recorded originally and not
really classified into some sort of higher aggregate but for the purpose of thinking about patterns of the events
and we're talking about then
visualization of our data's is coming quite important and it's at that point that we have to think about how we
aggregate into meaningful categories that respect historical forms but also
make sense in terms of abilities so this
public search page I think you can all
see that here that just outlines the
project and so we have search historical
trials here which has got a basic
keywords search which works across the select number of attributes of our data and simply searches in an uncontrolled way for any term arise in that somebody might choose to investigate so somebody come in Imam want to know of that particular individual and they type that
in or they may want to know about some
particular offense and without having to go into more advanced search they may wish to see whether we've got anything on forgery and there's plenty of stuff
there for them to look at but if they've
got more information about the area in which they want to search then they are able to search across a number of our attributes now this is a select number of attributes for a specified period of time which is constrained by archive access conditions some of our records are from closed periods or more under restricted access above the kinds such as children's court material now the third most of the records we have are P concert' across this range of attributes and we're in the process at the moment as we're getting to a more complete data set of starting to consider releasing a bit more of our data so how do we derive these things I think in terms of any kind of application principles of classification then you know the original data challengers just at the transcription level of getting accurate terminology off the page the data so first name and surname are significant challenges so it's very important for our data that they be as accurate as possible the offense category is one where we have a possibility both an original transcript considering how it might create it purposes and I'll show you that in a minute most of the other terms we have available we simply transcribe up in the original record and we have an open search that enables people to establish whether somebody guilty offenses in New South Wales in yes so that's just how
that search function works well I might just draw attention to what lies behind
this and that this is probably of more interest to a lot of people our first challenge was that we were dealing with a number of jurisdictions in which terms that we'd regard as you know comment or them might have been represented differently in the original records and the records in any case vary in the extent to which they cover all aspects of the criminal process so Queensland of Victoria are particularly rich datasets in terms of including the earlier stages of the trial as well as later but we had to develop a process that would enable the research is to define the different registers as we call them different
state jurisdictions in the particular course which which we were accessing data from and have an approach that would allow us to add attributes as they emerged over time and to have registers that had different numbers of attributes and at the same time respecting the original data we
so so this is a typical example maybe
Queensland state Supreme Court 67
attributes here some of these attributes
will be shared between different with
other states and others not some of the
data is available in original sources others is very inconsistent it's very
important in this area looking at indigenous identity for example but for the most part these records don't contain that and that tends to be derived from other reports such as news
historical newspapers which can be searched through a trove API that we
link to our records I'll just show you
quickly how this looks in in practice
with again examples from cuenca so a key
thing for us is verifying the data and the system for most of our states enables us to check the data extracted
against the original record and that's
very important because our data has been prepared both by researchers on the research team and as I mentioned by quite a large number of volunteers and this this record itself has been entered by a volunteer just and last day up to some are able to check the accuracy of this record and this is a pretty experienced transcribe owned by the
expecting now thank you one of the key classification challenges for us is making sense of this offense
here you're breaking open a locked showcase and stealing there from which is a very specific definition of an offense that if you looked at crime statistics you wouldn't find it for that and so we've done quite a lot of work over the last couple of years code in our fence data in particular to enable us to visualize the records so out of that wonderful so back on the
main page people are able to visualize their records through this facility and here where as I say we've run a current over well one second sorry Mike you you just
cut out for about a sentence there if you can run just that last sentence
please yes so the visualization is a product of work we do on aggregating particularly our offense categories because this this obviously key area of interest for people looking at this in social science or historical terms we run a code over our offense data of the whole jurisdictions over long periods of time and generate levels of aggregation through that code and the classifications are pretty familiar to people working in criminal justice anybody looking at criminal statistics since the 19th century would recognize these are generally the kind of categories that are used and really across national borders now as well so there's a lot of work gone into that and we have both meta level homicide
offenses and property offenses person offenses and then within those categories looking at more refined aggregations that still have their reference point in historical statistics or end now in in contemporary criminal justice statistics of the kind you see on ideas the other areas are pretty much drawn direct from our data although we do agregate again the fields and
sentences particularly because there's some interesting and sitting during this
period when the death penalty was still in place currencies in which death penalty in fact was the plight particularly they're not in century the trial place in committal place we just used the original data there were involved in some mapping exercises at the moment where we
yo code they got an IRC to look better
more detailed studies in interpersonal violence over long periods of time using this database and extending it and we'll be very interested in geocoding crime
events if we can get more specific information so look that's so sort of what we're about and as much as I I think I can say at this point I'm very happy to answer any questions thank you very much mark that's very very
interesting stuff