We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Python for Arts, Humanities and Social Sciences

00:00

Formale Metadaten

Titel
Python for Arts, Humanities and Social Sciences
Serientitel
Anzahl der Teile
112
Autor
Lizenz
CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 4.0 International:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
The various areas within humanities and social sciences such as political science, sociology, psychology, economics etc. have evolved to a point where they have been complementing existing qualitative and quantitative methods with methods rooted in data science. This shift in paradigm is primarily driven by real-world, publicly available data sets that cover a variety of scholarly domains and have the potential to solve fundamental research questions in these intriguing fields. There is however a huge bottleneck to be overcome before realizing the full potential of data science in arts, humanities and social sciences; and that bottleneck relates to a fear of programming in students/researchers within these disciplines. Our talk presents some tips and tricks from course modules being taught in Technological University Dublin; the fundamental idea is to present an overview of data science toolkit and how it relates to problem solving in the real world. First we will present ways to make the data science lifecycle being made easy via tools such as Google colab and Jupyter notebooks followed by explaining how showing students the big picture and the workflow lifecycle of a data science technique helps grasps concepts in a very effective manner. We will present examples from exploratory data analysis and classification using data-driven research questions; and look into elegant solutions that can easily be plugged into a social scientist’s skillset. Examples of continuous assessment projects based on different Python libraries are also presented with a view to further establish use of Python as a valuable tool for arts, humanities, and social sciences. Finally, we will give an overview of an Irish research council funded project emanating from a combination of STEM and HUMANIITIES disciplines that aims to perform an economic assessment of anti-immigrant sentiment in Ireland. Various phases of the project will be explained emphasizing particularly those where Python tools play a major role in interpretion of research outcomes. These research outcomes obtained through smart use of Python data analytic tools play a key role in building relationships between data scientists and policy makers in both government and not-for-profit sector.
29
GoogolTaupunktMeterSelbst organisierendes SystemCASE <Informatik>t-TestZeitbereichComputerNebenbedingungTypentheorieGruppenkeimDifferentialStatistische HypotheseSimulationStatistikUmwandlungsenthalpieGradientAnalysisVisualisierungKartesische KoordinatenBitDatenanalyseSchlussregelSpieltheorieWort <Informatik>TypentheorieForcingArithmetisches MittelBeobachtungsstudieDifferenteMathematikInformatikGrundraumGüte der AnpassungFormale SpracheFlächeninhaltGleitendes MittelRuhmasseInstantiierungGarbentheorieShape <Informatik>GraphfärbungPunktMixed RealityCoxeter-Gruppet-TestLeistung <Physik>VisualisierungCASE <Informatik>NeuroinformatikStatistikKontextbezogenes SystemMultiplikationsoperatorAlgorithmische LerntheorieAnalytische MengeAbstandVorhersagbarkeitNeuronales NetzSystemaufrufSimulationDeskriptive StatistikInhalt <Mathematik>WellenpaketTermExplorative DatenanalysePhysikalische TheorieVirtuelle MaschineDomain <Netzwerk>Logische ProgrammierungZeitrichtungSoftwareLoopPhysikalischer EffektComputeranimationDiagramm
UmwandlungsenthalpieGradientAnalysisVisualisierungDateiformatProzess <Informatik>StapeldateiGebäude <Mathematik>SystemprogrammierungDatenmodellDatenbankSystemverwaltungSicherungskopieWeg <Topologie>Produkt <Mathematik>SoftwaretestObjektverfolgungDatenanalyseTaskProgrammierungFundamentalsatz der AlgebraAdditionSoftware EngineeringFunktion <Mathematik>InformationEntscheidungstheorieSoftwareentwicklerBusiness IntelligenceCAMTopologieMomentenproblemRohdatenBasis <Mathematik>Lineare RegressionRechnernetzAlgorithmusComputerSinusfunktionRandwertVirtuelle MaschineSchlussregelInformatikMereologieAlgorithmische LerntheorieBitRechter WinkelExpertensystemKontextbezogenes SystemVarietät <Mathematik>TypentheorieEndliche ModelltheorieNeuronales NetzDifferenteDomain <Netzwerk>TermMinkowski-MetrikEinsStrategisches SpielAbfrageDatenanalyseSoftwareentwicklerKartesische KoordinatenAlgorithmusTeilmengeComputerarchitekturHochdruckSchnittmengeInformationsspeicherungVisualisierungCoxeter-GruppeTrennschärfe <Statistik>AuswahlaxiomRechenbuchPunktFlächeninhaltInformation EngineeringInformationDatenbankAnalysisSichtenkonzeptStrukturierte ProgrammierungSoftwaretestGanze FunktionSystemverwaltungMultiplikationsoperatorTUNIS <Programm>MathematikProgrammierungForcingNummernsystemSystemprogrammierungVorzeichen <Mathematik>ParametersystemGüte der AnpassungElektronischer Fingerabdruckt-TestWort <Informatik>Computeranimation
ZeitbereichSinusfunktionSondierungProgrammData MiningTablet PCMaßstabOrtsoperatorMereologieModulare ProgrammierungAnalysisService providerLateinisches QuadratBitProzess <Informatik>Dynamisches RAMKlasse <Mathematik>Demo <Programm>Notebook-ComputerSchnittmengeInformatikWort <Informatik>GraphTwitter <Softwareplattform>RichtungBenutzerbeteiligungArithmetische FolgeDienst <Informatik>TermInklusion <Mathematik>GruppenoperationCodeHypermediaEinbettung <Mathematik>FokalpunktSondierungIntegralCluster <Rechnernetz>OrdnungsreduktionMultiplikationsoperatorMAPBildgebendes VerfahrenMehrrechnersystemPlotterQuaderSoundverarbeitungWorkstation <Musikinstrument>AbstandRelativitätstheorieProgrammierungProjektive EbeneNeuroinformatikDatenfeldArithmetisches MittelZahlenbereichMusterspracheTVD-VerfahrenEinst-TestAnalysisDifferenteMereologieOrtsoperatorSystemprogrammierungModulare ProgrammierungLateinisches QuadratCASE <Informatik>TaskFlächeninhaltRandwertMixed RealityPunktRechter WinkelSkeleton <Programmierung>Lesen <Datenverarbeitung>MatchingGemeinsamer SpeicherHilfesystemElektronische PublikationVerkehrsinformationSichtenkonzeptVorzeichen <Mathematik>PlastikkarteEndliche ModelltheorieBeobachtungsstudieUnrundheitChirurgie <Mathematik>VersionsverwaltungGüte der AnpassungAggregatzustandMetropolitan area networkUngleichungVorlesung/Konferenz
Transkript: Englisch(automatisch erzeugt)
Thank you very much Hello everyone, my name is Atif so I'll be presenting the first half of the talk and then Ergeman will be speaking the later half. So the topic is around Python for arts humanities and social sciences
both of our backgrounds in terms of The training is coming from computer science But both of us are applying now Python and also see us as whole in arts humanities and social sciences The talk is organized By following contents. So we are going to touch base on what is a HSS?
bit on our experience and exposure coming from data science artificial intelligence because it's very Dominant and there are some data science rules where there is a difference between stem and a HSS So I'll just throw a bit of light on that Some bits on Python and technology followed by case studies, which would be covered by our German
So arts humanities and social sciences You know the academic counterpart to this is stem where we know like science technology engineering and mathematics But when it comes to arts humanities and social sciences This is where the user or the human exist a lot of the time when we are trying to solve a problem
It's most mostly about humans. So if you are even trying to solve a problem from computer science We are trying to solve a problem that exists outside of computer science mostly Which is why we see a lot of interdisciplinary Roles within computer science as well and applications and so and so forth
one of the challenges that we have with when it comes to arts humanities and social sciences is Lower attention has been given to this because there has been drive towards them more often. However, the problems are Now coming from it just more more often than it was coming before
Within universities arts humanities and social sciences. They get less attention compared to stem but Recently this this there is a change there is a drift Towards there is a push towards arts humanities and social sciences in Europe and and also abroad
However searching for a common good in the domain of Humanities, it's a big thing for the causes such as social social good social welfare and you know for the society for uplifting it for for futuristic smart cities and all those kind of things but The whole point here is human in the loop or we are trying to solve a problem for the human
That is where the humanities is quite important in UK it has been rebranded because the attention has not been given like as soon as people hear Arts humanities and social sciences for some reason there is a stigma around it In the in the minds of the students and as well as in the society
So they try to change it and bring it shape the college shape which which means simply social sciences humanities arts For people and the economy and of course when there are people there is society there is economy now When we see a big picture when we talk about data science or maybe perhaps artificial intelligence
We see a lot of intersections here where a lot of things happen For instance if I show you there is computer science and when we try to merge it with business or business problems we have traditional softwares such as Excel such as PowerPoint because They're kind of tools that try to solve some of the business problems
But when we try to mix computer science with maths and statistics, this is where we usually see machine learning And when we try to mix maths stats with business or any domain Which could be arts humanities and social sciences as well We kind of get data analysis where most of the traditional research takes place
But when we combine all these three together, we kind of get what we call data science Which is where you have, you know computer science as a tool and then maths and stats kind of provide you with a theory and then business or the domain provides you the context that you're trying to solve and
In in this intersection, there is a great support coming from Python for instance When we talk about statistics in analytics, these are the two of the common terms that are being used When we talk about statistics and you talk to a computer science person They might be talking about mostly about
Descriptive statistics or perhaps EDA exploratory data analysis if you may call it that way Where the idea is to look for or describe what has happened in the past? Rather than predicting what's going to happen in the future. This is what traditionally is declared as descriptive statistics Which is kind of also my background in the sense because I'm coming from computer science
The other side when you talk to a person who's coming from a computer science background They might be interested in something which they call Predictive analytics which is where they're trying to use the data and trying to predict something in the future Where are there other other two areas where mostly arts humanities and social sciences people are interested?
Which comes from statistics is around inferential statistic where they are trying to build a hypothesis and they're trying to test it Which is a little different from what we do in predictive analytics where we are trying to predict the future Likewise there is another discipline where business and hearts and humanities and social sciences people are interested is called prescriptive analytics
Where is the domain of like what should be done? Simulation rule-based approaches to trying to solve a problem or perhaps recommending a different path The interesting bit over here is when we discuss about Python or the application of computer science or the application of data science You can see all four of them playing a huge role and if a person is coming from computer science mostly
They are in these two areas, which are like pointed out with the arrows Whereas if somebody's coming from statistical background, they would be more inclined towards these two Then there are a lot of applications of AI now when we talk about like as I mentioned earlier
Computer science is not solving its own problem as much as it's trying to solve other problems So you can see we're trying to make you know, traffic experience better through automobile applications Trying to solve problems of the business education finance Manufacturing gaming government health care you just name it like the applications are limitless
But most of them come from the domain expertise which are outside of computer science Now when it comes to data science, there are two distinct kind of rules First rule that I mean first of the type of the rule that I'd say more suited towards stem people But there are other kinds of role that are student suited more towards a HSS
So when we talk about somebody who was a data scientist, and of course when we talk about data science We talk about Python automatically. Okay, because Python is the dominant language here as well So when it comes to data science It means like the person will do almost everything from visualization from collection of data analysis Presentation and everything if the person could do that we call that person a data scientist. However, we know that having an all-rounder
Best all-rounder. It's it's it's like very difficult to achieve. So you might come across people having Stronger abilities in one of the identified areas. For example, they might be very clever But when it comes to analysis, but they could do a bit of visualization and so and so forth
So it's it's like it's in demand. Lots of companies wants to have a data science Scientist then there are machine learning experts who are mostly about creating new methods Trying to create new models doing a research breakthrough kind of a thing. They're always trying to chase something around
An improvement of the accuracy and and so and so forth. They might be coming from an academic Background and they might be just trying to publish their new research So they are in that area. So these two are mostly stem
rules Then we have data engineer and data architect. These two are about how can we build the pipelines on the data? So how can we manage the data and the SQL and the storage and all these kind of things or the big data? There is a subtle difference between data engineer and it architect Data engineers are the ones that are responsible for designing and developing and maintaining the entire data pipeline
Test the ecosystem which is required for the businesses and prepare the data for everybody else who is in the team But mostly for data scientists, but as data architect try to give a well formatted data Produce a schema thinks about how to bring in structure so that everybody can access the information through a structured way
But again, these two rules are stem as well Then there are database administrators which are commonly known to businesses as well These are the people who can you know, pull out information from the databases And they are of course of stem as well. Then there are technological
Special technological specialized roles such as you know an LP expert deep learning experts and so and so forth Whatever is required by the industry again. Mostly they are coming from stem areas Now there are other rules when we talk about such as data analysts such as machine learning engineer They are more suited towards a HSS however, they might be coming from stem to there is a blurry bound blurred boundary here
but if you see from the point of view of Analyzing something in the context of business and if they don't understand business if they don't understand the importance of the human who's behind that then performing such an analysis or performing a visualization and trying to
You know trying to analyze what is important becomes very difficult So you have to become more expert of the domain and then you should know the right tools Just a bit instead of you know Instead of like showcasing that you know a lot of tools. You just need to know a few tools, but really tell the story
In terms of machine learning engineer whose different role compared to machine learning scientists scientists are trying to chase for you know One person accuracy boost come up with a new modeling strategy Whereas engineers are the ones who are trying to plug in what is already available in machine learning and then try to solve a problem That's where they understand the problem more than they can develop more of
the machine learning and advanced machine learning in terms of the theoretical space Then there are people from data storytellers Those who can tell the story from the data and they can you know inspire people or you know Businesses and as well as society wherever the application is and
That's where they need to connect with with the real society or the real people who are going to benefit from it And that's why it belongs to a HSS Likewise, there are people from business intelligence Development and they are called bi developers They understand the business and they just try to tell the story or try to solve the problem for the business they might not know much of data science in the details, but they do understand just enough to
Make use of the tools which are available to answer the queries for for businesses now when it comes to you know Python we have a great support for machine learning algorithms just to show you like this is how it looks like, right? This is just a small subset of how the real
You know support that exists within Python as well. These are all nearly algorithms, which are all supported in Python Now you can see with scikit-learn There is a path from where you can start you can think about how you can you know If if you have such a data set and if you want to grow from there What is the path that you should take and try to solve the problem? So there is a strategy
There is a cheat sheet which is available. And of course, it's growing as well It's just to you know, just to highlight what exists over there already Then we have different types of neural networks. You can assume that as a you know A rat's brain is different from an elephant's brain from a monkey's brain from a human's brain Whatever is suited you just need to borrow that and try to solve the problem
If it's if the problem looks like a rat's brain is enough So you need a small neural network to solve that problem But as if the print problem is different you have to go for a different architecture So there is a variety of choices that exist out there Now the problem what has happened over the last couple of years is we have lots of technology
And some of the quotes that I would like to point out over here. We have lots of technology available now It's mostly about the right selection of the technology For the businesses to benefit from it, so we don't need to say yeah, there is another technology out there So what so the problem here is what is the right set of technology? Another thing is computer science has become more like a calculator for the other disciplines, but the problem here is
This calculator is quite different from the calculator that we knew You know a few years ago or decades and you know centuries ago and not centuries ago, but decades ago So this requires a person who understands it, and then you know prescribe what you need from it
Another thing is computer science is kind of analogous to You know the spice of a cuisine and that cuisine actually belongs to different disciplines which could be outside of computer science and If you put too much of a spice it will spoil the dish
It's like too much of a technology will not be good, and if you don't give it enough technology It will sound tasteless, and this is where Python is quite good because it has a lot of support and just requires people To tailor it and then prescribe it as a solution for other disciplines as well So Python is not just for computer science
But of course beyond for that and here I'll stop and hand it over to our human for the rest of the part So my colleague covered what data science is why social science and The boundaries are blurring now, but when it comes to actually teaching
programming and Python to the arts humanities and social science students there are Quite a few challenges and the topmost of which is that there is this fear within them like the minion shows so
the real The issue that we face to coming from a computer science background to an HSS school or HSS students is how to make students overcome that fear, so over my One year exploration of this area or maybe two years what I have learned is that
Don't make them think so much about the programming task at hand Give them what they want in terms of their social science problem Curiosity driven exploration because they are solving social science Humanities disciplines are about solving problems of humanity problem of society
so Tell them okay this problem that you want to solve of society you do surveys you do interviews computer science has the right amount of tools not just the right amount but also a Mix of different tools that can help you get the data do some analysis
Edo over it to make your job easy, and that's when they stop thinking about the fear aspect of it and Get curious towards how that technology can give them more insights for example. This is postdoc position
Advertised by an interdisciplinary team in Germany now You can see that the topic is politics of inequality, which is coming from a social science problem, but they want somebody Who also has a background in Python in some of the technology? So this is the kind of work that's happening in research departments
This is the kind of research problems that computer scientists or social scientists with a little bit of technology cancel So if you motivate them that okay? let's take a problem of society which I'll also explain as I go on which I did with a business school and
also in another project on immigration and But first another thing to you know Make their fear go away is to break that problem down into smaller problems and for each set of problem Give them a notebook or a small demo that fascinates them
and Makes them like less worried about the programming and the syntax stuff and that has worked a lot now this was It's available on my github so During the last year. I taught in a fintech module at TU Dublin Technology University Dublin and
Being from a computer science background it was slightly My first experience and I was like how am I going to teach business school students and Their background was of a right one. There was a rugby player in my class there were two students from direct provision
So those of you who don't know direct provision is Ireland's asylum system refugees are housed so There was like huge and one were two or three were quite good in programming already So to take them all along in one class was quite a bit of a challenge Some were at times nervous that
The students who already know programming they are not letting us learn. They're going too fast They're answering all your questions, you know stuff like that So and then I had to come up with something that will be challenging for everybody Not just those students who were
Already familiar, but also those who wanted to study the business side of things the finance side of things So anybody's within finance and business knows that Tesla is the hot thing. So I Created this project about Tesla news articles. I scraped some news articles and they had to do sentiment analysis over it and then use those sentiment analysis features as
Predictors for the company's financial performance Another thing that helped was Providing them with starter code. So I gave them skeleton code at first they were struggling like how do we even start? How do we read those files? How do we read the CSV's and get the sentiments?
And so that part if you go to my github and you want to teach Finance students this project everything the whole instructions is up on my github. I'll also share on Twitter later So this worked really well and the students were like towards the end They were very very happy that we solved a very nice programming project
And when there when I saw their reports, it seemed as if there was also part where they had to do You know different feature engineering and a lot of them put Well, not everybody but some of the groups put in a lot of effort into that and that helped bring out their creative side
So it was a very good experiment This project is part funded by Irish Research Council and they are The other main funders and it's called in inclusive. I are is short for Ireland
It involves a school of business in TU Dublin and School of Computer Science in UCD so the whole idea is that what we are doing we are doing surveys and focus groups and gathering quantitative data from that part and applying those insights to
Social media gathering data from social media to develop a tool for detecting sentiment around migrants in Ireland With time especially during COVID. It was noted that Ireland is getting less friendly towards migrants This wasn't the case a few years ago, but it has increased or maybe it was never new
but Some aspects that social scientists always knew but I be as computer science stone Came out with this study, especially like Some interesting insights in relation to the various nationalities that live in Ireland. I'll present a graph
So this box plot showing something very very interesting and This came from a theory of by a postdoc of international business. Now. My background is not business I don't know about this stuff that there is something called cultural distance and
Ireland's cultural distance with respect to Anglo-saxons now anglo-saxons would be people from the US UK Confucian stations people from China, Hong Kong Vietnam those Asians Eastern Europeans Eastern Europeans would be Poland Ukraine
Slovakia countries like that and Latin America Brazil Chile that part of the words a Middle Eastern region So this cultural distance value we cluster made these clusters of countries based on cultural distance value
Derived from international business now, we see an interesting pattern here If we would have done all the migrants together, maybe this wouldn't have not been so obvious Some communities find it very hard to make friends that are native Irish native here We refer to those who have been in Ireland who are born Irish. So
overall What this is representing for would be that the question was the survey question was do you find it easy to make Irish friends and Strongly agree was five means you might find it very easy and
Strongly disagree was one so you see Sorry strongly disagree Was a higher number So the Latin American and the Africans they find their mean is the same for so they find it harder But within the Latin American ones you see that even the lower quartile has a lower number
So they are struggling more. So there's wide variation in the Africans Some Africans would find it easy But overall the Latin Americans and that makes sense if you read the news articles in Ireland You would see that a lot of the delivery drivers who are facing a lot of racism. They are from Latin America It if it would add up with the data from social media from the web
So that's a very interesting outcome of this project And this one is an ongoing part What we did here was that we gathered tweets of migrants and natives They were manually Tagged for what who isn't who is a native and who is a migrant the data set of this?
Will be made publicly available. Hopefully because it's Twitter data and Twitter terms of service allows it But it's still a work in progress and when the Findings will be published and hopefully it will be released along with the data so an interesting aspect is what I would what this graph is showing is that I just
Used word embeddings on the tweets of migrants and natives and then I applied k-means clustering to those words and based on TSNE Reduction was applied, you know to visualize now two of these clusters label, but the one with the label zero
The purplish one and the one with the label to the greenish one you would see that in the green one There are a lot of migrants and that one label two is about an image abortion Now there's this whole controversy
around Ireland's national maternity hospital it's being given to a church called st. Vincent's and a lot of Natives seem concerned about it, but the migrants not so much. They didn't tweet a lot about it So this also is showing The topics that migrants talk about over their social media and natives talk about over the social media can tell
The level to which there is an integration into the society. That's what we're trying to do via this project So the whole idea of this is the code. Hopefully I will be able to share it soon Because the project will be the data will be made available
but my whole point is that if you take a problems from humanities and social science and try to come up with data science methods and Python has very very effective tools like Gen sim word embeddings and ltk if you try to solve it with that you'll get very very interesting insights and ideas and that's what's
Motivating the arts humanities and social science disciplines towards it. In fact, there's a whole new field called computational social science So, thank you