We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Empowering social scientists with web mining tools

Formal Metadata

Title
Empowering social scientists with web mining tools
Subtitle
Why and how to enable researchers to perform complex web mining tasks
Title of Series
Number of Parts
490
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Web mining, as represented mostly by the scraping & crawling practices, is not a straightforward task and requires a variety of skills related to web technologies. However, web mining can be incredibly useful to social sciences since it enables researchers to tap into a formidable source of information about society. But researchers may not have the possibility to invest copious amount of times into learning web technologies in and out. They usually rely on engineers to collect data from the web. The object of this talk is to explain how Sciences Po's médialab designed & developed tools to empower researchers and enable them to perform web mining tasks to answer their research questions. Here is an example of issues we will tackle during this talk: How a social sciences laboratory life can be a very fruitful context for tool R&D regarding webmining How to create performant & effective webmining tools that anyone can use (multithreading, parallelism, JS execution, complex spiders etc.) How to re-localize data collection: researchers should be able to conduct their own collections without being dependent on external servers or resources How to teach researchers the necessary skills: HTML, the DOM, CSS selection etc. Examples will be taken mainly from the minet CLI tool and the artoo.js bookmarklet. Speaker Guillaume Plique is a research engineer working for SciencesPo's médialab. He assists social sciences researchers daily with their methods and maintain a variety of FOSS tools geared toward the social sciences community and also developers.