Merken

Quick and Dirty Usability:

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
so the authorities and on his work that he uses have been
doing in the PC student make up and we're a show you how you can use Google suggest to quickly figure out what the users are doing with yourself at this really cool technology later in the day were going to show a new version again with critical but that will give sister grandpa that I'm going to as well in both these talks I encourage you to interrupt during the talk and ask questions because you get the most out of
talk you have to stop and say week were about this what about that what about that right don't like to be in a similar turnover and now have it again please interrupt right so as to make that this is about understanding how people are using a software based on looking at what search search queries and performance based on the observation that when people run into trouble with software with interactive and devices 1st line of defense is
often Google error was very quick motivating example but so this is a given in the history of something
that so here's a quick example so back in September I went to google and type in Firefox have and immediately of course Google returns a list of of 10 suggestions for hours to complete the scoring and the important thing is that these queries are actual queries that other people perform in the past and the sort approximately according to their popularity just looking at this list we get a pretty good idea right away some of the common activities that of Firefox users engage in and so there's an interest in privacy clearing the cash deleting cookies and that but if we go further down the list here but highlighted this 1 is a little bit curious it comes from the outside of the menu bar back right so we can actually inspect the interface of Firefox to to try to figure out why this so popular alright so this
is Firefox 3 . 6 again as I did this in September so Firefox wasn't is still better than Firefox 3 . 6 on Windows XP I'm happy to report that this is a problem that's only on Windows but and if you go to the tool bars and you uncheck the menu bar while the many-body spirits the problem
with this is that that was the the the many-body is now gone and analysis actually access those options to begin with so the 1st time I did this actually had no idea how to undo that action had no idea how to recover from this situation and it turns out that I'm not alone as you might have already suggested suspected
but if you exhaustively look through Google suggest here are also hold kind of different search suggestions all related to this missing many barred on using the techniques and talk about later we've identified over 150 different suggestions in Google's just about this issue and we estimate that what a search from this set is actually performed about once every 32 minutes on average so what I think this this example demonstrates an action really appeal to your intuition is that search query logs
of the central repositories catalog the day-to-day needs of the user community and end a matching step back a little bit and look at this in sort of a broader context there's actually been some research done at microsoft research looking at how how to use quarry lots to do things like medical research in social social science research and I'm a researcher named Mark Matthew Richardson he has this quote which I really like and so the corre logs act as if a survey were sent to millions of people asking them every everyday to write down what they're interested in thinking about planning and doing so this is a very rich data and it's highly ecologically valid so to demonstrate that actually
but some of you may have come across as in past but Google actually produced an application called Google Flu Trends where they look at health-seeking behavior also searches related to flu symptoms or by direct and everything to try to predict when somebody would go to the doctor and and the diet food now the important thing is that their model actually is very close to the data that was released by the CDC but they're able to produce these numbers in 24 hours whereas the CDC had 7 daylight because at the weight for the doctors to report all these cases so it's a very very powerful technique now taking this a bit closer to to to what we do so I'd I looked at the google insight for born to and I think we can see a six-month release cycle here so it's actually a pretty neat and data source right so the claim here is that Corey
logs can reveal the tasks and the issues for any publicly available interactive system so I stress here that have to publicly available because people have to be performing searches and and you get better results when it when you have a larger community a larger user base but the problem we have is that we don't have access to to Google's Corey dataset and so the questions how we approximate this data and think you guys already know the answer but but just to reiterate so
here's here's example from the beginning again but if I add 1 letter to now I get another 10 suggestions I can just sort of do a breadth-first search here and get more suggestions and in total there's about
74 75 thousand distinct suggestions if we do this action just do this lot just give you a sense of how easy it is to do it so
this is Hey and you get the idea that you can sort of see the common questions very quickly and some people don't know this I think of chemical of the direction of the character back and it will actually feel like in random places so you can actually really quickly go through and enumerate these things the
European Patent spots so I actually did that for brunch
projects but some of them are listed here so you can see that you know between in many cases thousands or in some cases hundreds of thousands of of quot suggestions are returned I have asked for Blender here of an ostrich you just candidate that that 1 project has a common name you end up having to disambiguate between blended project and lenders for ice and things like that but you can actually filtered out pretty well just by actually performing the searches and seeing what pages come back and then you can kind of determine from the language of the pages whether or not it's relevant to a particular topic so now I talked about offered for
example give have about 15 thousand suggestions those are the suggestions and but they actually represent about 2 . 8 million search queries there's actually more queries and that that but but the the thing is is that Google about to preserve people's privacy they do this sort of key in Canada mization so they become off long tail so only searches are performed by many users are actually recorded in this dataset self so we've got data very very quickly representing about 2 . 8 million searches and that's kind a typical for this type of thing and of course the the the popularity of searches it falls off exponentially so the popular stuff is really popular and then it has a sort of long tail distribution of alright so I'm just you a couple of
examples that very briefly and so this 1 here like quite a bit is called I like to refer to a speaker uses language and and again I doubt this presentation from most of the presentation did earlier in the week so so I mean this is probably pretty obviously but from within a given dataset we actually see a lot of searches for people asking how I convert image to black and white and here they're not talking about a binary image in most cases actually they they want something that looks like it was a an image captured on black and white felt and and there's many ways of doing this in the end and you know you can use a gray scale the center channel Mr. command that kind of thing I the but problem is is that None of these commands use the words black or white and and that's what people are searching for so the least that research suggests that to to some percentages of of the audience of maybe they're not able to recognize that these these commands are are necessarily relevant so we acted in a fight over 90 different
distinct phrasings for the the question of you know how to convert a black white and on research about 7 once every 74 unless there's a question it's maybe going to address that inferior federative his intention of maybe that case it's just a random idea have you thought about having instead of just have a set of static menus having a searchable set of menus yes in fact this factor is quite prominently into our research is something that I'm working on currently I can show a little bit about that at the end of the presentation but actually my colleague then there is going show you that will get this afternoon and search plays a big role there as well and I think that's a good point I mean you can't find 1 vocabulary that's can fit everybody what you wanna do is have some of these aliases where are depending on your background you can come in and still find what's relevant to the former task also another example here this time Inkscape so a lot of people asking how the crops they're not thinking about fitting the page the selection of these are types of things and and here we can see that these types of searches are performed about once every 3 hours it's a slightly lesser try and so and we can look at
another example here this so with the claim that in game to drop primitive shapes you is is typically a multi-step process and you dropped note line circle for instance user select you stroke selection but and and and and as a result we see many many many searches for people asking how I draw a circle this searched about once per hour and that's not the only primitive shape people looking
for we see like 130 different ways of asking how draw various types of wines in particular straight lines 40 different suggestions for rectangles 24 squares 14 4 lectures so is just this data suggests that maybe a multi-step process is sort of alluding sample portion of the of the audience for the for the software people are actually using the data from day to day
right so so those are a couple of examples of just text do you you can get from this on a fairly high level but again we do get a fair trade I mean tens to hundreds of thousands of queries so we do get a good chunk of that long tail so you can get pretty specific I I just wanna go into how we can start to identify interesting parts of the of the suggestions that that which focus on so the problem is that when you get 140 thousand suggestions not all of them are useful for understanding how people use a soccer maybe people just try to download the software maybe they want to the 3 reviews or something like that so you wanna build pick the ones that are useful in any kind of already sort taste of that but I'm just gonna go
into it in a bit more detail so In our research we've identified in this space about 6 different types of queries and the ones and look at the for understanding people's use our operating instruction causal so people asking how I perform a particular task or troubleshooting course and
what we did is we actually only 1 what that these queries it turned out that if a chorus phrased as a question so how to work hand and things like that was typically instructions and the people looking for operating instructions so I have some templates up here but the thing to you can sort of see the sort of imperative statements so any time you say like draw a circle on the sort of the the verbs in the present tense I kind of thing that also tended to indicate people looking for and for operating instructions whereas for troubleshooting it tended to be the sort of statements of fact this is this this is the situation and and the other thing that's useful oftentimes just look at queries that have certain keywords in them on and the because the obvious ones here but there but just to give you an idea of how to try to filter the data in again a quick and dirty way to get at what you're interested in and of course once you've done that there is a variety of different visualizations and tools you can use to try to navigate all this data so
simple tag-cloud here this 1 for Inkscape I have all and actually show maybe lot them off this time of of some of the data that we have various projects so your simple type of here but the other 1 I
like here is the term co-occurrence visualization so the way that this works um Mrs. we users so this stuff down this column here represents the most common words and then the stuff that goes across the horizontally represents the words that co-occur in the shop in the same query as these words so you can see right away here that what people wanted to know whether people wanna draw 1 draw lines rectangles curves circles excetera you know that what they want to color will that change color by color hair color and so and so it's just a way of summarizing the data and getting sort of sense and feel off what what people are interested in doing so and actually before into
current working I'll just show some of this so we have these
interactive titles of sponsors for this so we've got we've got benefiting Inkscape which removes her and Blender for again and you can kind of unfortunate resolutions along here but you can sort of see what I'm talking about here is this the fixed on draw yeah we connect I promise it was working before and you know
you can get a sense that began with the tag cloud of what people are trying to what people are trying to draw so these are this is tilted on were drop and you can sort of see on the right here all queries that have to do with drawing in get and so the so we can talk about the that this dataset afterwards so I think so I think it is
going to conclude with an idea other ways of we're looking
user data currently in this gets back to your question so
so up to now I've been advocating that we use this data to understand how people are using this software in practice to get a real quick sense of of a large user community but we can also make more active use of the data so this is some of the work that I've been working on
doing right now currently so basically when somebody types in search going to google the retrieved documents tutorials form postings etc. and embedded in those pages are many references to commence in the
software and we actually we can produce a list of commands by instruments in the software or just by looking at the localization database of the string string database on so anyways all we do is that we we perform performances is retrieved documents and we identify the commands that are in those documents and when we take a joint dataset from Google suggest we can
actually create these really are fairly large graphs where it sorry of well with given that we just use the dataset about this so Indian because it was instrumented to require all command invocations we have a list of all commands but for other applications in 1 you have to instrumentals applications the given the name of the command so I did it in a kind package but very quick way so again I I just looked at the the translation files like the string translation databases and now I have all the error messages all menu names of men and and it's just very quick many case you you when you do that you take this very large datasets you can create these associations these graphs so queries on 1 side and commands on the other side and what you can do that is you can build smarter you can build smarter so sort of command
search so if you were integrate search into the interface if I wanna say convert the black and white can come back with based on just looking at the pages that Google returns no channel mixer grayscale the saturated and it's a search engine so that the rankings here may not always be the best so maybe the saturation the above channel mixer but that we've got a pretty good 1st result there and similarly you can kind of do command recommendations if you will so if I ask what commands are used in a similar context the stretch contrast you see that white balance order levels and colors are also used in the context and then the other thing we can do so
that's going from this direction so going from quarries to Khmer but because we have this graph built from all of these queries go in the opposite direction so from a command what queries are associated with that so so here's something that I'm working on right now
and this is just a mock-up but suppose that this is a tool to that for the ellipse electoral being game well based on the data that we have we can say that well it selects is result is related to the following searches so draw a circle job let's text on circle correct red arriving at fact and you can see what other commands associated with that now this is generated by the user community and on the fly so sort of a ball the analysis people's the soccer evolves and and and the nice thing about search's is that if you think about what you're doing when you make this when you when you type in a quarry we trying to come up with this very concise information rich phrase to describe what is you're doing or where you're looking for your information need and so I think that these queries here of the compact that they give a lot of data give a pretty good indication of what people are after a type in search itself so this is 1 way that we're looking at integrating this data into the application to make it actionable so I think that really concludes that everything wanted to say today but I'm more than happy to
take questions of this yeah so aside from all this looking very very interesting 1 1 factor I I wasn't sure if you looked at that is it when you're freezing things and maybe
thinking about as a developer thinking about using these results to change the phrasing of what you have in your applications of people can understand it better right doesn't want have you looked at all is the reason you don't get searches on and the phrasing is because it's a good reason to begin with and you're making more people happy and only unhappy minority or going out to search for things that you might make things worse rather than better by switching so I think that 1 thing probably should've mentioned going into this is that I did this is a way I would say do a pilot study in the sense that if you if you want so many talks about the usability studies are just how observing people use your software right but we you want do is you wanna put people into situations that a lot of people wanted to perform these tasks any anyone put people into situations where are you really get some sort of value from the user evaluation so when you run its course and you get the square is back it's a suggestion to watch somebody perform that task and if they're having trouble then you can try to get some insights from from from these follow up observations by it might turn out that it's not usability problem just the very popular task that people want perform this kind of build on that but that's a good question John and that you can look at some the numbers we have only 1 people or acquiring some these things like an average of once every 30 minutes that is a fairly you know some the thing that really be problems or the things that are fair number of people or searching and so yeah we we were talking to get the stuff where people know how to do it and therefore they don't need to search for right but you know as we mentioned earlier there isn't necessarily 1 size fits all interfaces and so having the ability to search like he showed would be and you go back to the year we use modernism and yes so like you have ability type and this is what I want do and then it comes back with the the actual command interface might be a way to kind of bridge the gap between people who use the vocabulary of your application and those who are unfamiliar and so that I'm just curious about the standard but which we're using the scraper whether that's publicly available like for instance the Digital Methods Initiative comparing all the stuff but just use I'm query incidence of on it so when I so it's actually a good question the what I'm actually grabbing like these thousands of of of course suggestions that's a Perl script that I run and I don't make it publicly available because quite frankly it's sort of in this gray area where we're using the same interface that Firefox and other web browsers used to do query suggestions in in the search box in the corner but but I'm not entirely sure if I don't want underwater is publicly because it's instead of a gray area because we are sending automated queries what I'm suggesting to you guys is that are you guys can get a lot of bang for your buck just by typing inquiries manually which is perfectly fine the way I showed you may get a fairly high level view of how people are using so by typing in these types of things this is not to say you couldn't go and write the same process yourself is that they have a page OK I don't have a question about this graph between queries and commands that we have a tool to make this relational opposites has to be done manually well I that I have a tool that does it and it want get into the details too much but basically this graph here it's using some so techniques from the information retrieval literature answer to Question Answering and then so basically what you do is you build a question and should answer engine so you you've you've caught your instead of like a search engine you wanna build type in question again just like in the name of the command or whatever is an answer so I built that engine that perform those queries native stored the results in the database like almost the capture and and that is that it's just a bipartite graph is used no because I remember look like separated in black and white they don't have any common words so make link only so basically the argument is that these pages factors like a Rosetta Stone so a tutorial is telling people how to perform a task in this office they can have to mention that many in the documents but in order to be retrieved for that query they also have to have some words in common with the query so by performing the search retrieving the relevant pages and then by extracting the commandments from those pages you link the 2 vocabulary for principle many more questions the entombment was and so on the 1 hand pursuant to few
t-Test
Versionsverwaltung
Computer
Knoten <Statik>
Druckertreiber
Computeranimation
Metropolitan area network
Rechter Winkel
Software
Luenberger-Beobachter
Abfrage
Interaktives Fernsehen
Computer
Programmierumgebung
Knoten <Statik>
Gerade
Computeranimation
Fehlermeldung
Bit
Datenmissbrauch
Cookie <Internet>
Abfrage
Indexberechnung
Mailing-Liste
Extrempunkt
Quick-Sort
Computeranimation
Metropolitan area network
Lesezeichen <Internet>
Diskrete-Elemente-Methode
Strukturgleichungsmodell
Datentyp
Bildschirmfenster
Schnittstelle
Subtraktion
Gruppenoperation
Abfrage
Gradient
Extrempunkt
Login
Computeranimation
Konfiguration <Informatik>
Wiederherstellung <Informatik>
Metropolitan area network
Menge
Analysis
Normalvektor
Zentralisator
Korrelation
Gewicht <Mathematik>
Sondierung
Dokumentenserver
Dokumentenserver
Onlinecommunity
Automatische Handlungsplanung
Zahlenbereich
Kartesische Koordinaten
Online-Katalog
Quellcode
Sondierung
Kontextbezogenes System
Quick-Sort
Computeranimation
Richtung
Informationsmodellierung
Dreiecksfreier Graph
Resultante
Total <Mathematik>
Physikalisches System
Extrempunkt
Datensicherung
Login
Computeranimation
Task
Portscanner
Metropolitan area network
Systemprogrammierung
Bildschirmmaske
Task
Interaktives Fernsehen
Zoom
p-Block
Große Vereinheitlichung
Chi-Quadrat-Verteilung
Metropolitan area network
Gruppenoperation
Computeranimation
Richtung
Metropolitan area network
Sondierung
Formale Sprache
Projektive Ebene
Programmierumgebung
Knoten <Statik>
Extrempunkt
Computeranimation
Wiederherstellung <Informatik>
Homepage
Managementinformationssystem
Distributionstheorie
Zentrische Streckung
Datenmissbrauch
Bit
Kombinatorische Gruppentheorie
Quick-Sort
Computeranimation
Spezialrechner
Datentyp
Schlüsselverwaltung
Steuerwerk
Binärbild
Bildgebendes Verfahren
Resultante
Aliasing
Shape <Informatik>
Bit
Prozess <Physik>
Kreisfläche
Punkt
Primitive <Informatik>
Extrempunkt
Kombinatorische Gruppentheorie
Teilbarkeit
Computeranimation
Homepage
Task
Hydrostatik
Spezialrechner
Menge
Spieltheorie
Trennschärfe <Statistik>
Datentyp
Primitive <Informatik>
Gerade
Instantiierung
Subtraktion
Prozess <Physik>
Zehn
Primitive <Informatik>
Rechteck
Abfrage
Quick-Sort
Computeranimation
Übergang
Eins
Quadratzahl
Software
Datentyp
Stichprobenumfang
Mereologie
Steuerwerk
Gerade
Offene Menge
Subtraktion
Bit
Information
Systemzusammenbruch
Raum-Zeit
Computeranimation
Eins
Task
Physikalisches System
Typentheorie
Datentyp
Visualisierung
Gruppoid
Operations Research
Algorithmische Programmierung
Nichtlinearer Operator
Befehl <Informatik>
Kreisfläche
Physikalischer Effekt
Template
Abfrage
Quick-Sort
Menge
Warteschlange
Schnelltaste
Zustandsdichte
Datenfluss
Varietät <Mathematik>
Inklusion <Mathematik>
Kreisfläche
Machsches Prinzip
Rechteck
Mathematisierung
Abfrage
Extrempunkt
Term
Gerade
Quick-Sort
Computeranimation
Quader
Spezialrechner
Metropolitan area network
Bildschirmmaske
Uniforme Struktur
Verschlingung
Datentyp
Visualisierung
Projektive Ebene
Wort <Informatik>
Kantenfärbung
Einfügungsdämpfung
Ext-Funktor
Gerade
Informationssystem
SIMA-Dialogverfahren
Gruppe <Mathematik>
Machsches Prinzip
Soundverarbeitung
Mathematisierung
Interaktives Fernsehen
Extrempunkt
Gerade
Computeranimation
Digitale Photographie
Homepage
Metropolitan area network
Spezialrechner
Quader
Magnettrommelspeicher
Bildschirmmaske
Freeware
Verschlingung
Zellularer Automat
Bildschirmsymbol
Bildauflösung
Binärdaten
Inklusion <Mathematik>
Abfrage
Knoten <Statik>
Gerade
Ellipse
Computeranimation
Digitale Photographie
Quader
Metropolitan area network
Rechter Winkel
Tropfen
Ideal <Mathematik>
Streuungsdiagramm
Gammafunktion
Inklusion <Mathematik>
Spezialrechner
Metropolitan area network
Software
Onlinecommunity
Mathematisierung
Computer
Knoten <Statik>
Ext-Funktor
Computeranimation
Digitale Photographie
Software
Datenhaltung
Datentyp
Stellenring
Mailing-Liste
Computeranimation
Zeichenkette
Homepage
Resultante
Assoziativgesetz
Datenhaltung
Summengleichung
Kartesische Koordinaten
Mailing-Liste
Ungerichteter Graph
Kontextbezogenes System
Elektronische Publikation
Ellipse
Quick-Sort
Computeranimation
Homepage
Übergang
Summengleichung
Rangstatistik
Suchmaschine
Translation <Mathematik>
Kantenfärbung
Kontrast <Statistik>
Ordnung <Mathematik>
Schnittstelle
Fehlermeldung
Zeichenkette
Resultante
Kreisfläche
Graph
Soundverarbeitung
Onlinecommunity
Abfrage
Kartesische Koordinaten
Ellipse
Quick-Sort
Computeranimation
Richtung
W3C-Standard
Metropolitan area network
Prozess <Informatik>
Spieltheorie
Datentyp
Ellipse
Information
Indexberechnung
Ext-Funktor
Analysis
Information Retrieval
Resultante
Public-domain-Software
Prozess <Physik>
Quader
Browser
Gefrieren
Zahlenbereich
Kartesische Koordinaten
Bridge <Kommunikationstechnik>
Inzidenzalgebra
Computeranimation
Homepage
Übergang
Task
Metropolitan area network
Suchmaschine
Mittelwert
Software
Datentyp
Luenberger-Beobachter
Skript <Programm>
Softwareentwickler
Leistungsbewertung
Schnittstelle
Beobachtungsstudie
Parametersystem
Sichtenkonzept
Graph
Benutzerfreundlichkeit
Datenhaltung
Gebäude <Mathematik>
Güte der Anpassung
Abfrage
Knoten <Statik>
Binder <Informatik>
Quick-Sort
Teilbarkeit
Office-Paket
Bipartiter Graph
Quadratzahl
Flächeninhalt
Rechter Winkel
Wort <Informatik>
Ordnung <Mathematik>
Instantiierung
Druckertreiber
Computeranimation

Metadaten

Formale Metadaten

Titel Quick and Dirty Usability:
Untertitel Leveraging Google Suggest to Instantly Know Your Users
Serientitel Libre Graphics Meeting (LGM) 2011
Teil 07
Anzahl der Teile 39
Autor Fourney, Adam
Terry, Michael
Mann, Richard
Lizenz CC-Namensnennung - Weitergabe unter gleichen Bedingungen 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben.
DOI 10.5446/21716
Herausgeber River Valley TV
Erscheinungsjahr 2011
Sprache Englisch
Produktionsort Montreal

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract Every second of every day, people use Google to troubleshoot problems and to learn how to accomplish their goals. While Google doesn’t make its search query logs publicly available, Google Suggest can be used to learn the most popular queries for any software. We systematically mined all of the query suggestions for GIMP, Inkscape, Blender, and Scribus to learn about the primary needs and problems encountered by users of these software applications. As examples, our technique collected ~15,000 common queries for GIMP and ~2500 queries for Inkscape. In this talk, we will present samples of the most common search queries for these applications, and what they suggest about the software user bases and their needs.
Schlagwörter Libre Graphics Meeting (LGM)
Libre and Open Source graphics software

Zugehöriges Material

Ähnliche Filme

Loading...