We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Look mum no hands! Headless browsing with Google’s Puppeteer

00:00

Formal Metadata

Title
Look mum no hands! Headless browsing with Google’s Puppeteer
Title of Series
Number of Parts
53
Author
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Computer animationLecture/Conference
GeometryMathematical analysisGoogolLecture/Conference
CASE <Informatik>Link (knot theory)Lecture/Conference
Scripting languageLecture/Conference
Line (geometry)Lecture/Conference
Server (computing)Open setOpen sourceSource codeLecture/Conference
Web browserLecture/Conference
Web pageHome pageSoftwareAutomationSoftware testingLecture/Conference
Web browserWeb pageLecture/Conference
WordInternetworkingRegular graphHand fanMatrix (mathematics)Lecture/Conference
Content (media)Raw image formatInteractive televisionFunctional (mathematics)InternetworkingWeb browserLecture/Conference
Multiplication signAnalytic continuationHome pageContent (media)Scripting languageModule (mathematics)Lecture/Conference
Line (geometry)Uniform resource locatorWeb browserWeb pageLecture/Conference
Web pageUniform resource locatorHome pageContent (media)Closed setLine (geometry)Lecture/Conference
Object (grammar)Web page1 (number)Game controllerInteractive televisionType theoryNavigationSelectivity (electronic)Computer configurationLecture/Conference
Web pageFunctional (mathematics)Parameter (computer programming)Multiplication signLecture/Conference
Web pageSoftware developerLecture/Conference
Performance appraisalGoogolResultantGame theoryCuboidLecture/Conference
InformationElement (mathematics)Web pageLink (knot theory)Lecture/Conference
InformationCuboidOrder (biology)Web pageHome pageType theoryResultantLecture/Conference
Computer configurationWeb pageLecture/Conference
Element (mathematics)Category of beingSoftware developerLecture/Conference
InformationUniqueness quantificationCuboidIdentifiabilityOpen sourceScripting languageSoftware testingCASE <Informatik>Lecture/Conference
Software testingCASE <Informatik>CuboidResultantInformationWeb browserWeb pageLecture/Conference
Dependent and independent variablesTouchscreenWeb pageVideo game consoleOrder (biology)Lecture/Conference
Web pageWindowNavigationVideo game consoleFunctional (mathematics)Lecture/Conference
Type theoryResultantPerformance appraisalFunctional (mathematics)Lecture/Conference
Parameter (computer programming)Element (mathematics)Block (periodic table)Content (media)Closed setLetterpress printingLoginInformationLecture/Conference
Scripting languageType theoryLecture/Conference
Single-precision floating-point formatContent (media)Element (mathematics)Web pageInformationCuboidLecture/Conference
Division (mathematics)Social classTable (information)Hash functionSet (mathematics)Functional (mathematics)Lecture/Conference
Selectivity (electronic)Element (mathematics)Functional (mathematics)BitWeb pageLecture/Conference
Functional (mathematics)Web pageContext awarenessLecture/Conference
Real numberVideo gameCASE <Informatik>Lecture/Conference
Object modelCASE <Informatik>Order (biology)Web applicationGeometryLink (knot theory)Lecture/Conference
Mobile appMappingLecture/Conference
Mobile appSelectivity (electronic)Zoom lensLevel (video gaming)Filter <Stochastik>WordLecture/Conference
Function (mathematics)Task (computing)Lecture/Conference
Content (media)Task (computing)Multiplication signSelectivity (electronic)Traffic reportingLecture/Conference
Server (computing)Web browserGoodness of fitLecture/Conference
Scripting languageContent (media)Web pageProbability density functionLecture/Conference
Probability density functionAsynchronous Transfer ModeLine (geometry)Lecture/Conference
Web browserWeb pageDisk read-and-write headDefault (computer science)Content (media)String (computer science)Parameter (computer programming)Lecture/Conference
String (computer science)Function (mathematics)Functional (mathematics)VolumenvisualisierungLecture/Conference
Uniform resource locatorScripting languageFunctional (mathematics)Web pageLecture/Conference
String (computer science)Scripting languageAsynchronous Transfer ModeSlide ruleLecture/Conference
Lecture/Conference
CodeSoftware frameworkTrailParameter (computer programming)Uniform resource locatorLecture/Conference
Scripting languageUniform resource locatorWeb 2.0Traffic reportingQuicksortDebuggerLecture/Conference
Lecture/Conference
Lecture/Conference
Musical ensembleLecture/Conference
Interactive kioskLecture/Conference
Interactive kioskAsynchronous Transfer ModeElectronic visual displayWeb pageWeb applicationWeb browserLecture/Conference
Interactive kioskCartesian coordinate systemCASE <Informatik>QuicksortDomain nameWeb browserRight angleLecture/Conference
Web pageContent (media)MathematicsInstance (computer science)Web 2.0Lecture/Conference
Web 2.0WebsiteLecture/Conference
Type theoryAddress spaceLecture/Conference
1 (number)TouchscreenOrder (biology)Lecture/Conference
Functional (mathematics)Element (mathematics)Focus (optics)Selectivity (electronic)Structural loadLecture/Conference
Structural loadLecture/Conference
Set (mathematics)WebsiteInteractive televisionDynamical systemLecture/Conference
WebsiteAddress spaceServer (computing)Lecture/Conference
Multiplication signLecture/Conference
CodeLecture/Conference
Scripting languageWordMereologyLecture/Conference
Electric generatorProbability density functionClient (computing)Lecture/Conference
Client (computing)Server (computing)Electric generatorProcess (computing)Cycle (graph theory)Lecture/Conference
CodeProcess (computing)Set (mathematics)Image resolutionFile formatServer (computing)Game controllerClient (computing)Lecture/Conference
InternetworkingGame controllerLecture/Conference
Lecture/Conference
Transcript: English(auto-generated)
Hello everyone, welcome to my talk. Thank you for coming. I'm Agata, I'm based in London. I work at Geolytics and before that I was a researcher at Center for Advanced Special Analysis, the University College London. And I would like to tell you about headless browsing
and Google's Puppeteer package, which I've been using for a year now. And I'll just explain a few use cases that have been quite useful to me.
So under this link in the top left corner, you can find the Dropbox folder with all the scripts I'm about to show you. So you can just download them and view them on your device. Also, if you'd like to try them, and you can run Node.js and you can install Puppeteer
with this line npm install puppeteer, then you can just easily execute them. So Puppeteer is a Node.js package. It's installed with npm, of course. And for those of you who don't know, Node.js is a breed of JavaScript which runs on the server.
Of course, open source and we have thousands, if not hundreds of thousands, maybe even more packages developed by the Node.js community that can do a lot of stuff for you. And Puppeteer is one of those packages. It's developed at Google.
And after installation on npm, it downloads its own Chromium browser. And of course, Chromium with WebKit. So you may be familiar with this page. It's, of course, Google's homepage and it's displayed with Puppeteer's Chromium.
You can see this funny bar saying, controlled by automated test software. And I wanted to start with explaining what a non-headless browser is. So I came up with this simple definition.
Non-headless browsing is viewing pages with styles, resources, and scripting rendered by internet browser just as designers have put them. In other words, regular internet. And headless browsing then is this.
And you can tell that I'm a fan of the matrix. So the definition of headless browsing would be accessing raw HTML content along with interaction functions provided by actual internet browser. So first I would like to explain how to get this thing.
Oh, by the time I continue, it's Google's homepage for your content. So the first script, which you can try, starts with declaring the Puppeteer module. Of course, require Puppeteer.
And you can see comments on each line. First it declares the URL, want to visit, creates a browser, creates a page. The page is just a single tab. Then I'm setting the viewport size, so the size of the page as I expected.
Then I go to my URL, that is Google homepage. Then I log the content and close. That's it. So it's pretty much line by line like a scenario. And this page object is one of the basic ones.
It represents control over a single tab. It does all the interaction, so it can navigate, type, click, select options from dropdown. It can execute page function. And page function is a function that just takes page
as the first argument and can do something with it. Then I need to skip many things, because the time is limited, of course. But the most interesting thing about the page is that it can evaluate. And first, it was just something that I needed
to discover on my own, what evaluate is exactly. If you're a developer, it may ring a bell. You may know what evaluation is, more or less. However, this is a screenshot of Google results after typing in Gemmarange.
So you will see the Wikipedia, things to do, and this box that will summarize some basic information on the place. And to evaluate is to understand where certain elements are located on the page. So if I evaluate it with my human senses,
I know that the first link is the Wikipedia. It's probably the most accurate since it's first. And there's this nice info box, as it's called, right? So in order to get to this page, I need to open the Google homepage,
find the search box or the Omnibox, type in the phrase I want, Gemmarange, wait for results, select the first option, and then I'm redirected to this page. And Puppeteer cannot evaluate it with my senses.
I need to tell what it is supposed to look for. Therefore, it's using CSS to see where the certain element is located and what it looks like and knows about all its properties. So I can actually go to the developer tools in my browser and check what is actually
the unique identifier as CSS selector of this info box. So I just, you can open developer tools and see what's there, plenty of stuff. And so this is the script that performs this test case that I just told you.
So I give my selectors to Puppeteer to find the search box, to find the first result, to find the info box. And just as before, I create a browser, open a page, set the viewport. I may want to set the viewport as with the age of responsive design,
like CSS may look different depending on the size of your device screen. And then expose is just something I need to do in order to use console.log because page doesn't know about globals like in JavaScript window, Navigator console.
So I use this thing, expose function to make something available that originally the page cannot see. And then again, I go to the URL, find the selector search, type my search phrase, which is Guimarange, wait for result, click the first result, wait for navigation,
and then the evaluation takes place. So the evaluate function, it doesn't take any arguments, but here there is a selector argument. So find this element, which is block, whatever, span, and get me text content if you can find it.
Print it like log info, print it to the console, and close. And if you can get the scripts, you have everything up and running. If you type in Google search, it should show up.
So if this script works as designed and expected, this is what you should see printed out. Guimarange, the city in northern Portugal, and so forth. So it's the text content of this single infobox element. And page can do plenty of stuff.
So if you're familiar with jQuery, if you've ever used jQuery selectors, it's exactly the same. Name tag, like div, table, or span, class with a dot, ID with a hash. The first set of function, the dollar,
it's just selection. So select elements, and we can do something with them, either with one or with an array. Evol is like evaluate, however, it's evaluate function just on this particular element, so it's a bit shorter. Select, selects from dropdown, custom page function,
which acts as the context of the page. That is, we can interact with the DOM itself. And you can take screenshots, create PDFs, or here in the corner, it's this Evol example. And it's from real life.
So the third use case, the third exercise I would like to tell you about is building DOM with Puppeteer. DOM as in document object model. So this is the original use case I needed
in order to solve a certain problem with our web app, which we develop at Geolytics. You can take a look at this link, https://geolytics.xyz-open. It will be demonstrated at Phosphorgy in Dar es Salaam, if you're going. And yeah, so going back to the app.
So basically, imagine you have your mapping app and you allow your users selecting, spam the map, zoom, apply filters, apply styling, and you would like to provide them
a nice summary of the insight they just had. So in other words, put it in a nice PDF document, ready for print, ready for handing out, so that they can have an output of what they just did with your tool. So there are certain, I mean, there were certain challenges with this task.
So basically, it's custom content every time. They may select anything, click anywhere, set the map, wherever they want. The report is requested by client side. So it's request is triggered by the user.
The PDF document is created on the server side and then sent to the browser when ready for download. And it has to retain layout and CSS styling and it has to always look good. So for us, Puppeteer was the way to go.
And I will show you a script that actually creates a PDF out of the custom content. So it's the third one, it's called Create Page. So PDF creation and screenshots,
it only works in headless mode. So if you just use it as headless, false, then it will just crash. So just like before, line by line, I create the browser, create the page, and this page by default, it just has head and body tags, it's empty.
I'm setting the viewport. Then another thing I do is setting the HTML content. And I can either, it takes string as an argument, so it can be either a string or it could be any output of any function. If you use, let's say, RenderJS like we do,
just returns HTML as a string. The next thing is adding style tag. So here it's just CSS string, but it could be a URL to an actual style sheet.
And can also add a script tag, so a function, what you want the page to do. Here it's just a string, but it could be also the URL to a proper script. Then, so if it's headless mode, then create the PDF named hello.
If it's not headless, just let it run. So, I think I have one slide missing. I do, no I don't, sorry. So yeah, this is the final thing I wanted to show you.
Of course, like this code in our framework, it looks a bit, like way more complex because we have, we're tracking what users are doing with URL hooks, so whenever they click somewhere, select something, we all have it added as parameters
to our URL, and then the script that we insert, it just takes URLs from the webhooks and just applies everything to the document before it's created. And while the report is being created, we have a sort of request from the front end
asking is it ready? No, is it ready? No, okay, is it ready? Okay, send it back. And then the user can download it, and pretty happy with that. So, I would like to end with a question. What would you use Puppeteer for?
Excuse me. Sorry, could you repeat the question?
Yeah, and did you answer, or? Ah, kiosk. Kiosk mode.
Like when you enter somewhere in a hallway of a company and then you see a display, and probably you can now automate changing web pages instead of building a web application that has to do it internally in the browser.
So basically you're manipulating the browser, right? From an external, okay. But I think she wanted to have like a use case of a real application. You are talking about how you do it technically, but what about the domain?
Which sort of kiosk for what? For news, or for a news aggregator? Let's say for presenting projects. Okay, but I think you could, for instance, if there is a page that you want to track
which content changes dynamically. For instance, let's say like weather data. Yes, I have actually some web scraping experience with Puppeteer, and I had to pull restaurants
from Uber Eats website just to explain. So Uber Eats website, so like first thing, you type in the delivery address where you want your food to be delivered,
and then the restaurants are listed. However, the viewport has limited size, and there are only I think about nine or 10 displayed at once. So as you scroll down, the previous ones disappear, and more appear, but you never have
more than 12 on the screen. And in order to overcome this, you just need to write a function that will focus on element, the focus like just selection is pretty much focused. So Puppeteer will just scroll down to this element,
then wait for others to load, check if there is load button available, click it. If it disappears, it means you're at the end. If it doesn't disappear, more are loaded. So you need to check, are there more? Okay, scroll to the last one, check for load more. If it's not there, go on.
So you do this programmatically in JavaScript? Yes, so all the stack is JavaScript, Node.js, so Puppeteer sits perfect with us. And aside from the PDF thing, I have also used Puppeteer for creating custom datasets
from dynamic website or interactive website which require you to give a delivery address or some details so that something can be returned. So I have a question. So is this running on server side? Yes. It's running on server side. One more thing, I forgot.
So everything in Puppeteer is a promise. So if you're familiar with JavaScript, you might know that sometimes it's asynchronous. I mean, Node.js pretty much all the time is asynchronous. And you need to wait for things to complete. You need to wait for them to resolve
because if you send a request for data, there might be nothing returned until you wait for this. So hence in my code, async await everywhere because pretty much everything has to resolve to complete. In other words, the rest of the script must wait
for the previous part to execute and resolve. And something relatively new to Node.js. Before that, we needed an extra package that would handle promises. Okay, any more questions for Agata?
I have one question. This PDF generation, so you're generating a PDF with client quality, client DPI. Is it intended? Because perhaps you would need something like 300 DPI
or something like that. Something like genuinely generated in server and not server process that mimics client cycle. You can set it all. You can set resolution in DPIs, in centimeters to whatever format you like because if you have it on the server side,
you have control over this. You're not worried about your client using Chrome, Firefox or I don't know, Internet Explorer because it's just on your side. You can't control it, yes.
Any more questions for Agata? Okay, so thank you, thank you very much.
We have a couple of minutes more for.