Finding Bad Needles in Worldwide Haystacks

Video in TIB AV-Portal: Finding Bad Needles in Worldwide Haystacks

Formal Metadata

Finding Bad Needles in Worldwide Haystacks
Experience of using Go for a large-scale web security scanner
Alternative Title
Go - Web Security Scanner
Title of Series
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date
Production Year

Content Metadata

Subject Area
Group action Concurrency (computer science) Run time (program lifecycle phase) Code Multiplication sign Port scanner Mereology Computer programming Information technology consulting Formal language Web 2.0 Semiconductor memory Computer configuration Computer network Encryption Arrow of time Information security Stability theory Physical system Social class Adventure game Link (knot theory) Wrapper (data mining) Software developer Bit Port scanner Web application Process (computing) Chain Order (biology) Website Right angle Clique problem Information security Freeware Arithmetic progression Writing Spacetime Ocean current Trail Dataflow Functional (mathematics) Game controller Link (knot theory) Open source Computer file Graph coloring Product (business) Wave packet Frequency Energy level Software testing Booting Task (computing) Domain name Home page Shift operator Scaling (geometry) Information Cellular automaton Projective plane Planning Limit (category theory) Cartesian coordinate system Symbol table Compiler Abstraction Local ring Library (computing)
Standard deviation Code State of matter Java applet Euler angles Multiplication sign Correspondence (mathematics) Range (statistics) Combinational logic Port scanner Set (mathematics) Function (mathematics) Computer programming Subset Web 2.0 Heegaard splitting Core dump Conservation law Process (computing) Endliche Modelltheorie Position operator Scripting language Link (knot theory) Mapping Computer-generated imagery Block (periodic table) Software developer Port scanner Digital object identifier Sequence Googol Website output Escape character Information security Spacetime Web page Game controller Functional (mathematics) Open source Real number MIDI Control flow Product (business) Goodness of fit Software testing output Combinatorics Software development kit Noise (electronics) Dependent and independent variables Demo (music) Key (cryptography) Cellular automaton Counting Incidence algebra Cartesian coordinate system System call Doubling the cube Personal digital assistant Iteration
Internet forum Parameter (computer programming) Escape character Information security Port scanner Hacker (term)
Injektivität Dependent and independent variables Web page Source code Java applet Parameter (computer programming) Function (mathematics) Port scanner Sign (mathematics) Uniform resource locator Medical imaging Personal digital assistant String (computer science) Universe (mathematics) Heegaard splitting Software testing Information security
Sign (mathematics) Inclusion map Scripting language Personal digital assistant Web page Exploit (computer security) Attribute grammar Information security Port scanner output Block (periodic table)
Sign (mathematics) Injektivität Dependent and independent variables Web page Source code Home page Attribute grammar Port scanner Information security
Injektivität Parsing Presentation of a group Injektivität Scripting language Dependent and independent variables Server (computing) Web page Point (geometry) Home page Exploit (computer security) Parameter (computer programming) Attribute grammar Port scanner Vector potential Web browser Sign (mathematics) Inclusion map String (computer science) Heegaard splitting Information security output Block (periodic table)
Injektivität Word Demo (music) Source code Interior (topology) Block (periodic table) Information security Port scanner
Context awareness Functional (mathematics) Building Token ring Real number Control flow Port scanner Mathematical analysis Function (mathematics) Mereology Information technology consulting Wave packet Revision control Web 2.0 Web service Different (Kate Ryan album) String (computer science) Negative number Energy level Software testing Information security Position operator Descriptive statistics Physical system Injektivität Scripting language Boss Corporation Dependent and independent variables Block (periodic table) Software developer Interface (computing) Content (media) Mathematical analysis Instance (computer science) Port scanner Regulärer Ausdruck <Textverarbeitung> Parsing Data management Message passing String (computer science) File archiver Website output Escape character Information security Resultant Relief
Multiplication sign Source code Parameter (computer programming) Mereology Web 2.0 Data management Mathematics Bit rate Different (Kate Ryan album) Information security Physical system Stability theory Adventure game Collaborationism Enterprise architecture Software developer Data storage device Bit Special unitary group Port scanner Data management Order (biology) Right angle Summierbarkeit Ideal (ethics) Information security Resultant Spacetime Ocean current Computer file Link (knot theory) Event horizon Rule of inference Googol Ideal (ethics) Energy level Software testing Default (computer science) Scaling (geometry) Cellular automaton Projective plane Physical law Interactive television Cartesian coordinate system Vector potential Word Personal digital assistant Video game Local ring Library (computing)
the but hello my name is the conscious of himself I'm followed up for yahoo and look for the final it's for the security and I've been a lot of for a long long time actually since the beginning of the century and didn't do and ingenuity and so there this development of billing systems infrastructure and in the last years conveyed in there security of and my programming kind of biography of follows the track of both industry and you have a development so when I came been still do republication in C + + is a lot about can lesson and lots of tall PHP python knowledge innocent those last month so did you will so this talk about the but do some web applications in security in in goal of web applications in security I I haven't showed a lot of information experiences about there isn't going there's circuitous space so some there are some problems like for example for encryption that uses but traditionally is so work publication security people use a lot of fire python and also GeoSMS can ingeles aligned would like to hope to change this and get a little bit more goal because I think it's a great language for for the security so I will talk about so the competence of this can now what would you do not not the whole thing because it's this work in progress but some of the pieces that I was able to become so the vesicle happens they're kind of toolkit for making the test code for creating test for the for the scanner then the Conte as did the text that's a piece of their scan the writing for the last month was good results and talk about some of the C a N and the next plan so it but the web application security domain this I think it has 2 parts 1 is it's like a specialized QA so you you do did a lot of checks so there's a website so this products and you have to look for them very carefully and why there's lots of attention to details and they're trying to imagine all their material so that better ways that unexpected ways that you application can be used but it's also a enabling like helping developers so you both the police and they're kind of cell and as so you doing like a consultant and training and the explaining and and and Wilson for that you know that automation is paramount at scale and you have to do a lot so you should have if you guys and uh hundreds of thousands of developers so the ratio of coal dust to Apollo it's this and the class this automation and you also do have to do a lot of tools and tools also at the costs and would be found out that you really custom tools so it's not enough just to take something that's available on the market or in the open source you end up making them because it's also things change you need to adapt you have to and create your own so video right our own scarce cancel like kind of specialty idea whose maintain and develop an XSS gonna so of part of their scanning system that is so the targeted at assassin and uh related things and that's what it takes a lot of time to create and to be able to scan all the sites all the time that's our FIL % our conditional at rescaling all the production site as well as all the sides that are coming from the city pipeline Firehose you consider well ieee really got to like go and I could sign up for if you've seen at for that goal and a stoker why he like goes so it's a lot of things that I was well on static and go for much of that's that was the the elevation that's exactly I think all how you to yeah but especially for tool development and not just the flow so for options so the Adventures that's this easy to deploy a static libraries you don't have this dependency how this usually lesson about the runtime arrows and defined functions undefined symbol you have the whole deal chains of the compiler dependence associated done and that test facilities are right there so it's when you have for the test files the public file and then this file underscore the goal when there uh place together then it's a stimulates people to write a will to I would test on this tried some tests and when you look any you in the code if you see that there are no doubt underscored test files then there's something wrong the as % especially for web security tools adventures so that the great into the stack and the whole net local carnival batteries included of course concurrency in performance on group important when you could just scan the many thousands of sites and do it in a reasonable time up what I like for the just general tool so that it's easy to make a web so usually you do do tool and command line interface but if you knew but if you pay fewer something nice want to give you something nice it's almost trivial to put a wrapper around the functionality and especially for the tools were as goal crests or provides a future what I called abstraction elevator so you could change the level of abstraction you could nominated government relatively high level I almost like is a scripting languages but if you need to go from fellow that go down so only you could go all the way to the packet shift in order to to those natural the which chiefs and for the right to modify them so that at times comes so our 1st talk a lot about this but found the chose to do you did in the tolls project the Bentley color and that was my learning project of free time project well I said about then kind of get in if you whether it would be appropriate for web security can eventually it and that took a very simple task just finding that links and that when you have philosophy size so that they come train a and B periodical and so when the it looked very well it was able to plot with thousands of the sites and made specifically Jussim for the upper that comes through a couple of other domains likely current on and it followed the couple of dozens of bad links that then the fixed something like maybe 2 cliques from the home pages and I would what it's hard to imagine how it can be found many of them on the union to the control center they fixed that's better user experience and good they call it slow graphics cited some people told that screeches job and that's good when you're trying to introduce a started boots bootstrap a new languages at the cooperation and it was running at like hundred the stable concurrent job and and for a for a few hours before limit and that memory consumption kind of peaked around the gigabytes and will slowly increasing this goes this is the
call I like this you have to key button map of the visited links and that it goes but I was surprised that it was quite solid and stable and we find good stuff so some of the lessons that big you know Q lessons ululated you find out fast that you need to close the body otherwise the this ghost cells that you will with needs to be increased and also found that kind of the sweet spot is somewhere between competent Precrime that and I could go over to use that to the goal of the parallel go when I try to do like a thousand and I was getting genus has that kind be set up of the the the local can you give us so you can talk about those website club so 1 is that the topic came up again get when they're American on the circuitous can either that or in general and kind of like instituted you should send the title of helping people we needed things so the 2 goods put together tests creakily or have a testsuite for this can so all the coal corner cases because we cannot wait until this obscure but will come next time so you you have to have at like all saved and to that several iterations not just the myself on and teammates user to some kind of attack PHP scripts then there was the application that was kind of a Munich of a small copy of tutor but his insecurity and that the novel the wiser glad that was similar to what I sure but in no 1 but then when I discovered go I said that's the right thing and that's what I want I want you to so I wrote a set of tests in go it is a site for test in web circuitous can such a success scanners and In that is similar to the recently I I think the idea was in there and Google Open Source firing range so it's and simple idea but it's in Java and and I also think about this not just a sites that has a few pages so over that of although but also tool kit for constructing new test so that it's would be a the is it true when when you need to know explore something or show or implicates so that you could put a new test once and so on and use it for an application of real use cases like you have about but program so and our and the all from the stage production when we find something I try to put a like a model on distilled to minimal goes in Webster collapse so that I could keep it for their for this can the requirements are but many case I will like this standard 1 so you get an input of the input is not feel that all feel that incorrectly and it goes into all food and and that the so the input is no on filtering but the bad news so the good cop vary output it has to be tighter control so we cannot just put unfiltered output everywhere because then it will come to you all they're all the test for the test isolation they had to be separated and some of the test a special so for example the HDP response split in all of the fuel injection they're not like Justice Betts input filtering and in goal so this I convey convention that their input count so that use CGI viable so that you keep the same and accept the special cases and in the that filters that to convert the input value and also the cold and so that so the convention that and if it's and is that OK the tone of bad news and I'll show how it's made if you go fish for this and if it ends is underscores then it's a false positive that's enough and that you have to check a lot of the scanner so a further in the scanner is true not noisy if it could use false positives or false alarms when you do a scan and scale it he'll kill their outputs dual their signal in noise and any users will just be I put off by them so it's a new kind about this I realized when when I'm done this and special purpose of vetting a finger looks so it's it has a place where it picks up the 10 plates that uses go 10 plates of course it uses text plates so that we could do that the bet output and it has some library of further filtering and and there it has the future interest and that 2 blocks of example tags all for all so if it's tags all the the future all tech characters or double quotes on and all of a so that you work with it goes through or not the His there so on and so for example you like those code so just when I was writing this I I was trying to do this all in in function the just for each combination of this I was doing a like a conservation and I have function do call solved from the these uh um spaces no tags and it's kind of combinatorial explosion there was a lot of code duplication so then I get this 1 and they got to the break based on this that this transform function that combines them and applies so uh those futures in sequence so that you could just specify of the set of those so and gets their correspondent in the future and here's an example of a custom past in that it doesn't come often but that was about where escape to their backslash X would be converted to the normal characters so this that the function just under the cut off a corner case so we'll have because that this can be done for replicated bed is so what if we find a like Barker or OK so the internal state then we could just pay templated said tests and have a have a new case and also it allows to be free in the listed acceptance so if you grabbed HTML from the production kind of scanned then then we have a test to the real yeah they'll HTML but it also what I like about this it can be used to visit and register for example if it's from 1 bond to program and that is the those initial developers sometimes a very fast to fix it so they fix it before I I get to this uh before I wake up so that I had 2 of them could you please put them all to vulnerable goes and 1 the test so so that I could check it is my scanner and this it was an awkward thing to do because developers have all the more important things to them would Dalton so somewhere so now we do this way our incidents response to when the kriged those issues they say there on output of attitude the block and then even if it's fixed II grabbed this dump channel down-converted to works black test and and put it to the guy can play it as it as much as they can of so let's say a demo the the the that's the pace bunch of
those I mean so for example the
classic is what uh the the full escape this like that when there's no no escaping the the and we just good under we and so so
on so all there so
there's the Postal it's also special
so this and most of the cases get because sufficient to check the output handling but you need to check this as well so if you let's say get a major source of people fool because 1 of looked at could this image you know what it was somewhere university how do to in so you need to close their this in the course of the we can be the there are
some cases like for example on Mars uh almost always the I think 1 of those some of I will long so so we could
take a look at their so it's in the sources said values in and put it in much the the introduction all
the moves over the course of In this talk of this it has to go in and thank you on the I have to I and it the so it In the present
it let's to this 1 that just clicked injections the good
here in so this injection
JavaScript and we could see the source
of this is was word in there just could block it puts this there a but injects the coded the guessing is accused of so and we want to open so this
so and all so so the
OK so it's no on public and and this description there there we need all the tests the of I hope that it would be useful for their work consultants and the people who not only to develop the scanners but people who wrote the book is that like doing bend tests in and then they can copy make the test from from the the website clap the what so the next step this that when to doctors the contents the text so that's actually in the exercise scanner it is a there then the events responsible for that analysis all their output so this so use of the scanners they send some 5 instance thing to their on 1 side our it it turns out good and then you need to analyze it so previously had just a lots of regular expressions and their costs scripting and this the so for example and that that difficult to understand and to quite fragile and then goes to lots of false positives and false negatives then we modified it to use that to go and to the Estates-General 5 fossil and foaf analysis so that a and it looked to so what is you 5 bosses has a tokenizer and there that the high level that the parser interface to be used in the past and it defines their hallway fires the context of injections and are this way we can they do different injections they maybe inocula so they for example in the quoted so properly quoted string all they might kind the that's all can be a an dangerous so in different cont context I they also used there and kind JavaScript passive but Clemens and to check whether injections Blake's gel escaped or not rescinded example obliviously that was injection sheet in John escape so usually when the ascending the fisons keep you're not sending just some like anomalous or and I don't but uh some special characters and different gets injected inside and just kept its each user breaks there the block of the JavaScript and makes it on possible and about this detection because see if it's a I almost stimulated the browser and see where the injection real and if it's breaks just could then it's likely to be exploitable our and because then you can it put your own stuff you your your old script so you basically you're kind of bloke in the sandbox to the in but that they normal kind the hold that the input shouldn't leave the same context and when will be output so here's how it's done you have the function checked jealous escaped and the need to do some like because of their their common so the because Soviet and archive imitate a place in this the common so they removing them around and then recall that In the past file from or the pass and C potatoes in and and so and then it wasn't the some was being broken and that's like the the injection well that Our not then determine also it could also from the while they could check if er our for example if just kind of innocuous training if it just doesn't break but there's something this couple of is they're going to block its break so long as they're given injection than it must be bad so some of the results that we could to the false-positive rate in in this cartoon using to always 0 and that was a big relief wells for last paranoid school and um that you know how to do this they're output and price should try this and also their developers and kids who who use this and that's still find in their real issues with reject was there wept website clap I also improved output because we can take the relevant parts from their from just that but the from the from the conflict there previous context larger than that of the breakage but the kind of this can have this kind of the text was able to actually is no part of this CG pipeline and it scans all their sites that goal phone developers so all the website the go also the because said that goal has a role to play in security a lot of yuppified the but for this kind of system you also had their distributed system is web services interface and there are multiple redundant look as that actually do that this kind so this as Scott web web so that provides the from vectors for front-end takes a request and then multiplexer to 1 of the real on the Net but sort of don't go the of and when we use also the contest and I once you go up his go I could that it's convenient for the functional tests and so do go test and but put a building block on on the functional and it's a quite well so some of them so that his dependency management in a cooperation violent because you have to be in this subpart dependencies the and just up if you would like to leave it to the
cold and then people my posited to my code so that it wouldn't have like all their all manner depend liable this locally it's my bit is saying kind of outside but we have trust that you like following the those will not go down or they have Pope application and of goal of this idea that there should be in a convict files so I our but today if if you want to do that doing this supports depends and then we have really to compromise on the local feature and files ideal and 1 of the ideas of the thing that we explore using under tapetal that's a tool for management a lot of not not buffer a dependencies and so it requires the manifest the default that it's a bell with that specifies like their sources and the past and the French the whole projects it can be used to lay out the built space than not at Boston to the past on the users for for a few months and to the sun there's some adventures like for example the fold of XML and introduce sky at a glance documentation and troll through all the dependencies in 1 place our and also always to the system covered in nice feature that is the dependences changes in that ain't defaulted XML than into yields a you are a taker your built it wasn't done for go it was done for uh uh Fondo at like is but if you you use it and it came to be very useful and and now they're thinking about so I like to folded XML because I don't have to invent I will and people manifest gene event a constant for a lot of that is and it's something that some some developers know already they're thinking about like doing it that generate an otological his goalie so that then you can change only they're like the oceans would that have using if you wanted to certain libraries and that can be also where if you buy this the system automatically so the this next step so the the expanded contigs detector and convert also to the fool goals countercurrently it's a sum of legacy piece this is that interact with his go and searches web security can community to promote goal among security people and also I'd like to call Bob was the dental that links can so in the followed that go is excellent foundation for securities to tools and also it's around so it's nice stable for large scale system sorry like the they're all scan in insidious system has been running for weeks and there I mean going like on locations sabbatical and that was and still violence surprisingly stable and unwittingly it's also appreciate this if some times people say that goal is boring so I I I think that's 1 part where boredom or goal being boring is great when it's operates in just 1 sentence and trans and there's a lot of room for innovation and potential for the article collaboration of executing go community so a few interested in doing this to me the let me also my very kind cooperate thank you FIL B questions that was in the movies you will have a lot longer than you would think of that which means that there the wanted of for I think it's genesets depends on the genesis of the year is over the ion the order of the filter go I'm not going to use your the what case so I should try so I tried to build it is it is the only it might have seen it in and all of the parameters of the in the eye of the allele of the various of the rate of 1 of the people so we go on private you make people you don't have all your that's the result of the switch current right and that's what you got you can see in all really need you the what do you with because unfortunately you can offer vendors criminals must use the word in the in the presence of and you find all kinds of things and so on reported receiving that so the tendency 1 5 possible and a similar stores but the young 1 on top of it this where is but on Linux we have to do it because a lot of what we're enterprises have a level of something of years you special policy for life health work in the world what kind of people think of for those of you that you so we think this through read the work of the rule of law to while the just keep this a standardized test so when the when just checking that the just escaped uh and I think it's what you could even like on on the Ghanaian node engineers if you if if JavaScript is malformed minutes so awful and I think that way the but all the snow become those escaped pieces because of HTML 5 past and all things relatively standard so I like I think it's serve in the I think it looks center doesn't depend so much and all the cells the but of course for this scanning that you will levels 1 is the just the simple origin the and another is visible also automation that's in different stores thank you very much for the