How SAP is using Python to test its database SAP HANA

Video thumbnail (Frame 0) Video thumbnail (Frame 813) Video thumbnail (Frame 6225) Video thumbnail (Frame 7676) Video thumbnail (Frame 9781) Video thumbnail (Frame 13112) Video thumbnail (Frame 15257) Video thumbnail (Frame 15988) Video thumbnail (Frame 19022) Video thumbnail (Frame 21440) Video thumbnail (Frame 26804) Video thumbnail (Frame 28110) Video thumbnail (Frame 29389) Video thumbnail (Frame 32697) Video thumbnail (Frame 34874) Video thumbnail (Frame 36578) Video thumbnail (Frame 39407)
Video in TIB AV-Portal: How SAP is using Python to test its database SAP HANA

Formal Metadata

How SAP is using Python to test its database SAP HANA
Title of Series
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date

Content Metadata

Subject Area
How SAP is using Python to test its database SAP HANA [EuroPython 2017 - Talk - 2017-07-11 - PyCharm Room] [Rimini, Italy] SAP operates one of the largest test infrastructure to test its in-memory database SAP HANA. The infrastructure provides different services like continues integration, code coverage and code linting for a huge C++ project with Python test coding. These services are essential for the development teams and quality specialists. Without these services developing and shipping of new SAP HANA version wouldn’t be possible. In 2010, we started with a single Jenkins master with ten nodes. But to keep our testing time acceptable for the growing number of developers we had to scale up and that led to multiple different scaling challenges. The current test infrastructure is powered by more than thousand physical servers. Scaling of the infrastructure was only possible with custom optimizations like improved scheduling, expressive test configuration and robust tooling implemented in our favorite language Python. With the flexibility and power of Python it’s possible for developers to implement complex test scenarios to verify features and mitigate regressions. On infrastructure side, it has been easier to extend, optimize and adapt the infrastructure for new requirements like different CPU architectures and newer Operating systems versions. This talk provides insights and stories how we scaled and improved our test infrastructure and how new technologies like Linux Containers can improve automated testing and software quality assurance
Word Database In-Memory-Datenbank
Laptop Statistical hypothesis testing Greatest element Service (economics) Sequel Open source INTEGRAL Multiplication sign Disintegration Execution unit In-Memory-Datenbank Client (computing) Mereology Product (business) Statistical hypothesis testing Formal language Pi CAN bus Database Semiconductor memory Physical system Area Relational database Execution unit Dialect Distribution (mathematics) Scaling (geometry) Interface (computing) Projective plane Fitness function Sound effect Analytic set Bit Database transaction Unit testing Cartesian coordinate system Sequence Statistical hypothesis testing Word Data management Process (computing) Software Database Order (biology) Interpreter (computing) Pattern language Table (information) Freeware
Statistical hypothesis testing Building INTEGRAL Code Entropiecodierung Source code Mathematical analysis Mereology Statistical hypothesis testing Revision control Mathematics Database Data compression Implementation Descriptive statistics Physical system Scaling (geometry) Building Mathematical analysis Code Statistical hypothesis testing Process (computing) Fluid statics Repository (publishing) Configuration space Resultant
Statistical hypothesis testing Area Addition Scaling (geometry) Entropiecodierung Run time (program lifecycle phase) Multiplication sign Bit Line (geometry) Mereology Statistical hypothesis testing Statistical hypothesis testing Front and back ends Repository (publishing) Semiconductor memory Personal digital assistant String (computer science) Integrated development environment Scheduling (computing) Domain name Physical system
Statistical hypothesis testing Slide rule Building Run time (program lifecycle phase) Linear regression State of matter Multiplication sign Connectivity (graph theory) Virtual machine Division (mathematics) Tracing (software) Field (computer science) Product (business) Statistical hypothesis testing Strategy game Semiconductor memory Office suite Area Standard deviation Linear regression Run time (program lifecycle phase) Complete metric space Statistical hypothesis testing Process (computing) Software Personal digital assistant Function (mathematics) Telecommunication Computer science Scheduling (computing) Resultant
Statistical hypothesis testing Point (geometry) Suite (music) Scheduling (computing) INTEGRAL State of matter Connectivity (graph theory) Execution unit Control flow Branch (computer science) Complete metric space Mereology Distance Scattering Twitter Statistical hypothesis testing Medical imaging Mathematics Causality Average Different (Kate Ryan album) Term (mathematics) Configuration space Endliche Modelltheorie Contrast (vision) Task (computing) Physical system Broadcast programming Projective plane Branch (computer science) Bit Unit testing Statistical hypothesis testing Arithmetic mean Process (computing) Logic Personal digital assistant Configuration space Scheduling (computing)
Statistical hypothesis testing Implementation Code Patch (Unix) Limit (category theory) Solid geometry Distance Data transmission Statistical hypothesis testing Goodness of fit Cache (computing) Bit rate Computer worm Local ring Task (computing) Scripting language Installation art Content (media) Heat transfer Sound effect Bit Special unitary group Line (geometry) Perturbation theory Statistical hypothesis testing Process (computing) Personal digital assistant Infinite conjugacy class property MiniDisc Queue (abstract data type) Intercept theorem Freeware Scheduling (computing)
Statistical hypothesis testing Building Service (economics) Software developer Multiplication sign Parallel port Mereology Statistical hypothesis testing Database Semiconductor memory Electronic meeting system Cuboid Integrated development environment Implementation Physical system Task (computing) Relational database Default (computer science) Weight Moment (mathematics) Sound effect Statistical hypothesis testing Process (computing) Pattern language Queue (abstract data type) Local ring Resultant
Statistical hypothesis testing Group action Scheduling (computing) Open source Software developer Code Statistical hypothesis testing Independence (probability theory) Architecture Frequency Centralizer and normalizer Roundness (object) Read-only memory Velocity Different (Kate Ryan album) Semiconductor memory Befehlsprozessor Vertex (graph theory) Computer architecture Exception handling Physical system Scale (map) Curve Scaling (geometry) Projective plane Content (media) Independence (probability theory) Statistical hypothesis testing Open set Hand fan Human migration Velocity Finite difference Personal digital assistant Computing platform Pattern language Freezing
Statistical hypothesis testing Suite (music) Scheduling (computing) Presentation of a group Run time (program lifecycle phase) Entropiecodierung Student's t-test Mereology Statistical hypothesis testing Data compression Error message Physical system Scripting language Default (computer science) Multiplication Dialect Moment (mathematics) Performance appraisal Proof theory Software Personal digital assistant Factory (trading post) Telecommunication Blog Configuration space output Figurate number Metric system
a word and so 1st of all news of In his quest of I'm working in the to a department of follow an in-memory database of honor and this talk show you 1 very very short introduction about 100 itself because it's important for the rest of the talk and then I how we using Python to test the Pill so let's get started with as
a behind that and it's no relative new product of SEP it's an in-memory database which means we have some as Storage Engines and columns stolen and boast or which is really optimized to run into main memory of a system of 4 so the and with the optimized codon it's much faster to find things to perform prayers and on the and it fits very bad floor and on-line analytic processing so you can to nominalistic presented by a can also use it for transaction process and the typical database in solid maybe on just 1 system but we can also then build up and scale system so you have multiple nodes and connect each atom the a big database and you can distribute tables across the systems for you as a pattern about about its actually not so interesting because your interfaces most of the time just sequel and then use it and so it's basically not so interesting about that they're very detailed inside spot so and highlights that was written in C + + so not so interesting fast on but we do have a lot of the management tools and commands which are totally written in Python and 1 part of the on a distribution itself is also a Python interpreter so at all as the Bahamas and in-memory database which means you need a lot of memory that you can start from the beginning so area for example the 1 with small Express Edition you can start with uh the fit on an open notebook we just hit 16 um uh gigabytes of memory but you can also skeleton and have this system of something like that yeah and 48 terror part of memory and they're really customers which are running such systems and still very impressive so if you take a look how you can I use as upon on you can connect to Python application of as cannot this very straightforward and very simple RBF and python client uh science basically the beginning of time and with the next uh service package from how much of the pattern kind will be fully supported and has also support of Python 2 . 7 and pattern free and then you interact with it or wanted to be good at DBLP effects so you open the connection of because the ransom seek further and you can fetch but as most of the pie of a dozen words sequel anymore and personally I can totally understand this and we have also some open source project of basic of open source projects to intake of 2 databases such as 1 dialect as community I can run them sequel alchemy very easy with fire and it makes a lot of more fun than watching sequence order is also not open source project and to use as a panacea data pick for so it's of a little bit about testing in the death in a
database is not so different to testing software because databases justice of so we have is a very typical pyramid and on the bottom we have unit and component tests written in C + + costs main language of finite the main language of all developers but if you start writing and some kind of and integration tests are very complex end-to-end tests on them so then the simplest rest doesn't fit very well anymore and most of the test but then written in Python and 1 disadvantage is that if you write more complex tests more integration tell it there will be a little bit slower and will be more expensive so unit has a much faster and much much cheaper typically so let's take a
look and our development process and how we can integrate testing politicians in this development process at about post a change into our carrot and guarantees a very famous get code for use system and to hold harnesses lives in 1 big acted repository and after the push from the developer and the trigger some quality assurance processes before the Committee even reach any kind of so that there will be no committee which will reach in the set of and quantify out testing the fault and after the building and had a test process is complete and on some other politicians things like code analysis and style Chickasaw's sanitizers in the simplest press coding that it will be review from a very dedicated to you which reduces your test results and i dont OK it's good enough but please try again and this fixed there some failure and after the review you change get much into repository so to build something like this but I mean it's very straightforward continous integration landscape and its also very common to do this
on so in 2010 it moss like very common landscape so developers pushing to Garrett gathered not defined Jenkins ISO over about and you change Jenkins will look into configuration maybe decyzja configuration for and then it for a trigger this job place it basically a you and if there are some nodes before out an hour with the baby resources have 1 node with to drop from the queue and execute this reform and let's take a look and deeper look into what perception job looks like that such job is basically divided in 4 parts so you have to check out the latest source code if to build a database from the source but it's and we set up a complete database and then run the test those straightforward and come to now looks like um everyone basically doesn't continous and oppression and I am 1 special thing is already included in the 2010 version we have a central database because the database um department and we store all of our test results into a central database and the developer can afterwards and take a look on this the data via but you and we can still access and this old data and I can still review um test results from 2 sources it's actually not so interesting anymore but yes a limited so the and if you read the description of his talk this talk is about scaling and how we scared of the test infrastructure and so the main question is so why should I talk all about skate so because of knowledge looks like a veritable continous integrations of so let me try to prove it that maybe we have some experience of scale so right now our system
is looking for a 600 developers the ToBI distribute that across cross-sectional this developers pushing around 700 commits every day into the system into the repository we have around all of the above and 30 million lines of Python testing coding in our repository and they're performing actually every day testing and of the the and 36 thousand hours so this is basically run for years so we are running tests that every day of 4 years we do it is on the landscape of the year and small 1300 Jenkins nodes and the string so it's actually not so small so and don't talking about how this small additions it hung about be emitted serve us and most of cases and we are using around 400 in a terabyte of memory for testing it's just about testing so we have much bigger systems with memory it's found but we still using parliament in a terror but for testing so let's talk a little bit about how we did this scaling and I just paid for what
topics for skin soul and 1 interesting part is test time how we optimize that at test scattering sponsors Trident 1 thing that affects it's also a very interesting area because it can them and move so much data around and then you have to provide a very high the testified on especially if you test and then persist let's think 1st about test front-end so
this slides and shows the runtime of Jenkins job of around 8 hours and you can see the job doesn't fit anymore on slide so you pushing and you're waiting more than 8 hours and until your test results is of a bird until you know about if everything works about its not so great actually fundamental aspect and we started to optimize its by applying a very common peasant from computer science divide-and-conquer are in our area it's basically to fight in text so the 1st thing we did was and be separated to test show from do the job itself which means we can now run on a different machine on maybe a machine which is optimized for building of our product and had and we're run the test on machine Reduced optimized for test so very common example is that the machines has to pick the more stipules and test missions has more memory so but actually this splits now increase the time because knowledge of this communication time you have friends for at at affects the process the network to the different holes but this was a prerequisite now we can also split to test log into smaller test drugs which has also the benefit that in case 1 test crop failures then you can reschedule it and and we can still keep the time for review by around 7 8 hours which is actually good enough for so that's struggling with more about test failures in our case
and tests can fail and actually it's the intention of text so I wrote something new you don't thought about and maybe it unrelated component and now 2 tests is broken and in case the test is broken country around a strategy to rerun this test to verify that this is now a b a regression really probe something and it's not caused by some sporadic the years and network latency issues of some kind and generous infrastructure problems and so on so and after the reruns complete and in case the rerun standard and reruns the anaphase state then we know OK it's really regression and someone has to take a look at the size and take a look at traces and decide here what is the reason for this so the main question is as 1 who restarts fair tests so you can imagine cases develop post something and then goes with the the home and Hector doesn't want to come get into office that morning and test field I have to restart it have to wait 8 hours of work 1 hour and a test results available so this is the reason why we started thinking about it more intelligence tests get
and we thought more about her scheduling and we found out that just get you is basically about 2 parts the 1st part is about configuration which test should now run which tests and now interesting for this change so for example we have different configuration for our our of multiple hunt would get branches with different components inside that if you put in your topic French images something like future point then we run test optimized on the test basically for this particular feature if you push against 1 owing integration contrast then you were wrong and you choose suite of tests so what what questions and other components and so on but we can also integrate things and we also added features likely testing which means we're so 1st to some kind of unit testing so we run 1st unit test in our infrastructure and 1st after this unit has a successful we run to really expensive depression test and then we run the really expensive and trend tests and as a fairly large developer Pearson logical base it can happen that someone breaks a test and in case of you have a broken tester cannot stop uh to integrate new changes in your integration projects drawing work such a scalar anymore and this is the reason why we have also some features stand and base to handle such broken test so you can see move test into accounting and say OK we know distance is currently unstable there's some parks inside in being excluded tests and run them and execution to say front-end the other big thing about has scattering is too absurd to hold test so in case there's a sufficient test he re sketchy was Frank on case the test is not complete can automatically perform and review of test and the most important thing as basically and after someone push something yes she want to know can now it's complete because it has just once it us for this we started to to employment more intelligent task Europe and obviously Python cause people of 5 average means after bed and the bed with me got our model once tested during the of debate and debate over 10 the come and ask different systems of knowledge configuration followed and states of certain things so for example in case of push something new reference of back and of it then we will take a look at the state of to spark and owning case this term back this in in a defined state and process and start to test execution after the reader decided which test should run here and scheduled and Jenkins and with this money to Iet and case this Fabio some test Gropius missing something here is that then it's
upscale a and you have also to talk about queuing in scattering so we have some requirements so for example on IQ test should be completed in the morning on that back fixes on a little bit more important in new features so the test should run and the phi approach and also that we should maybe finish started test them and finish the testing of commands and in case then not fully tested and we have some reruns then Jenkins currently only provides the 1st in 1st out q and riff 1st in 1st out it's really hard to implement such requirements and our solution for this was also 2 implemented in Python
again because of what so we better the prioritized test Q which means and the rate is putting now tasks to run some tests introduce q and this secure autumn around the uh based on the priority based on the content of some test task and then and an process of this patch Q items from the protons q a distributed across Jenkins hosts a trinkets masters which is actually also very uh required feature to disappeared across Jenkins masters because so we just learned that distance given the uh modern 457 so let's talk a little bit about that
affects and 1 thing instead of wine store that a little bit than to because of the broader tenets good Python package 0 solids stood 15 the amount and it's living on an innovation after the builders completed the replaced to then start up on this year and we will install from each and if a test data in their stranger something like for me about but also program above onto 852 and we doing and really extreme and 9 petabytes of data transfers just a transferring to install and test data to to read holes which are around tests and to optimize this we introduce some kind of caching so we just placed at a very simple Python script it's around 800 lines of code before before we called installer and he would shake is to install already locally available and in case there's a catch most and you have fetched installer from beneficial place it locally in the cash and then we can understand right they from and we can also do the same thing for test data so in case of test requesting some specific tested hot effect re intercept is called and then we have fetching do not effect from the Sun but 1st on the local disk and we can talk and with this and we actually saved a lot of traffic and this implementing engine implementations and very straightforward and varies I mean saved and 66 per cent of traffic and now only transfering it's still much but it's better than 9 that about so Mr. transforming free petabyte of test that affects every v era press on the next very upon fitting
especially in such a test moment is that if very high testimony so you have to make sure that all next time dependencies so it's a mentioned we have this an efficient if at effective test data but we also testing distributed systems so you know put holes not only holds which is currently interesting for you test and as you know Exxon dependencies will all of these things but we have also to make sure that your local systems in enhanced as we have performing parallel testing on the same host that we make have to make sure that there is no noisy neighbor around on the same host and noisy test basically running on your on the host which for some consumes all the way memory and then it's going and very much connected there you test with Fe because there's the money Brabant among the so too soft is restarted sperm and implemented health check which once before and after some test 1 and and the checks follow this dependencies availability of external services local memory you searched you use such as and it was implemented in the path
so let us it looks like today and that today we have so Garrett on but Garrett doesn't forget anymore that Jenkins directly Garrett trigger more available and we have now and dedicated infrastructure for building and that the database itself and after that but is completed we get some modification and the startle weight of which was then scheduled and the 2 test tasks and and some test processes related Suderman process or Jenkins landscape and they're the nice part is that if you now take a look at what is currently running in our infrastructure you see a lot of things and no coherent patterns every blue box this non-Python and other to build infrastructure spent heavily implement part and we still have our central and database default size to but you are right where you this results and we
still heavy PAV infants so it's still very created a learning curve is um so easy to get started that non-developers can write tests for us and for the data and we have a very big fans of 2 communities of the heavily heavily relying on open source tools like the true and like period and we have use of centuries earlier and and storing a lot of exceptions in central and the big fans of development velocity and performance this nice but to bring feature which you had in mind in the morning and afternoon to back to see much that actually and that the patterns but from independent about has also to scale across typical architectures so currently we are running that test on 3 different secu architectures in over 10 different operating systems were actions so just give me a very short outlook what they're currently doing and what we're currently trying to achieve Sylvia currently thinking how we skated for pressed 3 thousand nodes and we have to have some Cusí's running around with Apache Mesos fall to the mall resource-based scheduling approach of playing another round of ugliness containers on company stock up maybe some other contentions could be also interesting to a limited resources to ensure that you test 1 has enough resources locally available and they're commonly in step 2 migratory pattern freeze on a lot of code base system have to but there are some projects already marketed free so thank you very
much and this higher so in case you want to play around with a lot of memory you should definitely off of me so I think we have not enough memory for everyone thank you thank
you thank you can solve the but it was the so hello thank you for the presentation I'm curious about hockey tennis the failure scenarios like the the and has like students who searches and those at this moment in psychology we tested the like the light of and so they're testing it uh and patents have so for i in case if a distributed system and they come out and it's possible for the developer to why test cases to say OK no I would like to work and in this episode metric on the communication for example and then you can test that behavior how it no blogs and default network between 2 nodes works you it the animals Figure of we're talk on 1 question you mentioned in in use you running relevant test 1st for topic ranges how do you do that like to the developers have to define them or are you able to determine that automatically i currently it's error configure so the developer states OK would like to learn this test in my prime um but they're thinking about ways how you met for example source code to testing that to test coding and then we can decide on the book but should run and we have also some and proof of concepts running tool you so as to which data for such things hello how do you speed up your test suites so that they could run in and 8 hours to do by hand audio something interesting scheduling methods was have the ad currently doing most of their hand so on 1 of this test contains multiple python test puts factory and then I have been trying to picking them together to reach the runtime which is acceptable for us and we have some scripts which can generate this collections but it's not so sophisticated more questions we evidence and also performance tests part of the pseudo yes they are also doing performs tests have with accrue the same Lenski and so on all the performances on many performances are written in Python answer and regional as a lot of them reporting and evaluation of the to the size of performance testing good more questions the mother case of the input the forgetting fj