Correlation analysis in automated testing
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 490 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/47509 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Software testingCross-correlationMathematical analysisCoalitionAutomationMathematical analysisCross-correlationComputer animation
00:15
NeuroinformatikLevel (video gaming)CASE <Informatik>Presentation of a groupDenial-of-service attackComputer animation
00:59
File formatParticle systemCASE <Informatik>FreewareProcess (computing)Protein foldingSupercomputerComputer simulationProteinStatistical hypothesis testingMobile appComputer animation
02:06
Planar graphLimit (category theory)Statistical hypothesis testingIterationMachine learningMultiplication signMathematical optimizationVirtual machineSimulationState of matterElectric generatorExpected valueComputer animation
03:46
Convex hullCross-correlationMatrix (mathematics)Menu (computing)Statistical hypothesis testingCross-correlationBenchmarkCondition numberSuite (music)Perfect groupCoefficientMultiplication signFunction (mathematics)Cohesion (computer science)
04:55
Mathematical analysisCross-correlationSoftware testingFinitary relationMountain passCross-correlationSoftwareStatistical hypothesis testingLinear regressionBoolean algebraRow (database)Green's functionRevision controlFunction (mathematics)CircleServer (computing)Service (economics)Source codeXML
05:55
Statistical hypothesis testingRevision controlMoment (mathematics)Power (physics)Covariance matrixSoftwarePlanningSpreadsheetGoogolCASE <Informatik>XML
06:32
MathematicsComputer animation
06:49
Matrix (mathematics)VariancePearson product-moment correlation coefficientVarianceDiagonalAutocovarianceStatistical hypothesis testingField (computer science)Arc (geometry)Collision
07:08
Variable (mathematics)Product (business)Pearson product-moment correlation coefficientMathematical analysisStandard deviationDemo (music)Cross-correlationMetric systemMatrix (mathematics)Correlation and dependenceStatistical hypothesis testingMathematicsDynamical systemRevision controlCoefficientCondition numberIterationSoftware maintenanceReal numberShape (magazine)Firewall (computing)Suite (music)Office suiteResultantCollisionRouter (computing)Point (geometry)Negative numberContinuous integrationLinear regressionKälteerzeugungComputer animationXMLProgram flowchart
09:28
Mathematical analysisStatistical hypothesis testingRouter (computing)Linear regressionSuite (music)CASE <Informatik>Computer hardwareFirmwareCross-correlationResultantSet (mathematics)Dynamical systemLine (geometry)Proof theory
10:48
Mathematical analysisFirmwareProof theorySet (mathematics)Cross-correlationLecture/Conference
11:11
Correlation and dependenceArithmetic meanVarianceArithmetic meanVarianceSet (mathematics)StatisticsCross-correlationMathematicsDifferent (Kate Ryan album)Graph (mathematics)CausalityComputer animation
12:01
Mathematical analysisDemo (music)Arithmetic progressionGoogolNeuroinformatikStatisticsSpeciesStreaming mediaStatistical hypothesis testingSoftware testingComputer animation
14:35
Point cloudFacebookOpen source
Transcript: English(auto-generated)
00:05
Thank you very much for coming. My name is Lukas Fuchisław, I'm here to talk about correlation analysis in automated testing, as you can read. At the beginning, I want to introduce myself, tell something about who I am and how the history of my education
00:24
moved me to such a spec of this presentation. Formerly, I thought about something slightly different, but it evolved into this. Then I will tell what was the problem and what was the purpose of this research. I will tell you about some map that stands behind all
00:47
this thing. And I will go to the use cases and some conclusions, maybe some anecdote. So first of all, what was my main purpose? I was a former computational biologist, bioinformatic.
01:10
I was a person who used very strong supercomputers to simulate biological processes like protein folding, an example. The most problematic issue with such research is that are very
01:28
pronounced to any performance issues, because there are millions of billions of billions of particles that can be connected to each other in many ways and can have influence
01:46
one or two in each other. So the performance was crucial. And as I started, as I moved to informatics, to IT, I started to be a tester in 3MD app in Poland. I started
02:09
to work with big test suits, not very big, but big enough to consider that the speed
02:24
of the test, the time that it would be done and the resources consumed are important. So I thought about making it some more reasonable to do some optimization of these tests. And
03:02
I think that nowadays it's a big team, machine learning. So why don't all the tests learn one from each other? You know, from the history, one test will know that other tests was run at almost the exact time if some other test failed, this test also failed.
03:29
Or if some other test was run several iterations and it couldn't reach its expected outcome, the other test would be either shortened or not done. Then I realized it was better
03:47
from the beginning, because if tests are correlated in such way, the test suit is badly created. It has some wrong assumptions and the test conditions are badly formulated.
04:09
So this is my assumption. I don't want to read it again. I want to make test suits
04:23
more elegant and save some time and resources. Because for the perfect test suit, there should be no correlation between the tests. If there are some correlations, tests should be passed because we don't need to be made. Then that might be some value measured that can
04:50
even benchmark test suits, the correlation coefficient. Let's see for an example. There are tests in rows and software versions in columns. Green is passed and red is failed,
05:08
seems obvious. And I considered tests a Boolean functions, which are true or false. Normally, when you compute correlation coefficient, you need to do regression first, because
05:29
it's Boolean function, even from zero to one, that wouldn't make any sense. So what I took was the probability, historical probability of passing the test. That was the value
05:44
that I was measuring if the test is below that probability or above that probability. Next, what I've done, it's obvious that test that will have 100% pass, it doesn't
06:06
really matter for us. Just the same as software version that fails all the tests. If we have power issue at the moment of testing, it doesn't really matter, yeah? It's not a matter of tests. So I left only the meaningful test cases, meaningful tests that
06:29
have sometimes failed, and I created a covariance matrix from them. This may look suspicious,
06:41
but it's really very easy math. It can be done in Google Spreadsheets. From this, I've created a covariance matrix, just as I said before, where on the diagonal is just a variance of each test that is meaningful, and the other fields are covariances between the tests.
07:05
Next, I computed a Pearson correlation coefficient. It may look scary, but it isn't, trust me. And create a correlation matrix. What we can see above, it's just the shape of the test
07:25
outcome that was from the beginning that was our input, and below it's scattered by me. It's only half of the matrix, because, to be clear, for clarification, it should have on a diagonal
07:42
one, because, and it should on the diagonal, because correlation between the test and the same test is one. Pearson coefficient has always, is always between minus one and one. Minus one is a negative correlation. If something passes, the other thing fails,
08:02
and one is positive correlation. If something fails or passes the other test, the other value is always the same. Zero means no correlation, and in a big test suit, or many test suits aggregated in regression, the real values are near zero. Plus or minus, but near zero.
08:23
If something is like, in this example, I don't consider one, because it would be obviously error, but 0.5 or something like this, it is significant. We can do that just to validate the test, but it wouldn't be much, it wouldn't have much value for us. What can have value?
08:47
We can, with each regression in continuous integration, continuous delivery, we can test each version and check the dynamic of changes of this coefficient. You know, if we can see that
09:04
some values that shouldn't be connected seems to be correlated, correlated from some version in the past, correlation is getting bigger with each iteration. We should see yellow light and check whether the conditions of the tests are prepared correctly.
09:28
This is a real test outcome from, we are in the 3ND app, maintenance of Swiss firewalls firmware of PC engines, IPO, maybe some of you used these routers, and this is the test results
09:45
of the firmware of one of the lines of the, one of the routers. We are maintaining them for above two and a half year, but these tests are not a good example because our tests are evolving too. Some tests were created and I don't know,
10:08
started to be used half year ago. So we have not completed large enough base of tests for long enough history to make this really useful, but large silicon vendors or hardware producers,
10:26
I suppose it could be useful just because there are very often cases that one person or one team is responsible for one suit in regression. All in all, they might not be aware of correlations
10:44
between the regression suits. So in the conclusion, the dynamics can be useful in large sets with a large history, and this is a proof of concept. Do not consider this to be a white
11:02
paper or something like this. Idea is quite new, and we started to work on this more seriously on our firmware, with our firmware. This is an anecdote, it may be some of the, some of you know it or have heard of it because these four sets of data have the same
11:23
as a certain below mean of X and Y, variance of X and Y correlation and so on. Although they are obviously different, so we have to be very, we have to look. I don't consider the statistics
11:42
to be a red light, to be a fail if something rapidly changes and something rapidly grows, because it may be some, from some other reason, but yellow light should be considered visible if something occurs like this. This is a biography, and thank you for your attention. I don't,
12:52
you do any computations, all the computations I've done in Google Specsheets. Thank you. Yes, I heard about it, but this is a new concept. We are just looking for ways that could improve
13:06
our test suits, and I think it should be considered. Thank you. I beg your pardon? I can't hear you.
13:48
People who are watching the stream have not heard that, so you should probably mention that there is a thing called mutation testing. Yes. Have I heard about the mutation testing? Because it can be used, it can simplify all the computations that we used here
14:09
with statistics. Yes, we've heard about it, but it's in progress. We are looking for new ways to do it, and I think it'd be considered. Thank you. Thank you very much for your attention.