Bridging Arrays and ADTs in Recursive Proofs
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 36 | |
Author | ||
License | CC Attribution 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/54991 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | |
Genre |
10
26
00:00
EmpennageTemplate (C++)Abstract data typeLinear mapMilitary operationLogic synthesisElement (mathematics)Bridging (networking)ImplementationSource codeVariable (mathematics)Revision controlValidity (statistics)BacktrackingProof theoryInductive reasoningPreconditionerTheory of relativityDependent and independent variablesDifferent (Kate Ryan album)Data structureElement (mathematics)CASE <Informatik>System callMathematical inductionState of matterQuicksortSubject indexingMultiplication signElectric generatorArithmetic progressionGroup actionRow (database)Type theoryLimit (category theory)Set (mathematics)Computer programmingMetropolitan area networkFunctional (mathematics)Link (knot theory)File formatFlow separationForcing (mathematics)Procedural programmingCondition numberGame controllerUniverse (mathematics)Parameter (computer programming)Personal digital assistantInstance (computer science)Direction (geometry)Speech synthesisAreaHypermediaInductive reasoningOperator (mathematics)Term (mathematics)Uniform resource locatorTheoryImpulse responseRule of inferenceSource codeResultantAutomationOcean currentConstraint (mathematics)VirtualizationCrash (computing)WordArithmetic meanForm (programming)Shared memoryVariable (mathematics)Interior (topology)Point (geometry)Price indexField (computer science)Rational numberOvalMathematical analysisECosSingle-precision floating-point formatMathematicsPhysical systemBit rateLevel (video gaming)Free variables and bound variablesEvoluteOrder (biology)Lie groupRecursionFunction (mathematics)ImplementationProof theoryInvariant (mathematics)Correspondence (mathematics)Array data structureBitLogic synthesisFormal verificationQuantifier eliminationAlgebraSlide ruleAbstract data typeCodierung <Programmierung>Disk read-and-write headComplex (psychology)AxiomMappingMultisetElectronic mailing listCharacteristic polynomialPredicate (grammar)outputMessage passingEqualiser (mathematics)Boolean algebraLogic gate2 (number)TheoremBenchmarkDecision theoryHydraulic jumpComputer fileTemplate (C++)MultiplicationInheritance (object-oriented programming)Equivalence relationBounded variationHeat transferProgrammschleifeLatent heatQuantification1 (number)Modeling languageControl flowLengthLemma (mathematics)Stack (abstract data type)Front and back endsPerformance appraisalInterpreter (computing)View (database)Computer animation
Transcript: English(auto-generated)
00:02
Hello, everyone. My name is Grigori Fedukovich. I'm an assistant professor at Florida State University. And this background is not virtual. I'm going to present a joint work with Gidon Ernst, who is an assistant professor at Ludwig Maximilian University in Germany. This work is about relational verification of programs
00:22
that have arrays and algebraic data types. Well, definitely, we need data structures. We definitely need to verify programs with data structures and two common ones are arrays and ADTs. However, reasoning about arrays or ADTs is extremely complicated.
00:41
Arrays need quantifiers, because we don't know the length of the arrays. ADTs are defined recursively, such as link lists, trees, stacks. And they also need to have recursion. If you want to prove something about that, you definitely need our decision procedures, some dissolvers, or theorem provers need to support that. The reasoning in terms of relations between arrays
01:04
and ADTs is definitely even harder. So and we need this type of reasoning. For instance, if you want to prove some kind of equivalence, if our program uses arrays, and our specification uses ADTs, well, definitely, we
01:20
need to meet these two data structures together in order to prove that nothing bad has ever happened and our safety specification is good. Solution that we can offer is relational invariant that are recursive and possibly quantified. It looks crazy, but we can do it with our procedure that I'm going to show you in a second.
01:41
Consider a motivating example, which is an implementation of a stack. So you will see two different implementations. One uses linked list, another uses array. So it has one data structure is new, and it has two constructors, and array data structure
02:00
is basically an array and an initial index. So that's three operations available, pushing and popping. And I believe that's self-explanatory. When push happens, we just take the argument that we've given here and const it to the existing list. In the array way, we just increment a counter
02:22
and we also add the initial argument to the array. Hoping is a little bit more complicated. In the array case, we just subtract in and we keep the existing array untouched. And in the ADT, we return the tail of the list.
02:44
Again, completely forgetting what was in the head. Relational invariance for that is defined recursively. So in the case of empty list, so basically relational invariance says that index should be zero. In the case of a complex list, if it's a const,
03:03
then we see the nested call to the recursive predicate and also some equality over data structures that I used in both programs. So while this is a very particular example, we can draw it in plate based on that.
03:22
And for instance, by replacing the particular predicate by a wildcard, we can actually see the most important ingredients is that the two cases and a nested call to R in the recursive T. We can generalize it even further and we can say that it's basically defined over two states.
03:41
And in the active case, it uses the essentially quantified next state variable that you're gonna eliminate using this common function. This is already played and intuitively recursive relational variant, well, it defines correspondence between programs in terms of variables that are visible to the user
04:02
in particular inputs and outputs. So strictly speaking, it should satisfy three types of constraints. So preconditions should be consistent with each other. The outputs should be equivalent and also initiation and consequences constraints similar to basically inductive and variance
04:20
and this also should be fulfilled. So we formally write these constraints in the form of constraint one clauses that you can see in the paper but I'll avoid that in just a second time. The ideas in our approach, first of all, it's template driven. We use the template that you see here in the slide and we instantiate holes
04:42
for all these placeholders gradually. So in order to figure out how to instantiate, we run some lightweight analysis based on looking at pairs of different methods and analyzing how they actually change the state. So there are three types of methods that we can distinguish producers
05:00
that intuitively make the ADT larger, consumers that make the ADT smaller and no opposite actually don't change anything within the data structure that returns something else. We also point to your improver to validate the candidate because it's recursive and the prover that works on our backend
05:21
should be able to conduct proofs by induction. Getting back to our example using stack. So we can consider the consumer operation a pop. So we have a special recipe for this type of benchmarks,
05:41
which is also applicable for cubes. So the important feature of the benchmark of this type is that they use index variables through which in different elements of arrays are returned. Based on the consumer operation, we can actually see that, well, first of all, it has a precondition
06:01
and that precondition should be matched against the precondition of the ADT operation that says that Xs is not empty. On the other hand, equivalence of outputs should be created by looking what these two functions actually return. After obtaining different relationships within variables
06:23
that come in from a slightly symbolic execution, we can figure out that basically out here corresponds to the head of the file of the list and AN here basically corresponds to the AN minus one, first element of the list because here N was decremented.
06:42
We also get different terms that stand for scoring terms and can be used as arguments in the nested call in the, of the relational invariant N minus one in this case. And surprisingly that is enough. So we get these ingredients together,
07:01
we instantiate the holes and we run an ADT INV solver that validates this on all other operations. And it works in a second. We can see also to conduct the same reasoning also based on producer operation, which I skip here for, but you can look at the paper
07:21
and it has some really nice ideas as well. Another recipe that works on more general scenario, where you use no off operation, which is for instance needed if you want to implement such programs over sets, multi-sets or maps. So it does not have an index necessarily,
07:42
but they use the notion of a characteristic function in order to implement a set. So in a few words, very, very intuitively, synthesis is based on quantifier elimination and simplification. So we get a symbolic encoding of both operations. We can join them. So we make sure the variables are different, but outputs, those are things that we require
08:03
to be equal. And then quantify and run quantifier elimination, it gets rid of these outputs. And that will give us some formula, which is already sort of relational. And by applying different axioms of the array theory, we can make it better and make it more suitable
08:23
to be used in the invariant. So without actually details, it works in a completely automated way. And after that's done, similar to the previous recipe, so we validate the invariant on all other place of operations, we use ADT, IND, our inductive prover.
08:42
And if everything is successful, we are done again very, very quickly. Particular example, that's another example that has implementations of a set. So as you can see here, the ADT implementation also has linked list and array operation has a list of booleans.
09:02
And initially it's false, when which represents the set is empty. So we'll have a consumer operation that erases an element. We have a producer operation that makes a new list by consuming the previous list with a given input.
09:22
And that basically works by getting a new value for the corresponding in the array. So it also has a no operation in form of pass element. So has element checks if there's an element and it employs a recursive function contains
09:42
that scans an array and searches for particular elements returns true if it's found or false, it's not found. The array condition is way simpler of course, because SMT arrays are represented as maps. So actually that is enough just to say, okay, I'm looking what's stored in that particular index
10:01
and return that. Invariant looks a bit more crazy than in the linear case because first of all, it has a universal quantifier in the base case. So it's a recursive solution, but it is also quantified solution. Isn't it crazy?
10:21
It is an ADA and it can prove it pretty quickly. What's more importantly as well, it uses the recursively defined function inside the inductive case. For instance, if this contains how to get it, you can definitely look at the paper. It's based on quantifier elimination and simplifications,
10:40
but what was important and right now that you can see that this corresponds to this contains in the no op operation of has element and another element, which is basically says that a y should be true. That's taken from the corresponding array operation of the no op.
11:03
View words about ADT-IND. It's a very, very recent solver proposed by Princeton people including myself. And it's somewhere between SMT solvers and automated theorem provers. So it takes basically the best of two worlds. So it conducts proofs by induction
11:21
in a very similar way how for instance Coq does proofs, but also it uses SMT solver in order to quickly check if some particular assumption can be used or some something can be written. So it backtracks if no assumption can be used and it also synthesizes new helper lemmas based on failed proof attempts.
11:42
And then it tries to prove this lemmas and if the lemmas cannot be proven, it's basically it may require a user to actually give a lemma. So this way is the work exactly like a theorem prover. So we built it on top of Z3. So we have a support for interpretive functions
12:01
and race in this linear arithmetic. So the evaluation of our synthesis procedure for recursive invariance uses that ADT-IND solver and our solver because it's heavily relies on ADT-IND is called ADT-CHC. Benchmarks are taken from different textbooks realizations
12:21
of super common and widely known data structures such as stacks, queues, sets, multi sets, maps and different variations for them. For instance, you can implement the set such that the operation removes all elements sorry, it keeps all elements with not duplicates.
12:44
So there are two possible and absolutely viable implementations and both can be proved. Interestingly, however, one is slightly requires slightly more time by our tour. So ADT-IND expectedly takes a lot of time and basically all the time, which is not very huge
13:02
but like 99% of the time is taken only by ADT-IND to discharge the proof of the gate jumps. So it's completely automated except one benchmarks where we have to introduce a user provided lemma. However, we already have some ideas how that can become automated probably is the slight increase of running time.
13:21
Maybe it won't be 30 seconds, but 40 seconds. I'm almost done. I've presented the first approach the best of our knowledge to synthesize relational invariance over implementation using arrays in ADTs. It is completely automated. It's based on recent advances in SMT solvers
13:40
and what's more importantly, it generates recursive invariance. We don't know about any other solver which has a CHD realization that provides recursive invariance. In the future work well, we would like to target technology transfer and try to migrate our ideas to some state of the art
14:03
software model checker and also support more complex data structures and programs with some different control flow such as nested loops. Thank you for your attention. Hope you enjoyed that and I'll be happy to take any questions.