5th HLF – Lecture: Curious Facts About Nested Canalyzing Functions
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 49 | |
Author | ||
Contributors | ||
License | No Open Access License: German copyright law applies. This film may be used for your own use but it may not be distributed via the internet or passed on to external parties. | |
Identifiers | 10.5446/40143 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
2
00:00
Streaming mediaComputational complexity theoryNeuroinformatikVideo gameTuring testEndliche ModelltheorieProgramming languageDescriptive statisticsCompilerAttribute grammarAreaTranslation (relic)Mechanism designFormal grammarComplex (psychology)Computer programmingComputer animationMeeting/Interview
01:45
Bit rateInternet forumIntegrated development environmentFunctional (mathematics)Variable (mathematics)Rule of inferenceFunction (mathematics)Condition numberDefault (computer science)Positional notationCASE <Informatik>Software testingMereologyDescriptive statisticsParameter (computer programming)Information securityGroup actionStudent's t-testNatural numberSummierbarkeitMatching (graph theory)Faculty (division)Cartesian coordinate systemAlgebraSlide ruleContent (media)Process (computing)2 (number)Line (geometry)Statement (computer science)Order (biology)Flow separationComputer programmingMusical ensembleMathematicsPattern languageRight angleVenn diagramMeeting/InterviewLecture/Conference
10:49
Bit rateInternet forumOrder (biology)Well-formed formulaRule of inferenceFibonacci numberWeightAbsolute valueParity (mathematics)CounterexampleCalculationProof theoryPrice indexComputer programmingType theoryParameter (computer programming)NumberSummierbarkeitMathematicsResultantForm (programming)PlanningSound effectFunction (mathematics)Condition numberVariable (mathematics)Position operatorFunctional (mathematics)outputCASE <Informatik>AverageThresholding (image processing)Descriptive statisticsMaxima and minimaGroup actionInteger2 (number)AlgebraSensitivity analysisMathematical analysisLogic gateAreaDifferent (Kate Ryan album)PolynomialInductive reasoningLecture/Conference
19:44
Internet forumBit rateSensitivity analysisoutputAverageFunction (mathematics)Sound effectFunctional (mathematics)RectangleMathematicsError messageBitCASE <Informatik>Lecture/Conference
20:49
Internet forumBit rateSensitivity analysisWell-formed formulaMathematicsFunctional (mathematics)Arithmetic meanAuthorizationFunction (mathematics)Boundary value problemMathematical analysisCASE <Informatik>Exception handlingVariable (mathematics)CalculationSet (mathematics)Order (biology)Parameter (computer programming)Condition numberRule of inferenceExterior algebraoutputNumberSubsetAveragePay televisionWordLecture/Conference
26:44
Internet forumRule of inferenceFunction (mathematics)Set (mathematics)Parameter (computer programming)MereologyCondition numberFunctional (mathematics)Presentation of a groupSensitivity analysisNumberBitVector spaceWell-formed formulaResultantoutputCASE <Informatik>SummierbarkeitMathematicsSlide ruleForm (programming)Right angleLine (geometry)Total S.A.Multiplication signWordPartition (number theory)Variable (mathematics)Lecture/Conference
32:38
Bit rateInternet forumHill differential equationSensitivity analysisCASE <Informatik>Rule of inferenceNumberSet (mathematics)AverageTotal S.A.Greatest elementProof theoryFunction (mathematics)Optical disc driveWordExterior algebraFlow separationForm (programming)Multiplication signFunctional (mathematics)ResultantOnline helpTerm (mathematics)Lemma (mathematics)Slide ruleBit1 (number)AlgebraMathematicsState observerCalculationLecture/Conference
Transcript: English(auto-generated)
00:01
So, the stream of people is slowly going to zero, so let's start the next session. I'm happy to moderate this session with Richard Stearns and Madhu Sudan. My name is Reinhard Wilhelm. I'm the delegate of the Leibniz Center for Informatics in Schloss Dachstuhl.
00:27
In my previous life, I was having a career as a compiler person, and one particular area of work was atrial grammars, for whose invention Don Knuth did not receive the Turing Award.
00:45
Atrial grammars are long given up. It was expected that they would serve as description mechanisms for the static semantics of programming
01:01
languages, and so, when I started working there, I was seeing references to Richard Stearns, and I was convinced that his first name was Lewis, his second name was Roltenkranz, because there was a famous paper by Lewis, Rosenkranz and Stearns on attributed translations,
01:23
which you had to read when you were working in the area, but Richard is not Lewis nor Rosenkranz, only Stearns, and he later turned to complexity theory, and with Joris Hartmanis, laid the foundations for complexity theory and received the Turing Award for this.
01:45
Okay, so today he will talk us, speak about something that nobody of us knows, canalizing functions, whatever that is, please. Thank you. Well, I'm going to talk about canalizing functions, not because they're
02:07
important, because they're fun to play around with, so if you don't know what they are, that just goes to support my contention that they're not important. Also,
02:24
I'm going to, during the talk, I'm going to emphasize the process whereby we perform our research and then see what these functions are doing and try to draw our conclusions,
02:42
and without trying to throw all the algebra at you, which finally supports those conclusions. Okay, now this work was done jointly with Dan Rosenkranz and Ravi, who are,
03:02
three of us are retired faculty from the university, all of me, and my madame, Marati. He was a student of ours many years ago. He's since risen to be an important part of the Virginia Tech. He brings in a lot of grant money in respect to
03:36
homeland security and threats, so the first curious fact is why this is at all interesting to
03:45
homeland security and threats. Okay, so these are the research methods I'll be discussing. The first thing is to get a mathematical understanding whereby we can
04:07
solve a sum n plus one parameter problem by first solving an n parameter problem, and you'll see by the iterative nature of the definition why this is likely to be successful.
04:25
And after we understand those mathematics, we can program our solutions, and then it's easy to work out examples and experimentally look for patterns, and once we have a conjecture,
04:42
then we can prove them with the algebra suggested by the programs. So these functions are all binary valued functions, or zero one as we're going to use during this talk.
05:01
Well, finally to the definition, first we'll define catalyzing and then nested catalyzing. So a function is called catalyzing if for one of its parameters that can be expressed as indicated on the slide, the first statement, if the
05:24
value of that parameter is equal to a, then the value of the function is b, and otherwise the function is the function value of the function is some function of the remaining parameters, and the function is called nested catalyzing if when we say function of the
05:46
remaining parameters, that is also a catalyzing function, and so forth. So a catalyzing function then might be described thus, which says well if the value of a parameter v1 is equal to zero,
06:07
then the value of the function is one, otherwise we perform a second test, and that parameter has value one, then value of the function is zero, and so forth,
06:22
all the way down to the end, and then finally if you fail the final test, there's some default value in this example of one. Okay, well we don't like the idea the last statement being that the default value is the same as the output value for the last
06:46
variable, because in that case the function does not depend on the last variable at all, and we're just excluding them from the talk, not that we couldn't add them on,
07:01
but we choose not to, and the bottom line then is there for the final then, and the else parts must be different. Okay, so this enables a simpler notation shown on the right here,
07:21
where we list the name of the parameter, and the test condition, and the resulting output of that condition should hold true, so the first rule then says well if v1 is value zero, then we'll output one, etc. Now because of the last slide, we've ruled out the default
07:44
being the same as the output of the last rule, we don't even have to say what the default is, because you see the rule for parameter nine is one goes into zero, and therefore default value would be one. Okay, well we have here two descriptions which are equivalent,
08:13
and notice these two descriptions are entirely the same, except for the last variable, and of course if you say the, well if the, if that parameter is value zero,
08:28
the output is zero, and saying well if it's one, the output's one, those are two ways of saying the same thing, so we come to agreement that we'll always make the output of the last
08:41
rule match the proceeding rule, so in this case, it's the one on the left that we're using, and not the one on the right. Okay, now these functions do actually have a application to biology, so even though they're fun to work with, this concept was
09:07
originally proposed by some biologists, this is the second curious fact, is that originally they were spelling it with an I instead of a Y, and then
09:23
later on they started spelling it with a Y, so that's not in this print, that's the way it is. Okay, now this next slide shows the concept of a layer, notice that I've grouped the rules into
09:45
three groups here, in the green group they all have output one, and then in the red group there's output zero, and then finally the last group of blue again is an output one, and so within a layer you have the flexibility of switching the rules around, so
10:08
the second column is the same as the first column, that is, it doesn't matter in which order we test the variables to parameters two, three, and one, if any one of them
10:23
does succeed, the output is one, whereas if we were to change the, go to other arrangements, that would change the value of the function. Okay, now this shows that there is a certain simplicity when the layers have several rules in them,
10:48
which suggests that if we were looking for extreme examples in some cases, for some problem, that we'd want as many layers as we could have, so if we're looking for
11:01
counter examples for something, the obvious place to look is where the, you have as many layers as you have variables minus one, because we've agreed that the last two variables, two rules belong to the same layer. Okay, so we're going to examine
11:28
curious facts in two areas, one is in the area of representing a nested catalyzing function with a threshold gate, and then the curious thing is that we see the Fibonacci numbers showing up as
11:43
threshold weights, and the second is an analysis of worst case average sensitivity, in which case we'll see that suddenly the odd number of parameters and the even number of parameters are treated differently. All right, well what is the
12:06
threshold function? Well, it was explained this morning, each parameter has a weight, weights may be negative, and there's a threshold value, and if the sum of the weights for the
12:24
parameters which have input one is greater than the threshold, then the value of the function is one, otherwise it's zero. Okay, and these are the things we want to show, we'd like to show that any ncf is also a threshold function,
12:45
second that you can find these weights in polynomial time, and in the thirdly that the worst case, the largest integer weight is exponential in the number of variables, which you'd expect if those weights are going to be Fibonacci numbers.
13:06
Okay, so here's how we're going to compute the weights inductively from a description, and we're going to start with the last parameter first,
13:20
and we want to compute a weight for each parameter based on the weights already picked, and in order to program this, we're going to have to keep three variables, a threshold, the sum of the positive weights, and a sum of the negative weights.
13:42
Okay, and then for each type of rule, we're going to have to have a way of getting from the smaller number of parameters to the larger number, so what it says here, each rule we're going to have to have, each type of rule we're going to have to have a
14:07
a plan, so I'm going to describe the inductive rule for rules of the form, one goes into one, and I'm not going to present the other ones if after the talk is over, you beg me,
14:21
maybe I'll explain what the others are. Okay, well first the mathematical facts, of the weight for the parameter vi has to satisfy this condition, well why is that? Well the rule says that when the value of the parameter is one, the value of the function
14:46
has to be one, and that means that regardless of the other parameters, well you have to exceed the threshold, but what is the worst case? Well the worst case is that
15:00
with the positive, the parameters with positive weights all have input zero, and those with the negative weights have input one, in which case the sum of the weights then is wi plus all those negative weights, and that has to be greater than the threshold,
15:22
so this is simply a mathematical fact, and of course what is the minimum integer weight that will do that, and that's of course t minus the sum of the negative weights, so this is the mathematics, and now we program that, take the following actions,
15:46
we set the weight to be the minimum weight, which we know mathematically it's supposed to be, and of course we've created a new positive weight, so we have to add that to the sum of the positive weights, okay so here's what we think of ought to be the worst case,
16:05
namely where the outputs alternate so that the each rule is in its own layer, and so we run through the calculations, and we find these are the weights,
16:20
and you can see these are the fibonacci numbers, so we've discovered these and be interested in proving why, okay well let's make
16:41
this explains why this ought to be so, this isn't exactly, this is not a proof, but an indication namely if you set the variable vi equal to one, and the next two variables zero, you have to get the same result as if the variable was set to zero, and the next two
17:01
are set to one, because the rules alternated an output, and or for those the same have the same effect, you want the weight of for parameter i to be equal to the weights of the other two, so one expects the fibonacci numbers to come in, okay well let's make a
17:30
change the problem slightly, where for example for the rule for v3, instead of having one goes into one, let's have zero goes into one,
17:43
and we've changed several of them, and this is the result, and you see that yeah the fibonacci's are there again, it's just that you see that it's the absolute values which fall into the fibonacci series, and so the the actual rule of those
18:05
is about the absolute values, and the threshold is changed, now why are some of these negative, and some of these are positive, well the rules one goes into one, and zero goes into zero, both makes the function increasing as a function of the corresponding variables,
18:27
and where you have one goes into zero, and zero goes into one, the function is decreasing as those
18:41
the function of those variables, and so where they decrease the weights are negative, and where they increase the weights are positive, and just briefly I'm not going to explain all this, but what happens if it's not the worst case, what happens
19:00
if there are just a few layers, and you can see that the weights and the weights in each layer are the same absolute value, and no surprise there, since we said you could perform those tests in in any order, and there is a formula that explains that which we will not
19:21
I will not go into, and of course to prove then that the fibonacci numbers are the worst case, then you need to bring in some algebra to support that claim, which I'm not going to not going to do, so let's see, this is a check, this is a countdown clock, okay
19:44
okay, well let's talk about the the other topic, namely sensitivity which will explain what sensitivity is with an example, so we're supposing that the
20:00
value of the function shown in blue there inside the rectangle is as as shown as output one for those assignments to the inputs, and then we look at the effect of changing any one of the values and so in this example, if you change the first bit, now that doesn't change the output of the
20:24
function, if you change the second bit, it does, and and so forth, and we observe that four of the seven cases, changing one input bit, changes the function, and therefore the sensitivity of this
20:42
input assignment is four by definition, okay, well we want to we're interested in the average sensitivity, which you get simply by adding all the sensitivities and dividing by the number of possible input assignments, okay, so the question that arises and asked by these
21:10
biologists, what is the worst case average sensitivity for an ncf as a function of the number of variables, well we already know that we should conjecture that the worst case is where
21:25
the outputs alternate, and a second thing which was which we're going to be able to answer is to describe the set of all the functions which have the worst case, so
21:45
this is the candidate then for the worst case, and it's with the alternating outputs, except of course the last two variables which we require to be the same, and in this case when we do the calculation by means we'll be discussing, we see that the average sensitivity
22:06
is 1.33203, and that's very close to four-thirds, so it was conjectured by these five authors
22:22
that the average sensitivity was never larger than four-thirds, and that the alternating example is the worst case, and then these the second three gentlemen explained using Fourier analysis that they found an exact formula for the alternating example,
22:46
and were able to show it was the worst case, so that's another curious fact that using Fourier analysis instead of simple algebra, and they're able to get the answer, all right well
23:06
so here is the worst case example, and as explained this is the average sensitivity, but we're going to make a change here, and notice then the two green variables the outputs
23:22
alternate, and the red they alternate, and the blue and so forth, but between boundaries they do not alternate, that is between the going from the rule for input v2,
23:43
and the output for rule v3, they have the same output, so the alternating case is certainly is not the only worst example, well suppose we do the alternation slightly different,
24:08
here we've grouped two and three and four and five together, and now the average sensitivity is less, so this is the curious fact
24:23
that even-numbered parameters are treated from differently than the odd-numbered parameters, okay well let's get into how we actually compute these, first we need a little terminology,
24:49
so I'm going to use a subscripted i to be the condition for variable for input, and the b with subscript i to be the output, and then I'm going to write tilde one to mean
25:06
zero and tilde zero to mean one, so in order to build up rules for doing the for actually computing the sensitivity, there are several things that suggest that we need to know,
25:30
so we're going to talk about the set ai, which is a set of input assignments where the earlier variables have been set so their conditions fail, and of course we know the size of the set ai,
25:47
namely it's just two to the number of the remaining parameters, then we're going to have a ai with subscript zero or one, and that's going to be the subset
26:01
of ai where the value of the function is either zero or one, and of course this will partition the set ai, so the fact shown in blue is that ai is the union of those two sets,
26:21
okay and the third set we want to pay attention to is that the set of is a set of input assignments where changing the value of the parameter i uh changes the value of the function because that's so and of course the average sensitivity
26:46
is going to be the sum of those of the si so in which we're just counting the counting the number of ways of changing variable i will change the value and that's going to be the
27:05
total sensitivity which we divide by the number of total number of assignments okay and this through the certain relationships we can establish between these things and i'm going to i'm going
27:22
to interpret the black things in black and if you don't believe me you can read the green but i'm not recommending that you do that so the first line says well if you want to know how many things where how many vectors where you're in ai where the output is the output of the rule
27:47
well that happens in two ways one of the way is that the bit itself has been meets the condition that's where you get the two to the n minus i and the other is where
28:01
doesn't meet the condition and then the value depends on the on the remaining variables because that's the number that's the things in a of i plus one where the output is matches the output of the rule and the second thing says that well if you want to know how many things have a
28:26
different output than the output of the rule that's just a set of things that where the remaining parameters give you that output okay then the then the final thing
28:46
says that if you want to know how many input vectors there are in which changing the value of the ith bit changes the value of the function why it's
29:06
it's any any vector where the values is determined by the remaining bits since then you're turning the you're changing the bit makes the makes the condition satisfied
29:23
and by changing the value and of course the opposite is true you change the bit back and it's it switches back so that's where the the two comes from now these are these are enough to compute the give you the average sensitivity for the
29:45
well for any for any example because you take the you take the results of the smaller thing and you use these formulas to find the you can keep you day you compute all the
30:01
things on the left from the things on the right so that's where that's where the numbers we showed the earlier slides came from well what how does this going to help us with a worst case example well you can try to find a closed form for the for the worst case example simply by
30:31
plugging these plugging these things in well when we tried that a very peculiar thing happened
30:42
a curious thing happened which we'll describe next so i'm going to ask the question for two consecutive input bits what is the sum of the cases where changing the
31:04
ith bit changes the value of the function and plus the number of where changing the ith plus one bit changes it and using the formulas from the previous slide we see that it's two times
31:23
this sum okay well we did on the previous slide we did have a rule for this so let's plug in the let's replace that with the rule we had from the previous slide
31:43
and then we get this well next thing we notice that if we look at these two terms and if i assume that we're alternating that is if in which case bi would be different than bi plus one then that assumption this this is just a partition of a to the i plus two
32:13
and combining the two parts of the condition we simply have a of i plus two so with that assumption we replace this and then of course we know how large that is and that's this amount
32:29
and we add them together and we get this okay well what have we what have we discovered
32:43
this is what we've discovered we've discovered that adding these two things together and assume making this assumption that the total number of things in these two sets is a function which is independent of any of the rules so so this is this is a then going to be a big help in
33:10
getting a closed form as to how many the average sensitivity and the these worst cases
33:23
so this is where the suddenly the the alternations become a big help okay so let's look at an example now where the average where the number of rules is even
33:43
and we look at the first two rules the green ones and according to the lemma from the previous slide the number of cases where changing those bits changes the value of the function is two to the tenth and for the red ones it says it's two to the eighth and and so forth down to the
34:08
black which our lemma doesn't tell us what that is but by inspection we see that that's two to the two so we simply add these numbers up using high school algebra and we get the calculation
34:25
shown in black here and when i put in the number and instead of the number 10 we get the thing shown in blue with on the bottom on the other hand we wanted to look at what happens when
34:48
n is odd and so starting from the top or lemma now tells us that they're two to the 11th cases where the changing that bit changes the value of the function
35:02
and two to the ninth and so forth down to the very bottom where by observation it's two to the one so to get the average sensitivity in this case again we do our house high school algebra and what we get is shown in blue on the bottom when we put in put in an odd value of n
35:30
and so that shows then we have now we haven't proved that this is the worst case sensitivity and i'm not going to prove it's the worst case so i'm just asserting that it takes
35:48
takes work to make that proof but we have shown that when the output of the even numbered rules is opposite of the preceding rules then these are the worst case sensitivities so
36:00
not only have we shown it for the alternating example but we've covered all cases where the average ball cases were the worst case and the worst case coming about because the even numbers and were treated the same as the the odds and we can stop here and thank you for listening
36:38
thank you very much we have time for a few questions so i have a question yes the results
36:46
how do they re-translate to biology what is the is there any implication for biology is this well there some biologists think so okay that's the
37:04
i i think the idea is that either one gene determines what happens or the remaining genes do i think it comes from that okay and also sensitivity and this all and and yes okay
37:27
okay well uh tantalizing with an eye me uh means to uh make make canals
37:49
so to make canals it's a legitimate word so for example if you have a a riverbed and you want to have ships go along and you you make it into yeah you transform it so that
38:11
so that it makes uh so ships can travel on and i think it also means that dig separate canals uh but what that why the why
38:20
hypologist would use that term i have no idea last question no more question and thanks again