5/6 Nilsequences
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 18 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/16420 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
5
6
7
8
9
10
11
12
13
14
15
16
17
18
00:00
Hadamard matrixHadamard, JacquesLarge eddy simulationMathematicsMathematical analysisSequenceTheory of relativityGame theoryFrequencyAverageWell-formed formulaProof theoryFunctional (mathematics)Group actionComplex (psychology)TheoryInductive reasoningLogical constantNormal (geometry)Square numberPoint (geometry)Social classLatent heatVertex (graph theory)Multiplication signPolynomialCubeMathematical inductionOrthogonalityPower (physics)ResultantSkalarproduktraumTheoremThetafunktionParallelogramHelmholtz decompositionBound stateSummierbarkeitChi-squared distributionSupremumClassical physicsInverse elementParseval's identityLecture/Conference
09:29
SequenceMathematicsOrder (biology)Symmetry (physics)ModulformVariable (mathematics)AverageDerivation (linguistics)Inequality (mathematics)Arithmetic meanEnthalpyFunctional (mathematics)Power (physics)MereologyThetafunktionZyklische GruppeLinearizationLogical constantRange (statistics)Parameter (computer programming)Normal (geometry)Cross-correlationAbsolute valueSocial classSymmetric matrixCartesian coordinate systemMany-sorted logicChi-squared distributionExpressionMultiplication signTheory of relativityCategory of beingINTEGRALPhase transitionLine (geometry)Group actionWeightSpacetime1 (number)Lecture/Conference
18:59
Physical lawFraction (mathematics)Genetic programmingTheory of relativitySet theoryModulformCategory of beingAveragePhase transitionDerivation (linguistics)Reduction of orderConstraint (mathematics)Functional (mathematics)Geometric seriesHomomorphismusMereologyTheoryPrime idealSubstitute goodTerm (mathematics)Logical constantSummierbarkeitAdditionSocial classPoisson-KlammerGraph coloringSurfaceCalculationMultiplication signBounded variationBeta functionProof theoryMultiplicationStandard errorTheoremThetafunktionLinearizationGoodness of fitSimilarity (geometry)Normal (geometry)Many-sorted logicChi-squared distributionExpressionSign (mathematics)Alpha (investment)1 (number)Lecture/Conference
28:28
Curve fittingMaxima and minimaMathematical analysisFraction (mathematics)SequenceLie groupNatural numberSymmetry (physics)Product (business)ModulformCategory of beingAverageDimensional analysisDerivation (linguistics)ComputabilityAbelsche GruppeAlgebraProof theoryGleichverteilungGroup actionHomomorphismusLogarithmExtension (kinesiology)MereologyNilpotente GruppeTheoryStandard errorTheoremGeometry of numbersParameter (computer programming)Normal (geometry)Gamma functionSummierbarkeitSocial classArithmetic progressionPoisson-KlammerCircleMany-sorted logicChi-squared distributionExpressionDifferent (Kate Ryan album)Object (grammar)Multiplication signGroup representationGeometryNumerical analysisWave packetState of matterUniqueness quantificationDivisor1 (number)Lecture/Conference
37:57
Hand fanPhysical lawMoving averageSequenceNumber theoryIncidence algebraModulformAverageInfinityState of matterDichotomyFunctional (mathematics)Power (physics)ResultantSubsetQuadratic equationLogical constantParameter (computer programming)Strategy gameCross-correlationPopulation densitySocial classArithmetic progressionEqualiser (mathematics)Chemical equationAlpha (investment)Object (grammar)Multiplication signBounded variationAnalytic number theoryNumerical analysisPhase transitionInequality (mathematics)TheoryTheoremThetafunktionDivisorNormal (geometry)Bound stateAdditionLengthLecture/Conference
47:27
Physical lawForceNumerical analysisIterationAverageFiber bundleInfinityAnalogyProof theoryDichotomyExponential smoothingFunctional (mathematics)Loop (music)Nilpotente GruppeTheorySubsetTerm (mathematics)TheoremGoodness of fitParameter (computer programming)Normal (geometry)Strategy gameBound stateAbsolute valuePopulation densityArithmetic progressionCartesian coordinate systemMany-sorted logicProcess (computing)Chemical equationAlpha (investment)Cue sportsDifferent (Kate Ryan album)Element (mathematics)Multiplication signErgodentheorieMathematicsGame theoryVarianceState of matterPoint (geometry)Direction (geometry)Right angleLecture/Conference
56:56
Algebraic structureSequenceNumerical analysisSet theoryInfinityUniformer RaumArithmetic meanBeta functionProof theoryFinite setFourier seriesFunctional (mathematics)GleichverteilungGroup actionComplex (psychology)Lemma (mathematics)Standard errorTerm (mathematics)TheoremRegular graphParameter (computer programming)Normal (geometry)Helmholtz decompositionFood energyGamma functionGlattheit <Mathematik>Population densitySocial classLimit of a sequenceOpen setArithmetic progressionCoefficientAlpha (investment)LengthMultiplication signInverse elementFrequencyFree groupScaling (geometry)Independence (probability theory)4 (number)Cartesian coordinate systemClosed setLecture/Conference
01:06:25
Physical law3 (number)ManifoldNumerical analysisTheory of relativityGame theoryFrequencyCategory of beingConstraint (mathematics)Group actionExtension (kinesiology)MereologyTheoryProjective planeReal numberParameter (computer programming)Factory (trading post)Connectivity (graph theory)4 (number)Distribution (mathematics)Point (geometry)Arithmetic progressionSequelDirection (geometry)Latent heatAlpha (investment)Event horizonFiber (mathematics)Different (Kate Ryan album)Object (grammar)Multiplication signPolynomialSet theoryAverageAbelsche GruppeBeta functionTerm (mathematics)TheoremTorusSubgroupGamma functionSquare numberCircleVertex (graph theory)Lecture/Conference
01:15:55
Algebraic structureSequenceGraph (mathematics)CombinatoricsManifoldNumerical analysisPrime numberSet theoryGraph theoryCategory of beingAverageINTEGRALTriangleInequality (mathematics)Proof theoryFunctional (mathematics)Complex (psychology)Power (physics)Lemma (mathematics)MereologyProjective planeResultantSubsetTerm (mathematics)TheoremNichtlineares GleichungssystemFormal power seriesMeasurementRegular graphParameter (computer programming)Linear equationSquare numberPoint (geometry)Population densityArithmetic progressionCircleMany-sorted logicConvolutionExponential functionTowerGraph (mathematics)Alpha (investment)Vertex (graph theory)PseudozufallszahlenFiber (mathematics)LengthMultiplication signSpacetimePosition operatorNetwork topologyEnergy levelAlgebraForcing (mathematics)Group actionTheorySimilarity (geometry)Barrelled spaceStudent's t-testSurjective functionLecture/Conference
Transcript: English(auto-generated)
00:31
And today I want to talk about some things related to Szemeredi's theorem mainly.
00:43
But before I get started on that, I would like to say a few words about the inverse theorem for the Gauss norms, which is something I've mentioned several times in
01:01
these lectures. Let me just remind you of the statement. So suppose that f is a bounded function, bounded pointwise by one. And suppose that it's Gauss' k-th norm is at least delta.
01:33
Then the conclusion is that there is a nil sequence chi of m, which is phi, an
01:46
automorphic function evaluated at a polynomial of class k minus 1, which correlates with f.
02:04
So the inner product of f with chi is at least some positive constant. And I spent, in fact, a whole hour going over the detailed formulation of this,
02:20
which requires you also to put some bounds on the complexity. So with bounds on the complexity and on the smoothness phi. So I've mentioned this many times.
02:41
And it is a very difficult theorem, very long to prove. But I can at least give some ideas of how the proof goes. And there's one thing in particular that I have neglected to mention so far, which is that in the case k equals 2, this is actually very straightforward.
03:01
So I'm going to make a few remarks on the proof. So the case k equals 2 is classical Fourier analysis.
03:22
And when I did some classical Fourier analysis in my first lecture, I got the normalizations very wrong. And oh no, Sophie's left. So if I get them wrong again, there's nobody to correct me. And the point here anyway is that there's a formula,
03:42
the Gauss U2 norm of n to the power of 4. Well, it's a sum of f over parallelograms.
04:13
And actually, I normalized it like that.
04:21
And actually, there's also an additional constant there coming from the fact that I'm not working in a group. But it's basically this. And this can be expanded. So it's constant over n cubed times the integral from 1 to 0 of the average value of the exponential sum over f to the power 4
04:56
times n to the 4.
05:01
So this is a formula that's very similar to the formula I showed you for copies of x, y, and x plus y weighted by f. You can establish it by orthogonality relations, so proof by orthogonality relations.
05:27
And we also have to combine with that the Parseval identity. So Parseval identity.
05:43
So that will tell you that the L2 norm of f, so the sum over x, f of x squared, is the integral from 1 to 0 of this exponential sum
06:07
squared d theta times n squared. So if you compare these two facts, well, the result follows quite easily.
06:21
So under the assumption that the Gauss norm of f is at least delta, so f in U2 of m is at least delta, then that implies that the L4 norm of that Fourier transform integral, so that is bounded below by a constant
06:51
times delta to the 4 over n.
07:02
But on the other hand, this is at most the sup over theta of this exponential sum squared times the L2 norm.
07:33
And if you compare those two sides and use the Parseval identity, you get the statement that I claimed.
07:41
So using Parseval gives the claim. And in fact, and this is a slight quirk of the k equals 2 theory, it gives a very specific type of nil sequence with chi of m equals e of n theta.
08:06
So this is a very specific type of class one nil sequence.
08:25
And what's happened here is, well, this is a nil sequence with a vertical frequency. I introduced the notion of vertical frequency before. And while the inverse theorem is true with the ostensibly stronger statement
08:42
that chi has a vertical frequency, and you just get that for free from the inverse theorem as I've stated it, plus a decomposition of chi into nil sequences with a vertical frequency. So somehow those two steps get concatenated together inside this proof.
09:00
Anyway, so the point is that this is a relatively straightforward fact. And then the general case is done by induction on k. For the general case, we'll work by induction on k
09:31
using the crucial relation, which I'm pretty sure I mentioned before. So using the fact that the uk norm,
09:42
or the uk plus one norm to the power two to the k plus one is an average of uk norms of derivatives of f. So it's the average over h of the derivative, multiplicative derivative
10:11
to the power two to the k, where here delta h f of x is f of x, f of x plus h.
10:23
There are slight technical inaccuracies in what I've written here to do with the fact that this range, the fact that the interval one up to n is not quite a group, but that statement would be literally true if I was working in the cyclic group
10:41
in which I've embedded this interval. So if you have that fact, then it follows fairly easily. Hence, if the Gauss uk plus one norm is at least delta, then that implies that many of the derivatives of f
11:01
have a large uk norm. So you'd get a statement of the form bigger than delta to some constant. I didn't work out what the constant is. No worse than two to the k,
11:20
but probably better than that for many values of h. And then, well, if you have a, we're working by induction, so you can conclude from this
11:42
that this derivative correlates with a nil sequence of class k minus one. So this implies that the average over x of f of x, f of x plus h bar, chi sub h of x is,
12:09
it's bounded below by some constant for these same values of h.
12:24
And then, well, so all of that's relatively straightforward. The difficult bit is then integrating this assertion about derivatives to get an assertion about the function itself. So to get some kind of feel for what that task might involve,
12:42
and we then need to integrate this statement. Well, let's consider a very simple example. So suppose that the function f I started with
13:02
was just a quadratic phase. So suppose that f of x is e to the two pi i theta x squared.
13:22
Then what could I take chi to be inside this statement? So I can take, I believe that I can take chi sub h of x to be e to the two theta h x. And then actually, the left hand side here would just be,
13:45
it would actually have absolute value one, so it's just a constant, depending on h. So you can see that in this example, this is by no means, the way that this varies with h is by no means arbitrary. So this varies linearly in h.
14:10
And one would expect something like the same phenomenon to be true generally, at least for the top order terms, whatever that means, of this nil sequence chi sub h.
14:22
So we hit upon the idea, suggests the idea of establishing, trying to establish
14:41
some kind of linear behavior in chi sub h. So we need to, we're going to have to do that. Well actually, this has another property,
15:03
which is that it's symmetric in h and x. And it turns out that you also have to establish some kind of version of that property as well. So also, this comes a bit later down the line, so symmetry in h and x.
15:24
So in other words, you've got to gather evidence that this chi sub h of x is behaving as it should if it really is a derivative of something that's of class one greater. So how is this done? Well, the argument is a beautiful argument due to Gowers
15:46
that gives you a handle on this. And actually, what I'm going to show you is a simplification of Gowers' argument, which reduces some of the magic of it, I think.
16:01
So let's take that assertion, let me give that a name. Actually, well, there should be absolute values around there, but let me just ignore that. And in fact, if you just twist the whole thing by an appropriate phase, you can ignore that.
16:22
So I'm going to ignore them. And from star, I just get that the average over h and x of f of x, f of x plus h times chi of hx is bounded below.
16:45
And now make a change of variables. Let me just be doubly sure which one. Yeah, so I'm going to put zx equals n and x plus h equals m. I suppose I didn't really need the first change of variables.
17:04
And then what you get, that implies that the average over n and m of f of n, f of m chi sub m minus n of n is bounded below.
17:27
And again, I've been a little bit vague about the ranges that these things go over, but that's a purely technical matter. So now we can apply the Cauchy-Schwarz inequality. In fact, the way in which Cauchy-Schwarz is applied
17:40
is very similar to the way it's applied in many other parts of the theory, but I'm not sure I ever gave any details in any particular case. And maybe I'm not going to give any details now, in fact, but I think it's sort of clear what you do. So you're going to apply Cauchy-Schwarz first in the n variable,
18:01
and that allows you to delete f of n, because it's bounded, and then square what remains. And then I apply it in the m variable plus a new dummy variable that came out of the squaring, and that allows me to delete these ones as well. So by two applications of Cauchy,
18:24
Cauchy times two, that implies that the average over m, m primed, n, n primed of chi, m minus n, n, times all the appropriate dashed expressions
18:47
with appropriate bars is bounded below. And what you can see about this is it's a statement only involving chi. So f has disappeared, and note that f has disappeared.
19:15
And that's good because I want to prove some statements about how chi varies with h.
19:22
Unfortunately, it's not the most pleasant-looking statement to have to deal with, but I can make it look a bit more pleasant with another substitution. So let me remember what that is.
19:41
So if I set m minus n equals h one, minus n, h three, h four, and h two,
20:06
this expression then becomes that the average over h one, h two, h three, and h four times the average over n of chi sub h one
20:23
and chi sub h three, and chi sub h two, n plus h one minus h four, chi sub h four, and n plus h one minus h four is bigger than one.
20:42
And actually, there's a constraint on these h one, h two, h three, and h four. They must satisfy this additive relation. So it may not actually look any better
21:01
than what I had before, but let me show you now what happens in a specific case. So suppose that we were in the case of the U3 norm. So suppose that k equals two. So when we did the induction, the derivatives would all correlate with a linear phase because that's the U2 norm.
21:23
And so in that case, chi sub h of n is e of theta h n. And if you substitute this in here, well, you get a geometric series. So substituting in and summing the geometric series,
21:59
well, what you get is,
22:02
so the phase will be just theta h one, and actually, there should be some bars. So theta h one minus theta h two minus theta h three plus theta h four is the phase that you are averaging over. And when you sum that GP, it's negligible unless that phase
22:21
is very, very close to zero. So if you do this calculation, what you get is many, so more than a constant times n cubed, quadruples h one, h two, h three, h four,
22:47
with which are what we call additive quadruples. So with h one plus h two equals h three plus h four. So that's called an additive quadruple.
23:08
And with a very similar property for theta. So theta h one minus theta h two minus theta h three plus theta h four
23:27
is basically bounded by one over n. These symbols here contain dependence on the delta. And actually, I'm not entirely sure I've got my signs right. Let me, I'll stick with that for now.
23:46
So this now looks a little bit more suggestive. I hope you agree. It's sort of suggesting that theta, well, it really does look a little bit like a homomorphism. A genuine homomorphism would take a relation like this
24:00
and preserve it. It would send it to zero. So it suggests that the map h maps to theta of h is behaving like a homomorphism.
24:24
Well, that's good because I wanted to show precisely that the variation of theta h in h is linear. But unfortunately, it's not, well, actually, fortunately, because otherwise the whole theory would be wrong,
24:43
this does not imply that it's a genuine homomorphism. So there are, so examples of maps like this which satisfy these two properties,
25:01
but which is not a homomorphism would be a map of the form. So theta h is something like alpha brackets beta h where this bracket is the fractional part as usual.
25:26
So why would that satisfy that property? Well, restricted to the set of h for which this fractional part is less than 1 10th, let's say, this then is a genuine, this will be a homomorphism because the fractional part is additive
25:41
for such small fractional parts. But it's not a globally additive function. So this is not a genuine homomorphism. Now we don't, just examples is not good enough. But it turns out that there's a converse to that example.
26:19
So actually, more or less, those are the only examples.
26:22
And that's the key, that's the heart of the proof actually.
26:54
So a relatively deep fact
27:00
that these are the only examples. These and related examples are the only ones. So to state a precise theorem, we have, so theta h would equal
27:27
a sum of a few of these bracket terms. And then you have to allow an error tolerance as well because the assumption has an error tolerance.
27:42
Where d here is bounded and this is for a large fraction of h.
28:04
So this fact, well morally is in the work of Tim Gowers on this topic. And it uses tools from additive combinatorics, which I don't want to go into in these lectures. So proof uses what's now called
28:21
the Balog-Semeradi-Gowers theorem and Freiman's theorem and some facts from the geometry of numbers. Well, I guess those are inside the proof of Freiman's theorem.
28:43
It's quite a non-trivial fact. I have a sort of sketch proof of a more direct deduction of this form from this, which I haven't fully worked out the details of yet. But it seems as if this is such a natural statement, one shouldn't have to go through all of that machinery
29:01
to get there. But it's, anyway, this is quite deep. And well, something that may strike you if you've remembered the first couple of lectures is that this is not the first time we've seen these bracket objects. So the brackets should remind us
29:21
of something we've seen before. So the brackets remind us of the Heisenberg group. And indeed, if you have a derivative,
29:43
so if chi sub H of N varies in this bracket way, then it is in fact the derivative of something coming from the Heisenberg group. If you also establish a certain symmetry property. So it turns out we can construct nil sequences
30:09
coming from, in fact, you need products of Heisenberg nil sequences,
30:26
which have, so let me call this chi, such that chi sub H of N is equal to E of this bracket.
30:55
But as I say, you can't always do that. You need to establish a sort of symmetry statement as well.
31:09
So we need a certain symmetry of chi sub H of X in H and X. And it's actually not too easy to even state what that should mean,
31:25
even the statement of which is tricky. But that's a sketch of how the proof of the inverse theorem for the U3 norm would go.
31:42
And while it's complicated, but the other thing I want to emphasize about it, and this is one reason I'm giving these lectures and writing the associated notes, is that this proof is just the wrong proof in that the Heisenberg group, which is this very natural nilpotent group associated with the problem,
32:01
is constructed in a totally ad hoc manner by first of all finding some bracket expressions like these, and as I said earlier in these lectures, somehow one feels one should never really be working with bracketed expressions if you can help it. It's the nilpotent objects that are more natural.
32:22
And then you just construct by inspection the Heisenberg, the nil sequence on the Heisenberg group which gave rise to those. So it doesn't seem natural. And while I would like somebody to find a more natural argument, there's a different approach to all of this by Szegedy. So it's the two by two Heisenberg group?
32:42
This one, the three by three. Yeah. So you could make each one of these from a copy of the three by three Heisenberg. You can prove actually directly, that's in a sense no surprise, you could prove just by looking at
33:03
doing some simple computations with class two Lie algebras that you can make every example from a Heisenberg example up to some small errors. So the Heisenbergs, you know in advance that they're going to be enough. So you don't mean that the abelianization
33:20
has large dimensions, bigger than two? Well, the abelianization of a product of D Heisenbergs has dimension two D. I think you're happy taking products. You need to take products sometimes. Yes. You could also realize this as coming from a larger dimension class two group
33:42
that is not a product of Heisenbergs. So there's a certain amount of flexibility. It's not a unique representation in each case. The dimension here is related to the D or given D? Yes. So this D, we have to tolerate this D.
34:00
Roughly what happens is, so here's a one dimensional bracket homomorphism and this will be a homomorphism maybe half the time whenever the fractional part of this is less than one between minus a quarter and a quarter. And if you have D brackets,
34:21
this should be a homomorphism a fraction two to the minus D of the time. So you certainly need to let D grow like log of one over delta. You have to let it grow. This is a big difference between the class two and the class one theory. In the class one theory,
34:40
you could always just use these objects on the circle R mod Z. And here you need to let the dimension go much higher. So I was going to say that there is another approach to these theorems by Szegedy and Camarena Szegedy, which is quite different. And in a sense is more natural
35:01
because you see an open group in a more conceptual way. But it's somehow equally difficult in the details to this. So it doesn't provide a, I think both arguments are not the right argument for the proof of the inverse theorem.
35:22
There are ways, so in the abelian case, you can do is a free extension. Like decompose into a sum of, it's natural sum. So is there a way that something similar might take place, like also you could decompose?
35:44
Well, this is a kind of a dream of mine that one could find an appropriate for your analysis of these class two objects. But it would have to be very exotic because you don't have your gamma mod G is not given to you to begin with. So you can't just decompose L two of that.
36:02
And in fact, you know that you are going to need to consider many of those G mod gammas. You have to allow arbitrarily large dimension. Yeah. So this already, I don't, I have no real ideas
36:22
of what such a theory could look like. But I do believe there's something to be said that's different to this inductive approach. This was just the U3 norm for the, so one respect in which the analysis here was very straightforward is that
36:41
I got to just sum a geometric progression in this part of the argument. And for the U4 norm, I won't get to do that. I'll now have an average over objects which are themselves class two. And to evaluate those, you need to use the theory that I mentioned last time
37:00
about equal distribution of nil sequences. And it gets, it's much more difficult. And then it gets still harder for the U5 norm and higher.
37:30
So Tim Gowers did prove a statement for all of the UK norms. And I want to state what he proved. So Gowers' local inverse theorem.
37:48
So this dates from about 1998. And this is also very difficult, but it is, it gives, I think it's easier, at least in the U3 case,
38:02
and also gives better bounds than the full inverse theorem. So the local inverse theorem says the same thing. So suppose that the f in the UK norm is at least delta. Then, well, he doesn't get f correlating with an object
38:23
on all of the interval one up to n, but rather some local correlations. And in fact, if you're formulating things that way, the functions that you correlate with locally can just be the constant function. So we can partition one up to n
38:46
into progressions of roughly equal lengths. So I mean, all of the lengths are within a factor
39:02
of two of each other or something. And the lengths are all at least n to some power, some small power depending on delta. So n to the delta to some constant depending on k.
39:23
The dependence here is also quite reasonable, but I won't mention that. So they're much smaller than the original interval you started with, but they're still of size tending to infinity with n. This is definitely not the same d, by the way. Let me call that m instead, such that the average of f over those progressions
39:53
is bounded away, are bounded away from zero.
40:10
So it's much less information. It's just very local correlations. Now it seems as if this is a stronger theorem in that you get local correlations
40:20
with just a constant function, but actually this can be really relatively easily deduced. So it's a relatively easy consequence of the global inverse theorem, which I've just been calling the inverse theorem.
40:45
And the reason for that is that any nil sequence is actually locally constant. So any nil sequence phi of p of n can be shown to be locally constant
41:06
on progressions of that length. Although I should say,
41:20
I mean I'm not claiming that Gower's result is a trivial consequence of some result that has been established by Tao and Ziegler and myself. Rather, our result requires all of the techniques essentially that Gower's used, and then some more. And actually it also gives weaker bounds. So if you don't care about bounds.
41:53
So, well for anybody who's done a little bit of analytic number theory, I'll leave it as an exercise for you
42:04
to understand why, say, E of theta n squared is locally constant on progressions.
42:23
I'm seeing as I'm setting it as an exercise, let me be precise about what I mean by that. So i.e., we have n can be partitioned as a union of progressions, each of length, at least n to some small constant. Such that on each of them,
42:41
the variation of E of theta n squared, the diameter, sub pi of E of theta n squared is less than n to the minus a constant. So that's the kind of statement
43:01
that's true more generally for nil sequences. But in this case, you could establish that by standard techniques involving Weyl's inequality and so on. It's not easy, but it's classical. Okay, so are there any questions on what I've said so far? So Garrett's theory on this one?
43:23
Yes, so Garrett has two papers. There's the paper, which I feel that everybody should read, masterful paper on four-term progressions, where he proves basically this theorem for the U3 norm. And actually, he only states it,
43:41
he doesn't state it quite like this, although it's in the paper. He has some additional quadratic phase here. But then if you apply this observation, you actually just get on some smaller progressions, a local constant-ness. And then he has a much longer paper. So that paper's only 23 pages, truly remarkable piece of work.
44:02
And then the longer paper of 129 pages is doing the general case. And he runs into difficulties that are somewhat parallel to the difficulties that we have with the U4 norm and higher.
44:20
His motivation, I'm going to tell you that now, yeah. So Garrett's motivation was Szemeredi's theorem.
44:50
So this is following Garrett's. So I stated Szemeredi's theorem in the first lecture.
45:01
Let me just remind you, so Szemeredi states that if A is a subset of one up to n, size alpha times n, then A contains a k-term arithmetic progression,
45:25
provided n is large enough, n naught of k alpha. So Gower's drew inspiration from an argument of Klaus Roth in the case of k equals three.
45:44
And this is called the density increment strategy. So the idea of the density increment strategy is to prove a dichotomy. So you establish a dichotomy,
46:07
and the dichotomy is of the form either, well, two things happen. So either A contains many k-term progressions,
46:29
or you have a density increment. So you can pass to a sub-progression on which the density of A is increased. So there is a sub-progression p of n
46:50
of psi is bigger than n to some constant, to the alpha to the constant,
47:00
such that the density of A on p is bigger than alpha by a substantial amount. So there's at least alpha plus alpha to some constant. So the constant will depend on k here.
47:22
So if you establish such a dichotomy, then you can do an iterative argument. And you start with A. If it contains many progressions, then you're finished. Otherwise, you pass to a sub-progression, and then you apply the same argument to that.
47:40
So either that contains many progressions, or there's a further sub-progression. And then you repeat. But you can only repeat a bounded number of times in terms of alpha before you end up with something ridiculous here, because the density is always at most one. So you apply this iteratively.
48:03
So an iterative application of this, as I just said, I won't write it all out, gives the conclusion. And actually, if you figure out what the dependence is,
48:22
with so long as n is bigger than a double exponential of one over alpha to some constant,
48:41
you may ask where a dependence could come in. And the answer is that, actually, there's a third sort of slightly trivial, you can't really have a third element of a dichotomy. So let's call it a trichotomy. So there's a sort of trivial option,
49:01
which is that n was too small, n is tiny. So that's how the loop might fail. So that's the strategy. And to establish this dichotomy, what you show is that if this fails,
49:24
this fails, if fails, then essentially, the characteristic function of a has a large Gauss norm.
49:41
And I'm dealing with k-term progression, so I should care about k minus one norms, alpha to some constant. And I will just mention why that's true in just a second. And then that implies that you have a density increment
50:01
by the local inverse theorem. So that's how the deduction goes. This bit is actually quite easy.
50:20
You sort of almost can see that the statement that comes out of the local inverse theorem, namely the average of this characteristic function of a minus alpha, well, if its average on a progression is large, that's precisely the same thing as saying that the density of a on that progression is large. And there's a trick to get rid of the absolute value signs, which you need to do.
50:45
Any questions on that?
51:18
So how does the first step that I mentioned there go?
51:23
It's a good job I'm only giving six lectures. I've more or less destroyed all of IHS's chalk to tiny bits. Yeah, that goes via what I call the generalized von Neumann theorem. So recall the generalized von Neumann theorem.
51:49
So if I write T sub K of f1 up to fK to be what I get by counting K10 progressions
52:02
with these functions, then I mentioned that this is bounded by the Gower's norm.
52:21
So Gower's norm of this, sorry, this is bounded by the Gower's norm mod of this. Have any chosen one of those functions fi?
52:48
And as I mentioned before, this itself is, you can't use this immediately to count the number of progressions in a set, but if you're just slightly clever about it. So we'll apply this and split
53:01
the characteristic function of a as alpha plus f, with f of course being, f is what we call the balanced function of a. And then just by multilinearity,
53:25
the number of progressions were T of 1a to 1a is going to be alpha to the K plus a bunch of terms, the final one of which will be the one with all f's in it
53:43
but they'll all have at least one f inside them. And each of those terms is bounded by the Gower's norm of f. So this will equal alpha to the K plus big O of the Gower's norm of f.
54:03
And at this point, the dichotomy is clear because the left hand side is counting the number of progressions of a. If it's small, well then this term here would have to cancel with this term here.
54:21
So alpha to the K is not small, it's not zero. So the only way you can have no progressions in a is if the Gower's norm cancels the main term. Any questions on that? So this easily implies the dichotomy.
54:56
So that's all I'm going to say about the Gower's proof of Semirides theorem.
55:01
So what I want to do to finish today is talk about a more exotic variant of Semirides theorem that I mentioned in the first lecture that really uses much more comprehensively all the theory of nilpotent groups.
55:30
So let me state the theorem again. So this is a theorem of myself and Tau from 2010.
55:46
It's an analog of a theorem that had previously been established in a Gothic theory. Although the conclusion there is quantitatively, it's weaker, but it's analogous by Berkelson, Hostin, Krah.
56:04
So it's a kind of, it's a sort of strengthening of Semirides theorem for three and four term progressions. So suppose that A is a subset of one up to N. Size alpha N. Then there is a common difference D, not zero,
56:29
such that A has at least alpha to the K minus little o one as N tends to infinity times N.
56:42
K term progressions of common difference D. And this is only if K is three or four.
57:04
Well, it's also true if K is one or two, but I think I can leave that as an exercise. And it's false. An example of Rougeau shows that it's false where if K is five or higher.
57:20
So it's a strange kind of theorem. And this really uses, as I say, some structure of nepotent groups. I can only really sketch how this goes just to give you some idea.
57:45
So the first key ingredient to this is something that I have not had time to tell you anything about. And this is something that we have given a slightly strange name to.
58:00
It's called the arithmetic regularity lemma. And I'm not gonna state it precisely because it's not particularly straightforward to.
58:20
But what it states is that an arbitrary function, so an arbitrary bounded function from one up to N to the complexes can be decomposed as,
58:50
well, we tend to write this as F sub nil plus F sub small plus F sub uniform.
59:03
But the basic idea is that the F sub nil is a nil sequence of class K minus one.
59:22
It's actually, I'll tell you what, I should be more disciplined about whether I use K or S. I mean, it's just a dummy variable, but I've used, K should be the length of the progression in which case I'm interested in Gauss K minus one naught. So let's suppose this is a nil sequence of class S.
59:42
So fix S bigger than one. I can decompose an arbitrary function into a nil sequence of class S plus some things that really don't concern me too much in a typical application, although they can be a bit annoying. But this is small in L2, small in little L2.
01:00:00
of N, which for practical purposes means you can always ignore it. And this one here is really small in the Gauss US plus one norm.
01:00:21
Now that doesn't mean that it's small, certainly doesn't mean it's small in L infinity. So a random plus or minus one function could be this bit here. So in fact, if F is a random plus minus one function, this decomposition would look like zero plus zero plus F.
01:00:46
But this bit can be ignored if you're doing things like counting progressions. So if S is equal to, so if I'm counting K term progressions, I want the Gauss K minus one norm.
01:01:02
So if S is, so which is two-step null sequences. If S is K minus two, this means I can ignore this
01:01:22
when counting K term progressions. So one way to think of this, I mean, of course there is a precise statement, but you can think of it as, if all I care about is counting arithmetic progressions,
01:01:41
then I can assume that every function is a null sequence of class K minus two. The proof of the regularity lemma involves an iterated application of the inverse theorem. Inside what's called an energy increment argument. So you just keep applying the inverse theorem every,
01:02:02
and eventually you spit out a decomposition like this, because something called the energy has gone up every time you do that. Much like the density increment argument, that can only happen a bounded number of times.
01:02:20
Another way to think of this is, in the class one world of Fourier analysis, this is a bit like decomposing a function into its big Fourier coefficients and the rest of the Fourier coefficients.
01:02:52
So suppose I've applied that regularity lemma to my given set A.
01:03:06
And actually, or to one sub A. And actually I'm going to simplify quite a bit more than that. I'm going to just assume for further discussion, let's just assume that one sub A of N
01:03:23
is equal to the characteristic function of some nice set S of P of N, where S is a nice set, just like an open ball, an open set,
01:03:43
not a very complicated open set in the Heisenberg. So maybe I should say, I'm going to prove the theorem that I stated in the case K equals four.
01:04:02
And in the case K equals four, I apply this theorem when S equals two. Because then this error term will be small in the Gower's U3 norm. And it's terms that are small in the U3 norm that can be ignored when counting four-term progressions.
01:04:24
So this is an oversimplification of the truth. I mean, whilst I will have a class two nilpotent group, it may not be the Heisenberg. And nobody told me that the function that I take on the Heisenberg is just the characteristic function of a set.
01:04:45
In fact, that's not even a bona fide nil sequence because I defined them to use smooth functions on G mod gamma. But for the discussion, let me consider that. And let's suppose that P, so P of n is just a sequence.
01:05:03
I don't really mind what the top term is, just something. And suppose that alpha and beta are highly independent over Q. So one alpha and beta highly independent over Q
01:05:24
which means by what I said last time that P of n is very equal distributed. Highly, I mean, it's not completely equal distributed because it's a finite set.
01:05:41
But it's pretty equal distributed in gamma mod G. Then I stated last time, well, let me give, there are two steps. So the number of four-term progressions in A,
01:06:12
this is a weighted version of that quantity, but still. Well, by my assumption, well, actually this is equal to
01:06:25
the number of four-term progressions involving S.
01:06:40
But I said last time that the distribution of this tuple, so P of n, P of n plus D, P of n plus 2D, P of n plus 3D, that does not equal distribute over G to the four.
01:07:01
It's equal distributed inside something called the whole-potresco group. So it equidistributes in a group, in a subgroup of G to the four mod gamma to the four
01:07:24
called whole-potresco, whole-potresco four of G. And I didn't give the general definition of what that group is. I gave one example in the abelian case. But in this case, which here is going to be
01:07:45
the set of X naught, X one, X two, X three, in G to the four, so that's in Heisenberg to the four,
01:08:00
which satisfied that their abelian parts are in four-term progression. So here, pi is the map, just the natural projection
01:08:30
to the abelian part, which is a two-dimensional R2. And the vertical components, Z of X naught,
01:08:43
Z of X one, Z of X two, Z of X three. Let me be precise about this. So if G is one, one, one, U, V, W, then I'm writing pi of G is U, V,
01:09:03
and Z of G is W. So the central parts are not in arithmetic progression, but they do satisfy a constraint. So the constraint is that minus three Z X one,
01:09:24
plus three Z X two, minus Z X three equals zero. So the example I gave of a whole Petresco group last time, which was for the reals R
01:09:41
with a funny filtration, we only saw this relation here. But now in the Heisenberg, there's a kind of abelian bit and a central bit. Abelian projection and an abelian subgroup.
01:10:01
Now I think I'm going to draw, well, what did I have on R?
01:10:57
So, well, you could kind of depict the situation like this. This is grossly inaccurate in some ways.
01:11:04
So here's the abelian part of the Heisenberg nilmanifold G mod gamma. So this is supposed to be gamma mod G mod the commutator, which is isomorphic to R mod Z squared.
01:11:25
And a gamma mod G itself sits above there. And I certainly do not have the ability to draw this accurately. So let me just, it's a three dimensional object that is circle fiber above.
01:11:41
So these are supposed to be somehow circle fibers above here. Now, a typical four term progression will look like the image of, sorry,
01:12:01
the image of a four term progression under a polynomial map will look like a four term progression down here. And then in the vertical fibers will satisfy this additional relation, this relation here. So it won't look like a four term progression in the fibers, that doesn't even quite make sense, but it will satisfy a constraint there.
01:12:24
Now, what I haven't done yet is pick my common difference. The theorem that I'm trying to prove is about a specific D. So I'm going to choose my D and in fact, to make the whole argument work, I need to choose many D with this property,
01:12:44
average over all such D, such that, well, which don't move too much in the horizontal direction. So the alpha D and beta D are pretty much zero.
01:13:04
So remember, alpha and beta are what occur in the polynomial map P. So a four term progression, if I take a four term progression with that common difference,
01:13:25
so if N, N plus D, N plus two D, and N plus three D is a four term progression with such a D as a common difference,
01:13:44
then it doesn't do anything horizontally down in this torus here. Because D, alpha D and beta D are very, very small. So it will basically just look, the four points down here will be very, very close together and then they'll be,
01:14:01
their images will be four points up here that are essentially in the same circle fiber. So P of N, P of N plus D, P of N plus two D and P of N plus three D is essentially constant horizontally.
01:14:27
They're all essentially the same under pi.
01:15:02
So if I take a point downstairs here, let me call it T, there'll be a circle above it and remember, I'm working with a set S in the Heisenberg group. So S will be some set
01:15:24
and it will intersect that fiber in a slice ST. So ST is supposed to be, ST is equal to S intersect circle fiber
01:15:42
above T. So the number of four term progressions in A for which the common difference D is essentially constant downstairs is essentially the number of solutions
01:16:01
to this equation in these vertical fibers. So let me try and write that down. So the average, the number of four term progressions
01:16:24
and this is only over those specialty. So D special is roughly equal to the average over T of the integral over all solutions to Z naught minus three Z one
01:16:44
plus three Z two minus Z three of the characteristic function of this slice ST. Z naught, Z one, Z two, Z three.
01:17:08
So that's just another way of, that's a sort of formalization of what I've been saying. If you have a progression that's essentially constant in the base, that's the same thing as a solution to a certain linear equation in the vertical fibers.
01:17:22
And I've just said take all the progressions which are roughly constant in the base and then it's equal to the second thing that I just said. So the point is that this has a special property, special positivity property.
01:17:43
So you can actually write this as the average over T of a certain convolution square one S sub T star one S sub minus, well minus three ST of X squared DX.
01:18:07
And by Cauchy-Schwarz that's at least average over T of the integral ST star
01:18:20
minus three ST X DX all squared. But this is just precisely the measure of ST to the power four. So that's the average of the measure in the vertical fiber of ST to the fourth.
01:18:42
And by Hohle's inequality that's at least the measure of S to the power fourth. So these ST are just slices of the set S. And so if I take their fourth powers and average them, I get at least the fourth power of the measure of S.
01:19:00
So that's Hohle. But A is just the set of values of N for which P of N lies in S. And P of N is equidistributed. So that's just alpha to the fourth.
01:19:23
So that's, well it's not an easy argument but that's an explanation of why this theorem that I wrote there is true. And the key thing that I want to emphasize is that there is a certain positivity property that is associated to this equation here.
01:19:42
So if you have a set of measure delta, the measure of solutions to that equation is at least delta to the power four. And if you try and do this for K equals five, everything looks similar except downstairs you would now have, well you'd have a two-step,
01:20:05
no manifold. And the equation in the vertical fibers is no longer this one, but it's the next row of Pascal's triangle. So it would be Z naught minus four Z one plus six Z two minus four Z three plus Z four equals zero.
01:20:24
And that equation does not have the same positivity properties. So you certainly can't use this trick of writing it as a convolution square. And indeed there are examples of subsets of the circle of measure delta for which the measure of solutions to this equation is less than delta to the five. And it's really that that's,
01:20:43
it's that that's underlying Rijer's construction. So he first constructs a set with too few solutions to that, and then defines his A as a set of return times of a cubic to that set. Okay, so that will be it for today.
01:21:00
And then tomorrow I'm going to talk about how to apply these results to the prime numbers.
01:21:23
How does it fail? So this case is larger than five. It's not upper than the five, but it's smaller. It actually would fail very dramatically, I think. Yeah, I don't think it's even that alpha to the C
01:21:40
for any C. It's a little, yeah, it's a little bit like the bare-end construction three-term progressions essentially. So constructing sets with no solution to this is quite a similar task to that. It's very related to the phenomenon that you can have sets of density alpha
01:22:03
with less than alpha to the 100 times n squared three-term progressions. Can you say some more about this decomposition? So this is, it's an ingredient for,
01:22:24
I think you will use tomorrow. I will not use this tomorrow. So this theorem, I don't know if there are any people here who know about graph theory and that sort of combinatorics. This is quite analogous to the Semiradi regularity lemma,
01:22:41
which can be thought of as a way of decomposing an arbitrary graph into structured pieces plus a kind of pseudo-random part. Actually, more accurately, this is analogous to the hypergraph version of that statement. It comes with very, very bad bounds, that's the problem.
01:23:02
And the reason is that for it to be useful, you need to have a growth function in the statement. So the statement would be choose a growth function, which would typically be the exponential function or something, and then you can make this piece smaller than one over that growth function of the complexity of this piece.
01:23:21
If you don't do that, then you don't get to ignore this term when you're counting progressions. So that's why it's kind of complicated to even state it properly. It's like the traditional regularity lemma for graphs is about what are called epsilon regular pairs.
01:23:44
And actually, there's something called the strong regularity lemma, which this is equivalent to, which is best seen as a statement more like this. So you can actually find in various notes of Terry Tower the regularity lemma stated precisely like this for graphs.
01:24:00
So the structured piece is what piece here? The structured piece in here. So you're saying this is deduced from the regularity lemma? It's not, it's proved by a similar argument. And both arguments can be seen as an effectivization
01:24:21
of projection in Hilbert space. So what you really want to do is decompose, you've got L2 of N, and sitting inside there, you have something like L2 of G mod gamma, something like this. But a quantitative version of that, quantitative version of that, and you want to project. Well, you don't have L2 of G mod gamma,
01:24:41
you have the space of nil sequences. But if you're working with nil sequences of fixed complexity, this is not an algebra. But this is somehow a proxy for projection onto that factor. And actually, the proof is closely analogous to the proof of the projection lemma in Hilbert space, where you just keep decreasing the length.
01:25:04
It's the same proof, essentially.