W.T. Gowers

Udruženi sadržaj Gowers's Weblog
Mathematics related discussions
Ažurirano: prije 7 sati 38 minuta

EDP12 — representing diagonal maps

Sub, 2010-03-13 23:57

In this post I want to use my unfair advantage as host of this blog to push an approach that is the one that I am currently most interested in. So let me precede it with the qualification that although I shall present the approach as favourably as I can, it may well be that Moses’s SDP approach or Gil’s polynomial approach will in the end turn out to be more fruitful. I should also add that this new approach makes use of a crucial insight from the SDP approach, which is the idea that one can prove a result about all sequences , including our troublesome friend 1, -1, 0, 1, -1, 0, … if we bound the discrepancy below by an expression of the form .

Lewis’s theorem

Before I explain the new result, I’d like to discuss a result from the geometry of Banach spaces. (I don’t need to do this, but it is a very nice result and this is a good excuse to write about it. If you just skim this section and take note of the definition of “well spread out” below, that will be enough to follow the rest of the post.) The result is a special case of an important theorem of Dan Lewis, though the proof that I shall sketch is a little different from Lewis’s.

Let be a symmetric convex body in . The minimal volume ellipsoid of is, as its name suggests, the ellipsoid of minimal volume that contains . Now if is this ellipsoid, then obviously the boundary of will touch the boundary of , or else we could just shrink by a small amount and have a new ellipsoid with smaller volume that still contains . (For what it’s worth, this argument relies on compactness, which guarantees that if the boundaries of and are disjoint, then there is a positive distance between them.)

However, a moment’s thought suggests that we ought to be able to say more. Suppose, for instance, that is a sphere and that its boundary touches the boundary of only at two antipodal points. It feels as though we ought to be able to expand in the subspace orthogonal to those points and get a larger affine copy of inside . Note that the problem of finding the minimal volume ellipsoid that contains is equivalent to the problem of finding the maximum volume linear copy of contained in the unit sphere, where by “linear copy” I mean the image of under a linear map.

More generally, it ought to be the case that if we take the maximum volume linear copy of inside the unit sphere , then the contact points (that is, the points where the boundary of touches the boundary of ) should be “well spread about”. The intuition would be that if the contact points are not well spread about, then we could find some big area of the sphere that is nowhere in contact with and expand into that area (perhaps very slightly contracting in other places in such a way as to increase the volume in total).

To turn this intuition into a formal proof, we need to start with a formal definition. When do we count some unit vectors (the contact points) as being “well spread about”? There turns out to be a very nice answer to this: define the unit vectors to be well spread about if there are positive scalars such that the identity matrix . Here, I write for the matrix . One could alternatively think of the as column vectors and write for the sum instead. Either way, if is any vector, then . Another small remark is that saying that can be written as with positive is equivalent to saying that belongs to the convex hull of the , since the trace of is , and this implies that the have to add up to . A more geometrical way of expressing this condition is as follows. A well-known property of orthonormal bases is that each vector is equal to the sum of its projections on to the 1-dimensional subspaces generated by the basis vectors: that is, for every . The unit vectors are well spread about if we can find weights such that every is equal to the weighted sum of the “coordinate projections”: that is, we need positive constants such that every is equal to .

Lewis’s theorem is the statement that the contact points of and the maximal volume copy of inside are well spread about in this sense. (Or rather, Lewis’s theorem is a more general statement of which this is a special case.) Before I sketch the proof, let me point out that it implies Fritz John’s famous result that for any symmetric convex body there is an ellipsoid such that . To show this, let be the Euclidean norm and let be the norm with unit ball . Since we know that for every . If we can write as , where each is a contact point, then since each is a unit vector in both norms, we know that . By Cauchy-Schwarz, the square of this is at most the product of and . But the latter expression is equal to . Therefore, , so , which is equivalent to saying that . And this obviously implies John’s theorem.

And now for the proof of the assertion about the contact points. Let us suppose that is not in the convex hull of the matrices with a contact point. Then by the Hahn-Banach separation theorem there exists a linear functional that separates from all such matrices. We can phrase this in terms of the matrix inner product , which is the trace of . Then the linear functional can be thought of as a matrix with the property that , which is times the trace of , is 1, whereas there is some such that for every contact point . Now is equal to .

Now if is any matrix, then the determinant of is . It follows that if the trace of is , then the determinant of is .

Next, we use the fact that for every contact point . If is another unit vector, then

which is at most If , then this is at most . Thus, there is a positive constant , independent of , such that if is a contact point and then . From this it follows that

By compactness, there is some positive distance between the inner symmetric convex body and the set of all points on the surface of the sphere that are not within of a contact point. Therefore, if we perturb by a sufficiently small amount, we will not cause any point not within of a contact point to go inside . But the volume of is equal to the volume of up to and every point within of a contact point maps to a point of norm at most . Therefore, for sufficiently small , the body has volume greater than that of but still lies inside the unit sphere. The proof is complete.

What has this to do with EDP?

The Erdős discrepancy problem asks us to prove that if is any sequence of length , then there is some HAP with characteristic function such that . Now one way one might imagine trying to show this would be by showing that these functions are so spread about that every sequence of norm (that is, every sequence such that ) has inner product at least with one of them. And that one could prove if one had a representation with , since

Actually, there is an obvious problem with this idea, which is that if all the are positive, then unless all the are singletons then there will be positive elements off the diagonal and therefore no hope of representing the identity. However, one can rescue the situation by considering slightly more general vectors such as convex combinations of the set of all s and their negatives. If is such a combination and , then must have discrepancy at least in one of the HAPs that makes up . And the advantage of these vectors is that they are allowed to have negative coordinates.

I had this thought a few years ago when attempting solve EDP and dismissed it once I spotted that the troublesome 1,-1,0,1,-1,0,… example kills the idea. It simply isn’t true that if is a sequence of length and , then has unbounded discrepancy in some HAP, as the troublesome example (multiplied by ) demonstrates.

However, an important lesson from the SDP approach is that this is a less serious problem than it at first appears, because we are not forced to use the norm as a lower bound for discrepancy. Instead, we can use a weighted norm. In the context of this proof, this suggestion becomes the idea that instead of trying to represent the identity, we could go for a diagonal matrix with large trace instead.

To spell this out, let us define a test vector to be any vector of the form , where and the are characteristic functions of HAPs. Suppose that is a diagonal matrix with entries down the diagonal, and that we can write it as a convex combination , where the are test vectors. Then if is any sequence, we know that . But from the representation of we also know that

It follows that there exists such that , and hence, since is a test vector, that the discrepancy of in some HAP is at least as well.

Thus, we can prove EDP if we can find a convex combination , where the are test vectors, that equals a diagonal matrix with unbounded trace.

Moses Charikar points out a modification of this requirement that is somewhat simpler. If with , then , which is a convex combination of matrices of the form , where both and are characteristic functions of HAPs. Conversely, if we write a diagonal matrix as , where the and are HAPs and , then

From this it follows, as before, that if is a sequence then there is some HAP on which the discrepancy of is at least .

The non-symmetric vector-valued EDP.

Why should we believe that good representations of diagonal matrices exist?The best answer I can give to this is that it turns out to be equivalent to a strengthening of EDP that has (I think) no obvious counterexample. Let me give the strengthening and then prove the equivalence.

We have already met the vector-valued EDP: given any sequence of unit vectors in a Euclidean space and any constant there exists some HAP such that has norm at least . The problem that we shall now take is the “non-symmetric” vector-valued EDP: given any two sequences and of vectors in a Euclidean space satisfying the condition for every , there are HAPs and such that . If we insist that , then this reduces to the usual vector-valued EDP (for ).

The reason this looks as though it is probably true (or rather, that it and EDP are “equi-true”) is that if you try to make one of the sequences have small discrepancy by making it small on some HAP, then you have to make the other one large on that HAP. (This argument applies rather better to the real-valued case.) Also, the condition means that the directional bias of the is somehow similar to that of the , which helps the discrepancies of the two sequences to align with each other.

These are fairly feeble arguments, which reflects the fact that just at the moment I very much want to believe this statement and have not made a big effort to disprove it.

How about the equivalence? This again follows from the Hahn-Banach theorem. Let’s suppose that we cannot represent any diagonal matrix with trace greater than as a convex combination of s. Then the Hahn-Banach theorem implies that there is a separating functional, in this case a matrix such that whenever is a diagonal matrix with trace at least , and whenever and are characteristic functions of HAPs. Moreover, this is an equivalence: if such a matrix exists, then we trivially cannot represent as a convex combination of s.

The first condition (that whenever is a diagonal matrix with trace at least ) is easily seen to fail if is not constant on the diagonal, and if it is constant then we see that its value on the diagonal must be at least .

If such a matrix exists, then choose vectors and such that for every and . (For instance, could be the rows of and could be the standard basis of .) Then Therefore, the condition that is always at most 1 is telling us that is always at most 1. If we now rescale by multiplying the by then we have a counterexample to the non-symmetric vector-valued EDP.

Conversely, if we have such a counterexample, then we can set and we have a separating functional that proves that no diagonal matrix with trace greater than can be expressed as a convex combination of s.

What are the difficulties in carrying out this programme?

The main problem we now face is this. If we want to write a matrix with unbounded trace as a convex combination of s, then we need to use fairly large HAPs. Indeed, the trace of is (if I may mix up sets and characteristic functions), so if our convex combination is then we need to be unbounded. This means more than merely the statement that the weighted average size of is unbounded, since some of the are of necessity negative.

In particular, we need the sizes of the and to be unbounded. But if that is the case, then we are trying to write a diagonal matrix with a large trace as a convex combination of matrices that are very far from diagonal, which forces us to rely on clever cancellations.

Having said that, I would also like to point out that the task is by no means hopeless. For example, if and are the Walsh functions (with respect to some bijection between and ), then equals the identity, even though each individual is maximally far from being diagonal. Thus, there is nothing in principle to stop cancellations of the kind we want occurring. It is just a technical challenge to find them in the particular case of interest.

Note that if we are looking for clever cancellations, then we are more likely to be able to find them if we deal as much as possible in points that belong to several HAPs. This suggests, and the suggestion is borne out by the experimental evidence we have so far, that the diagonal matrix we produce will probably be bigger at points with many factors. But one of the nice aspects of this approach is that one can concentrate on getting rid of the off-diagonal parts and not worry too much about which exact diagonal matrix results in the end. Making sure that its trace is big is a far easier task than guessing what it should be and then trying to produce it.

I will end this post by reiterating what I said after Moses first suggested using SDP: it seems to me that we have now “softened up” EDP. That is, it no longer feels like a super-hard problem with major obstacles in the way of a solution. Obviously I can’t be sure, but I feel optimistic that we are going to get there in the end.


Kategorije: Matematički blogovi

EDP11 — the search continues

Ned, 2010-03-07 12:27

I do not have too much to say about the new programme to use SDP to solve EDP, other than that it is still being actively pursued. One remark I would make is that I have rather forgotten recently about keeping the wiki updated, and SDP is a major omission from it. It would be good to have a theoretical account of the method (which I suppose I could paste quite easily from one of my existing posts), and an experimental page devoted particularly to data connected with this approach, with links to all the matrices, sequences and plots that have been produced.

So far, the data has had some expected features (such as the sequence values tending to be larger when is smooth) and some puzzling ones (such as the fact that is much much larger than ). An initial hope for the method was that the experimental data would give rise to a very precise conjecture, but so far that has not happened. There are, however, various avenues that have not been fully explored, and I still have some hope that we will suddenly stumble on some data that we can fully understand.

One way of doing this is to introduce a smoothing. One idea I had turned out to go too far: in the search for a pretty formula I ended up with a version of EDP that was false. For the record, here is that version. One can think of the discrepancy of a sequence as its largest inner product with the characteristic function of any HAP. Those characteristic functions have sharp cutoffs (in that they go up to for some and then suddenly stop), so one might hope to make the problem nicer by smoothing the cutoffs. One natural way of doing this is to look at functions such as , which take the value at and 0 at non-multiples of . For this to be a sensible idea, we need a “smoothed EDP” to be true. That is, we need it to be the case that for every sequence there exist and such that has absolute value at least . However, this is false for the character-like function . The multiplicativity of implies that it will be enough to show this when , so let us fix some and calculate the sum .

We do this in the usual way. First, let us look at non-multiples of 3. That gives us the sum

which equals . More generally, the sum over multiples of that are not multiples of works out as . One can check that the function is increasing in the interval , so if we sum this over all then the alternating series test tells us that the total is at most 1/3, whatever the value of .

A similar argument works if we use a weight of instead, but the calculations are easier. By the alternating series test, we know that is positive and at most 1. Call this value . Then , which is at most .

At one point I observed that if you have a sequence of bounded discrepancy then the Dirichlet series is uniformly bounded for all positive real , and wondered if we could deduce anything useful from this. The fact that has this property shows that we certainly cannot derive a contradiction, though it leaves open the possibility of deducing that the sequence is forced to have character-like behaviour.

In the light of these observations, what hope is there for obtaining nicer formulae by smoothing? The answer is that even if we cannot smooth the HAPs, we can at least smooth the interval of integers we are thinking about as a whole. That is, instead of thinking of discrepancy as a function of (the length of the sequence ), we could take an infinite sequence , associate with a weight , and think of the discrepancy as a function of the decay rate (or equivalently the half-life, which is proportional to ). That is, we would like to prove that for each there must exist such that has absolute value at least , where as .

If we try to prove this using Moses’s SDP approach, then, as Moses points out, there is a nice formulation of the problem. We would like to find non-negative coefficients that sum to 1, and coefficients such that , such that

for every sequence . This is similar to what we were looking at before, but now we have attached some slowly decaying weights to the .

Note that we could think of the previous version of the problem as doing exactly the same, but with weights that equal 1 up to and thereafter. I am hoping that with these smoother weights, we will get nicer numbers coming out.

I would like to end this post by drawing attention to Gil’s polynomial method. This is a completely different approach to EDP, where one constructs a polynomial over a finite field that is identically zero if and only if EDP is true. Rather than explain the idea in detail, let me link to two comments in which Gil himself gives a nice explanation. He introduced the idea in this comment and an interesting variant of the idea in this one. It would be good to add this method too to the list of possible proof techniques on the wiki.


Kategorije: Matematički blogovi

EDP10 — a new and very promising approach

Uto, 2010-03-02 18:40

From time to time, there has been an input into this project that has given rise to a burst of optimism (on my part anyway). Perhaps the first was the rapid discovery of very long sequences with low discrepancy, and especially the fact that these sequences had interesting structure to them. (The length led me to hope that the conjecture might be false, or at the very least that it might be possible to construct sequences with extremely slow-growing discrepancy, and the structure led me to hope the opposite.) I’m probably forgetting a few things, but the next one I remember is Terence Tao’s amazing observation that we could restrict attention to multiplicative functions if we were prepared to change the problem very slightly. We then discovered (though we sort of knew it anyway) that multiplicative functions are not easy objects to understand …

Since I posted EDP9, there has been a development that has radically changed my perception of the problem, and I imagine that of anyone else who is following closely what is going on. It began with this comment of Moses Charikar.

Moses’s idea, which I shall partially explain in a moment (for about the fifth time) is based on the theory of semi-definite programming. The reason I find it so promising is that it offers a way round the difficulty that the sequence 1, -1, 0, 1, -1, 0, 1, -1, 0, … has bounded discrepancy. Recall that this fact, though extremely obvious, is also a serious problem when one is trying to prove EDP, since it rules out any approach that is just based on the size of the sequence (as measured by, say, the average of the squares of its terms). It seemed to be forcing us to classify sequences into ones that had some kind of periodicity and ones that did not, and treat the two cases differently. I do not rule out that such an approach might exist, but it looks likely to be hard.

Moses proposes (if you’ll excuse the accidental rhyme) the following method of proving that every sequence has unbounded discrepancy. I’ll state it in infinitary terms, but one can give finitary versions too. Suppose you can find non-negative coefficients (one for each pair of natural numbers ) that sum to 1, and non-negative coefficients summing to infinity, such that the quadratic form

is positive semi-definite. Then you are done. Why? Because if were a sequence of s with discrepancy at most , then the first term in the above sum would be at most , while the second would be , which contradicts the positivity in a rather big way.

Why does this deal with the troublesome sequences? Because it is perfectly possible (and necessary, if this method is to work) for the sum of the over all that are not multiples of 3 to be finite. So this method, unlike many previous proof ideas, would not accidentally be trying to prove something false.

Note that to prove that the quadratic form is positive semi-definite, it is sufficient to write it as a sum of squares. So EDP is reduced to an existence problem: can we find appropriate coefficients and a way of writing the resulting form as a sum of squares?

Now this idea, though very nice, would not be much use if there were absolutely no hope of finding such an identity. But there is a very clear programme for finding one, which Moses and Huy Nguyen have started. The idea is to begin by using semidefinite programming to find the optimal set of coefficients for large (that is, for a finite truncation of the infinite problem), which can be done on a computer, and which they have already done (see the comments following the one I linked to above for more details). Next, one stares very hard at the data and tries to guess a pattern. It is not necessary to use the very best possible set of coefficients, so at this point there may be a trade-off between how good the coefficients are and how easy they are to analyse. (This flexibility is another very nice aspect of the idea.) However, looking at very good sets of coefficients is likely to give one some idea about which choices have a chance of working and which don’t. Having made a choice, one then tries to prove the positive semidefiniteness.

As Moses points out, if such coefficients can be found, then they automatically solve the vector-valued problem as well, since we can look at the expression

instead, and the positivity will carry over. As he also points out, if you modify our low-discrepancy multiplicative examples such as by multiplying by the unit vector , where is the largest power of 3 that divides , then you get a sequence of discrepancy that grows like , which shows that this method cannot hope to do better than a bound. But I’d settle for that!

Finally, I want to draw attention to another comment of Moses, in which he introduces a further idea for getting a handle on the problem. I won’t explain in detail what the idea is because I haven’t fully digested it myself. However, it gives rise to some Taylor coefficients that take values that are all of the form for some integer . It is clear that they have a great deal of structure, but we have not yet got to the bottom of what that structure is. If we do, then it may lead to a concrete proposal for a matrix of coefficients that should be a good one.

My optimism may fade in due course, but at the time of writing it feels as though these new ideas have changed the problem from one that felt very hard into one that feels approachable.


Kategorije: Matematički blogovi

EDP9 — a change of focus

Sri, 2010-02-24 12:49

The discussion in the last thread has noticeably moved on to new topics. In particular, multiplicative functions have been much less in the spotlight. Some progress has been made on the question of whether the Fourier transform of a sequence of bounded discrepancy must be very large somewhere, though the question is far from answered, and it is not even clear that the answer is yes. (One might suggest that the answer is trivially yes if EDP is true, but that is to misunderstand the question. An advantage of this question is that there could in theory be a positive answer not just for -valued functions but also for -valued functions with norm at least , say.)

Another question that has been investigated, mostly by Sune, is the question about what happens if one takes another structure (consisting of “pseudointegers”) for which EDP makes sense. The motivation for this is either to find a more general statement that seems to be true or to find a more general statement that seems to be false. In the first case, one would see that certain features of were not crucial to the problem, which would decrease the size of the “proof space” in which one was searching (since now one would try to find proofs that did not use these incidental features of ). In the second case, one would see that certain features of were crucial to the problem (since without them the answer would be negative), which would again decrease the size of the proof space. Perhaps the least satisfactory outcome of these investigations would be an example of a system that was very similar to where it was possible to prove EDP. For example, perhaps one could find a system of real numbers that was closed under multiplication and had a counting function very similar to that of , but that was very far from closed under addition. That might mean that there were no troublesome additive examples, and one might even be able to prove a more general result (that applied, e.g., to -valued functions). This would be interesting, but the proof, if it worked, would be succeeding by getting rid of the difficulties rather than dealing with them. However, even this would have some bearing on EDP itself, I think, as it would be a strong indication that it was indeed necessary to prove EDP by showing that counterexamples had to have certain properties (such as additive periodicity) and then pressing on from there to a contradiction.

A question I have become interested in is understanding the behaviour of the quadratic form with matrix . The derivation of this matrix (as something to be interested in in connection with EDP) starts with this comment and is completed in this comment. I wondered what the positive eigenvector would look like, and Ian Martin obliged with some very nice plots of it. Here is a link to where these plots start. It seems to be a function with a number-theoretic formula (that is, with a value at that strongly depends on the prime factorization of — as one would of course expect), but we have not yet managed to guess what that formula is.

I now want to try to understand this quadratic form in Fourier space. That is, for any pair of real numbers I want to calculate , and I would then like to try to understand the shape of the kernel . Now looking back at this comment, one can see that

Since the bilinear form is determined by the quadratic form I’ll concentrate on the latter (which in any case is what interests me). So substituting into the above formula gives me

The infinite sum is a geometric progression, so this simplifies to

Note that for each the integrand is bounded unless is a multiple of , and more generally is small unless is close to a multiple of and is close to 0. So we do at least have the condition of being close to a rational with small denominator making an appearance here. (Why small denominator? Because then there will be more such that is a multiple of .)

I plan to investigate the sequence 1, -1, 0, 1, -1, 0, … from this perspective. It takes the value at . I shall attempt to understand from the Fourier side why this gives a sequence with such small discrepancy.

Before I finish this post, let me also mention a nice question of Alec’s, or perhaps it is better to call it a class of questions. It’s a little bit like the “entropy” question that I asked about EDP, but it’s about multiplicative functions. The question is this: you play a game with an adversary in which you take turns assigning values to primes. You want the resulting completely multiplicative function to have as small discrepancy as you can, whereas your adversary wants the discrepancy (that is, growth of partial sums) to be large. How well can you do? One can ask many variants, such as what happens if your adversary is forced to choose certain primes (for instance, every other prime), or if your adversary’s choices are revealed to you in advance (so now the question is what you can do if you are trying to make a low-discrepancy function but someone else has filled in half the values already and done so as badly as possible), or if you choose your values randomly, etc. etc. So far there don’t seem to be any concrete results, and yet it feels as though it ought to be possible to prove at least something non-trivial here.

One other question I’d like to highlight before I finish this post. It seems that we do not know whether EDP is true even if you insist that the HAPs have common differences that are either primes or powers of 2. The powers of 2 rule out all periodic sequences, but for a strange parity reason: for instance, if you have a sequence that’s periodic with period 72, then along the HAP with common difference 8 it is periodic with period 9, which means that the sum along each block of 9 is non-zero (because it is an odd number) and therefore the sums along that HAP grow linearly. Sune points out that the sequence is a simple counterexample over , but it’s not clear what message we can take from that, given that periodic sequences don’t work in the case. I like this question, because finding a counterexample should be easier if there is one, and if there isn’t, then proving the stronger theorem might be easier because HAPs with prime common differences are “independent” in a nice way.

Update: I intended, but forgot, to mention also some interesting ideas put forward by Gil in this comment. He has a proposal for trying to use probabilistic methods, and in particular methods that are suited to proving that rare events have non-zero probability when there is sufficient independence around, to show that there are many sequences with slow-growing discrepancy. It is not clear at this stage whether such an argument can be made to work, but it seems very likely that thinking about the problems that arise when one attempts to make it work will be fruitful.


Kategorije: Matematički blogovi

EDP8 — what next?

Pet, 2010-02-19 09:10

It’s taken noticeably longer than usual for the number of comments on the previous EDP post to reach 100, so this is perhaps a good moment to think strategically about what we should do. Individual researchers continually have a choice — whether to take a break from the problem and work on other, possibly more fruitful, projects or to tackle that blank page head on and push towards a new level of understanding — and I see no difference with a Polymath project.

I would be interested in the views of others, but my own feeling is that there is still plenty to think about here. There has been a certain amount of talk about Fourier analysis, and that still feels like an insufficiently explored avenue. A good preliminary question, it seems to me, is this. Suppose that is a quasirandom -valued function defined on for some large , in the sense that all its Fourier coefficients are small. Must there be some HAP along which the sum has absolute value at least ? If so, how quasirandom does need to be? What I like about this question is that I think it should be substantially easier than EDP itself. It could be that a simple calculation would solve it: my attempts so far have failed, but not catastrophically enough to rule out the possibility that they could succeed next time. It also seems a pertinent question, because the functions we know of with very low discrepancy have some very high Fourier coefficients. (I don’t really mean Fourier coefficients, so much as real numbers such that has very large absolute value.) Therefore, proving that low discrepancy implies a high Fourier coefficient would be a result in the direction of proving that these examples are essentially the only ones, which would solve the problem.

Even if we knew in advance that there was no hope of solving EDP, there are still some questions that deserve more attention. As far as I can tell, nobody has even attempted to explain why some of our greedy algorithms appear to give a growth rate of for the partial sums. I would love to see just an informal argument for this.

I am also interested in (without yet having thought much about) a class of examples introduced by Kevin O’Bryant. These are sequences of the form where Kevin also considers the case where is infinite but the tend to zero so that the sum is always finite. Sune has pointed out that every sequence is of this form, but also makes the point that this may still be a good way to think about the problem. At the time of writing, I don’t have a lower bound of even in the case when . If that is not just me being stupid, then it raises the possibility that with an infinite one might be able to get a growth rate of in an interestingly new way. (A first step would be to get with a constant that tends to zero as tends to infinity.)

Another line of enquiry that Sune and Gil have thought about is to come up with toy problems that may be easier than EDP. One of Sune’s suggestions is interesting in that he has variants for which there appear to be counterexamples. This sheds light on EDP itself, because it shows that any proof of EDP would have to distinguish between the real problem and the variants.

I think it might be a good idea if we made a conscious decision that we would all focus on one aspect of the problem. For instance, we could choose a sub-problem with a nice statement and temporarily think of that as the “real” problem that we’re working on. But it might also be a bad idea: perhaps what we’ve done up to now, just letting people comment and others follow up on the comments if they find them sufficiently interesting, is the most efficient process. Incidentally, if you think you had a pretty good idea that got forgotten about, it’s not a bad idea to repeat it from time to time (as various people have done already, both in this project and in the DHJ project).

Anyhow, if anybody has thoughts about how we should proceed, then I’d be very pleased to hear them. In the absence of any suggestions to the contrary, I’ll think about the Fourier question and about trying to come up with interesting new multiplicative functions with low discrepancy.

While I’m at it, here’s a thought about such functions, which could turn out to be a special case of Kevin’s thought. (It might even be the basic idea that led him to his thought.) Our best examples so far (not including finite ones found by brute force) have been character-like functions, which are based on some prime number . Could one have something that is a bit like a character-like function but based on an irrational number? The idea would be to take a Sturmian sequence of some kind but to throw in a few zeros. (One way of doing this would be to have two fairly large intervals mod 1 and one small one, and an irrational number , and make 1 if lies in one of the large intervals, -1 if it lies in the other, and 0 if it lies in the small interval. Then one could do something about the zeros that was somehow based on the original sequence. It might end up being quasimultiplicative rather than multiplicative. I haven’t even remotely checked whether that was a non-crazy idea, but perhaps something vaguely similar could work, and perhaps it might do better than our existing character-like examples. The experimental evidence (by which I mean the 1124 sequences) certainly suggests that there ought to be some theoretical construction that does better than . Perhaps a useful exercise would be to write a character-like function in O’Bryant form and then try to generalize from that.

Another little thought, this time in the direction of toy problems, is this. (It may be a special case of one of Sune’s suggestions — I haven’t checked.) Let us say that an infinite discrete set of positive real numbers is a set of pseudointegers if it is closed under multiplication, and if the growth rate is roughly . I don’t know how rough it would have to be. One could then ask EDP for : a HAP would be defined as a set of the form , where is the set of the first elements of and is some element of . It’s just conceivable that one might be able to find an for which EDP is false, in which case it would tell us that a proof of EDP in would have to distinguish between and .

I believe that something like this has been done for the Riemann hypothesis but I can’t remember the details. Terry, if you’re reading this, you were the one who told me about the RH example — it would be great if you or anyone else could remind me what the precise statement is and give me a reference so that I can add them to this post. [It has now been provided: see the first comment below.]


Kategorije: Matematički blogovi

DHJ latest

Sri, 2010-02-17 01:14

A quick post to say that earlier today I put a new version of the write-up of Polymath1’s proof of the density Hales-Jewett theorem on the arXiv. Very soon it will be submitted for publication. I will not say more at this stage (since I don’t want the journal to evaluate the paper in the public eye) but will report back when I know whether it has been accepted.

Update: it is now submitted.


Kategorije: Matematički blogovi

EDP7 — emergency post

Pon, 2010-02-08 22:31

I don’t feel particularly ready for a post at this point, but the previous one has got to 100 comments, so this is one of the quick summaries again — but this one is even shorter than usual.

We are still doing an experimental investigation of multiplicative functions, trying to understand how special they have to be if they have low discrepancy. Ian Martin has produced some beautiful plots of the graphs of partial sums of multiplicative functions generated by various greedy algorithms. See this comment and the ensuing discussion.

Terence Tao has some thoughts about how one might try to reduce to the character-like case.

I came up with a proof strategy that I thought looked promising until I realized that it made predictions that are false for character-like functions such as and . Even if the idea doesn’t solve the problem, I think it may be good for something, so I have written a wiki page about it. Gil has had thoughts of a somewhat similar, but not identical, kind. Here is a related comment of Gil’s, and here are some more amazing plots of Ian’s. (I think we should set up a page on the wiki devoted to these plots and the ideas that led to them.) Regardless of what happens with EDP itself, I think we have some fascinating problems to think about, which can be summed up as, “What is going on with these plots?”


Kategorije: Matematički blogovi

EDP6 — what are the chances of success?

Pet, 2010-02-05 14:29

I thought it might be good to have a post where I tried to take stock and assess the chances that Polymath5 will actually result in a solution to the Erdős discrepancy problem. But before I talk about that, I want to make the point, which I have made before, that solving the problem one sets out to solve is not a necessary condition for the attempt to count as worthwhile. There is easily enough material on the blog now for it to be possible to base a publishable paper on it, even if quite a lot of that paper ends up being a combination of a survey about the problem and a report on the interesting phenomena that have become apparent as a result of computations, some of which we understand reasonably well and some of which are still rather mysterious.

I am not saying this in order to hint that I think that it might be time to give up. At the moment it feels to me as though we still have momentum, so my enthusiasm is not flagging. What is mathematical momentum? I might define it as follows: an assault on an open problem has momentum if one’s understanding of the problem is continuing to develop. Right now I feel as though with each passing day I understand things about EDP that I did not understand before — and I presume that applies to everyone else too. And for as long as that is the case, one can never quite be sure that the problem won’t suddenly yield.

After Terry’s reduction of the problem to one about multiplicative functions, we have devoted almost all our attention to them. (Almost, but not quite: Klas and Alec are still trying to find a sequence of discrepancy 2 and length greater than 1124. It looks as though 1124 may be the upper bound though.) At this stage, I think it is fair to say that the main thing we are doing is playing around in multiplicative-function space, trying to get a feel for what it looks like. But this playing around has a serious purpose, as I shall now explain.

In my EDP4 post, I wrote about how there seem to be some very different kinds of multiplicative functions, which have unbounded partial sums for very different kinds of reasons. On the one hand you have functions like the Liouville function , which behaves fairly like a random function and therefore has partial sums that grow like (if you believe the Riemann hypothesis, and at least that fast unconditionally). On the other, there are functions like and , which have a great deal of additive structure (as well as their multiplicative structure) and give rise to logarithmic growth. Because of this, it felt as though one might be forced to classify multiplicative sequences to some extent. Some of them would be “similar to character-like functions” and would have unbounded partial sums for similar reasons to the reasons that apply to and , while others would be “not similar to character-like functions”, which would cause them to behave sufficiently randomly to have unbounded partial sums for that reason.

It is still conceivable that some kind of strategy like that could work, but certain difficulties with it have come to light. At one point, I had the following serious anxiety about the approach. The only proof I knew of that the partial sums of the Liouville function were unbounded went via the Riemann zeta function and some very special properties of . Therefore, in order to prove EDP it looked as though it would be necessary to find an entirely new, and highly generalizable, argument that applied to . And this looked as though it might be a known hard problem in number theory.

Fortunately, Terence Tao was able to supply a much simpler proof that these partial sums were unbounded. His proof relies on the fact that the Dirichlet convolution of with the constant function 1 is the characteristic function of the squares, which is a very special property of , so Terry’s proof does not look like the massively generalizable argument that one would need, but at least it is reassuringly elementary. It might still be worth trying to find new ways of understanding this fact, though.

However, I have another worry about the strategy of splitting multiplicative functions into those that are character-like and those that are not. It is that I now believe that there are probably multiplicative functions that have no obvious additive structure but that still have slow-growing partial sums. (For now let me say that the partial sums are slow-growing if they are bounded above by , but I think one might be able to get a small power of .) The evidence for this comes from thinking about algorithms for choosing the values at primes in a way that is designed to keep the partial sums small. The obvious algorithm is a greedy one: if the partial sum up to is positive then let , and if it is negative then let , and do what you like if it is zero. However, computations suggest that this fails spectacularly. I don’t know if anyone has a clear understanding of why this should be — I certainly don’t.

Equally mysterious is a fascinating discovery of Johan de Jong: that if you change your criterion slightly and choose if and only if the partial sum up to is less than -10, you do far better. I would love to have an explanation for this. I don’t care whether it is rigorous as long as it is convincing.

Those two links are part of a sub-conversation that starts here. I think that by this point is fairly safe to say that the experiments are telling us that multiplicative functions with slow-growing partial sums do not have to be all that structured. It is still possible that if you insist that the growth is at most logarithmic then your function is forced to be similar to a character-like function, but it also seems clear that we are not going to be able to prove big growth when the function is not similar to a character-like function.

To me this suggests that we may have to look for a unified proof after all. I do not feel 100% confident about this, but it is where my own energies are concentrated for the time being.


Kategorije: Matematički blogovi

EDP5 — another very brief summary

Uto, 2010-02-02 00:54

I wasn’t expecting to have to write another post quite this soon, so this is another one where I don’t have much to say. Here are a few bits and bobs from the last lot of comments, but there’s plenty more in the comments themselves, and actually I think that it’s not that hard to browse through them now that we have depth-1 threading and quite a lot of the comments are very short.

Johan de Jong came up with a very interesting variant of the problem, in which is replaced by the space of polynomials over . I confess to being a little sad when the problem was solved negatively soon afterwards, as it had looked as though it might be a rather good model problem. However, the solution was nice.

One of the general aims at the moment is to try to show that a multiplicative function of bounded discrepancy must have some kind of character-like behaviour. Terence Tao has come up with an intriguing argument that shows not that but at least something in the right ball park.

Work has continued on a human proof that completely multiplicative sequences must have discrepancy greater than 2. It looks as though the proof will be complete before too long, and not too painful. Some nice tricks of Jason Dyer have come in helpful in reducing the amount of case analysis that is needed.


Kategorije: Matematički blogovi

EDP4 — focusing on multiplicative functions

Sub, 2010-01-30 12:39

Just as I thought that Polymath5 was perhaps beginning to lose momentum, it has received a sudden boost, in the form of this comment of Terry’s, which has potentially completed one of the hoped-for steps of the proof strategy, by establishing that if there is a counterexample to EDP, then there is a “moral counterexample” that is completely multiplicative.

There are a few points that need clarifying here (which I am just repeating from Terry’s comment). The first is that the multiplicative counterexample is over the complex numbers: that is, it is a sequence with values in . The second is that instead of having bounded partial sums (and hence bounded discrepancy), it has mean-square partial sums with bounded lim inf. That is, there exists a constant such that infinitely often. Turning this round, in order to prove EDP it is sufficient to prove that for every completely multiplicative function we have the inequality for some function that tends to infinity.

Since this is asking for something substantially stronger than unbounded partial sums, it makes sense to ask whether this stronger conjecture is true. The answer is that it seems to be about as plausible as EDP itself: that is, the key examples we have seem to behave similarly for both measures of growth. In particular, the mean-square partial sums of the function (the multiplicative function that’s -1 at 3, 1 at numbers that are 1 mod 3 and -1 at numbers that are 2 mod 3) grow logarithmically (so the root mean square grows like the square root of log, with log being the maximum it ever reaches).

Now it seems reasonable to regard this problem as “morally the same” as the problem of proving that a completely multiplicative function has unbounded discrepancy. That is, from a practical point of view, it seems to make sense to focus on the multiplicative case of the original EDP and hope that any proof we manage to come up with will give similar results for complex numbers and mean square averages. In fact, using mean square averages, though it requires us to prove something stronger, is likely to be easier than proving something about the maximum partial sum, since the norm tends to be easier to work with than the norm. (A little example of that is this calculation, which shows that the mean square partial sum of grows logarithmically by taking the sequence of partial sums and splitting it up as an orthogonal sum.)

At the moment, quite a lot of attention is being given to algorithms of the kind I discussed in EDP2. These are algorithms where each time you choose a value at a prime, you follow through all its easy consequences (there are various ways that this notion can be formalized) before looking at the next prime. Some time ago, I used an algorithm like this to find a maximal-length multiplicative sequence of discrepancy 2. The interest now is in whether one can prove by hand that that length (which was 246) is maximal. To read more about this, I recommend looking at a page on the wiki that Sune created, in which he gave a rather clear and concise description of the kind of algorithm we are talking about. And there is also a lengthy discussion of the algorithm following this comment.

I have to admit that from a theoretical point of view, my enthusiasm for this algorithm has dwindled. I had hoped that we would be able to identify regions that were full of odd-smooth numbers, and show that the discrepancy and multiplicativity constraints from those regions were too numerous to be simultaneously satisfiable. If that were the case, then it might be possible to prove it by examining carefully how the algorithm performed, and showing that for each discrepancy it would always terminate with a proof that no infinite multiplicative sequence had discrepancy .

Before embarking on an attempt like this, it is worth thinking about when, in general, analysing an algorithm can be useful for proving a theorem. Here, for example, is a theorem where it does not seem to be useful. One can prove van der Waerden’s theorem for 2 colours and arithmetic progressions of length 3 as follows. At each stage I choose R if I can, and otherwise blue, and if I can’t choose R or blue I backtrack. The proof then goes like this: R, RR, RRB, RRBR, RRBRR, RRBRRB, RRBRRBB, RRBRB, RRBRBB, RRBB, RRBBR, RRBBRR, RRBBRRB, RRBBRRBB, RRBBRB, RRBBRBR, RB, RBR, RBRR, RBRRB, RBRRBR, RBRRBRB, RBRRBB, RBRB, RBRBB, RBRBBR, RBRBBRR, RBRBBRB, RBRBBRBR, RBB, RBBR, RBBRR, RBBRRB, RBBRRBB, RBBRRBBR, RBBRB, RBBRBR, B. At that point I can argue by symmetry: if there is no sequence beginning with R then there cannot be one beginning with B.

Now I actually have a soft spot for this proof, which is simply a depth-first search, and would be quite interested to know more about it. For example, if I run a depth-first search but looking for progressions of length 4, then how will the lengths of the sequences I look at at each stage behave? (The maximum length will be the van der Waerden number W(2,4).) How many sequences will I end up looking at? Can I cut this down a lot by exploiting self-similarities in the tree? (To put that another way, I could build up little lemmas that would tell me that certain subsequences would lead fairly quickly, but not instantly, to contradictions, and I would then not have to keep demonstrating those contradictions over and over again. That would happen every time I backtracked, and the further back the backtracking went, the more powerful the lemma would be.) However, the point I want to make is the negative one that it seems incredibly unlikely that a proof of van der Waerden’s theorem could come out of analysing this algorithm. I don’t know how to justify this assertion convincingly, but what goes on in my head is a thought like this: oh dear, if van der Waerden’s theorem is false, then this algorithm won’t ever terminate, so to prove that it terminates I’m going to need van der Waerden’s theorem. Perhaps another way of putting it is to say that the algorithm merely searches through all possible colourings: it doesn’t have an obvious theoretical advantage over literally writing out all colourings of and checking every single one, as runs through the positive integers, until you reach an such that there is no colouring without a monochromatic progression. (Once, a very long time ago, I hoped that the patterns one sees in the way the sequences expand and contract could be understood in great detail and proved to backtrack right to the beginning, but I soon came to think that the chances that that could be done were minuscule.)

How about an example where analysing an algorithm does help to prove a theorem? One such is a proof of Hall’s theorem. This can be done by describing an algorithm for creating a matching, and proving that the only reason it could have for getting stuck would be if Hall’s condition was violated. Similarly, Bézout’s theorem can be proved by using Euclid’s algorithm to define better and better integer combinations of the original two numbers. Somehow, in both these cases the algorithm is what one might call the algorithmic version of an already existing theoretical proof, whereas this is not the case for brute-force algorithms.

That was not a very successful attempt to distinguish between algorithms that help you prove things and algorithms that don’t. (One other point is that efficient algorithms seem to be more useful for proving theorems than inefficient ones. I’d be interested to know of a counterexample to this — a situation where one can obtain an interesting theorem by proving that a certain inefficient algorithm terminates.) However, in the particular case of searching for long multiplicative sequences of low discrepancy, there is another argument that I find particularly convincing, which I think of as Sune’s test. It can be summarized with the following slogan: your analysis has to apply to the function . And it seems to me that that function shows that the kinds of algorithms we have been looking at will not work unless a significant degree of backtracking is allowed, which is a polite way of saying that it is a brute-force algorithm with a few clever tricks added that do not make it efficient. (I visualize it as follows: the brute-force backtracking algorithm has a tree structure, and brute-force with tricks is a kind of “quotient” of this tree that doesn’t collapse enough for the tree structure to be lost.)

My reason for being so pessimistic is based on what the algorithm would do to the function . I claim that it would not be able to tell that up to cannot be extended to an infinite sequence of discrepancy without lots and lots of backtracking, and that none of the tricks that work so well for small sequences and discrepancy 2 would be of any help. A slightly more detailed argument can be found in this comment.

As a result, I think we are more or less forced to look for theoretical arguments. Here is one possible general proof strategy.

1. Develop a notion of “almost character-like” functions.

2. Prove that a multiplicative sequence that is not almost character-like must have unbounded discrepancy.

3. Prove that a multiplicative sequence that is almost character-like must have unbounded discrepancy.

I would expect 3 to be easy once one had a decent answer to 1. The real challenge is to answer 1 in a way that allows one to prove a good theorem that answers 2.

I have suggested a couple of very preliminary ideas in the direction of 1. I won’t repeat them here. However, I will mention that there is a notion of closeness for multiplicative sequences that has proved itself useful in several contexts. An interesting discussion can be found in this paper of Granville and Soundararajan, starting on page 207. The simplest notion of closeness they mention is this: given multiplicative functions and and a positive integer , we define to be . Note that if and are functions, then for this distance to be finite we need for all primes except in a fairly sparse set.

Granville and Soundararajan have various results that show that multiplicative functions with certain properties (such as partial sums that grow linearly) must be close to characters that very obviously have those properties. Two questions we might think about are these. Suppose we have a multiplicative function that is except on multiples of and 0 at multiples of . Suppose also that its partial sums are bounded. Must it be the Legendre character mod ? (Equivalently, must it be periodic with period ?) And if we make the weaker assumption that the partial sums grow slowly, can we show that the function is close to the Legendre character mod ?

More generally, if is a multiplicative function with partial sums that grow slowly (the meaning of slowly to be chosen in response to the proof), does it follow that is close to a character-like function? (Or as Granville and Soundararajan put it, does it follow that the function pretends to be character-like?)


Kategorije: Matematički blogovi

EDP3 — a very brief report on where we are

Uto, 2010-01-26 13:11

Right at the moment I’d say that Polymath5 has quietened down a little bit but is nevertheless continuing to make progress. There is quite a lot I could write about, but I thought it made more sense to put it on the wiki, so here I shall just briefly draw attention to some recent additions to the wiki (not just by me).

On the experimental side, I am losing track of how new everything is, but a list of the main data can be found at the start of the experimental page on the wiki.

On the theoretical side, Terence Tao has put proofs of some nice facts. One is a very clean proof of Mathias’s result that if the HAP sums of a sequence never go below -1 then they must be unbounded above. The cleanness of the argument comes partly from working with the positive rationals rather than the positive integers and partly from making use of results from topological dynamics. In particular, the latter allows one to talk about a “random rational number ” and give a sensible meaning to quantities such as the probability that .

Terry has also proved, by a fairly intricate combinatorial argument, that there must exist a HAP such that the difference between the maximum and minimum partial sums along that HAP is at least 3. (In the terminology we have been using, the drift is at least 3.)

I have created some pages about general proof strategies. These are linked to from a section of the main Polymath5 page. One is a proof strategy that I have already discussed in some detail in recent posts: showing that a counterexample must have multiplicative structure, tidying up that multiplicative structure as much as possible, and eventually showing that there cannot be any examples with multiplicative structure. A newer idea is to define a different parameter, a bit like drift but measured on a number of different distance scales and averaged. More details about this idea and the motivation for it can be found here.

Incidentally, it is easy to find out what the most recent changes to the wiki have been: on the left of each page is an internal link called Recent Changes.


Kategorije: Matematički blogovi

EDP2 — a few lessons from EDP1

Čet, 2010-01-21 23:18

I plan to continue my policy of writing a new Polymath5 post every time the number of comments on the current one threatens to go into three figures. Since I won’t always have a substantial post’s worth to say, I shall simply write whatever I do feel like saying and have time for. I can’t guarantee that it will always be a full and accurate summary of the discussion so far.

In this post I want to highlight a couple of ways in which my view of the broad proof strategy explained in the previous post has changed. Let me begin with the multiplicative case of the problem (item 1 of the proof strategy). My proposal for this was to attempt to find clusters of odd-smooth (there is now a suggestion that these should be called odd-friable) numbers, and argue that the constraints that these put on small primes cannot all be satisfied simultaneously. I would like to explain now why I do not think it likely that an argument along these lines can be developed.

The first reason is a remark of Sune’s. He points out that if such a proof exists, then we should see how it applies to the Walters-type examples (where is 1 if the last non-zero digit of base 3 is a 1 and -1 otherwise). These examples have so much alternation of signs that they demonstrate that you can have an infinite multiplicative sequence such that any subsequence of drift at least needs to be exponentially long in . Thus, it seems pretty hard to find “clusters” of a kind that would be helpful.

I find that an important argument, but a small part of me thinks there might conceivably be a way round it. However, that lingering hope is crushed by a second argument, which is derived from my experiences of building long low-discrepancy multiplicative sequences by hand.

To explain this second argument, I’d like to distinguish more explicitly between two kinds of constraint that a multiplicative sequence of low discrepancy must satisfy. At the risk of stating the embarrassingly obvious, it must be multiplicative, and it must have low discrepancy. Why did I bother to say that? Well, let’s suppose that, as happens when one tries to build up a sequence, we are given the values on a certain set of integers. The way my by-hand algorithm works, I then try to write down all other values that I can deduce from these ones by directly using either the multiplicativity condition or the low-discrepancy condition. By “directly”, I mean that I do not allow arguments of the following kind: if you set then you are forced by the following chain of implications to get five 1s in a row over here; therefore, . I would count that as an indirect implication.

So what are the direct implications? I can describe these by explaining how I would program a computer to work them out. I would start by calculating everything that follows from multiplicity. That is, if I have three numbers such that and I know the values at two of them, then I can deduce the value at the third. (In practice, the values I know have most often been those at and , but deducing from and is more interesting when you get the chance.)

There are of course infinitely many values that follow from multiplicity, so before I start I fix a large and only go up to that .

Once I have made all the multiplicity deductions I can, I then make discrepancy deductions. One useful one is the following lemma: if the discrepancy is at most 2 and , then the partial sum up to is 0. (The proof is that it must be even and for obvious reasons it can’t be either 2 or -2.) From this it follows, for example, that if , then . More generally, if you can work out what the partial sum up to a certain point must be, and if that point belongs to an interval of values that have been decided, then you can work out the partial sums everywhere in that interval — and it is often possible to do that even if you don’t know all the values before that interval.

So the second stage of my procedure is to make all discrepancy deductions of that kind. They then allow me to make more multiplicativity deductions, which in turn allow further discrepancy deductions, and so on.

The point I want to make is this: in practice, I found myself doing several alternations between multiplicative and discrepancy deductions before I had deduced everything I could. I might add that when I ran out of deductions, I would typically find that making a choice for just one more prime would unleash a surprisingly long chain of implications before I next ran out.

What has this to do with the proposal in my previous post? Well, the clusters argument in that proposal seems to be suggesting that for every assignment of values to all the primes up to some , there will be a cluster of -odd-friable numbers somewhere such that the values you have chosen will force a drift that is too large (at least , say). But the actual experience of searching for long multiplicative sequences of low discrepancy doesn’t feel like this: the contradiction is reached not after one step, when all you’ve done is explore the consequences of multiplicativity, but after several, when you’ve alternated quite a bit between using multiplicativity and using the low discrepancy. So it now seems to me that what I was asking for before was too strong.

There is a trivial sense in which it wasn’t too strong, which is that the numbers up to are themselves a cluster of -odd-friable numbers. But that is reducing the problem to itself, which does not count as a serious proof strategy.

So I now think that the right approach is to try to examine the following question: if you are given a certain set of values that a multiplicative sequence of discrepancy at most must take, how many other values do you expect to be able to deduce? Ultimately, one is hoping to be able to make two contradictory deductions: if is the set of all primes up to , can we make some kind of guess about how large needs to be in order to force us to make two contradictory deductions (whatever values we choose at those primes)?

That second question suggests another. Up to now, we have looked for the longest multiplicative sequence of discrepancy at most 2, and Alec has shown that the best possible bound is 246. However, in choosing such a sequence, we probably commit ourselves to a contradiction much earlier. So a more fundamental question (from the point of view of finding a proof that the partial sums of a multiplicative function must be unbounded) is this. What is the largest prime such that one can choose values for all primes up to , follow through all the implications of multiplicativity and discrepancy at most 2, and not reach a contradiction? This is still a computational question and I think it should be easy to program and quick to run.

At this point I’d briefly like to highlight an interesting post of Guy Srinivasan. He was interested in the question of when one would expect contradictory constraints to arise in the general problem, and devised a probabilistic model for it. (An example of contradictory constraints for him would be something like that and , making it impossible to choose without violating the discrepancy-2 condition.) His model appears to agree remarkably well with the actually observed data, so perhaps it can give us some real insights into what is going on. It would be nice if we could use such a model to guess what the right lower bound should be for the discrepancy of a sequence of length . (Even without a formal proof, it is good to have as precise an idea as possible of what we think ought to be true.) It would be good to try to develop heuristic arguments for the multiplicative case as well. Is there some probabilistic model that would allow us to predict how broad and long the chain of implications described above should be, given the set where the values have already been assigned?

Now let me move on to step 2 of the proposed proof strategy. Here I think I can claim that there has been some progress, though of a rather modest kind. In this comment and the subsequent one, I proposed an argument for showing that if there is a counterexample to EDP then there is a counterexample such that for every the sequence has a well-defined density. Terence Tao confirmed that this does indeed almost certainly follow from well-known dynamical-systems techniques. I haven’t thought about it yet, but I feel fairly confident that one should be able to get the same result simultaneously for all geometric progressions and not just those with common ratio 2. (It seems as though all it would need is a relatively standard generalization to several commuting operators — maybe Terry can confirm this too.)

Why does it help to have well-defined densities along GPs? Well, I don’t know for sure that it does. But I am hoping we could use averaging arguments of the kind introduced in this comment, to produce from a hypothetical counterexample a much nicer example, obtained from averaging along GPs. That brings me to my next point: a discussion of step 3 in my previous post.

Here I think we have something surprisingly simple: I said we should walk before we fly, but in fact it seems to be easy to prove that if you can walk then you can fly. That is, it seems to be easy to prove that if there is a quasimultiplicative example, then there is a multiplicative one. (So although the very long and mostly quasimultiplicative examples that have been generated experimentally are very interesting, they now appear to be a red herring from a theoretical point of view.) The argument is in the comment referred to in the previous paragraph.

I would also like to mention an entirely different proof strategy, which is to modify the problem by shifting the HAPs in order to destroy structured examples, so that all you have to do is analyse much more random-looking ones. A few experiments need to be done before we will know whether this is a promising idea — it seems that we will learn a lot from them whatever they turn out to yield. More details about the modified problem can be found in this comment. [Added slightly later: in the responses to the comment, Sune makes an observation that rather pours cold water on this idea. But the experiment may still be interesting.]

A quick thing to report on the numerical side: if you follow the responses to this comment, you will see that Alec has come up with a completely multiplicative sequence of length 1530, and you can get some idea of how he did it. This sequence holds the current record for the longest sequence yet to have been generated that is of significance to the project. The sequence itself is displayed here.

Finally, I would like to draw attention to a problem I like that ought to be easier than the EDP itself. I also think that if we can solve it then we will have a lemma that could be extremely useful when it comes to “cleaning up” hypothetical counterexamples. Rather than repeat myself, I refer you to this comment and the one after it (and the responses to them). Roughly speaking, a positive solution to this problem would show that there were “few” counterexamples to EDP, which would then tend to show that the operators had to have some kind of stronger compactness property than the trivial one (since after you iterate a few times, you’d have to get a sequence that was substantially more similar to your original sequence than the pigeonhole principle would guarantee). This, coupled with limiting arguments, could enable siginificantly greater tidying up to be done of the original sequence.


Kategorije: Matematički blogovi

EDP1 — the official start of Polymath5

Uto, 2010-01-19 17:09

After several hundred comments (including everything, we’re at 688 at the time of writing) about the Erdős discrepancy problem, it may seem odd to be talking about the project starting. However, the focus of the discussion so far has been on experimental results, and I have deliberately restrained myself from saying too much about the theoretical side of the problem — that is, proposing approaches, trying to prove partial results, etc. — and I think others have too. So from now on any theoretical comments are welcome. I have been trying to delay this moment as long as possible in order to have a chance of doing various other things I have to do, but I have a relatively free couple of weeks coming up, the experimental evidence is starting to lead to the possibility of making some precise conjectures, and Terence Tao recently came in with a theoretical comment: this conjunction of circumstances fatally weakens the argument for holding out any further.

I had planned, in this first theoretical post, to concentrate on approaches that don’t work. The idea behind this was that the more you can rule out, the more your moves are forced and the more likely you are to find whatever correct argument might remain. However, I am less inclined to do this now because the body of experimental evidence does the proof-constraining very effectively without it. As I have already commented, the fact that there is a sequence of length 1124 and discrepancy 2 probably rules out a simple inductive argument for a lower bound of the form (for the discrepancy of a sequence of length ). And the fact that this example has so much structure strongly suggests that we should attempt to prove that an infinite bounded-discrepancy sequence must have structure of this kind. (A key property seems to be quasi-multiplicativity. We basically know what a quasi-multiplicative sequence is, though we have not yet settled on a formal definition.)

When I first suggested the Erdős discrepancy problem as a possible Polymath project, one thing that worried me about it was that, unlike with the density Hales-Jewett theorem (for ) I did not have a suggestion for how one might go about solving it. However, that has changed, thanks to this experimental work. I now have a suggestion for a broad approach, and slightly more detailed suggestions for how one might go about proving the various steps. I haven’t forgotten that my initial suggestions for DHJ3 were eventually abandoned, and that could happen again here. But it is still somehow helpful to have the initial suggestions.

My view of the problem before the 1124 structure emerged was something like this. There was a natural special case of the problem, where you assume that your sequence is completely multiplicative, and not only was this probably already very hard, but it probably wouldn’t help all that much with the general problem. Now I feel a little bit more optimistic about the problem for multiplicative functions (after a small amount of thought about how to produce good ones, and why one might fail) and a lot more optimistic about their relevance to the general problem, at least if one generalizes to quasimultiplicative sequences. So I would suggest the following broad approach to the Erdős discrepancy problem.

1. Prove that completely multiplicative sequences cannot have bounded discrepancy.

2. Prove that every infinite sequence of bounded discrepancy can be “cleaned up” so that it becomes quasimultiplicative, in some yet to be determined, but pretty strong, sense.

3. Generalize 1 to quasimultiplicative sequences.

Let me now say a little bit more about 1-3 in an effort to explain why I don’t feel that carrying out those steps is necessarily hopeless.

A few thoughts about 1.

The best way to understand why I feel optimistic (but only moderately so, as I think there could be some serious difficulties with what I am about to propose) about 1 is to do a little exercise that I did a few days ago, namely generate by hand a completely multiplicative sequence of length 246, all of whose partial sums lie between -2 and 2. It turns out not to be that hard, provided that you are prepared to look ahead at all the consequences of your choices so far (up to 246 — the significance of which is that Alec and his computer have proved that that is as far as you can get). Once you have done that, you start to see that awkward constraints can arise if you have a longish interval with many numbers in it that I’ll call odd-smooth, until someone proposes a better, or more standard, name. By this I mean the product of a perfect square with a product of distinct small primes. Equivalently, I mean a number such that every prime that occurs an odd number of times in the prime factorization is small. The relevance of this is that if is your completely multiplicative function and you want to be 1 (respectively -1), and if are the primes that occur an odd number of times in the prime factorization of , then an even (respectively odd) number of the values must be -1. Thus, each for which you want to be -1 imposes a constraint on the values of the that can be thought of as a linear constraint over . The hope is that if you have too many of these constraints applying to too few primes , then you won’t be able to satisfy them all simultaneously.

The situation isn’t quite as simple as that, because you are rarely forced to take a particular value for . However, if you have several odd-smooth numbers in close succession, then you do at least have many constraints on the values they can take, and there seems to be at least some hope of demonstrating that they cannot all hold simultaneously.

This line of thought suggests a further division of 1 into two subproblems.

1a. Can we say anything about the distribution of odd-smooth numbers? How dense is the set of numbers such that the primes that occur an odd number of times in the prime factorization of are all at most ? (Obviously they will get sparser as you go on. A typical number around will have around prime factors, so most of these will be large, and typically a large prime factor will occur only once.) And can one show that the distribution is somewhat irregular, so that there are intervals that are particularly rich in odd-smooth numbers? (Because the odd-smooth numbers thin out, such an interval would have to be not too far out.) Can one at least argue heuristically that this ought to be the case?

This is the part of the proof strategy that I feel least confident about. It could be that these questions are known to be very hard. But I’m crossing my fingers and hoping for the best: are there any number theorists out there who can help? Would sieves be useful perhaps? If nobody comes forward then I’ll ask on mathoverflow whether anything is known about odd-smooth numbers.

1b. Can some kind of -linear algebra argument be developed that proves that the discrepancy of an infinite multiplicative sequence is unbounded, subject to reasonable assumptions about the distribution of odd-smooth numbers (and their factorizations)?

I admit that 1a looks pretty hard and 1b doesn’t obviously work. But I hope it’s at least a start.

A few thoughts about 2.

I feel much more hopeful that something can be done here, and I think that if we managed to show that the Erdős discrepancy problem was equivalent to the Erdős discrepancy problem for a highly structured class of sequences, then we could be pretty satisfied.

Very roughly, the programme would be this. Over at mathoverflow I asked a question about how much a sequence of 1s and -1s could be tidied up if you are allowed to take pointwise limits of sequences of translates. To see the relevance to us, consider the operation that we have talked about quite a bit. This is the map that takes a sequence to the sequence . If we look at the iterates of , then it is natural to partition the natural numbers into geometric progressions of common ratio 2, one for each odd number, since on each such GP behaves like a shift. The reason this is of some interest is that if has discrepancy C, then so does , and hence so does for every . Furthermore, every pointwise limit of sequences of discrepancy has discrepancy at most .

Let us call a property of sequences stable if it is closed under translates and pointwise limits. An example of a stable property is that of never having three 1s in a row. An example of an unstable property is that of having density 1 (meaning that the density of such that is 1). The reason that the latter is unstable is that one could have arbitrarily long blocks of -1s but far enough out that the density of the sequence was 1, and then one would be able to find a pointwise limit of translates that was zero everywhere. Periodicity and almost periodicity (by which I mean the type of behaviour you see in sequences such as ) are stable properties, and highly relevant to us. Let us call a property inevitable if for every sequence there is a pointwise limit of translates that has the property. Of particular interest are properties that are both stable and inevitable.

Let be a stable and inevitable property. If we keep applying to a sequence , we can ignore everything except the subsequence and pass to a new sequence where this subsequence has property . We can then repeat the exercise for the subsequence , and the stability means that we will not lose the property on the first GP. Continuing in this way, we can in fact get all GPs to have any stable and inevitable properties we like. So it seemed to make sense to find out what such properties were. There have been several helpful answers to this question on mathoverflow.

Unfortunately, but not at all surprisingly, for a general sequence the kind of properties we can get are far too weak to be useful for showing that our original sequence can be converted into one with multiplicative structure. (We can get quasiperiodicity, but the kind of thing we are looking for is much much stronger than that.) However, our sequences are very far from general: they are subsequences along geometric progressions of a sequence of bounded discrepancy. This raises all sorts of possibilities. For example, if we can prove that some rather weak property must hold, then perhaps we can pass to some limit and greatly strengthen the property. (An example of this, though not a very interesting one, is that if we have a sequence of density 1 we can turn it into the all-1s sequence.)

I have made a list of weak properties that bounded-discrepancy sequences ought to have, and for some of them I think there ought to be a good chance of proving that they really do have the properties. Rather than repeating the list here, I refer you to the interesting subquestions section on the first page of the Polymath5 wiki.

To summarize what I’ve been saying here, I think that, given that the discrepancy can be very low indeed for very long sequences, the best chance of proving the conjecture may well be to use infinitary methods. I hope that we may be able to prove some weakish facts about hypothetical counterexamples to the conjecture that suggest some kind of weak multiplicative structure, and then to argue that in the limit there is a much nicer multiplicative structure — some kind of quasimultiplicativity of the kind we have been observing.

Non-thoughts about 3.

I don’t have much to say about this: we should probably walk before we try to fly.

The other direction.

Of course, I don’t think we should abandon the attempt to find longer and longer sequences of low discrepancy: it has been extremely fruitful already, but the patterns we have found have a tantalizing way of breaking up when the numbers get large. At the moment it feels as though we won’t understand the breaking up properly unless we can get our hands on more data. And it’s probably a bit dangerous to assume that the conjecture is true, or at least it should be healthier to attempt to disprove it as well as attempting to prove it. There are many loose ends arising from the discussion so far, and there are likely to be many more. (See the wish-list section on the experimental results page of the wiki for a sample of these — others are scattered in the comments.) Even if the conjecture is true, it would still be interesting to have an efficient algorithm that could generate better sequences than we know how to describe theoretically. (It feels as though we are closing in on the 1124 sequence, or rather sequences, but it also feels as though the description of those would involve some slightly arbitrary elements that would not generalize straightforwardly to infinity.)

At one point I raised the question of whether it would be good to have separate threads for the theoretical and experimental aspects of this project. At the moment I don’t plan to do this: the two seem too closely intertwined, and the experience with DHJ was that having two parallel threads didn’t really work unless they were fairly disjoint. (That applied both to the material being discussed and, to a lesser extent, to the participants.) So what I hope will happen is that the experimental part of the project will continue apace, but that anyone who might have been holding back on discussing the theory will no longer do so — just as I, by posting this, have no longer done so.


Kategorije: Matematički blogovi

The Erdős discrepancy problem V

Sub, 2010-01-16 21:14

To a casual observer it may look as though the frequency of comments to Polymath5 is dauntingly fast — much faster than it was for Polymath1, say. However, I think appearances are misleading, because many of the comments are very short, and on the whole they are easy to understand because they are not too theoretical. I’ll give the usual roundup later in this post, but I thought I’d begin with a few remarks about the polymath process itself, because I think that Polymath5 has already demonstrated a few things that were not clear, or at least were less clear, before.

The most obvious, given the way the discussion has been going so far, is that the process is ideal for problems where there is the potential for an interplay between theory and experiment. When looking at DHJ, we looked at both theory and experiment, but the two were fairly disjoint. This was due to the nature of the problem: it is not computationally feasible to gather data about DHJ except in very low dimensions, and it is therefore difficult to draw any general conclusions from the results. But for the Erdős discrepancy problem the situation is quite different, and we have learned a huge amount from looking at long sequences that have been generated experimentally. (The main three things we have learned are that they can be surprisingly long, that if they are surprisingly long then they probably have to have some multiplicative structure, and that they have this structure in a approximate sense rather than an exact one.)

Why is this significant? Well, one of the points of polymath-style collaborations was supposed to be that it would greatly speed up the process of making and evaluating those little conjectures that you hope will shed light on a problem and, if you are lucky, turn into useful lemmas. The thought was that some people could throw out guesses, and a group of people each with slightly different expertise would be able to assess the plausibility of those guesses much more quickly than any single individual. So both the generation and evaluation of the guesses would happen much more quickly. But here we have a further speed-up, which is that the experimental evidence already alters the prior plausibility of guesses. For instance, one can conjecture with some confidence that an infinite sequence with bounded discrepancy must have a lot of multiplicative structure: before, that would have been much more speculative. (The fact that random sequences fail miserably suggests that some structure is needed, so I suppose I should add that the key word above is “multiplicative”.)

A second difference between Polymath5 and Polymath1 is that there is an additional tool available now, namely mathoverflow. This may sound like a minor addition, but I think it could turn out to be highly significant. Several people noted that the difficulty of keeping up with the discussion was a notable barrier to participation in Polymath1. Presumably this meant that even if some of the questions that were being asked were self-contained, the people who might know the answers to those questions would not necessarily have the time or inclination to search for them (especially as they might not expect to find them). But now, every time a self-contained question arises that seems as though it might be doable by somebody with the right knowledge, it can be posted on mathoverflow, where people can answer it without having to follow what’s going on in the wider discussion. For the purposes of giving credit where it is due, it might be good to have a page on the wiki that lists all questions related to the project that have been asked on mathoverflow and briefly describes their status. (One can also see what they are by using the polymath5 tag at mathoverflow itself.) Needless to say, any crucial help that we get from mathoverflow will be appropriately acknowledged in any paper that might result from the project.

I find it interesting that the wiki for Polymath5 has a rather different flavour from that of the wiki for Polymath1: it is more focused on the discussion itself and less on background knowledge. It will be interesting to see whether when the more theoretical discussion begins the character of the wiki will change.

A small difference between Polymath5 and Polymath1 is that we are (so far at least) not numbering the comments, and instead are making full use of the threading, which I am still allowing only up to depth 1. I think this has worked rather well, especially when people provide links to comments that they refer to. It means that some of the chronological feel of the discussion, which I think is important, is preserved, but it is also split to some extent into mini-discussions, so that when one looks back at the comments later they are better organized. If we can keep this going, and are also assiduous about putting useful information (experimental results, observations, lemmas, outlines of proof strategies, etc.) on the wiki, then I hope that the barrier to late entry will be much lower than it was for Polymath1.

Returning to the mathematics, in the last few hours it has emerged that there is a striking pattern in the original sequence of length 1124. I am so interested in it that I don’t really have the energy to write about anything else that has been discussed in the last 100 comments, but perhaps I’ll add a brief summary later. (Right now I want to get this post out fast because when I last looked there were 99 comments and probably there are 100 by now, or at any rate will be very soon.)

I now think it likely that we will come up with a formula that gives something very close to the 1124 sequence. If so, then I am keeping my fingers very firmly crossed that it will give an infinite sequence with sublogarithmic discrepancy. But we shall see: some computation is still needed before we will know what the formula is. There is also a chance that we will obtain a formula but with parameters that have to be chosen for each prime, and that we will not be able to see how to choose them in general. In that case, we may at least obtain a very efficient algorithm for finding extremely long sequences of low discrepancy.


Kategorije: Matematički blogovi

The Erdős discrepancy problem IV

Čet, 2010-01-14 12:20

Again, this is just a brief report on what is going on. Plenty of work is still being put into making sure that the main results, questions, ideas, etc. are appearing on the Polymath5 wiki and especially (for the moment) the experimental results page, so looking at that is still a good way to keep up.

Alec came up with a sequence of length 1112 that is fairly different from the sequences of length 1124. See this comment and succeeding ones for information about it. I am particularly intrigued by the diagram showing partial sums of pointwise products of different HAP-subsequences. There is data here that demands to be explained, but the focus of the discussion is elsewhere for now.

Kevin has suggested thinking about how the number of sequences of length n compares with the number of sequences of length m, where by “sequences” I mean sequences with relevant properties such as having discrepancy 2 and perhaps being multiplicative as well. There is some interesting data here too.

The logarithm of the number of multiplicative sequences of discrepancy 2 and length n behaves in a suggestive way. Yet more to chew over.

I myself have started thinking about an algorithm for generating long low-discrepancy multiplicative sequences in the hope of proving that the discrepancy of such sequences can grow more slowly than logarithmically.

Gil has suggested looking at randomized modifications of the problem with a view to obtaining insights about the problem itself.

Alec has looked at the behaviour of Dirichlet series built out of some of the good examples we have.

Finally, I am hoping for good news from Klas in the not too distant future.


Kategorije: Matematički blogovi

The Erdős discrepancy problem III

Pon, 2010-01-11 16:59

I’m having to write posts at a ridiculous rate now, but until the official launch (at which point I expect a significant theoretical component to add to the experimental one — one decision we should make is whether to have separate theoretical and experimental threads or whether it is better to have a unified discussion) I’ll keep them short and mainly aimed at summarizing what is going on for the benefit of people who want to keep up with the discussion without reading hundreds of comments.

Most of the comments in the last post have concerned sequences with low discrepancy and additional constraints. If we define the map to be the one that takes the sequence to the sequence , then it is notable that the long sequences that we have found satisfy, not quite all the time but certainly most of the time, constraints such as or more generally . This suggests searching for sequences that satisfy these constraints exactly. One reason for doing so is that there are many fewer such sequences, so searching for low-discrepancy examples should be much quicker, and we have some reason to expect them to exist. And indeed, although we have not matched the length 1124, we have got sequences of length well over 500 that satisfy additional constraints.

The best way to keep up with what has been done is probably to check out the experimental results page on the Polymath5 wiki. And the wiki itself is still growing fairly quickly, so if you read through that then you will have a digest of most of what has been discussed in the comments.

A phenomenon that has recently been observed is that sometimes pairs of HAP-subsequences of good sequences do not agree all that well to start with, but after a while “lock in” to each other. We have not yet properly thought about this, but it seems pretty interesting.


Kategorije: Matematički blogovi

Erdős discrepancy problem, continued

Sub, 2010-01-09 19:47

This is another post for the sake of not having too many comments on any single post. I actually wrote a completely different one yesterday, but it contained some theoretical thoughts that are probably better held back until the project starts in earnest. So I’m going to make this post very short indeed.

A quick report on what’s going on right at the moment. We have just been fed some more data: several more sequences of length 1124 and discrepancy 2, some sequences that have HAP-sums bounded between numbers like -1 and 3 instead of -2 and 2, and some nice two-dimensional examples. The Polymath5 wiki has some details about these and other aspects of the problem.

If anyone feels like doing an experiment that I’d be interested to know the result of, they could investigate the following sequence. I want to know whether it is good for anything. It’s the simplest example I can think of of a sequence that appears to exhibit multiplicative behaviour but keeps breaking the pattern. Let be the completely multiplicative function that takes every prime to , where . Then define to be 1 or -1 according to whether the imaginary part of is positive or negative. (Of course, .) At what kind of rate does the discrepancy of this sequence seem to grow?

Another experimental question: are all the HAP subsequences of the new 1124 examples significantly different from all the HAP subsequences of the old one? In other words, is it reasonable to say that they are completely different examples, or are they more like modifications of HAP subsequences of the original one?


Kategorije: Matematički blogovi

Erdős’s discrepancy problem as a forthcoming Polymath project

Sri, 2010-01-06 10:19

This is an emergency post, since the number of comments on the previous post about Erdős’s discrepancy problem has become unwieldy while I’ve been away enjoying a bit of sunshine in Luxor. My head is full of amazing details from the walls and columns of ancient Egyptian tombs and temples. The main thing I didn’t see, or at least not until I finally saw them sticking up through the mist from my plane window yesterday morning, was the pyramids, since those are near Cairo. However, I didn’t feel too bad about that as my guide book, the Lonely Planet guide to Egypt, assured me that the pyramids never fail to disappoint (which was just one of many little gems of bad writing in that book).

For the first time for ages, I was completely away from all email and internet for a week, but just before I left, the results of the polls, as they then stood, about the next Polymath project were already suggesting that the Erdős discrepancy problem was the clear favourite, so I couldn’t help thinking about it a certain amount while I was in Egypt. I said that the process of choosing the next problem was not going to be fully democratic, but the fact is that I love the Erdős discrepancy problem and am currently somewhat grabbed by it, so I see no reason to go against such a clear majority, especially as it also has a clear majority of people who say that they are ready to work on it.

Now that I’m back, I see that the longest sequence yet found with discrepancy at most 2 in any arithmetic progression of the form has gone up to 1124, and is exhibiting some clear structure. So yet another argument for choosing this particular project is that it has in a sense already begun. I’d like to regard what is happening now as a kind of preliminary stage before the true launch. The fact that such long sequences with low discrepancy can exist, and the fact that they appear to have to have some structure, are two extremely helpful pieces of information, since they place very definite constraints on what any proof could look like. For example, it no longer seems all that likely that the true bound for the best possible discrepancy of a sequence of length is logarithmic in , and even if it is, the proof will not be some easy induction if the bound is logarithmic to a very high base. And I don’t think one can rule out the possibility that there exists an infinite sequence of bounded discrepancy: at any rate, it seems foolish to stop trying to find longer and longer sequences, since there is almost certainly more that we can learn from them.

There are a couple of terminological decisions I’d like to make collectively before we start properly. Then main one is what we should call an arithmetic progression of the form . I’ve never managed to come up with a good answer to this, and the lack of a good phrase for it is annoying when writing about the Erdős discrepancy problem, so any suggestions would be very welcome. [While writing this post, I found a description of the problem that called them homogeneous arithmetic progressions. That seems a pretty good solution to me, so perhaps we should go for that unless someone comes up with a clear improvement.] A more minor decision is how to refer to the problem itself. I think I favour EDP, since three-letter acronyms seem somehow right. The only argument against that is that we talked about DHJ rather than DHJP (or DHJT), but I think that is outweighed by the catchiness of EDP: ED just doesn’t do it for me in the same way. A yet more minor decision is what number to give to this project. I’m pretty sure it should be polymath5 (following on from Terry Tao’s polymath4, which was the search for a deterministic algorithm for finding primes). But before giving it that tag I thought I’d quickly check in case anyone thinks I’m wrong. Another decision I’d like to make, since not having made it clearly yet is another source of slight annoyance when I try to write about the Erdős discrepancy problem, is what counts as a positive answer. I’m not sure what Erdős actually conjectured, but I’d like to assume that the conjecture is that all sequences have unbounded discrepancy. Therefore, if I refer to a positive answer, that is what I mean, and a counterexample is an infinite sequence with bounded discrepancy. Another practical question: is there some way of displaying good examples so that their structure can be more easily seen? At the very least it might be good to put them in tabular form, perhaps with each row of some highly composite length such as 24. If I just see a long line of pluses and minuses and I want to know what the subsequence is like, then it is extremely tedious to find out. I don’t know what WordPress’s support for tables is like, though. If anyone can help me to display the sequence of length 1124 (and others) in a nice way, I’d be extremely grateful.

At some point reasonably soon, I plan to write a post that will describe various approaches to the problem that do not work, in the hope of stimulating a discussion about what kind of approach could conceivably work — which at the moment is not at all clear to me (at least if the conjecture is true). That post, rather than this one, will be the “official launch” of the project, and it is at that point that I shall start working seriously on the problem.

For the benefit of anyone who does not want to wade through well over a hundred comments on the previous post, here is a very quick summary of what we know now that we did not know when I put up the post. The main thing is that it is possible to have extremely long sequences of discrepancy at most 2 (meaning that the sum over any homogeneous arithmetic progression has modulus at most 2). But there is more to say. A simple observation that I mentioned in the previous post is that multiplicative sequences, by which I mean sequences such that for every and , are good candidates for low-discrepancy sequences, since the discrepancy of such a sequence is bounded by the largest size of any of its partial sums, which sort of means that there is less to check.

Of course, that is not a very convincing argument, because the price we pay for having less to check is a huge constraint on the sequence: we are now free to choose its values only at primes and everything else is determined. And the initial experimental evidence quickly showed that multiplicativity was indeed too strong a constraint if one is looking for the longest possible sequence with a given discrepancy, since the longest sequence of discrepancy 2 is much longer than the longest multiplicative sequence of discrepancy 2. However, what the examples given in the comments on the previous post are suggesting is that a weaker form of multiplicativity may well still hold, perhaps in some approximate sense. We can characterize a multiplicative sequence as a sequence that starts with and has at most two distinct subsequences of the form (one of which will be minus the other — there will be exactly two unless the original sequence is 1 everywhere). The weaker property is simply that there should be a small number of distinct subsequences. The very long examples don’t quite exhibit this, but they do show something a bit like it: if you look at the subsequences of the given form, then there are six beginnings that appear a lot. This has led to a suggestion that another way of building good examples is to take an Abelian group , to choose a function such that for all and , and to compose that with a function from to . In the light of what we have seen, a good choice for appears to be . It turns out that if a sequence has only finitely many distinct subsequences , then it must have a subsequence of this form.

It doesn’t look as though the example of a sequence of length 1124 is of this form for the group , but it certainly does seem that we can get some understanding of the sequence (which we do not yet fully have) by looking at it with very much in mind. If the conjecture is false, then it seems possible that to produce an infinite sequence of bounded discrepancy one would start with a long finite sequence produce with the help of a small group, and one would then introduce complications that would eventually be explained by the presence of a much larger group, and one would iterate this process. (That is intentionally vague — I do not know how to make it less so.)

Incidentally, the account of the problem I alluded to above, by Josh Cooper, gives it as an open problem whether a multiplicative sequence can have bounded discrepancy. I have some dim memory of being told by a number theorist once that it couldn’t: does anyone know? The most obvious -valued multiplicative sequence, the Liouville function, defined to be -1 raised to the power the number of prime factors of n, has partial sums that grow at rate that is, I think, known to be at least in the rough region of and at most at this sort of rate if and only if the Riemann hypothesis is true. [Added later: I now know that both these statements are correct.] But what about an arbitrary multiplicative sequence? We know from the Walters base-3 sequence that the partial sums can grow quite slowly, so perhaps this question really is also unsolved, in which case it would make an excellent auxiliary project: a strictly easier question that now appears to be highly relevant to the main question.


Kategorije: Matematički blogovi

The next Polymath project on this blog: some polls

Pon, 2009-12-28 12:56

For reasons that I have already gone into, I do not think that the origin-of-life project is suitable as the next Polymath project on this blog, though there seems to be enough enthusiasm for it that I am quite serious about giving it a try at some point in the not too distant future. The complexity-related project also no longer seems such a great idea for now. That leaves three candidates from amongst the problems that I have posted about recently: the project related to the polynomial-DHJ problem, the project related to the Littlewood problem, and the project to solve Erdős’s discrepancy problem. My impression is that for each of these three projects there is already a small group of highly interested people, and certainly my level of enthusiasm for each of these three problems is enough for me to be ready to devote plenty of time to it. (A theory I want to test is that if I post regularly and am not completely stuck, then that will be enough to keep the project feeling active and attracting contributions from other people as well, even if a proof does not appear to be round the corner.)

To help me decide which of the above three problems to go for, here are four polls. The first is a straightforward choice between the three problems. However, there are drawbacks with such a poll, because the success of a project depends not just on the level of general interest, but also on finding a nucleus of people who are prepared to work seriously on the problem. So I have three more polls, designed to test, for each problem, whether there would be such a nucleus. I have tried to identify three categories of potential participation. An active and enthusiastic participant with relevant knowledge is somebody who expects to follow closely the progress of the project, to think about it hard, and to attempt, with a reasonable probability of success, to make posts that will constitute genuine advances. An occasional contributor is somebody who will not necessarily try all that hard to keep up with the discussion, but will certainly check on its progress from time to time and will make comments if they can think of useful ones without too much effort. (For example, such a contributor might happen to know the answer to a question that somebody has asked. If so, they would provide it.) An interested non-participant is somebody who would be enthusiastic about a project but would not expect to be able to make genuine mathematical contributions to it. If you feel no enthusiasm at all for a given project, then you can express that by not voting for any of the three categories. And you are of course welcome to express more detailed opinions by commenting on this post. Those who followed polymath1 will recall that there was quite a bit of discussion about the danger of steamrollering all over somebody’s existing work on a problem. I hope these problems will avoid that, but if for any reason you are anxious that a given problem should not be chosen and want to express your reservations in confidence to me, then drop me an email. If somebody has put a lot of work into a problem and appears to be close to a solution, then I will avoid treading on their toes. This may be a difficult judgment to make, but with luck I won’t in fact be called upon to make it.

One other thing is that I have not cast any votes myself: obviously I would be happy with any of the three projects and would be an active participant.

View This Poll
surveys View This Poll
polls View This Poll
opinion View This Poll
polls
Kategorije: Matematički blogovi

Wiles meets his match

Ned, 2009-12-20 01:08

A brief return to the theme of mathematics in literature: I can’t resist sharing what is, by a long way, the silliest piece of fictional mathematics I have ever come across. It comes in “The Girl Who Played With Fire,” by the late Stieg Larsson, translated (not very well) by someone called Reg Keeling. Here is a little piece of advice for any author who wants to incorporate mathematics into a novel. If you don’t want what you write to be risibly unrealistic, it is not enough to read popular science books: you must also run what you write past a mathematician.

And here is the passage in question.

Salander began her advance towards the house, moving in a circle through the woods. She had gone about a hundred and fifty metres when suddenly she stopped in mid-stride.

In the margin of his copy of Arithmetica, Pierre de Fermat had jotted the words I have a truly marvellous demonstration of this proposition which this margin is too narrow to contain.

The square had been converted to a cube, and mathematicians had spent centuries looking for the answer to Fermat’s riddle. By the time Andrew Wiles solved the puzzle in the 1990s, he had been at it for ten years using the world’s most advanced computer programme.

And all of a sudden she understood. The answer was so disarmingly simple. A game with numbers that lined up and then fell into place in a simple formula that was most similar to a rebus.

Fermat had no computer, of course, and Wiles’s solution was based on mathematics that had not been invented when Fermat formulated his theorem. Fermat would never have been able to produce the proof that Wiles had presented. Fermat’s solution was quite different.

She was so stunned that she had to sit down on a tree stump. She gazed straight ahead as she checked the equation.

So that’s what he meant. No wonder mathematicians were tearing out their hair.

Then she giggled.

A philosopher would have had a better chance of solving this riddle.

She wished she could have known Fermat.

He was a cocky devil.

After a while she stood up and continued her approach through the trees. She kept the barn between her and the house.

Needless to say Salander is entirely self-taught at mathematics. More surprisingly and cringeworthily, she has been seriously interested in it for only about a year. I was initially puzzled by what would have made the author think that Wiles used “the world’s most advanced computer programme” but I now have a theory that could perhaps explain this: he might have read a popular account of the proof that said that it used extremely advanced mathematical machinery, and, literary genius that he was, failed to spot that the word “machinery” was a metaphor. I can’t decide whether that is my favourite detail, or whether it is the superfluous brackets round the equation (not to mention the fact that that case has been known since Euler).

It is hard to believe that anyone can match this passage. However, it does seem to be quite common for novelists to make fools of themselves in this way, and past experience on this blog suggests that I am usually wrong when I make a universal claim about literary depictions of mathematics (which in this case is the claim that nothing is as silly as the passage that I have just quoted). I have to admit that my amusement is mixed with just a hint of annoyance — by writing a post about it I am getting it off my chest.


Kategorije: Matematički blogovi