What is the largest set avoiding a linear equation? This general question is a central one in additive combinatorics. Indeed the search of extremal structures gives it a combinatorial flavor, while the presence of an additive equation makes such a question ripe for additive techniques. Such questions include well-studied subjects such as sum-free sets, Sidon sets, and sets avoiding arithmetic progressions. In 1993, Ruzsa made an effort to put these examples into a general framework. In what follows, we focus on Sidon sets and their generalizations, as developed by Ruzsa in the same paper.

We fix and consider the linear equation

Following, Ruzsa, we say a solution is trivial if the are a permutation of the . We set to be the size of the largest set of with only trivial solutions to (1). We set to be the size of the largest set of with no solutions to (1) in with distinct. It follows from these definitions that

We will show below that , while it is not known It is not known if . Indeed one may feel that most of the solutions to (1) where the variables lie in a set come when the variables are distinct, since otherwise we have imposed an additional constraint. This idea does work well for sets with additive structure, but has shortcomings for general sets. Furthermore, it was shown by Timmons that .

For a set , we let be the number of nontrivial solutions to (1) in and let be the number of solutions to (1) with distinct variables.

We briefly remark that in Wooley’s efficient congruencing proof of Vinogradov’s mean value theorem, he estimates quantities similar to (where is the set of powers), as applications of Hensel’s lemma require the variables to be distinct (modulo a certain prime). Indeed there are some overlapping techniques (see Lemma 1 below).

To start, we first consider the case . A set with is known in the additive combinatorics literature as a Sidon set. Thus for . There is a nice construction due to Ruzsa (modifying a previous construction of Erdős), which is just one of several known constructions.

**Example 1 (Large Sidon Set):** Let be a prime. Let be a primitive root in . One can check the set of residues in the cyclic group of order , defined via the Chinese remainder theorem, by

forms a Sidon set in the cyclic group modulo and hence in .

On the other hand, we have the following upper bound.

**Theorem 1:** Let be a Sidon set. Then

**Proof:** We follow an argument of Green, which will rely on some basic Fourier analysis (we follow notation from Chapter 4 of Tao and Vu’s book on additive combinatorics). Let and consider as a subset of . Let . We have by Parseval,

in the last step isolating the term. Since is a Sidon set we have

for (note this is not true for other values of ). Thus

Combining (2) and (3) and choosing the optimal choice of gives the desired result.

Green remarks in his paper that above proof only assumes that for instead of the full Sidon set property. Combining Example 1 and Theorem 1, we find

It is a famous question of Erdős to improve these bounds (in particular the lower bound), which has been stuck for over fifty years (modulo a improvement by Cilleruelo to ). It seems likely that both the lower and upper bound are not optimal and that the truth lies somewhere between and for any and . One can see this blog post of Gowers and the comments within for interesting discussion of this problem.

We now consider general and start with a lower bound.

**Proposition 1 (Random sets):** For large enough,

**Proof:** We use the alteration method, which is discussed in Alon and Spencer’s book on the Probabilistic Method. Choose where each element is chosen independently at random with probability

The number of solutions to (1) in is at most , so the expected number of solutions to (1) is at most . By Markov,

On the other hand, by say Chebyshev,

for large enough. By the union bound, we may choose an such that and . By our choice of , we have . For each nontrivial solution of (1), we delete an element (the so-called alteration) of that renders it no longer a solution and call the resulting set . Thus and so , as desired.

Ruzsa remarks that the probabilistic construction above probably never gives the correct order of magnitude for , where (8) is replaced by a suitable linear equation. We already saw this above, as Example one gives a Sidon set of size while the random construction gives one of size about . Moreover, with an explicit construction, Bose and Chowla showed the following improvement to Proposition 1:

(see this survey of Kevin O’ Bryant for more information). Timmons showed that

by pasting together two sets from the aforementioned Bose-Chowla construction.

We now turn to upper bounds. We let be the number of solutions in to (1) For any set with , we have that and so by Cauchy Schwarz

It follows that

and thus

This argument fails to bound , as was observed by Ruzsa, who provided an alternative argument, which we recall below. The basic idea is to still use (4), and to show that is still small for sets with . We will need the following lemma.

**Lemma 1 (Hölder):** Let be a function on the torus. Then

To prove this, apply Hölder’s inequality to with powers and . Note that one can always interpolate an upper bound for in terms of and as long as .

**Proposition 2:** We have

**Proof:** Suppose that (though we will not make use of this until the end). By (4), it is enough to show

Note that

By Lemma 1 and Parseval, it is enough to show

The idea is that allows us to eliminate a variable at a cost of only , as opposed to the trivial bound of .

We let

and

Note we have

By the triangle inequality in , it is enough to show

Note the second term is the larger of the two for large.

We dispose of the first inequality. Note is the number of solutions to with some . There are at most choices for the indices and then after relabelling we find

Thus

by Hölder’s inequality.

Now we finally use that . Suppose we have a solution to (1) with distinct and distinct . Since , this implies for some . Thus

since we have choices for and choices for .

The best bound for (for large) is due to Green and is of the form

an improvement to the constant in (4). On the other hand, the best bound for is in a recent paper of Schoen and Shkredov:

adopting ideas from Timmons as well as Ruzsa’s original work. Schoen and Shkredov implicitly mention the following question.

**Question 1:** Let such that . Does there exist a large (say ) such that has no solutions to

An affirmative answer would allow one to apply (5) and then (4) to and eventually get improved bounds for comparable to what is known for (i.e. the same as (6) up to the absolute constant). Schoen and Shkredov are not able to solve Question 1, but are able to show that (8) holds for many and then apply a suitable version of Erdős-Ko-Rado to obtain (7). Underlying their argument is the simple observation that a solution to (1) with can be added to a solution with to create a new solution with .

We illustrate part of the argument in the case, which is significantly simpler than the general case and already handled by Timmons.

**Proposition 3:** One has

**Proof:** We let with . By the first part of (5) and Lemma 1, we have

By the second part of (5),

and so

Suppose first that has no solutions to (8) with . Then

Indeed there are at most solutions to with either or and also trivial solutions. It follows that

Combining this with (4) gives the desired result. Now suppose does have a solution to (8) with , say

Delete these four elements from to create . Now we claim does not have a solution to (8) with . Indeed if there were, say

then adding this to (9) yields

which contradicts . Thus we may apply the above argument to to obtain

and the Proposition follows from (4).