### The Green-Tao theorem and a relative Szemerédi theorem

David Conlon, Jacob Fox, and I just uploaded to the arXiv our second joint paper, titled “A relative Szemerédi theorem.”

Before describing our work, let me to take a detour to reflect on some recent news in number theory.

The past couple weeks were filled with exciting developments on prime numbers. Yitang Zhang, in a breakthrough that caught the mathematical community by complete surprise, proved that there exists infinitely many pairs of primes with bounded gaps (specifically, less than 70 million). This is a giant leap of progress towards the famous twin primes conjecture, which claims that there exist infinitely pairs of primes differing by two.

By an amazing coincidence, in the same day that Zhang’s news became public, Harald Helfgott posted on the arXiv his paper which claims to prove the odd Goldbach conjecture, that every odd integer greater than 5 can be written as a sum of three primes. Previously this was known to be true for sufficiently large odd numbers, specifically those with at least about a thousand digits. It is possible to check the conjecture by computer for “small” cases, but 10^{1000} is far from small. Helfgott’s work brought the “sufficiently large” down to a reasonable threshold, so that all the remaining small cases can indeed be verified by computer search.

Both results are exciting developments. However, we still seem far away from resolving two of the oldest problems about prime numbers. Zhang’s bound of 70 million probably could be brought down by more careful analysis, although getting it all the way down to 2 will probably require some really new ideas. As for Goldbach’s conjecture, there’s significant obstacle in getting down to two primes (the best result in this direction is due to Jingrun Chen), as all existing methods have severe limitations. But then again, the same could have been said about bounded gaps between primes before Zhang’s surprising work. Perhaps one day, we’ll be shocked again with a breakthrough on Goldbach, maybe even using techniques right in front of our eyes.

~~~~~~

Back in 2004, the mathematical community received a surprise on another long-standing open problem on prime numbers: Ben Green and Terry Tao proved that the primes contain arbitrarily long arithmetic progressions.

An arithmetic progression is a sequence of equally spaced numbers. For example, 3, 5, 7 is a 3-term arithmetic progression of prime numbers. To find one with four terms, you need to look a bit further: 5, 11, 17, 23. At the time of Green and Tao’s proof, the longest such progression known was a sequence of 22 numbers, starting with 11410337850553 and with step size 4609098694200. This record has since been improved to 26. Here’s the description from Wikipedia:

As of April 2010, the longest known AP-*k* is an AP-26, found on April 12, 2010 by Benoãt Perichon on a PlayStation 3 with software by Jaroslaw Wroblewski and Geoff Reynolds, ported to the PlayStation 3 by Bryan Little, in a distributed PrimeGrid project.

There is a webpage with current records on primes in arithmetic progressions.

It was a long standing open conjecture that the primes should contain arithmetic progressions of every length. This was a folklore conjecture, with the first recorded instance in history dating back to 1770 by Lagrange and Waring. Green and Tao proved this conjecture, and their breakthrough is considered one of the greatest mathematical achievements of the twenty-first century. This work was an important part of Tao’s 2006 Fields Medal award.

~~~~~~

I was still in high school when the Green-Tao theorem was announced. The news spread quickly online, and I soon heard of it through my math competition online community. I remember being quite excited about the news. It was a quintessential mathematical breakthrough. The statement of the problem could be easily understood, yet the solution involved very deep mathematics. I fell in love with the statement of the Green-Tao theorem. It became one of my favorite mathematical results, and I aspired to one day understand the mathematics behind it.

Now, nine years later, as a graduate student, I find it deeply personally fulfilling to be working on a problem central to the Green-Tao theorem. Our latest paper (joint with Conlon and Fox) strengthens the main technical result in the work of Green and Tao.

To describe this result, I need to first tell you about another important mathematical breakthrough by Endre Szemerédi dating back to the early 1970’s. Last year, Szemerédi received the Abel Prize, one of the highest lifetime achievement awards in mathematics. Szemerédi is perhaps most famous for his result, commonly referred to as **Szemerédi’s theorem**, which says that *every subset of integers with positive density contains arbitrarily long arithmetic progressions*. So, if *S* is a sufficiently large subset of the integers, “large” meaning that it contains, say, at least 0.1% of the first *N* positive integers for sufficiently large *N*, then *S* necessarily contains an arithmetic progression of every length.

Szemerédi’s theorem is a very deep result, albeit again with a seemingly very innocent statement. It was conjectured by Erdős and Turán in the 1930’s, and remained open for decades. Since Szemerédi’s breakthrough, several other proofs have been discovered. None of the proofs are easy but all of them have deep insights and have spawned into very rich and active areas of mathematical research.

I was personally exposed to Szemerédi’s theorem while at Cambridge University. I spent a year there right after finishing undergrad and took part in what is colloquially known as Part III of the Maths Tripos. One of the components of the program was writing an extended expository essay, and I wrote mine about Szemerédi’s theorem under the supervision of Ben Green. Ben gave me a very high score for the essay. I’ve put the essay on my website, and several people have told me that they found it to be helpful (including one who was preparing a popular article on Szemerédi after his Abel Prize award). Though, to be honest, looking back, perhaps now I would be a bit embarrassed to read that essay again myself, since I wonder how much I actually understood while I was writing it up at the time.

While Szemerédi’s theorem is a powerful result, it is not enough to draw any conclusions about prime numbers. Szemerédi’s theorem only works for sets of integers with positive density. The primes, on the other hand, have density diminishing to zero. Indeed, the Prime Number Theorem tells us that between 1 and *N*, approximately 1 in ln *N* fraction of the numbers are prime. This ratio, 1/ln *N*, diminishes to zero as *N* grows to infinity. So Szemerédi’s theorem doesn’t work here.

The primary innovation of Green and Tao is that they came up with a relative version of Szemerédi’s theorem. To overcome the problem that the primes have zero density, they find a slightly larger set of integers where the primes can sit inside as a subset of positive relative density. This larger host set, roughly speaking, is the set of “almost primes,” which consists of numbers with few prime divisors. They found a way to transfer Szemerédi’s theorem to this relative setting, showing that if we start with a set with some random-like characteristics (in this case, the almost primes), then any subset of it with positive density relative to the host set must necessarily contain long arithmetic progressions.

The Green-Tao paper splits into two parts. In the first part, which is their main technical contribution, they establish a relative Szemerédi theorem for subsets of pseudorandom sets of integers. Their method was heavily influenced by the work of Fields Medalist Timothy Gowers, who revolutionized the field of additive combinatorics by inventing a far-fetching extension of Fourier analysis (also known in this context as the Hardy-Littlewood circle method) to give a novel proof of Szemerédi’s theorem. The second part of Green and Tao’s paper shows that the almost primes act as a suitable host set by verifying the required pseudorandomness conditions. Most of the number theoretic input to their work were credited to the works of Goldston and Yıldırım, which later led to a spectacular breakthrough in the problem of small gaps in primes by Goldston, Pintz, and Yıldırım (which relates back to Zhang’s twin primes breakthrough mentioned at the beginning of this blog post).

The pseudorandomness conditions required in the Green-Tao method are rather involved. It assumes that the host pseudorandom set satisfies a “linear forms condition” as well as a “correlations condition.” Both conditions are essential to Green and Tao’s method of establishing their relative Szemerédi’s theorem. They show that the almost primes satisfy these pseudorandomness conditions, thereby proving the old conjecture about progressions in primes. However, the complicated pseudorandomness hypotheses, while adequate for this application, seem rather contrived and somewhat unsatisfying. This leads to the next question, which has been repeatedly asked since the Green-Tao work: does a relative Szemerédi theorem hold under more natural pseudorandomness hypotheses?

In our new paper, we show that the answer is yes! We prove a relative Szemerédi theorem under a very simple and natural linear forms condition. What we need to assume is a small subset of what Green and Tao assume in their work. This new result provides not only a brand new (and simpler, as we believe) alternative approach to the proof of the Green-Tao theorem, but also, more broadly speaking, a new method for understanding sparse pseudorandom structures. The Green-Tao method has had a large number of applications since its inception almost a decade ago. With our new perspective, perhaps one can go even further.