Reading group: on the average number of real roots of a random algebraic equation, M. Kac

At the moment I am in between research projects: I am “working” on a bunch of old projects, some of which are many years old, but I haven’t had any new ideas for any of them in a long time and, hence, I haven’t made any progress whatsoever. At the same time I am thinking about beginning work on some new projects. Most notably, I want to spend some time understanding quantum systems with static and annealed disorder, and the connections between these systems and computational complexity. Unfortunately the literature on disordered quantum systems is vast, to say the least. Hence, I am putting off learning it. So now I am procrastinating by learning about a whole bunch of new ideas in the hope of learning something that will make the entry into the disordered systems literature a little smoother.

Basically I am now going to play out my learning curve through this blog.

The type of problems I eventually hope to study will be of the following form. Take some family of computational problems {\mathcal{F}}, and consider a random instance {P} from this family. What is the running time, on average, of some quantum algorithm to solve the problem? At the same time I’d also like to consider families {\mathcal{Q}} of quantum problems (eg. a family of quantum systems) and understand the running time, on average, of either classical or quantum algorithms to calculate approximations to, eg., expectation values of local observables, of a random instance. In both cases there is obviously some quantum system (i.e., the quantum computer in the first case, and the quantum system studied in the second case), and there is obviously some disorder present. The hope, broadly speaking, is to exploit the ubiquitous phenomenon of Anderson localisation to understand what happens in each case.

However, except in some very special cases, the problems I want to study don’t reduce in any obvious way to systems studied in the disordered quantum systems literature (at least, not so far as I can see). So I’m now casting around looking for interesting stuff which might have some relevance…

At the most abstract and high level all of the problems I want to eventually consider require that one understands the critical points of a random function (which is usually related to the running time). With a bit of luck one will be able write this expression as a polynomial. Hence it’d be nice to understand, say, the roots of random polynomials.


1. Real roots of random polynomials


In this post I want to discuss an old paper of M. Kac which considers the following problem:

Consider the polynomial

\displaystyle  	p_{c_0, c_1, \ldots, c_{n-1}}(x) = c_0 + c_1x + c_2x^2 + \cdots + c_{n-1}x^{n-1}. \ \ \ \ \ (1)

In general such a polynomial will have a bunch of real roots and complex roots. Suppose that the numbers {c_j}, {j=0, 1, \ldots, n-1} are chosen randomly according the gaussian distribution, i.e., suppose that the probability density function for each {c_j} is

Then the question we want to answer is: how many real roots, on average, of {p_{\mathbf{c}}(x)} are there? Apparently this problem was first studied by Littlewood and Offord, who consider the case where {c_j} are uniformly distributed in {(-1,1)}.

Here is an example of a random polynomial of degree 100 on the interval {[-1,1]}:




Denote by {N_n = N(c_0, c_1, \ldots, c_{n-1})} the number of real roots of {p_{\mathbf{c}}(x)}. What M. Kac actually proved is the following

Theorem 1 Let {\mathbf{c}} be distributed as in (2). Then

\displaystyle  		\mathbb{E}[N_n] = \frac4\pi\int_0^1 \frac{\sqrt{1-n^2(x^2(1-x^2)/(1-x^{2n}))}}{1-x^2}\,dx. 	\ \ \ \ \ (3)

Further, {N_n} behaves, asymptotically, as

\displaystyle  		\mathbb{E}[N_n] \sim \frac{2}{\pi}\log(n). 	\ \ \ \ \ (4)

Pretty impressive really. Note that this says there are not many real zeroes: even if the degree is 1000 there will, on average, be about 4 or 5 real zeros!

Now this result was the state of the art in 1948. Obviously mathematics has moved on since then: there is now a vast literature on roots of random polynomials, and I can’t hope to scratch the surface here. But what I want to achieve is to understand the basic techniques exploited in this paper (the Kac-Rice formula), which have since gone on the form the cornerstone of this body of theory.

The proof of this theorem is sufficiently elementary that I can actually discuss it in full here.


2. The Kac-Rice formula

The central tool exploited in Kac’s paper is what is now known as Kac-Rice formula. This is a really neat trick and it is definitely worth explaining its derivation here. For this section I’m going to follow the heuristic derivation presented in the book of Adler.

Let {f(x)} be a continuous function in {(-\infty, \infty)} having continuous first derivative {f'(x)} and only a finite number of turning points in each finite interval. Then we have the following

Lemma 2 If neither {a} nor {b} is a zero of {f(x)} then, defining

\displaystyle  		N_u(a,b) = |\{t\in[a,b]\,|\, f(t) = u\}|, 	\ \ \ \ \ (5)

we have that

\displaystyle  		N_u(a,b) = \int_a^b \delta(f(t)-u) |f'(t)|\, dt, 	\ \ \ \ \ (6)

where {\delta(x)} is the Dirac delta function.

Proof: (this proof is at the level of “physical rigour”, i.e., it is heuristic. You can add in the excruciating detail required to make it rigourous yourself! We shall later, in the words of Adler, “commit serious crimes” including the exchange orders integration, etc. 😉 ) The key to the proof is to notice that

\displaystyle  	1 = \int_{\mathbb{R}} \delta(y-u)\,dy. \ \ \ \ \ (7)

Now suppose that {I} is a small interval centred on a point where {f(t) = u}. Then a simple change of variables gives us (beware: we’re treating {\delta(x)} as though it were a smooth function with compact support that tails off to zero either side!):

\displaystyle  	1 = \int_{\mathbb{R}} \delta(y-u)\, dy = \int_I\delta(f(t)-u) |f'(t)|\, dt. \ \ \ \ \ (8)

The next step is to concatenate all such intervals {I} and note that there is no contribution to the following integral outside of them:

\displaystyle  	N_u(a,b) = \int_a^b \delta(f(t)-u)|f'(t)|\, dt, \ \ \ \ \ (9)

and we are done! \Box

This lemma has since been generalised in a vast number of obvious and non-obvious ways. See, eg., Adler’s book for details.


3. Applying the Kac-Rice formula to random polynomials

Lemma 3 If {p(x)} is a polynomial of degree {n-1}, then

\displaystyle  		\int_{\mathbb{R}} \delta(p(x))|p'(x)|\,dx \le n-1. 	\ \ \ \ \ (10)

Proof: There are only at most {n-1} roots of a degree {n-1} polynomial, hence the bound. \Box

Now we turn to the calculation of

The trick here is to egregiously change the order of integration and consider

\displaystyle  	\mathbb{E}_{\mathbf{c}}[N_n] = \int_{-\infty}^\infty \mathbb{E}_{\mathbf{c}}\left[\delta(p(x))|p'(x)|\right.]\,dx \ \ \ \ \ (12)

Thus we are reduced to understanding

\displaystyle  	\mathbb{E}_{\mathbf{c}}\left[\delta(p_{\mathbf{c}}(x))|p_{\mathbf{c}}'(x)|\right] = \frac{1}{(\pi)^{n/2}}\int e^{-\|\mathbf{c}\|^2} \delta(p_{\mathbf{c}}(x))|p_{\mathbf{c}}'(x)| \, d\mathbf{c} \ \ \ \ \ (13)

This expression involves the distribution of {p_{\mathbf{c}}(x)} and {p_{\mathbf{c}}'(x)}. To evaluate this we use the following lemma

Lemma 4 (Bernstein limit result) If {\alpha_j}, {\beta_j}, {j=0,1, \ldots, n-1}, are real numbers with {\sum_j \alpha_j^2 = \alpha}, {\sum_j\beta_j^2=\beta}, {\sum_{j}\alpha_j\beta_j = \gamma} and if {\Delta = \alpha\beta - \gamma^2}, then the joint probability distribution function of {\alpha_0 x_0 + \cdots + \alpha_{n-1}x_{n-1}} and {\beta_0x_0+\cdots + \beta_{n-1}x_{n-1}} is equal to

\displaystyle  		\frac{1}{\pi\sqrt{\Delta}}e^{-(\beta u^2 - 2\gamma uv + \alpha v^2)/\Delta}, 	\ \ \ \ \ (14)

where {x_j}, {j=0,1,\ldots, n-1} are distributed according to {\frac{e^{-u^2}}{\sqrt{\pi}}}.

Proof: This follows from a limit theorem of Bernstein (basically one just applies a central limit theorem-type argument to the characteristic function and keeps track of the mutual independence of {p_{\mathbf{c}}(x)} and {p_{\mathbf{c}}'(x)}…) BTW: does anyone know a more modern reference than this? \Box


4. Completing the proof

Now we know the joint distribution function for {p_{\mathbf{c}}(x)} and {p_{\mathbf{c}}'(x)} we are almost done: Proof: (of main theorem). Use the Bernstein limit lemma to write

\displaystyle  		\mathbb{E}_{\mathbf{c}}\left[\delta(p_{\mathbf{c}}(x))|p_{\mathbf{c}}'(x)|\right] =

\displaystyle  \frac{1}{\pi\sqrt{\Delta}}\int_{-\infty}^\infty \int_{-\infty}^\infty \delta(u)|v|e^{-(\beta u^2 - 2\gamma uv + \alpha v^2)/\Delta}\,dudv, 	\ \ \ \ \ (15)

where now {\alpha = \sum_{j=0}^{n-1} x^j}, {\beta = \sum_{j=0}^{n-1} jx^{j-1}}, and {\gamma = \sum_{j=0}^{n-1} jx^{2j-1}}. This reduces to

\displaystyle  		\mathbb{E}_{\mathbf{c}}\left[\delta(p_{\mathbf{c}}(x))|p_{\mathbf{c}}'(x)|\right] = \frac{1}{\pi\sqrt{\Delta}}\int_{-\infty}^\infty |v|e^{-\alpha v^2/\Delta}dv. 	\ \ \ \ \ (16)

Carrying out this integral gives us

\displaystyle  		\mathbb{E}_{\mathbf{c}}\left[\delta(p_{\mathbf{c}}(x))|p_{\mathbf{c}}'(x)|\right] = \frac{\sqrt{\Delta}}{\alpha\pi}. 	\ \ \ \ \ (17)

Now, an elementary summation gives us that

\displaystyle  		\Delta = \frac{x^{4n} - n^2x^{2(n+1)}+2(n^2-1)x^{2n}-n^2x^{2(n-1)} + 1}{(x^2-1)^4}. 	\ \ \ \ \ (18)

Substituting this into our original integral (11) gives us

\displaystyle  		\mathbb{E}_{\mathbf{c}}[N_n] =

\displaystyle  \frac{1}{\pi}\int_{-\infty}^\infty \frac{\sqrt{x^{4n} - n^2x^{2(n+1)}+2(n^2-1)x^{2n}-n^2x^{2(n-1)} + 1}}{(x^2-1)^2(1+x^2+\cdots+x^{2n-2})}\, dx. 	\ \ \ \ \ (19)

The actual result follows from an “elementary” transformation… The asymptotic result is a bit more tedious, and I’ll leave that to you to chase up. \Box


5. Extensions and applications

One can imagine a zillion variations on this theme. I won’t explore them here. I just want to comment on what is actually needed by the above argument. Basically if one has a family {\mathcal{F}} of functions {f(x)} for which the joint probability density function for the tuple {(f(x), f'(x))} for any given {x} is known then one can get at the number of real solutions within some interval. This is rather interesting in the context of disordered systems: suppose we have the Anderson model

\displaystyle  	H = \sum_{ j=1}^n v_j|j\rangle\langle j| - \sum_{j=1}^{n-1} |j\rangle\langle j+1|+ |j+1\rangle\langle j|. \ \ \ \ \ (20)

Write {p_n(x)} for the characteristic polynomial. If we know the number of zeros of {p_n(x)} in an interval {[a,b]} then we can obtain the averaged eigenvalue density function. Applying the method of Kac leads us to the consideration of the joint probability density function of {(p_n(x), p_n'(x))}. Now, this looks tricky, but we do have the following recurrences, which might help:

\displaystyle  	p_n(x) = (v_n-x)p_{n-1}(x) - p_{n-2}(x) \ \ \ \ \ (21)


\displaystyle  	p_n'(x) = (v_n-x)p_{n-1}'(x) - p_{n-2}'(x)-p_{n-1}(x). \ \ \ \ \ (22)

To finish off the analysis we’d need to derive a Bernstein-like limit theorem for {(p_n(x), p_n'(x))} using these recurrences. I cant see how to do this (if it is even possible!)…

BTW: has the averaged eigenvalue density function for the Anderson model ever been calculated at the level of mathematical rigour? I haven’t found any reference for such a thing…


3 Responses to Reading group: on the average number of real roots of a random algebraic equation, M. Kac

  1. matt says:

    I’m pretty sure what you want to know about the 1d Anderson model has been done very carefully. Not just the averaged eigenvalue density, but also the localization length. Take a look maybe at the book “Schrodinger Operators”, by Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B. on the section on point spectrum of the Anderson model. If it isn’t there, it’s definitely done elsewhere.

  2. tobiasosborne says:

    Dear Matt,

    Many thanks for the reference! I’ll check it out.

    I’ve had a hard time reconciling the mathematical literature with the physical: I’m certainly aware that in the mathematical literature the spectrum of many random schroedinger operators with enough disorder have been proven to have a pure point spectrum, almost always, for realisations of the disorder. In the physical literature the supersymmetric method is good at obtaining averaged quantities. But I’m far from familiar with the interface between these two bodies of theory, and any statements I make about this are purely based on ignorance…


  3. matt says:

    I remember seeing one paper (can’t remember the author, any details, anything) which was (in my very rough recollection) claiming to give a mathematical proof of localization in 1d using supersymmetry techniques. This required I think lots of use of tools of analysis to estimate errors in the physics calculation.

    However, in general I think you’re right: other than this one paper, the math and physics literatures seem to use very disjoint tools on this problem.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: