RANDOM MATRICES AND FREE PROBABILITY

JACOB CAMPBELL

Contents

Contents
Notation
 1.  GUE and genus expansion
 1.1.  Moments of gaussians
 1.2.  Self-adjoint gaussian matrix
 1.3.  Exercises
 2.  Semicircle law and noncommutative probability spaces
 2.1.  Which pairings survive in the limit?
 2.2.  Moment problem
 2.3.  Noncommutative probability spaces
 2.4.  Exercises
 3.  Free independence
 3.1.  Motivation and definition
 3.2.  What does freeness actually do?
 3.3.  Free central limit theorem
 3.4.  Exercises
 4.  Asymptotic freeness of multiple GUEs
 4.1.  Free semicircular families
 4.2.  Asymptotic freeness of GUEs
 4.3.  Exercises
 5.  How to integrate over compact groups
 5.1.  Reminders about integration
 5.2.  Projection onto fixed points
 5.3.  Gram matrices and pseudoinverses
 5.4.  Exercises
 6.  Weingarten formula
 6.1.  The spanning set
 6.2.  Character expansion
 6.3.  Exercises
 7.  Randomly rotated matrices
 7.1.  Exact trace formula
 7.2.  Asymptotics of Weingarten function
 7.3.  Limit of trace formula
 8.  Free additive convolution
 8.1.  Free cumulants
 8.2.  Cauchy transform and Stieltjes inversion
 8.3.  R-transform
 8.4.  Exercises
 9.  Expected characteristic polynomials
 9.1.  Exercises
 10.  Finite free probability
 10.1.  Finite free cumulants
 10.2.  Exercises
 11.  Asymptotics of finite free convolution
 11.1.  Combinatorics of maps
 11.2.  Moment-cumulant formulas
References

Notation

For X , let M(X) be the set of probability measures on X, and let Mc(X) be the set of probability measures on X with compact support. Let + and be the upper and lower half-planes, respectively.

1. GUE and genus expansion

1.1. Moments of gaussians. Recall the following distribution:

Definition 1.1. A random variable X has a standard gaussian distribution if it has density

fX(x) = 1 2πex2 2

for x . This is the case of mean 0 and variance 1.

Actually, we care about all the moments:

Proposition 1.2. If X is standard gaussian, then

𝔼(X2k) = (2k 1)!! and 𝔼(X2k1) = 0

for k 1.

Proof. Exercise.

Definition 1.3 (Set partitions). A partition of the set [n] := {1,,n} is a collection {V 1,,V l} of blocks, which are disjoint subsets of [n], with [n] = i=1lV i. The set of partitions of [n] will be denoted by P(n). A pair partition is a partition whose blocks all have size 2; the set of pair partitions of [n] will be denoted by P2(n). Note that P2(n) = when n is odd.

Let ϕ : P(n) Sn be the map which turns blocks into cycles, with the usual order inherited from [n]. This is obviously injective, and the range of P2(n) is the set of permutations in Sn with order 2 and no fixed points. We will refer interchangeably to P2(n) and its image in Sn; notice that in the case of P2(n), the arbitrary choice of order in the definition of ϕ does not actually matter.

Example 1.4.

P2(4) = {{{1,2},{3,4}},{{1,3},{2,4}},{{1,4},{2,3}}}

Proposition 1.5. We have |P2(2k)| = (2k 1)!! for k 1.

Idea of proof. When you choose what gets paired with 1, you have 2k 1 choices. Next, you have 2k 3 choices. This goes on to give you (2k 1)!! choices in total. Exercise: formalize this idea.

Theorem 1.6 (Wick formula). Let X1,,Xn be gaussian with covariance matrix Σ. Then for 𝐢 : [k] [n], we have

𝔼(X𝐢(1)X𝐢(k)) = πP2(k) (r,s)π𝔼(X𝐢(r)X𝐢(s)).

Proof. Exercise.

Definition 1.7 (Complex gaussian). A standard complex gaussian variable Z is obtained by letting X and Y be independent standard real gaussians and letting Z = 1 2(X + 𝑖𝑌 ). This has mean 0 and variance 1:

𝔼(Z) = 1 2(𝔼(X) + 𝑖𝔼(Y )) = 0

and

𝔼(ZZ¯) = 1 2𝔼((X + 𝑖𝑌 )(X 𝑖𝑌 )) = 1 2(𝔼(X2) + 𝔼(Y 2)) = 1.

Proposition 1.8. Let Z be a standard complex gaussian variable. Then

𝔼(ZmZ¯n) = { m!if m = n 0 otherwise

for m,n 0.

Proof. Exercise.

Remark 1.9. By multilinearity, the structure of the Wick formula is retained when one replaces some of the real gaussians with complex gaussians and/or their adjoints. We will use this observation freely.

1.2. Self-adjoint gaussian matrix.

Definition 1.10 (Gaussian unitary ensemble). Let A = (a𝑖𝑗)1i,jN be a matrix where

The point of this definition is that A is self-adjoint with independent gaussian entries, except as required by the self-adjointness. The choice of variance 1 N is for normalization and will play an important role when we make N .

Definition 1.11. Let (Ω,F,P) be a probability space and for 1 i,j N, let a𝑖𝑗 : Ω be a random variable. Assume that A = (a𝑖𝑗)i,j is self-adjoint. This random matrix produces a random probability measure

νA := 1 N i=1Nδ λi

on , where λ1,,λN are the eigenvalues of A. This is called the empirical spectral distribution (or ESD) of A.

The average eigenvalue distribution of A, denoted by μA, is simply the mean of νA: define μA by

f(x)dμA(x) = 𝔼 (f(x)dνA(x))

for measurable f. This gives us a nice way to access the moments:

xmdμ A(x) = 𝔼 (xmdν A(x)) = 𝔼 ( 1 N i=1Nλ im) = 𝔼tr N(Am).

Theorem 1.12 (Wigner’s semicircle law). Let A be the random matrix from Definition 1.10. Then μA μ weakly as N , where 𝑑𝜇 = f𝑑𝑥 with

f(x) = { 1 2π4 x2if x [2,2] 0 otherwise .

Remark 1.13. The mode of convergence in Theorem 1.12 can be strengthened. Namely, one can show νA μ weakly in probability or weakly almost surely. This could be a good starting point for your project.

In this class and next, we will prove Theorem 1.12 using moments. The moments of the average eigenvalue distribution turn out to be encoded by a combinatorially rich sequence of polynomials – related to topological genus of certain surfaces – evaluated at 1 N. We will come back to the latter connection later on; for now, we will only be concerned with the leading order.

Notation 1.14. Let γn = (1,,n). We will use the notation #() for the number of disjoint cycles in a permutation (including singletons/fixed points).

Theorem 1.15 (Genus expansion). We have

𝔼(trN(Am)) = πP2(m) ( 1 N )m 2 +1#(γmπ)

for m 1. When m is odd, the formula above should be read as an empty sum, i.e. as 0.

Remark 1.16. For π P2(2k), there are at least two ways to interpret the exponent k + 1 #(γ2kπ). It is 2gπ, where gπ is defined in either of the following two ways:

  1. gπ is the genus of the surface obtained from a polygon with 2k sides by gluing the sides together in pairs according to π;
  2. gπ is the smallest possible genus for which the following can be done: put the elements of [2k] on a circle clockwise, make that circle the boundary of a surface with genus gπ, and draw π on the surface without crossings.

This is where the phrase “genus expansion” comes from.

Proof of Theorem 1.15. We have

𝔼(trN(Am)) = 1 N 𝐢:[m][N]𝔼(a𝐢(1)𝐢(2)a𝐢(m)𝐢(1)) = 1 N 𝐢:[m][N] πP2(m) (r,s)π𝔼(a𝐢(r)𝐢(r+1)a𝐢(s)𝐢(s+1)) = 1 N 𝐢:[m][N] πP2(m) (r,s)πδ𝐢(r)=𝐢(s+1)δ𝐢(r+1)=𝐢(s) 1 N = 1 N 𝐢:[m][N] πP2(m) 𝐢=𝐢γmπ ( 1 N )m 2 = ( 1 N )m 2 +1 πP2(m)|{𝐢 : [m] [N] : 𝐢 = 𝐢 γm π}| = ( 1 N )m 2 +1 πP2(m)N#(γmπ) = πP2(m) ( 1 N )k+1#(γmπ)

since a map 𝐢 : [m] [N] with 𝐢 = 𝐢 γm π amounts to a choice of label from [N] for each cycle in γmπ.

1.3. Exercises.

Exercise 1.17. In this exercise, you will show that the moments of a standard gaussian variable count pair partitions.

  1. Let X be a standard gaussian variable. Prove that

    𝔼(X2k) = (2k 1)!! and 𝔼(X2k1) = 0

    for all k 1. Use integration by parts to find a recursion.

  2. Prove that |P2(2k)| = (2k 1)!! by putting P2(2k) in bijection with a set of cardinality (2k 1)|P2(2k 2)|.

Exercise 1.18. In this exercise, you will prove the Wick formula: if X1,,Xn are gaussian with mean 0 and covariance matrix Σ, then

𝔼(X𝐢(1)X𝐢(k)) = πP2(k) (r,s)π𝔼(X𝐢(r)X𝐢(s))

for all 𝐢 : [k] [n].

  1. Assume Σ is diagonal and prove the claim.
  2. Prove the claim in general by diagonalizing Σ and using the multilinearity of the claimed formula.

Exercise 1.19 ([12, Exercise 1.6]). Let Z be a standard complex gaussian variable. Show that the moments are

𝔼(ZmZ¯n) = { m!if m = n 0 otherwise .

Hint: start by showing that

𝔼(ZmZ¯n) = 1 π2(t1 + it2)m(t 1 it2)ne(t12+t 22) dt1dt1

and then switch to polar coordinates to show that

𝔼(ZmZ¯n) = 1 π02π0rm+n+1e𝑖𝜃(mn)er2 𝑑𝑟𝑑𝜃.

Use this to prove the claim.

2. Semicircle law and noncommutative probability spaces

Recall the theorem from last class:

Theorem 1.15 (Genus expansion). We have

𝔼(trN(Am)) = πP2(m) ( 1 N )m 2 +1#(γmπ)

for m 1. When m is odd, the formula above should be read as an empty sum, i.e. as 0.

What happens when we send N ? Clearly, for the RHS of Theorem 1.15 to converge, the exponents have to be non-negative; if this is indeed the case, the only summands that will survive are the ones with m 2 + 1 #(𝛾𝜋) = 0.

Definition 2.1. A partition π P(n) is said to be noncrossing if the following situation never occurs: there are some blocks V,W π with V W, and some a,c V and b,d W with a /mo> b /mo> c /mo> d. The set of noncrossing partitions of [n] is denoted by 𝑁𝐶(n).

Theorem 2.2 ([3]). We have #(γ2kπ) k + 1 for all π P2(2k). Moreover, we have equality if and only if π is noncrossing.

We will come back to this in a moment. First, let’s use it to find the limiting moments of μA.

Notation 2.3. Write NC2(m) for the non-crossing pair partitions of [m]. When m is odd, NC2(m) is of course empty.

Proposition 2.4. We have

|NC2(2k)| = 1 k + 1( 2k k)  and |NC2(2k 1)| = 0

for all k 1.

Proof. Exercise.

Definition 2.5. The k-th Catalan number is Cat(k) := 1 k+1( 2k k) .

Corollary 2.6. We have

lim N𝔼(trN(A2k)) = Cat(k) and lim N𝔼(trN(A2k1)) = 0

for k 1.

2.1. Which pairings survive in the limit?. Recall the core combinatorial theorem:

Theorem 2.2 ([3]). We have #(γ2kπ) k + 1 for all π P2(2k). Moreover, we have equality if and only if π is noncrossing.

To see the inequality, we can make a simple observation about permutations:

Lemma 2.7. Let α Sn and let τ = (i,j) be a transposition. Then

#(𝛼𝜏) = { #(α) + 1if i and j are in the same cycle of α #(α) 1 if i and j are in different cycles of α .

Remark 2.8. The permutation γ2kπ can be thought of in terms of a building process: each pair in π is a transposition, which bumps the cycle count up or down by 1. We start with the one cycle of γ2k, and the maximal situation is that each of the k pairs in π increases the number of cycles. This makes #(γ2kπ) k + 1.

For example, let k = 3 and π = (1,4)(2,3)(5,6). Then we start at (1,2,3,4,5,6) with 1 cycle. Next, we multiply by (1,4):

(1,2,3,4,5,6)(1,4) = (1,5,6)(2,3,4)

which has two cycles. Next,

(1,2,3,4,5,6)(1,4)(2,3) = (1,5,6)(2,4)(3)

which has three cycles. Finally,

(1,2,3,4,5,6)(1,4)(2,3)(5,6) = (1,5)(2,4)(3)(6)

which has four cycles. This is the maximal situation: each pair split a cycle in two.

On the other hand, now consider π = (1,5)(2,3)(4,6). Then we again start at (1,2,3,4,5,6) with 1 cycle. Then

(1,2,3,4,5,6)(1,5) = (1,6)(2,3,4,5)

has two cycles, as before. Next,

(1,2,3,4,5,6)(1,5)(2,3) = (1,6)(2,4,5)(3)

which is another splitting step (notice we haven’t arrived at the crossing yet) giving us three cycles. The last step is different:

(1,2,3,4,5,6)(1,5)(2,3)(4,6) = (1,6,5,2,4)(3)

which is back down to two cycles. This is the generic situation: we can have both splits and merges.

Analyzing the case of equality can probably be done directly, but it is really more insightful to put all this in a proper algebraic framework. Let’s work with general partitions and permutations, with size n; the results we need for pairings will fall out as special cases.

Definition 2.9. For α Sn, let |α| be the length of α: the minimal number of transpositions needed to factor α. For α,β Sn, write d(α,β) := |α1β|.

Proposition 2.10. Notation as above. Then

  1. |α| = n #(α) for all α Sn;
  2. (Sn,d) is a metric space.

Proof. For (1), use Lemma 2.7. (2) is a very direct and straightforward verification of axioms.

Proposition 2.11 ([3]). Let ϕ : P(n) Sn be the injective map which turns blocks into cycles. Let

S𝑁𝐶(γn) := {α Sn : d(e,α) + d(α,γn) = d(e,γn)}.

Then S𝑁𝐶(γn) is the range of ϕ|𝑁𝐶(n).

Proof. Finicky induction but elementary. See [3, Theorem 1].

Proof of Theorem 2.2. In (S2k,d), for π P2(2k), the triangle inequality gives

d(e,γ2k) d(e,π) + d(π,γ2k).

Unpacking notation, this translates to

2k 1 (2k k) + (2k #(γ2kπ))

which simplifies to #(γ2kπ) k + 1. The case of equality, i.e.

d(e,γ2k) = d(e,π) + d(π,γ2k),

corresponds to π being noncrossing due to Proposition 2.11.

2.2. Moment problem.

Question 2.12. We just showed that the moments of the average eigenvalue distribution of a GUE converge to the Catalan numbers. How do we know this is a moment sequence, and how do we know this determines the weak limit of the average eigenvalue distribution?

The Catalan numbers appear as the moments of a particular distribution:

Proposition 2.13. Let 𝑑𝜇 = f(x)𝑑𝑥 where

f(x) = { 1 2π4 x2if x [2,2] 0 otherwise .

Then

x2k𝑑𝜇(x) = 1 k + 1( 2k k)  and x2k1𝑑𝜇(x) = 0

for all k 1.

Proof. Exercise.

Okay, so we can guess that maybe the average eigenvalue distribution of a GUE converges to the semicircle distribution. But we just found the moments match – a priori this could just be a funny coincidence. So we can refocus Question 2.12:

Question 2.12. If (μn)n1 is a sequence in M() and μ M() has

lim nxmdμ n(x) =xm𝑑𝜇(x)

for all m 1, then does μn converge to μ in some meaningful sense at the level of measures?

The answer to Question  is a qualified “yes”:

Theorem 2.14. Let μ be a probability measure on with compact support.

  1. If ν is another probability measure on with

    xm𝑑𝜇(x) =xm𝑑𝜈(x)

    for all m 1, then μ = ν.

  2. If (μn)n1 is a sequence of probability measures on with

    lim nxmdμ n(x) =x,𝑑𝜇(x)

    for all m 1, then μn μ weakly as n .

Proof. Big exercise. See [4, Theorems 30.1 &30.2].

Proof of Theorem 1.12. Let μN be the average eigenvalue distribution of an N × N GUE, and let μ be the semicircle distribution with radius 2. We have already proved that

lim Nxmdμ N(x) =xm𝑑𝜇(x)

for all m 1. Since μ has compact support, Theorem 2.14 shows that μA μ weakly as N .

2.3. Noncommutative probability spaces.

Definition 2.15. A -algebra is a unital associative algebra over with an operation 𝒜 𝒜 : aa which is conjugate-linear and has (a) = a and (𝑎𝑏) = ba for all a,b 𝒜. An element a 𝒜 is said to be positive if a = bb for some b 𝒜.

Definition 2.16. A -probability space is a pair (𝒜,φ) where 𝒜 is a -algebra and φ is a linear functional on 𝒜 with φ(aa) 0 for all a 𝒜, and φ(1) = 1.

Remark 2.17. The point of Definition 2.16 is to view the elements of 𝒜 as “random variables” and the linear functional φ as the “expectation”. This abstraction becomes useful when one needs to consider noncommutative objects as random variables.

Example 2.18 (Commutative case). Let (Ω,F,P) be a (classical) probability space, i.e. a measure space with total measure 1. Let 𝒜 := L(Ω,F,P) be the algebra of essentially bounded functions, and let φ be the linear functional defined by

φ(f) =Ωf(ω)𝑑𝑃(ω)

for f 𝒜. This is a -probability space, where the -operation is complex conjugation – in fact, it has a lot of analytic structure that we will put aside for now.

Example 2.19 (Finite moments). Major objection to Example 2.18: many random variables we care about are not bounded. We can set up a similar example that will actually be more relevant for us: let 𝒜 = L(Ω,F,P) where L(Ω,F,P) := p1Lp(Ω,F,P), i.e. the algebra of random variables with all moments finite. (Use Hölder’s inequality to show 𝒜 is a -algebra: once to show Lp Lq for p q, and once to show 𝑓𝑔 L1 for f,g 𝒜.) The expectation φ is defined in the same way, by integrating functions against P. This is again a -probability space.

Example 2.20 (Scalar matrices). Let 𝒜 = MN be the algebra of N × N matrices and let φ(A) = 1 NTr(A). This is a -probability space which is not commutative.

Example 2.21 (Random matrices). Let 𝒜 = MN(L(Ω,F,P)) be the -algebra of matrices with entries that are RVs with finite moments, and let φ be the expected trace: φ(A) = 𝔼tr(A). For example, our GUE random matrix lives here.

So far, we haven’t seen anything new. The real reason for making this abstraction is to include genuinely noncommutative situations – where there is no hope of identifying an underlying classical probability space – and view them as essentially probabilistic in nature. The best example comes from groups and their group rings.

Example 2.22 (Group algebra). Let G be a group and let 𝒜 := [G]. The expectation is defined by

φ ( gGagg) = ae.

This is a -probability space.

Next class, we will use group algebras and group-theoretical freeness to motivate Voiculescu’s highly influential notion of free independence. Then, we will use it to construct a concrete operator model for the asymptotics of multiple GUEs.

2.4. Exercises.

Exercise 2.23. In this exercise, you will enumerate the noncrossing pairings and show that they are counted by the moments of the semicircle distribution. (1) and (2) are classic textbook combinatorics. (3) is a somewhat involved calculus problem.

  1. Find a recursion for |NC2(2k)|. Think back to the enumeration of P2(2k).
  2. Show that Cat(k) := 1 k+1( 2k k) satisfy the same recursion that you found in (1). To do this, first show that with C(z) := k=0Cat(k)zk, we have C(z) = 114z 2z , and derive the functional equation C(z) = 1 + 𝑧𝐶(z)2. Recover the recursion from this functional equation.
  3. Let 𝑑𝜇 = f(x)𝑑𝑥 where

    f(x) = { 1 2π4 x2if x [2,2] 0 otherwise .

    Show that

    x2k𝑑𝜇(x) = Cat(k) and x2k1𝑑𝜇(x)

    for k 1. To do this, make the substitution x = 2cos𝜃. You can use the following identity without proof:

    0π cos2m𝜃𝑑𝜃 = (2m 1)!! (2m)!! π.

Exercise 2.24. In this exercise, you will fill in the details of why convergence in moments implies weak convergence in the case of the semicircular law. In both parts, let μ be a probability measure on with compact support. This is part genuine exercise, part “book report”. Both parts require some knowledge of measure theory.

  1. Let ν be a probability measure on with

    xm𝑑𝜇(x) =xm𝑑𝜈(x)

    for all m 1. Prove that μ = ν. Note: there is no assumption that ν has compact support! Be careful and think about how to get around this issue.

  2. Let (μn)n1 be a sequence of probability measures on with

    lim nxmdμ n(x) =xm𝑑𝜇(x)

    for all m 1. Prove that μn μ weakly. This is standard textbook material (Billingsley P& Theorems 30.1 and 30.2). What I’m looking for is a concise summary of the argument that addresses its various subtleties: for example, how exactly is (1) being used, how do you establish tightness, etc. You don’t need to reproduce proofs of basic classical theorems, like Markov inequality or Helly selection theorem/Prokhorov’s theorem.

3. Free independence

3.1. Motivation and definition.

Definition 3.1 (Group-theoretical freeness). Let G be a group and let {Gi : i I} be a family of subgroups. The subgroups are free if for all k 1, for all i1,,ik I with ijij+1, and for all gj Gij with gje, we have g1gke.

How does this look in terms of the -probability space ([G],φ)? Well, the condition that

gjeg1gke translates to φ(gj) = 0φ(g1gk) = 0.

Definition 3.2 (Freeness of subalgebras). Let (𝒜,φ) be a -probability space, and let {𝒜i : i I} be a family of unital -subalgebras. The family {𝒜i : i I} is said to be freely independent (or free) if for all k 1, for all i1,,ik I with ijij+1, and for all aj 𝒜ij with φ(aj) = 0, we have φ(a1ak) = 0.

Proposition 3.3. Let G be a group and let {Gi : i I} be a family of subgroups. Then the subgroups are free in the group-theoretical sense if and only if the subalgebras [Gi] of [G] are freely independent with respect to φ.

Proof. Exercise. I think this is an essential one.

Definition 3.4 (Freeness of variables). Let (𝒜,φ) be a -probability space and let {ai : i I} be a family in 𝒜. For i I, let 𝒜i be the -subalgebra of 𝒜 generated by ai. The family {ai : i I} is said to be freely independent (or free) if the family {𝒜i : i I} is freely independent.

3.2. What does freeness actually do?. To understand what freeness means, let’s look at some examples. This section is entirely based on [13, Lecture 5].

Example 3.5. Suppose that 𝒜 and B are freely independent subalgebras of some ambient algebra. Let’s look at some words with mixtures between 𝒜 and B.

Length two: for a 𝒜 and b B, freeness says

0 = φ((a φ(a))(b φ(b))) = φ(𝑎𝑏) φ(a)φ(b)

so φ(𝑎𝑏) = φ(a)φ(b).

Length three: for a1,a2 𝒜 and b B, freeness says

φ((a φ(a))(b φ(b))(a φ(a))) = 0.

Expanding and using the length two factorization, this says

0 = φ(a1ba2) φ(a2)φ(a1b) φ(b)φ(a1a2) + φ(a1)φ(a2)φ(b) φ(a1)φ(ba2) + φ(a1)φ(a2)φ(b) + φ(a1)φ(b)φ(a2) φ(a1)φ(a2)φ(b) = φ(a1ba2) φ(a2)φ(a1)φ(b) φ(b)φ(a1a2) + φ(a1)φ(a2)φ(b) φ(a1)φ(b)φ(a2) + φ(a1)φ(a2)φ(b) + φ(a1)φ(b)φ(a2) φ(a1)φ(a2)φ(b) = φ(a1ba2) φ(a1a2)φ(b)

so φ(a1ba2) = φ(a1a2)φ(b).

IMPORTANT: this factorization pattern does not continue. The same kind of calculation, with a1,a2 𝒜 and b1,b2 B, shows that

φ(a1b1a2b2) = φ(a1a2)φ(b1)φ(b2) + φ(a1)φ(a2)φ(b1b2) φ(a1)φ(b1)φ(a2)φ(b2).

The point of all this is that mixed moments only depend on mixed moments from the same subalgebra, via some complicated pattern.

Proposition 3.6. Let (𝒜,φ) be a -probability space and let {𝒜i : i I} be freely independent -subalgebras. Let B be the -subalgebra of 𝒜 generated by {𝒜i : i I}. Then φ|B is uniquely determined by {φ|𝒜i : i I}.

Proof. See [13, Lemma 5.13].

3.3. Free central limit theorem. Similar to the last section, this one is entirely based on [13, Lecture 8]. From now on, fix a -probability space (𝒜,φ) and a sequence (an)n1 in 𝒜 with an = an, φ(an) = 0, and φ(an2) = 1 for all n 1. Also assume all the an have the same moment sequence and they are freely independent.

Theorem 3.7 ([15]). We have

lim Nφ ((a1 + + aN N )m) = { Cat(k)if m = 2k 0 otherwise

for m 1.

The first step is to simply unpack the LHS:

φ ((a1 + + aN N )m) = 1 Nm2 𝐢:[m][N]φ(a𝐢(1)a𝐢(m)).

Notation 3.8. For 𝐢 : [m] , let ker(𝐢) be the partition of [m] defined by letting p and q be in the same block if and only if 𝐢(p) = 𝐢(q).

Lemma 3.9. If 𝐢,𝐣 : [m] have ker(𝐢) = ker(𝐣), then

φ(a𝐢(1)a𝐢(m)) = φ(a𝐣(1)a𝐣(m)).

Idea of proof. This follows from the assumption that the an all have the same distribution, plus the fact that free independence means mixed moments are determined by the individual distributions.

For example, suppose the common partition is {{1,3},{2,4}}, and 𝐢 = (1,2,1,2) and 𝐣 = (5,3,5,3). Then, using the computations from Example 3.5, we have

φ(a1a2a1a2) = φ(a12)φ(a 2)2 + φ(a 1)2φ(a 22) φ(a 1)2φ(a 2)2

and

φ(a5a3a5a3) = φ(a52)φ(a 3)2 + φ(a 5)2φ(a 32) φ(a 5)2φ(a 3)2.

Since we assume the an all have the same moment sequence, these are the same.

Notation 3.10. In light of Lemma 3.9, for π P(m), write φ(π) for the common value of φ(a𝐢(1)a𝐢(m)) where 𝐢 : [m] [N] has ker(𝐢) = π. Let AπN be the number of maps 𝐢 : [m] [N] with ker(𝐢) = π.

So we have

φ ((a1 + + aN N )m) = 1 Nm2 πP(m)AπNφ(π)

and the task is to see what happens to AπN as N . We can immediately dispense with any π that has a singleton block:

Lemma 3.11. If π P(m) and there is some V π with |V | = 1, then φ(π) = 0.

Proof. Suppose that 𝐢(j) = r and {j} π. Then

φ(π) = φ(a𝐢(1)a𝐢(j1)ara𝐢(j+1)a𝐢(m)) = φ(a𝐢(1)a𝐢(j1)a𝐢(j+1)a𝐢(m))φ(ar)

using freeness and one of the computations in Example 3.5. We have φ(ar) = 0, so φ(π) = 0.

Now, we need to understand the limit of AπN where π P(m) has |V | 2 for all V π. This is straightforward: a map 𝐢 : [m] [N] with π = ker(𝐢) amounts to a choice of label from [N] for each block of π, where we do not allow duplicate labels. There are

N(N 1)(N |π| + 1)

choices, so we have

φ ((a1 + + aN N )m) = πP(m) |V |2V π N(N 1)(N |π| + 1) Nm2 φ(π).

The condition on π makes |π| m2, and when N , the fraction involving N goes to 1 if |π| = m2, and goes to 0 otherwise. The only way we can have |π| = 1 is if π P2(m). So we have

lim Nφ ((a1 + + aN N )m) = πP2(m)φ(π).

Of course, if m is odd, this is 0.

Now let us compute φ(π) for π P2(2k). Pick 𝐢 : [m] with π = ker(𝐢). If 𝐢(j)𝐢(j + 1) for all 1 j m 1, then φ(π) = 0 by freeness; otherwise, there is some 1 j m 1 with 𝐢(j) = 𝐣(j + 1). Using the computations from Example 3.5, we have

φ(π) = φ(a𝐢(1)a𝐢(j1)(a𝐢(j)a𝐢(j+1))a𝐣(j+2)a𝐢(m)) = φ(a𝐢(1)a𝐢(j1)a𝐢(j+2)a𝐢(m))φ(a𝐢(j)a𝐢(j+1)) = φ(a𝐢(1)a𝐢(j1)a𝐢(j+2)a𝐢(m)).

The pairings where we can keep doing this k times are exactly the ones that are noncrossing, so we have

φ(π) = { 1if π NC2(2k) 0if π P2(2k) NC2(2k) .

This shows that

lim Nφ ((a1 + + aN N )m) = { Cat(k)if m = 2k 0 otherwise

which is the claim of Theorem 3.7.

3.4. Exercises.

Exercise 3.12. Let G be a group and let 𝒜 = [G]. Let φ be the linear functional on 𝒜 defined by

φ ( gGagg) = ae.

I would say (3) is essential for understanding the concept of free independence. I strongly suggest you do it, even if you don’t hand it in.

  1. Prove that φ is positive by directly using the definition above.
  2. Find a Hilbert space H, a -homomorphism π : 𝒜 B(H), and a vector ξ H such that φ(a) = π(a)ξ,ξ for all a 𝒜. Use the fact that such a thing exists to prove that φ is positive.
  3. Let {Gi : i I} be a family of subgroups of G, and for i I, let 𝒜i := [Gi]. Prove that {Gi : I} are free in the group-theoretical sense if and only if {𝒜i : i I} are freely independent.

4. Asymptotic freeness of multiple GUEs

4.1. Free semicircular families. The example of freeness in the last section will become relevant when we talk about random unitary matrices, but for now, we are looking for some noncommutative data that captures the asymptotics of multiple GUEs. The answer is that when N , they become freely independent families of semicircular variables.

Definition 4.1 (Semicircular family). Let (𝒜,φ) be a -probability space. A free semicircular family in (𝒜,φ) consists of some elements s1,,sr 𝒜 such that

Theorem 4.2. For each r 1, there is a -probability space (𝒜,φ) and some elements s1,,sr 𝒜 such that

  1. {s1,,sr} is a free semicircular family, and
  2. 𝒜 is generated by {s1,,sr}.

In this section, we will construct a concrete instance of a free semicircular family using creation and annihilation operators on the so-called full Fock space. This model will allow for very concrete calculations involving certain lattice paths which are in obvious bijection with NC2.

Definition 4.3 (Full Fock space). Fix a Hilbert space H. The full Fock space on H is the Hilbert space

F(H) := Ω n=1Hn.

Here, the inner product of Hn is given on elementary tensors by

ξ1 ξn,η1 ηn = ξ1,η1ξn,ηn

and the direct sum is the set of sequences (ξn)n0 where ξ0 Ω and ξn Hn for n 1, with n=1ξn2 /mo> . The inner product is given by

(ξn)n0,(ηn)n0 = n=0ξ n,ηn.

The -algebra B(F(H)) carries a canonical vacuum state:

φ(T) = TΩ,Ω.

Then (B(F(H))),φ) is a -probability space.

Proposition 4.4. For ξ H, there is an operator c(ξ) B(F(H)) defined by c(ξ)Ω = ξ and

c(ξ)ξ1 ξn = ξ ξ1 ξn.

The adjoint is given by

c(ξ)Ω = 0c(ξ)ξ 1 = ξ1,ξΩ,

and

c(ξ)ξ 1 ξn = ξ1,ξξ2 ξn

and has the following useful property:

c(ξ)c(η) = η,ξ 1.

Specifically, c(ξ) is a scalar multiple of an isometry, and c(ξ) = ξ.

Proof. Exercise.

Proposition 4.5 (Semicircular). Let H be a Hilbert space and let ξ H. Then c(ξ) + c(ξ)B(F(H)) is a semicircular element with radius 2ξ.

The combinatorics of noncrossing pairings are directly built into this model, but in a slightly different form.

Definition 4.6. A Dyck path with m steps is a path in 0 × 0 which

Note that this definition includes the requirement that the path never goes below the x-axis. Let D(m) be the set of Dyck paths with m steps; when m is odd, D(m) = .

Lemma 4.7. There is a canonical bijection NC2(m) D(m).

Proof of Proposition 4.5. To clean up notation, let a = c(ξ). We will compute moments:

φ((a + a)m) = 𝜖1,,𝜖m{1,}φ(a𝜖1 a𝜖m ).

The key observation is that φ(a𝜖1a𝜖m) is just an indicator of a certain combinatorial property of the sequence (𝜖1,,𝜖m): the sequence determines a lattice path with NE and SE steps, and the indicator detects whether or not it is a Dyck path.

To see this, for 𝜖1,,𝜖m {1,}, let

λj = { 1 if 𝜖mj+1 = 1 1if 𝜖mj+1 =

for 1 j m. Then, if we read the expression ξ0 as Ω, we have

a𝜖1 a𝜖m Ω = { ξ(λ1++λm)if λ 1 + + λk 01 k m 0 otherwise

and

φ(a𝜖1 a𝜖m ) = a𝜖1 a𝜖m Ω,Ω = { ξ(λ1++λm),Ωif λ 1 + + λk 01 k m 0 otherwise

The inner product in the last line is 1 if λ1 + + λm = 0, and 0 otherwise. The condition λ1 + + λk 0 for all 1 k m means the path doesn’t go below the x-axis, and the condition that λ1 + + λm = 0 means it does return to the x-axis, i.e. it’s a Dyck path. So φ((a + a)m) is the number of Dyck paths with m steps. By Lemma 4.7, this shows that a + a is semicircular.

Proposition 4.8 (Freeness). Let H be a Hilbert space and let H1,,Hr be mutually orthogonal subspaces of H. For 1 i r, let 𝒜i be the -subalgebra of B(F(H)) generated by {c(ξ) : ξ Hi}. Then 𝒜1,,𝒜r are freely independent.

Proof. Let 1 i1,,ik r with ijij+1, and pick Tj 𝒜ij with φ(Tj) = 0. Assume without loss of generality that

Tj = c(ξ1j)c(ξ mjj)c(η 1j)c(η njj)

with mj + nj 1 and all vectors in Hij. The reason we can do this, and ignore contributions from 1, is that any non-zero contribution from 1 will make φ non-zero.

If there is some 1 j k 1 with nj0 and mj+10, then the product TjTj+1 includes the product c(ηnjj)c(ξ1j+1) = ξ1j+1,ηnjj1. Since Hj Hj+1, this is 0 and then φ(T1Tk) = 0.

Otherwise, for all 1 j k 1, we have either nj = 0 or mj+1 = 0. Then

T1Tk = c(ξ1)c(ξm)c(η1)c(η n)

for some ξ1,,ξm,η1,,ηn H with m + n 1, and

φ(T1Tk) = c(ξ1)c(ξm)c(η1)c(η n)Ω,Ω = c(η1)c(η n)Ω,c(ξ m)c(ξ 1)Ω.

Since at least one of m or n is 1, one of the two arguments in the inner product is of the form c(ξ)Ω = 0, so φ(T1Tk) = 0.

Proof of Theorem 4.2. Let {e1,,er} be the standard orthonormal basis of r, and let sj = c(ej) B(F(r)) for 1 j r. Let 𝒜 be the -algebra in B(F(r)) generated by {s1,,sr}, and let φ(T) = TΩ,Ω for T 𝒜.

Remark 4.9. Theorem 4.2 is a bit artificially weak, because I don’t want to assume any background in operator algebras. If we let 𝒜 be the C-subalgebra of B(F(r)) generated by {c(e1) + c(e1),,c(er) + c(er)} and let φ(T) = TΩ,Ω for T 𝒜, then

  1. Ω is a cyclic vector for 𝒜, meaning that {TΩ : T 𝒜} is dense in F(H);
  2. φ is a faithful trace;
  3. for any C-algebra B with a faithful state ψ, if B is generated by a free semicircular family of size r, then there is a state-preserving -isomorphism 𝒜 B.

This is a very nice object and in light of (3), it is referred to as the semicircular C-algebra with r generators. See [13, Lecture 7] for details.

4.2. Asymptotic freeness of GUEs.

Theorem 4.10 ([16]). Let {A1,,Ar} be an independent family of N × N GUEs, and let {s1,,sr}be a free semicircular family. Then (A1,,Ar) converges in distribution to (s1,,sr), in the following sense: for all m 1 and 𝐢 : [m] [r], we have

lim N𝔼tr(A𝐢(1)A𝐢(m)) = φ(s𝐢(1)s𝐢(m)).

This theorem follows from an easy multivariate generalization of the genus expansion:

Theorem 1.15. Let m 1 and 𝐢 : [m] [r]. Then

𝔼(trN(A𝐢(1)A𝐢(m))) = πP2(m) πker(𝐢) ( 1 N )m 2 +1#(γmπ).

When m is odd, the formula should be read as an empty sum, i.e. as 0.

Proof of Theorem 4.10. When N in the formula above, the only summands that survive are the ones with m 2 + 1 #(γmπ) = 0. These are the ones with π noncrossing, so

𝔼(trN(A𝐢(1)A𝐢(m))) = |{π NC2(m) : π ker(𝐢)}| + O(N2)

and we need to show that

φ(s𝐢(1)s𝐢(m)) = |{π NC2(m) : π ker(𝐢)}|.

This is similar to Proposition 4.5: in the proof there, the words a𝜖1a𝜖m which contribute to the count |NC2(m)| correspond to Dyck paths. There is a similar constraint here, but we also need pairs (r,s) of up and down steps to have 𝐢(r) = 𝐢(s). In terms of NC2(m), this amounts to π ker(𝐢).

Here is an example of how this works: consider the path which makes two up steps and then two down steps, or in other words the noncrossing pairing {{1,4},{2,3}}. This corresponds to

a𝐢(1)a 𝐢(2)a 𝐢(3)a𝐢(4)Ω = a𝐢(1)a 𝐢(2)(e 𝐢(3) e𝐢(4)) = a𝐢(1)e 𝐢(3),e𝐢(2)e𝐢(4) = e𝐢(3),e𝐢(2)e𝐢(1),e𝐢(4)Ω

so we have the additional requirement that 𝐢(2) = 𝐢(3) and 𝐢(1) = 𝐢(4), i.e. the pairing is compatible with 𝐢.

4.3. Exercises.

Exercise 4.11. Let H be a Hilbert space, let F(H) be its full Fock space, and let F0(H) be the dense subspace of finite linear combinations of elementary tensors. In this exercise, you will fill in the details of the construction of creation and annihilation operators on F(H).

  1. For ξ H, define c(ξ) : F0(H) F(H) by letting

    c(ξ)(ξ1 ξn) = ξ ξ1 ξn

    and extending by linearity. Prove that c(ξ) is bounded and extends to an element c(ξ) B(H) with c(ξ) = ξ. Try to do it in a way that implies c(ξ) is a scalar multiple of an isometry.

  2. Prove that

    c(ξ)Ω = 0c(ξ)ξ 1 = ξ1,ξΩ,

    and

    c(ξ)(ξ 1 ξn) = ξ1,ξξ2 ξn

    for n 2.

  3. Prove that c(ξ)c(η) = η,ξ 1.

Exercise 4.12. A C-probability space is a -probability space (𝒜,φ) where 𝒜 is equipped with a norm, with respect to which it is complete, which has the properties

This might be familiar to you as a C-algebra equipped with a state. As usual in this course, we consider all algebras and related structures to be unital. This exercise requires some knowledge of functional analysis beyond the prerequisites.

  1. Let (𝒜,φ) be a C-probability space and let a 𝒜 be normal (meaning that aa = aa). Prove that there is a unique compactly supported measure on such that

    φ(apaq) =zpz¯q𝑑𝜇(z)

    for all p,q 0. This is called the -distribution, or simply the distribution, of a.

  2. Let (𝒜,φ) be a C-probability space and let {𝒜i : i I} be a freely independent family of -subalgebras of 𝒜. For i I, let Bi be the norm-closure of 𝒜i. Prove that {Bi : i I} are freely independent.

5. How to integrate over compact groups

In this class and the next, I want to show how to integrate functions on the unitary group UN. Specifically, C(UN) has a dense subalgebra generated by the functions u𝑖𝑗 : UN – which take a matrix U UN and pick out the (i,j)-th entry – and their conjugates (this is an easy exercise with the Stone-Weierstrass theorem). So here is the problem we will solve:

Problem 5.1. Let 𝐢,𝐣 : [k] [N] and 𝐢,𝐣 : [l] [N]. How to compute

UNu𝐢(1)𝐣(1)u𝐢(k)𝐣(k)u𝐢(1)𝐣(1)¯u𝐢(l)𝐣(l)¯𝑑𝑈?

This problem was solved comprehensively in [5] using the representation theory of Sk and Schur-Weyl duality. The ideas of that paper were refined in [7], and in my opinion the most attractive approach to the problem is the one outlined in the recent survey [6]. I claim no originality in my exposition of these works.

5.1. Reminders about integration. Let G be a compact group. Fundamental theorem: there is a unique left-invariant probability measure on G, called the Haar measure. In terms of integration, left invariance means that

Gf(y1x)𝑑𝑥 =Gf(x)𝑑𝑥

for all y G. Here are some important properties of this measure:

5.2. Projection onto fixed points. The key starting point is the following observation: we can assemble the integration problem into an operator

P := (UNu𝐢(1)𝐣(1)u𝐢(k)𝐣(k)u𝐢(1)𝐣(1)¯u𝐢(l)𝐣(l)¯𝑑𝑈 )𝐢,𝐢,𝐣,𝐣

on (N)k (N)l. This matrix is actually an orthogonal projection, and its range is easy to describe. I think it will clarify things if we work in a more general setting.

Proposition 5.2. Let G be a compact group and let ρ : G 𝒰(H) be a unitary representation of G on a Hilbert space H. Let

P :=Gρ(x)𝑑𝑥,

where the integral is defined in the weak sense by

𝑃𝜉,η =Gρ(x)ξ,η𝑑𝑥

for all ξ,η H. Then P is the orthogonal projection onto

FixG(ρ) := {ξ H : ρ(x)ξ = ξ for all x G}.

Remark 5.3. If you are not comfortable with the description of P given above, you can pretend that H is finite-dimensional with orthonormal basis {ei : i I} and

P = (Gρ(x)ej,ei𝑑𝑥)i,jI.

You will not lose anything – this is the situation we care about.

Proof of Proposition 5.2. Invariance gives

P2ξ,η =Gρ(x)𝑃𝜉,η𝑑𝑥 =G𝑃𝜉,ρ(x)η𝑑𝑥 =G (Gρ(y)ξ,ρ(x)η𝑑𝑦)𝑑𝑥 =Gρ(𝑥𝑦)ξ,η𝑑𝑦𝑑𝑥 =Gρ(x)ξ,η𝑑𝑥 = 𝑃𝜉,η

so P2 = P, and unimodularity gives

Pξ,η = ξ,𝑃𝜂 = 𝑃𝜂,ξ¯ = Gρ(x)η,ξ𝑑𝑥¯ =Gρ(x)η,ξ¯𝑑𝑥 =Gξ,ρ(x)η𝑑𝑥 =Gρ(x)ξ,η𝑑𝑥 =Gρ(x1)ξ,η𝑑𝑥 =Gρ(x)ξ,η𝑑𝑥 = 𝑃𝜉,η

so P = P. In other words, P is an orthogonal projection.

To compute the range, one inclusion is immediate: if ξ FixG(ρ), then

𝑃𝜉,η =Gρ(x)ξ,η𝑑𝑥 =Gξ,η𝑑𝑥 = ξ,η

for all η H so 𝑃𝜉 = ξ and ξ is in the range of P.

For the reverse inclusion, suppose that ξFixG(ρ), so there is some x G such that ρ(x)ξξ. If we can show that 𝑃𝜉 /mo> ξ, then we can conclude that ξ is not in the range of P, because if it was, we would have 𝑃𝜉 = ξ.

Claim: for any ξ H, we have

ξ2 𝑃𝜉2 = 1 2Gξ ρ(x)ξ2𝑑𝑥.

To see this, expand

ξ ρ(x)ξ2 = ξ ρ(x)ξ,ξ ρ(x)ξ = ξ,ξξ,ρ(x)ξρ(x)ξ,ξ + ρ(x)ξ,ρ(x)ξ = 2ξ2 (ρ(x)ξ,ξ + ρ(x)ξ,ξ¯) (ρ(x) is unitary)

so

Gξ ρ(x)ξ2 = 2ξ2 (Gρ(x)ξ,ξ𝑑𝑥 +Gρ(x)ξ,ξ¯𝑑𝑥) = 2ξ2 (Gρ(x)ξ,ξ𝑑𝑥 + Gρ(x)ξ,ξ𝑑𝑥¯) = 2ξ2 2ReGρ(x)ξ,ξ𝑑𝑥 = 2ξ2 2Re𝑃𝜉,ξ = 2ξ2 2ReP2ξ,ξ = 2ξ2 2Re𝑃𝜉,𝑃𝜉 = 2ξ2 2𝑃𝜉2

and dividing by 2 proves the claim.

The point of this claim is that since we have ξ with ρ(x)ξξ for some x G, i.e. ξ ρ(x)ξ /mo> 0 for some x G, the same is true on a set of positive measure, so Gξ ρ(x)ξ2𝑑𝑥 /mo> 0 and 𝑃𝜉2 /mo> ξ2, hence 𝑃𝜉 /mo> ξ and ξ cannot be in the range of P.

5.3. Gram matrices and pseudoinverses. The basic data we have in hand is the following: a finite-dimensional Hilbert space H, a subspace K, and the orthogonal projection P of H onto K. Suppose we have a finite spanning set {ξα : α A} for K, and let G be its Gram matrix:

G = (ξα,ξβ)α,βA.

We want to describe the entries of P.

Proposition 5.4 (Moore-Penrose pseudoinverse). Let A be an n × n matrix. There is a unique n × n matrix B such that

  1. 𝐴𝐵𝐴 = A,
  2. 𝐵𝐴𝐵 = B, and
  3. 𝐴𝐵 and 𝐵𝐴 are self-adjoint.

This matrix B is called the pseudoinverse of A and is denoted by A+. If A is invertible, then A+ = A1.

Remark 5.5. How to construct A: invert A|ker(A) : ker(A)range(A) as usual and send range(A) to 0. If A is self-adjoint – as it will be for us – then you can diagonalize and simply invert the nonzero eigenvalues while leaving any 0s alone.

Theorem 5.6. Let K be a finite-dimensional Hilbert space with a finite spanning set {ξα : α A} with Gram matrix G. Then for η,ζ K, we have

η,ζ = α,βAη,ξαζ,ξβ¯Gα,β+.

Proof. The RHS can be written as

βA ( αAη,ξαGα,β+) ξ β,ζ

so we want to show that with η = βAyβξβ, we have

yβ = αAη,ξαGα,β+.

Let z be the vector of values on the RHS above:

zβ := αAη,ξαGα,β+ i.e. z := (η,ξ α)αAG+

so 𝑧𝐺 = (η,ξα)αAG+G.

We have

(η,ξα)αA = (\langle βAyβξβ,ξα\rangle ) αA = ( βAyβξβ,ξα)αA = 𝑦𝐺

so

(\langle αAzαξα,ξβ\rangle ) βA = ( αAzαξα,ξβ)βA = (zα)αA(ξα,ξβ)α,βA = 𝑧𝐺 = 𝑦𝐺G+G = 𝑦𝐺 = (η,ξ β)βA.

Now, to show η = αAzαξα, it is enough to show that for any v K, we have

η,v = \langle αAzαξα,v\rangle .

But we know that v = βAvβξβ since {ξβ : β A} spans K, so

η,v = βAvβ¯η,ξβ = βAvβ¯ \langle αAzαξα,ξβ\rangle = \langle αAzαξα,v\rangle

and we are done.

Corollary 5.7. Let H be a finite-dimensional Hilbert space with orthonormal basis {ei : i I}. Let K be a subspace of H with a spanning set {ξα : α A}and Gram matrix G, and let P be the orthogonal projection of H onto K. Then

Pei,ej = α,βAei,ξαξβ,ejGα,β+

for i,j I.

Proof. The theorem says

Pei,ej = P2e i,ej = Pei,Pej = α,βAPei,ξαξβ,PejGα,β+ = α,βAei,Pξ αPξ β,ejGα,β+ = α,βAei,ξαξβ,ejGα,β+

as claimed.

5.4. Exercises.

Exercise 5.8. Prove that if kl, then for any 𝐢,𝐣 : [k] [N] and 𝐢,𝐣 : [l] [N], we have

UNu𝐢(1)𝐣(1)u𝐢(k)𝐣(k)u𝐢(1)𝐣(1)¯u𝐢(l)𝐣(l)¯𝑑𝑈 = 0.

Hint: this can be done directly using facts that have appeared before this point.

6. Weingarten formula

Recall what we did last time: we let P be the operator

(UNu𝐢(1)𝐣(1)u𝐢(k)𝐣(k)u𝐢(1)𝐣(1)¯u𝐢(k)𝐣(k)¯𝑑𝑈 )𝐢,𝐢,𝐣,𝐣

on (N)k (N)k, and we showed that P is the orthogonal projection onto the subspace

FixUN(Uk U¯k) := {ξ (N)k (N)k : (Uk U¯k)ξ = ξU U N}.

If {ξα : α A} is a spanning set of this range, then one can show (by elementary linear algebra) that the entries of P are

Pe𝐣 e𝐣,e𝐢 e𝐢 = α,βAξα,e𝐢 e𝐢ξβ,e𝐣 e𝐣Gα,β+

where e𝐢 := e𝐢(1) e𝐢(k), G := (ξα,ξβ)α,βA is the Gram matrix of the spanning set, and G+ is the pseudoinverse of G (or just the inverse if the spanning set is linearly independent).

So at least in principle, the solution to our problem of computing unitary matrix integrals comes down to finding a convenient spanning set for the space of fixed points.

6.1. The spanning set.

Notation 6.1. For α Sk, let

Eα := 𝐢:[k][N]e𝐢(α(1)) e𝐢(α(k)) e𝐢(1) e𝐢(k).

Theorem 6.2 (Schur-Weyl duality). We have

FixUN(Uk U¯l) = span{E α : α Sk}.

Remark 6.3. We do not have time in this course to prove this theorem, but the idea is easy to explain. We have two natural representations ρ and π, of UN and Sk respectively, on (N)k:

ρ(U)ξ1 ξk = Uξ1 Uξk and π(σ)ξ1 ξk = ξσ1(1) ξσ1(k).

You can immediately check that they commute. The basic principle of Schur-Weyl duality is that if 𝒜 is the subalgebra spanned by ρ and B is the subalgebra spanned by π, we actually have 𝒜 = B and B = 𝒜. The theorem above is a consequence of this more fundamental theorem.

Proposition 6.4. For α Sk and 𝐢,𝐢 : [k] [N], we have

Eα,e𝐢e𝐢 = { 1if 𝐢 = 𝐢 α 0otherwise .

Proof. We have

Eα,e𝐢 e𝐢 = 𝐣:[k][N]e𝐣(α(1)) e𝐣(α(k)) e𝐣(1) e𝐣(k), e𝐢(1) e𝐢(k) e𝐢(1) e𝐢(k) = e𝐢(α(1)) e𝐢(α(k)),e𝐢(1) e𝐢(k) (𝐣 = 𝐢 ) = { 1if 𝐢 = 𝐢 α 0otherwise

hence the claim.

So with W := G+, the integration formula now reads as

UNu𝐢(1)𝐣(1)u𝐢(k)𝐣(k)u𝐢(1)𝐣(1)¯u𝐢(k)𝐣(k)¯𝑑𝑈 = α,βSk 𝐢=𝐢α 𝐣=𝐣β Wα,β

and what remains is to understand something about the matrix W.

6.2. Character expansion.

Notation 6.5. Define GrN,k : Sk by

GrN,k(σ) = N#(σ)

for σ Sk. If no confusion will arise, we may refer to GrN,k as Gr.

Proposition 6.6. The function GrN,k appears in the following familiar ways:

  1. it is the character of the representation π of Sk on (N)k;
  2. we have Gα,β = GrN,k(α1β) for all α,β Sk.

Proof. For (1), compute

Tr(π(σ)) = 𝐢:[k][N]π(σ)e𝐢(1) e𝐢(k),e𝐢(1) e𝐢(k) = 𝐢:[k][N]e𝐢(σ1(1)) e𝐢(σ1(k)),e𝐢(1) e𝐢(k) = 𝐢:[k][N]δ𝐢=𝐢σ1 = #(𝐢 : [k] [N] : 𝐢 = 𝐢 σ1) = N#(σ).

The last line is a straightforward count: 𝐢 = 𝐢 σ1 just means 𝐢 is constant on the cycles of σ1, so we are picking a label from [N] for each cycle, and there are N#(σ1) = N#(σ) choices.

For (2), for α,β Sk, we have

Eα,Eβ = 𝐢,𝐣:[k][N]e𝐢(α(1)) e𝐢(α(k)) e𝐢(1) e𝐢(k), e𝐣(β(1)) e𝐣(β(k)) e𝐣(1) e𝐣(k) = 𝐢:[k][N]e𝐢(α(1)) e𝐢(α(k)),e𝐢(β(1)) e𝐢(β(k)) (the last k components force 𝐢 = 𝐣 ) = 𝐢:[k][N]e𝐢(1) e𝐢(k),e𝐢(α1β(1)) e𝐢(α1β(k)) (replace 𝐢  with 𝐢 α1 ) = 𝐢:[k][N]δ𝐢=𝐢(α1β) = N#(α1β)

so Eα,Eβ = GrN,k(α1β).

We have a function GrN,k on Sk, and it is clearly a class function – GrN,k(σ) only depends on σ through #(σ), which only depends on the conjugacy class of σ. Equivalently, this can be seen as a central element of [Sk]:

Lemma 6.7. Let f : Sk be a function. The element σSkf(σ)σ of [Sk] is central if and only if f is a class function.

Proof. First, suppose that f is a class function. Then for any τ Sk, we have

σSkf(σ)𝜎𝜏 = σSkf(𝜏𝜎τ1)𝜏𝜎τ1τ = σSkf(σ)𝜏𝜎 = τ σSkf(σ)σ

and passing to linear combinations of τs, we see that σSkf(σ)σ is central. Conversely, if σSkf(σ)σ is central, then for any τ Sk, we have

σSkf(𝜏𝜎τ1)σ = σSkf(σ)τ1𝜎𝜏 = τ1 ( σSkf(σ)σ)τ = τ1τ σSkf(σ)σ = σSkf(σ)σ

so for all σ Sk, we have f(𝜏𝜎τ1) = f(σ).

This means, in particular, that we have a central element of [Sk] corresponding to GrN,k:

Gr~N,k = σSkGrN,k(σ)σ.

To describe the inverse, we need some tools from algebra. From this point forward, we will assume N k for simplicity.

Notation 6.8. Let 𝕐k be the set of Young diagrams with k boxes: these are grids of boxes, justified to the left, where the length of rows is non-increasing. (These are in bijection with partitions of k.) The length of λ 𝕐k, denoted by (λ), is the number of rows in λ.

For λ 𝕐k, a tableau is a labeling of the boxes by elements of {1,,k}, and a tableau is said to be standard if the rows and columns are strictly increasing. The number of standard tableaux of shape λ will be denoted by dim(λ).

Theorem 6.9 (Young, Frobenius, etc?). There is a family {χλ : λ 𝕐k} of functions on Sk with the following properties:

  1. χλ is a class function, in the sense that it is constant on conjugacy classes;
  2. every class function on Sk is a linear combination of {χλ : λ 𝕐k};
  3. χλ,χμ = δλ=μ.

Let π be the representation of Sk on (N)k defined by

π(σ)ξ1 ξk = ξσ1(1) ξσ1(k).

Then

Tr(π(σ)) = λk𝗌λ(1N)χλ(σ)

for σ Sk, where

𝗌λ(1N) = dim(λ) k! (i,j)λ(N + j i).

Notation 6.10. Define WgN,k : Sk by

WgN,k(σ) = 1 k!2 λk dim(λ)2 𝗌λ(1N) χλ(σ)

for σ Sk. If no confusion will arise, we may refer to WgN,k by Wg. Note that for this definition to make sense, we are using the assumption N k – otherwise, 𝗌λ(1N) might be 0.

Theorem 6.11. Let Wg~ := σSkWg(σ)σ and assume N k. Then we have Wg~Gr~ = Gr~Wg~ = 1.

Proof. We have

Wg~Gr~ = 1 k!2 λ,μ𝕐k dim(λ)2𝗌μ(1N) 𝗌λ(1N) ( αSkχλ(α)α) ( βSkχμ(β)β) = λ,μ𝕐k dim(λ)𝗌μ(1N) dim(μ)𝗌λ(1N)CλCμ

where Cλ := dim(λ) k! αSkχλ(α)α and similarly for μ. It is an exercise in representation theory of finite groups to check that

CλCμ = { Cλif λ = μ 0 otherwise  and  λ𝕐kCλ = 1

so Wg~Gr~ = 1. Similarly Gr~Wg~ = 1.

Remark 6.12. The assumption that N k isn’t really necessary; if we drop it, we need to work in a smaller subalgebra that incorporates an extra condition on the irreducible components. Specifically, in the isomorphism

[Sk] λ𝕐kEnd(V λ)

from Maschke’s theorem, we can define

N[Sk] := λ𝕐k (λ)N End(V λ)

and invert GrN,k in there; this is done in [7]. We only assumed N k for simplicity, and because for our purposes this will always be the case.

Corollary 6.13. Continue to assume N k. We have

Wα,β = 1 k!2 λ𝕐k dim(λ)2 𝗌λ(1N) χλ(α1β)

for α,β Sk.

Proof. Let W~α,β := WgN,k(α1β). We will show W~ = W by checking that W~G = GW~ = I. From the theorem, which says that

1 = α,βSkWg(α)Gr(β)𝛼𝛽 = σSkσ αSkWg(α)Gr(α1σ),

we know that

αSkWg(α)Gr(α1σ) = { 1if σ = e 0otherwise .

So

[W~G]α,β = γSkW~α,γGγ,β = γSkWg(α1γ)Gr(γ1β) = γSkWg(α1𝛼𝛾)Gr((𝛼𝛾)1β) = γSkWg(γ)Gr(γ1α1β) = { 1if α1β = e 0otherwise

and [W~G]α,β = δα=β. Similarly we have [GW~]α,β = δα,β so W~ = W.

6.3. Exercises.

Exercise 6.14. Let {Eα : α Sk} be the spanning set described above.

  1. Prove that {Eα : α Sk} is linearly independent if and only if N k.
  2. Let {ξα : α A} be a family of vectors in a Hilbert space H. Prove that if {ξα : α A} is linearly independent, then the Gram matrix

    (ξα,ξβ)α,βA

    is invertible.

Exercise 6.15 ([8, Problem 4.5.2]). Let G be a finite group and let {χi : i I} be the irreducible characters of G. For i I, let ρi be the irreducible representation giving rise to χi, and let

Ci := dim(V i) |G| gGχi(g)g1 [G].

Recall that ρi extends by linearity to an algebra homomorphism [G] End(V i).

  1. Prove that

    ρj(Ci) = { Idif j = i 0 otherwise

    for i,j I.

  2. Prove that

    CiCj = { Ciif i = j 0 otherwise

    for i,j I.

  3. Prove that iICi = 1 in [G].

Hint: the important keywords are central, Schur’s lemma, Maschke’s theorem, and orthogonality relations.

7. Randomly rotated matrices

In this section we will show that conjugating by a random unitary matrix produces asymptotic freeness. This section is heavily based on [13, Lecture 23].

First, let us fix our terminology:

Definition 7.1. For N 1, let (𝒜N,φN) be a -probability space and let {ai(N) : i I} be a family of self-adjoint elements of 𝒜N. We say that this family is asymptotically free if for all k 1, for all i1,,ik I with i1ik, for all polynomials P1,,Pk with φN(Pj(aij(N))) 0 for 1 j k, we have

lim NφN(P1(ai1(N))P k(aik(N))) = 0.

Theorem 7.2 (Voiculescu). For N 1, let AN and BN be self-adjoint N × N matrices, let UN be a random N × N unitary matrix, and suppose that for all m 1, the sequences trN(ANm) and trN(BNm) converge as N . Then UNANUN and BN are asymptotically free.

Here is what we need to show: if P1,,Pk and Q1,,Qk are polynomials with

𝔼trN(Pj(𝑈𝐴U)) 0 and 𝔼tr N(Qj(B)) 0

then

𝔼trN(P1(𝑈𝐴U)Q 1(B)Pk(𝑈𝐴U)Q k(B)) 0.

Strictly speaking, there is another case to be handled where the Qk(B) is omitted; this can be done in the same way as the case mentioned above.

We can simplify this by bringing out the unitaries from inside the polynomials: we want to show that if P1,,Pk and Q1,,Qk are polynomials with

trN(Pj(A)) 0 and 𝔼trN(Qj(B)) 0

then

𝔼trN(UP1(A)UQ 1(B)UPk(A)UQ k(B)) 0.

So what we really need to do here is compute things like

𝔼trN(UA1UB 1UAkUB k)

where A1,,Ak and B1,,Bk are tuples of matrices.

7.1. Exact trace formula.

Notation 7.3. Let (𝒜,φ) be a tracial -probability space. For a cycle c = (i1,,ip), write

φc(a1,,an) := φ(ai1aip).

For α Sn with disjoint cycle decomposition α = c1cl, write

φα(a1,,an) := φc1(a1,,an)φcl(a1,,an)

Lemma 7.4. For N × N matrices A1,,Ak with Ar = (a𝑖𝑗(r))1i,jN for 1 r k, we have

trα(A1,,Ak) = 1 N#(α) 𝐢:[k][N]a𝐢(1)𝐢(α(1))(1)a 𝐢(k)𝐢(α(k))(k).

Proof. Direct verification from the definition of the trace.

Proposition 7.5. Let A1,,Ak and B1,,Bk be N × N matrices. Then

𝔼trN(UA1UB 1UAkUB k) = α,βSkWgN,k(α1β)N#(α)+#(β1γ)1 trα(A1,,Ak)trβ1γ(B1,,Bk).

Proof. We have

𝔼trN(UA1UB 1UAkUB k) = 1 N 𝐢,𝐢,𝐣,𝐣:[k][N]𝔼 (u𝐢(1)𝐣(1)a𝐣(1)𝐣(1)(1)u 𝐢(1)𝐣(1)¯b𝐢(1)𝐢(2)(1) u𝐢(k)𝐣(k)a𝐣(k)𝐣(k)(k)u 𝐢(k)𝐣(k)¯b𝐢(k)𝐢(1)(k)) = 1 N 𝐢,𝐢,𝐣,𝐣:[k][N]a𝐣(1)𝐣(1)(1)a 𝐣(k)𝐣(k)(k)b 𝐢(1)𝐢(2)(1)b 𝐢(k)𝐢(1)(k) 𝔼 (u𝐢(1)𝐣(1)u𝐢(k)𝐣(k)u𝐢(1)𝐣(1)¯u𝐢(k)𝐣(k)¯) = 1 N 𝐢,𝐢,𝐣,𝐣:[k][N]a𝐣(1)𝐣(1)(1)a 𝐣(k)𝐣(k)(k)b 𝐢(1)𝐢(2)(1)b 𝐢(k)𝐢(1)(k) α,βSkδ𝐢=𝐢αδ𝐣=𝐣βWgN,k(α1β) = 1 N α,βSkWgN,k(α1β) 𝐢,𝐣:[k][N]a𝐣(β(1))𝐣(1)(1)a 𝐣(β(k))𝐣(k)(k)b 𝐢(1)𝐢(α(2))(1)b 𝐢(k)𝐢(α(1))(k)

We can relabel things to make this formula easier to work with: swap α with β, 𝐣 with 𝐢, and drop the primes. Also, move the permutation in the as to the column index using the inverse. Then the above becomes

1 N α,βSkWgN,k(α1β) 𝐢,𝐣:[k][N]a𝐢(1)𝐢(α(1))(1)a 𝐢(k)𝐢(α(k))(k)b 𝐣(1)𝐣(β1γ(1))(1)b 𝐣(k)𝐣(β1γ(k))(k) = 1 N α,βSkWgN,k(α1β)Tr α(A1,,Ak)Trβ1γ(B1,,Bk) = α,βSkWgN,k(α1β)N#(α)+#(β1γ)1 trα(A1,,Ak)trβ1γ(B1,,Bk)

as claimed.

In order to take the limit N , we need to understand the asymptotics of Wgk,N as a function of N.

7.2. Asymptotics of Weingarten function. Again, assume N k. Recall our formula for WgN,k:

WgN,k(σ) = 1 k!2 λ𝕐k dim(λ)2 𝗌λ(1N) χλ(σ).

The only part which involves N is

1 𝗌λ(1N) = k! dim(λ) (i,j)λ 1 N + j i

and the highest power of N in the denominator is Nk. So we can expand as a Laurent series in N:

WgN,k(σ) = n=0a n(σ)Nkn.

There is a nice formula for the coefficients.

Theorem 7.6 ([5, Theorem 2.2]). Let an(σ) be the Laurent coefficients above. Then

  1. with n = 0, we have

    a0(σ) = { 1if σ = e 0 otherwise
  2. for n /mo> 0,

    an(σ) = m=1n(1)mb m,n(σ)

    where bm,n(σ) is the number of tuples (σ1,,σm) with

    • σje,
    • σσ1σm = e, and
    • j=1m|σj| = n.

Corollary 7.7. For all σ Sk, we have WgN,k(σ) = O(Nk|σ|).

Proof. The claim here is that an(σ) = 0 for all n /mo> |σ|. Suppose n /mo> |σ|, let 1 m n, and take σ1,,σm Sk with j=1m|σj| = n. By the triangle inequality, we have

|σ| = |σσ1σmσm1σ 11||σσ 1σm| + j=1m|σ j|

so |σσ1σm||σ| j=1m|σj| /mo> 0. This means σσ1σme.

Notation 7.8. Let μ(σ) be the leading coefficient of WgN,k(σ).

Remark 7.9. With some more work [57], one can get a much more detailed understanding of this expansion:

μ(σ) = cCyc(σ)(1)|c|1Cat(|c| 1)

and

Nk+|σ|Wg N,k(σ) = μ(σ) + O(N2).

This level of detail will not be necessary for our purposes.

7.3. Limit of trace formula. Recall the trace formula:

𝔼trN(UA1UB 1UAkUB k) = α,βSkWgN,k(α1β)N#(α)+#(β1γ)1 trα(A1,,Ak)trβ1γ(B1,,Bk) = α,βSkWgN,k(α1β)N2k1|α||β1γ| trα(A1,,Ak)trβ1γ(B1,,Bk)

We have

WgN,k(α1β) = μ(α1β)Nk|α1β| + O(Nk|α1β|1 )

so the leading order in N is

Nk1|α||α1β||β1γ| .

Lemma 7.10. We have

|α| + |α1β| + |β1γ| k 1.

If we have equality, then either α or β1γ has a fixed point.

Proof. The first claim is just triangle inequality:

k 1 = |γ| = |αα1ββ1γ||α| + |α1β| + |β1γ|.

For the second claim, notice that at least one of

|α| k 1 2  or |β1γ| k 1 2

is true. In the first case, α can be written as a product of k1 2 transpositions, so it can only move up to k 1 points, i.e. it has a least one fixed point. Similarly, in the second case, β1γ can only move up to k 1 points, so it has at least one fixed point.

Recall the original goal: we need to show that for all P1,,Pk and Q1,,Qk with trN(Pj(A)) 0 and trN(Qj(B)) 0, we have

𝔼trN(UP1(A)UQ 1(B)UPk(A)UQ k(B)) 0

Using the lemma and formula above, we have

lim N𝔼trN(UP1(A)UQ 1(B)UPk(A)UQ k(B)) = α,βSk |α|+|α1β|+|β1γ|=k1 μ(α1β) (lim Ntrα(P1(A), ,Pk(A))) (lim Ntrβ1γ(Q1(B), ,Qk(B)))

and according to the lemma, for each summand, either α or β1γ has a fixed point. This means we have a factor of lim NtrN(Pj(A)) or lim NtrN(Qj(B)). This makes each summand 0.

8. Free additive convolution

Here is a very concrete problem: let A be the 2N × 2N diagonal matrix with half its entries 1 and the other half 1. Rotate A randomly by conjugating with a Haar unitary, and keep another copy of A. These are two matrices with the same eigenvalues, but in generic/random position in relation to each other. What does A + 𝑈𝐴U look like?

The distribution is complicated in an exact sense, but in our context it is natural to ask this question when N . This question can be answered using asymptotic freeness and free probability. This section draws from [12, Chapter 3] and [13, Lecture 12], and is generally rather sketchy about technical details. Rigorous proofs of everything in this section can be found in one of the two aforementioned references.

Definition 8.1. Let μ,ν Mc() and let (𝒜,φ) be a -probability space with freely independent self-adjoint elements a,b 𝒜 such that

φ(am) =xm𝑑𝜇(x) and φ(bm) =xm𝑑𝜈(x)

for all m 1. Let μ ν be the distribution of a + b.

This construction can always be made using free products (see [13]). It produces a well-defined operation on probability measures by a basic property of freeness: the distribution of a + b only depends on the individual distributions of a and b.

According to asymptotic freeness, the limit of A + 𝑈𝐴U should be described by μ μ where μ = 1 2(δ1 + δ1). My goal for today is to compute μ μ.

8.1. Free cumulants.

Notation 8.2. Let 𝒜 be an algebra and let (φn)n1 be a sequence of multilinear functionals φn : 𝒜n . For a set V [n], say V = {i1,,ip}, write

φV (a1,,an) = φp(ai1,,aip).

For π P(n), write

φπ(a1,,an) := V πφV (a1,,an).

For example, if π = {{1,3,5},{2,4},{6}}, then

φπ(a1,a2,a3,a4,a5,a6) = φ3(a1,a3,a5)φ2(a2,a4)φ1(a6).

Definition 8.3. Let (𝒜,φ) be a -probability space. The free cumulants are the multilinear functionals κn : 𝒜n defined implicitly by

φ(a1an) = π𝑁𝐶(n)κπ(a1,,an).

Example 8.4. With n = 1, the moment-cumulant formula says

φ(a1) = κ1(a1).

With n = 2, there are two noncrossing partitions so

φ(a1a2) = κ2(a1,a2) + κ1(a1)κ1(a2) = κ2(a1,a2) + φ(a1)φ(a2)

hence κ2(a1,a2) = φ(a1a2) φ(a1)φ(a2).

With n = 3, there are 5 noncrossing partitions:

φ(a1a2a3) = κ3(a1,a2,a3) + κ2(a1,a2)κ1(a3) + κ2(a1,a3)κ1(a2) + κ1(a1)κ2(a2,a3) + κ1(a1)κ1(a2)κ1(a3)

and the only one we don’t already know is κ3(a1,a2,a3). This pattern continues to determine κn.

Remark 8.5. The proper way to formulate this definition involves the theory of Möbius inversion. This is a general procedure for functions on lattices which generalizes the observation we made in the previous example – that the relation

φ(a1an) = π𝑁𝐶(n)κπ(a1,,an)

implicitly determines κn(a1,,an).

Notation 8.6. If we have a single variable a, we will abbreviate

κn(a) := κn(a,,a).

This means that for a single variable, we have a new sequence (κn)n1 which is related to the moment sequence by mn = π𝑁𝐶(n)κπ.

Theorem 8.7. Let (𝒜,φ) be a -probability space and let 𝒜1,,𝒜r be -subalgebras of 𝒜. Then 𝒜1,,𝒜r are freely independent if and only if for any 1 i1,,in r with ijik for some 1 j,k r, for any choice of aj 𝒜ij for each 1 j n, we have κn(a1,,an) = 0.

Proof. See [13, Lecture 11].

Corollary 8.8. If a and b are free, then κn(a + b) = κn(a) + κn(b).

Proof. Suppose that a and b are free. Then

κn(a + b) = κn(a + b,,a + b) = κn(a,,a) + + κn(b,,b)

and the intermediate terms vanish, so the above reads as

κn(a + b) = κn(a) + κn(b)

as claimed.

Example 8.9. If a is standard semicircular, the moment-cumulant formula says

π𝑁𝐶(n)κπ = { Cat(m)if n = 2m 0 otherwise .

This relation is satisfied by

κn = { 1if n = 2 0 otherwise .

Then, if a and b are free semicirculars, we have

κn(a+b) = κn(a)+κn(b) = { 2if n = 2 0 otherwise

so

m2k(a + b) = π𝑁𝐶(2k)κπ(a + b) = πNC2(2k)2k = 2kCat(k)

which are the moments of the semicircle measure with variance 2. In general, the semicircle measure with variance t is given by the density

dμt(x) = { 1 2𝜋𝑡4t x2𝑑𝑥if x [2t,2t] 0 otherwise .

The observation above can be generalized as follows: we have a semigroup (μt)t1 with respect to free additive convolution, in the sense that μs μt = μs+t. This is a fundamental property of semicircular variables. In particular, if A and B are GUEs, then the asymptotic distribution of A + B is still semicircular, just with a bigger radius.

Remark 8.10. The semigroup (μt)t1 of semicircular measures mentioned above is a special case of an important general construction in free probability. Namely, for any μ Mc(), there is a semigroup (μt)t1 with μ1 = μ, μs+t = μs μt for all s,t 1, and tμt weakly continuous. The measures μt can be constructed using compressions by free projections in C-probability spaces, see [13, Lecture 14].

8.2. Cauchy transform and Stieltjes inversion.

Definition 8.11. For μ M(), the Cauchy transform of μ is the function Gμ defined by

Gμ(z) := 1 z t𝑑𝜇(t)

for z +.

Proposition 8.12. Let μ Mc().

  1. Gμ is a holomorphic map + .
  2. With r := sup tsupp(μ)|t|, we have

    Gμ(z) = n=0m nzn1

    for |z| /mo> r.

  3. We have

    lim y𝑖𝑦Gμ(𝑖𝑦) = 1.

Theorem 8.13 (Stieltjes inversion). For μ M(), we have

μ((a,b)) + 1 2μ{a,b} = lim 𝜀0+ 1 πabImG μ(t + 𝑖𝜀)𝑑𝑡

for a /mo> b. If μ,ν M() have Gμ = Gν, then μ = ν.

Example 8.14 (Semicircle distribution). Recall that the standard semicircle distribution has the density

𝑑𝜇(t) = { 1 2π4 t2if t [2,2] 0 otherwise .

The moments are

tm𝑑𝜇(t) = { Cat(k)if m = 2k 0 otherwise

where Cat(k) := 1 k+1( 2k k) are the Catalan numbers. The Catalan numbers satisfy the following recursion: Cat(0) = 1 and

Cat(k) = r=1kCat(r 1)Cat(k r)

for k 1. We can use this to compute the Cauchy transform:

G(z) = 1 z + k=1Cat(k) 1 z2k+1 = 1 z + k=1( r=1kCat(r 1)Cat(k r)) 1 z2k+1 = 1 z + 1 z k=1 r=1kCat(r 1) z2r1 Cat(k r) z2(kr)+1 = 1 z + 1 z r=1Cat(r 1) z2r1 k=rCat(k r) z2(kr)+1 = 1 z + 1 z r=1Cat(r 1) z2r1 G(z) = 1 z + 1 zG(z)2

so G(z) satisfies the quadratic equation G(z)2 𝑧𝐺(z) + 1 = 0. By the quadratic formula, we have

G(z) = z ±z2 4 2 .

Since G(z) satisfies 𝑖𝑦𝐺(𝑖𝑦) 1, we need to pick the minus:

G(z) = z z2 4 2 .

What do we actually mean by z2 4? For z +, let 𝜃1 be the angle between the x-axis and the line from 2 to z, and let 𝜃2 be the angle between the x-axis and the line from 2 to z. Then z 2 = |z 2|ei𝜃1 and z + 2 = |z + 2|ei𝜃2, and we let z2 4 = |z2 4|12ei𝜃1+𝜃2 2 .

Now let’s see what our inversion procedure does: we have

Im (t + 𝑖𝜀)2 4 = |(t + 𝑖𝜀)2 4|12 sin (𝜃1 + 𝜃2 2 )

and when 𝜀 0+, we have two cases: either |t| /mo> 2, in which case 𝜃1 and 𝜃2 both go to either 0 or π, so the sin goes to sin(0) = sin(2π) = 0; or |t| 2 and one of 𝜃1,𝜃2 goes to 0 while the other goes to π, so the sin goes to sin(π2) = 1.

So if |t| /mo> 2, we have

lim 𝜀0+ (1 πImG(t + 𝑖𝜀)) = Im ( t 2π ) = 0.

If |t| 2, we have

lim 𝜀0+ (1 πImG(t + 𝑖𝜀)) = 1 2πlim 𝜀0+Im (t + 𝑖𝜀 (t + 𝑖𝜀)2 4) = 1 2πlim 𝜀0+ (𝜀 |(t + 𝑖𝜀)2 4|12 sin (𝜃1 + 𝜃2 2 )) = 1 2π4 t2.

This recovers the familiar semicircular density.

8.3. R-transform.

Theorem 8.15 (Voiculescu). Let μ M() with supp(μ) [R,R], and let G be the Cauchy transform. Then there is an analytic function R(z) on |z| /mo> 1 6R with

  1. G(R(z) + 1z) = z for 0 /mo> |z| /mo> 1 6R, and

  2. R(z) = n=0κn+1zn for |z| /mo> 1 6R.

Corollary 8.16. For μ,ν Mc(), we have Rμν(z) = Rμ(z) + Rν(z).

Example 8.17. Let μ = 1 2(δ1 + δ1). We will compute μ μ and solve the problem posed at the beginning of class; this is a classic textbook example of what free convolutions look like.

The first step is to find the Cauchy transform, and then in turn the R-transform. The Cauchy transform is easy:

Gμ(z) = 1 z t𝑑𝜇(t) = 1 2 ( 1 z 1 + 1 z + 1 ) = z z2 1.

For the R-transform, we want to use the functional equation Gμ(Rμ(z) + z1) = z. Let Kμ(z) := Rμ(z) + z1, so

z = Gμ(Kμ(z)) = Kμ(z) Kμ(z)2 1Kμ(z)2 z1K μ(z) 1 = 0

and the quadratic formula says

Kμ(z) = z1 ±z2 + 4 2 = 1 ±1 + 4z2 2z .

To pick which one, observe that we are supposed to have Rμ(0) = 0; the minus sign would give something non-zero divided by zero, while the plus sign would give 00, so the plus sign is the only possiblity. This makes

Rμ(z) = 1 4z2 1 2z .

Now, we can use the linearization property of the R-transform:

Rμμ(z) = Rμ(z) + Rμ(z) = 1 4z2 1 z Kμμ(z) = 1 + 4z2 z .

One can show (see the proof of [13, Theorem 12.7]) that z = Kμμ(Gμμ(z)). Then

z = Kμμ(Gμμ(z)) = 1 + 4Gμμ (z)2 Gμμ(z) (z2 4)G μμ(z)2 1 = 0

and Gμμ(z) = 1 z2 4.

Finally, we can do Stieltjes inversion with this:

lim 𝜀0 (1 πImGμμ(t + 𝑖𝜀)) = 1 πlim 𝜀0Im 1 (t + 𝑖𝜀)2 4 = 1 πIm 1 t2 4.

So if |t| 2, the above is

1 πIm 1 t2 4 = 1 πIm 1 i4 t2 = 1 πIm i 4 t2 = 1 π4 t2

and otherwise, there is no imaginary part so it’s just 0.

8.4. Exercises.

Exercise 8.18. Let μ = δ0. Compute the limit in the Stieltjes inversion formula and explain why you couldn’t have worked directly with the integrand. Note: the point of this exercise is to see how Stieltjes inversion works when you have atoms.

9. Expected characteristic polynomials

In this section, we will use familiar combinatorial methods to compute some expected characteristic polynomials. This is the starting point for a new theory called finite free probability, which we will spend the next few classes surveying.

Notation 9.1. For A MN(), say A = (a𝑖𝑗)1i,jN, and S,T [N], write

A(S,T) := (a𝑖𝑗)iS jT .

Let cx(A) := det (𝑥𝐼 A) be the characteristic polynomial of A.

Proposition 9.2. For A MN(), we have

cx(A) = k=0NxNk(1)k𝖾 k(A) where 𝖾k(A) := S[N] |S|=k det (A(S,S)).

Proof. See [9, Section 1.2].

The first example of an expected characteristic polynomial is straightforward:

Example 9.3 (GUE and Hermite polynomials). Let A be an N × N GUE. Denoting by Sym(S) the set of permutations of S [N], we have

𝔼𝖾m(A) = |S|=m𝔼 det (A(S,S)) = |S|=m σSym(S)sgn(σ)𝔼 ( iSa𝑖𝜎(i)) = |S|=m σSym(S)sgn(σ) πP2(S) (r,s)π𝔼(a𝑟𝜎(r)a𝑠𝜎(s)).

If m is odd, then the inner sum above is empty and the whole expression is just 0. This means the expected characteristic polynomial is even in the sense that it is of the form

xN + ()xN2 + ()xN4 + .

In other words, the roots come in plus-minus pairs.

Now suppose m = 2k. Then

𝔼𝖾2k(A) = |S|=2k σSym(S)sgn(σ) πP2(S) (r,s)π𝔼(a𝑟𝜎(r)a𝑠𝜎(s)) = |S|=2k σSym(S)sgn(σ) πP2(S) 1 Nkδ r=σ(s) s=σ(r) (r,s)π = 1 Nk |S|=2k πP2(S)sgn(π) = 1 Nk( N 2k)(2k 1)!!(1)k = 1 Nk N! (2k)!(N 2k)! (2k)! 2kk! (1)k = (N)2k Nk (1)k 2kk! .

This is the very famous Hermite polynomial with degree N.

The central computation that gives rise to finite free probability involves the randomly rotated matrices we discussed previously. Next class, we will turn the following theorem into a definition of a “finite free convolution” operation on polynomials.

Theorem 9.4 ([11]). Let A and B be normal N × N matrices and let U be a random N × N unitary matrix. Then

𝔼cx(A + 𝑈𝐵U) = k=0NxNk(1)k i+j=k (N)k (N)i(N)j𝖾i(A)𝖾j(B)

and

𝔼cx(𝐴𝑈𝐵U) = k=0NxNk(1)k 1 (N k) 𝖾k(A)𝖾k(B).

Proof. First of all, we can assume without loss of generality that A and B are diagonal with

A = diag(a1,,aN) and B = diag(b1,,bN)

as outlined in the exercises. Let X = A + 𝑈𝐵U; we need to compute 𝔼(X(S,S)) for |S| = k. We have

det (X(S,S)) = σSym(S)sgn(σ) iSx𝑖𝜎(i) = σSym(S)sgn(σ) iS(aiδi=σ(i) + p=1Nu 𝑖𝑝bpuσ(i)p¯) = σSym(S)sgn(σ) RS iRaiδi=σ(i) iSR p=1Nu 𝑖𝑝bpuσ(i)p¯ = RS ( iRai) σSym(SR)sgn(σ) 𝐩:SR[N] iSRui𝐩(i)b𝐩(i)uσ(i)𝐩(i)¯ = RS ( iRai) 𝐩:SR[N] ( iSRb𝐩(i)) σSym(SR)sgn(σ) iSRui𝐩(i)uσ(i)𝐩(i)¯

so we need to compute

σSym(SR)sgn(σ)𝔼 ( iSRui𝐩(i)uσ(i)𝐩(i)¯)

for each |S| = k, R S, and 𝐩 : S R [N].

As outlined in the exercises, this can be reduced to the following problem: for 0 n N and 𝐩 : [n] [N], compute

σSnsgn(σ)𝔼 ( i=1nu i𝐩(i)uσ(i)𝐩(i)¯) .

By Weingarten calculus, it’s equal to

σSnsgn(σ) α,βSn i=σ(α(i)) 𝐩(i)=𝐩(β(i)) WgN,n(α1β) = σSnsgn(σ) αSn 𝐩=𝐩α WgN,n(𝜎𝛼);

the reduction from a double sum to a single sum was made by observing that i = σ(α(i)) for all 1 i n forces α = σ1. If 𝐩 is not injective, say 𝐩(i) = 𝐩(j) with ij, then one can use the transposition (i,j) to set up cancellation of pairs in the outer sum, as outlined in the exercises. So in this case, the expression above is 0.

Now, assume 𝐩 is injective, so the constraint 𝐩 = 𝐩 α makes α = e. Using the character expansion, the above is equal to

1 n!2 λn dim(λ)2 𝗌λ(1N) σSnsgn(σ)χλ(σ)

and the inner sum can be immediately recognized as a scalar multiple of the inner product χ1n ,χλ. As such, the orthogonality relations give

σSnsgn(σ)χλ(σ) = { n!if λ = 1n 0 otherwise

so the above is

1 n! dim(1n)2 𝗌1n(1N) = 1 n! (i,j)1nn!N(N 1)(N n + 1) = (N n)! N! .

In total, what we have shown is that

σSnsgn(σ)𝔼 ( i=1nu i𝐩(i)uσ(i)𝐩(i)¯) = { (Nn)! N! if 𝐩 injective 0 otherwise .

Putting this back into the big formula, we have

𝔼𝖾k(A + 𝑈𝐵U) = |S|=k𝔼 det (X(S,S)) = |S|=k RS ( iRai) 𝐩:SR[N] injective ( iSRb𝐩(i)) (N |S R|)! N! = |S|=k RS det (A(R,R))|S R|!𝖾|SR|(B)(N |S R|)! N! = |S|=k r=0k(k r)!(N (k r))! N! 𝖾kr(B) RS |R|=r det (A(R,R)) = r=0k(k r)!(N (k r))! N! 𝖾kr(B) R[N] |R|=r det (A(R,R)) RS[N] |S|=k 1 = r=0k(N r k r) (k r)!(N (k r))! N! 𝖾r(A)𝖾kr(B) = r=0k(N r)!(N (k r))! N!(N k)! 𝖾r(A)𝖾kr(B)

which is the claim of (1) in the theorem. The claim of (2) can be proved by similar techniques; the character computation can be reused, and the only difference is in the final “reassembly” step.

9.1. Exercises.

Exercise 9.5. Let A and B be normal N × N matrices with eigenvalues (a1,,aN) and (b1,,bN) respectively. Let U be a random N × N unitary matrix.

  1. Let

    DA = diag(a1,,aN) and DB = diag(b1,,bN).

    Prove that

    𝔼cx(A + 𝑈𝐵U) = 𝔼c x(DA + UDBU).
  2. When you need to compute

    σSym(S)sgn(σ)𝔼 ( iSui𝐩(i)uσ(i)𝐩(i)¯)

    for S [N] with |S| = n and 𝐩 : S [N] in the proof of Theorem 9.4, why can you safely assume S = [n]?

Exercise 9.6. Let 0 n N, and let 𝐩 : [n] [N] be a multi-index which is not injective. Prove that

σSnsgn(σ)𝔼 ( i=1nu i𝐩(i)uσ(i)𝐩(i)¯) = 0.

10. Finite free probability

Notation 10.1. Let p be a polynomial of degree N. We write the coefficients as follows:

p(x) = k=0NxNk(1)k𝖾 k(p).

One can immediately show that

where λ1,,λN are the roots of p. (Write p as a product of linear factors and expand.) This is the so-called elementary symmetric polynomial, perhaps familiar to you from algebraic combinatorics or undergrad algebra.

𝖾k(p) = 1i1/mo>/mo>ikNλi1λik

Definition 10.2. For monic polynomials p and q with degree N, define

p Nq := k=0NxNk(1)k i+j=k (N)k (N)i(N)j𝖾i(p)𝖾j(p)

and

p Nq := k=0NxNk(1)k 1 (N k) 𝖾k(p)𝖾k(q).

It must be emphasized that this definition is merely a repackaging of Theorem 9.4.

Theorem 10.3 (Quadrature for finite free convolution). Let p and q be monic polynomials of degree N. Let A and B be diagonal N × N matrices with cx(A) = p(x) and cx(B) = q(x). Then

𝔼cx(A + 𝑈𝐵U) = p Nq

and

𝔼cx(𝐴𝑈𝐵U) = p Nq

where U is a random matrix from any of the following groups:

  1. UN N × N unitary matrices;
  2. ON N × N orthogonal matrices;
  3. SN N × N permutation matrices;
  4. 𝐙r SN N × N permutation matrices signed by r-th roots of unity, for 1 r , where the case r = is shorthand for the circle group.

Theorem 10.4 ([1714]). Let p and q be monic polynomials of degree N. Suppose that p and q are real-rooted. Then

Notation 10.5. For a polynomial p of degree N, write

ρ(p) := 1 N i=1Nδ λi

where λ1,,λN are the roots of p with multiplicity.

Theorem 10.6 ([1021]). For N 1, let pN and qN be monic real-rooted polynomials with degree N, and suppose that there are some μ,ν Mc() with ρ(pN) μ and ρ(qN) ν. Then

  1. ρ(pN NqN) μ ν;
  2. if “real-rooted” is replaced by “non-negative-rooted” and Mc() is replaced by Mc(0), then ρ(pN NqN) μ ν.

Remark 10.7. We have not discussed Voiculescu’s multiplicative convolution , but you can find information about it in [13]. We will eventually prove (1); the proof of (2) is similar and can be found in [1].

10.1. Finite free cumulants. Recall that for a self-adjoint variable a, the free cumulants are a sequence (κn)n1, related to the moment sequence by a sum over noncrossing partitions, in which κ1 is the mean and κ2 is the variance.

Definition 10.8 ([2]). The finite free cumulants are defined implicitly by

𝖾n(p) = 1 Nn( N n) πP(n)N|π|μ(π)κ π(N)(p)

where μ(π) := (1)n|π| i2i!mi+1(π) and mj(π) is the number of blocks in π with size j.

Remark 10.9. The finite free cumulants can also be defined by a moment-cumulant formula; we will do this next class.

Example 10.10. With n = 1, the relation above says that

𝖾1(p) = Nκ1(N)(p)κ 1(N)(p) = 1 N i=1Nλ i

where λ1,,λN are the roots of p. This is just the mean of ρ(p).

With n = 2, it gives

κ2 = 1 N𝖾1𝖾1 2 N 1𝖾2

which is not the variance. We will see later that

κ2 = N N 1(m2 m12).

Example 10.11. Recall the expected characteristic polynomial of GUE:

𝔼cx(A) = k=0N2xN2k(1)k (N)2k 2kNkk!

Claim: the finite free cumulants are

κn(N) = { 1if n = 2 0otherwise

To check this: putting these κs into the definition, we have

1 Nn( N n) πP(n)N|π|μ(π)κ π(N) = 1 Nn( N n) πP2(n)N|π|μ(π)

which is 0 if n is odd. If n = 2k, then the above is

1 N2k( N 2k) πP2(2k)Nk(1)2kk = 1 N2k( N 2k)(2k 1)!!Nk(1)k = 1 Nk N! (2k)!(N 2k)! (2k)! 2kk! (1)k = (1)k (N)2k 2kNkk!

which matches.

Theorem 10.12 ([2]). κn(N)(p Nq) = κn(N)(p) + κn(N)(q)

Proof. We have

1 (N)k𝖾k(p Nq) = i+j=k 1 (N)i𝖾i(p) 1 (N)j𝖾j(q) (Definition 10.2 = i+j=k ( 1 Nii! π1P(i)N|π1|μ(π 1)κπ1(N)(p)) ( 1 Njj! π2P(j)N|π2|μ(π 2)κπ2(N)(q)) (Definition 10.8 = 1 Nkk! i+j=k(k i) ( π1P(i)N|π1|μ(π 1)κπ1(N)(p)) ( π2P(j)N|π2|μ(π 2)κπ2(N)(q)) = 1 Nkk! S[k] ( π1P(S)N|π1|μ(π 1)κπ1(N)(p)) ( π2P([k]S)N|π2|μ(π 2)κπ2(N)(q)) = 1 Nkk! πP(k) π=π1π2N|π1|+|π2|μ(π 1)μ(π2)κπ1(N)(p)κ π2(N)(q) = 1 Nkk! πP(k)N|π|μ(π) π=π1π2κπ1(N)(p)κ π2(N)(q) = 1 Nkk! πP(k)N|π|μ(π)(κ(N)(p) + κ(N)(q)) π

so κn(N)(p Nq) = κn(N)(p) + κn(N)(q).

10.2. Exercises.

Exercise 10.13. Prove (3) in Theorem 10.3. Hint: the only difference is in computing

σSnsgn(σ)𝔼 ( i=1nu i𝐩(i)uσ(i)𝐩(i)¯)

where U = (u𝑖𝑗)i,j is a random permutation matrix. Show that the expression above is the same as it is for U UN.

11. Asymptotics of finite free convolution

Reminder: free moment-cumulant formula is

mn = π𝑁𝐶(n)κπ

For example,

Last class, we introduced a sequence of finite free cumulants associated to a degree N polynomial p, denoted by κn(N)(p). In this class, we will develop a moment-cumulant relation and use it to prove that the finite free cumulants characterize convergence in root distribution.

11.1. Combinatorics of maps. Recall that the noncrossing partitions can be characterized as follows:

𝑁𝐶(n) {α Sn : #(α) + #(α1γ n) = n + 1}

where γn := (1,,n) and #() is the number of disjoint cycles. Furthermore, we have

#(α) + #(α1γ n) n + 1

for all α Sn, and the LHS decreases in twos. We need to generalize this phenomenon:

Theorem 11.1. Let α,γ Sn and suppose that α,γis transitive. Then there is some g 0 such that

#(α) + #(α1γ) = (n + 2 #(γ)) 2g.

Idea of proof. For each α, there is a surface

Shrinking the discs to points produces a closed surface with a graph drawn on it. The Euler characteristic formula says V E + F = 2 2g; the number of vertices is #(γ), the number of edges is n, and the number of faces is #(α) + #(α1γ). This makes

#(γ) n + #(α) + #(α1γ) = 2(1 g).

Example 11.2. We have already seen an instance of this theorem, when γ = (1,,n) and α is an involution with no fixed points. In this case, with n = 2k, the formula reads as

k + #(α1γ) + 1 = 2k + 2 2g#(α1γ) = k + 1 2g

and we recover the exponents from the GUE genus expansion.

Example 11.3. More generally, we saw earlier in the course that there is a bijection

𝑁𝐶(n) {α Sn : #(α) + #(α1γ n) = n + 1}

where γn := (1,,n). This is the case in Theorem 11.1 where γ = (1,,n) and g = 0.

Example 11.4 ([12]). Let γ = (1,2,3), and α = (1,2,3) and β = (1,3,2). Then

α1γ = (1,3,2)(1,2,3) = (1)(2)(3) and β1γ = (1,2,3)(1,2,3) = (1,3,2)

and we can check genus:

#(α) + #(α1γ) = 1 + 3 = 4 and n + 2 #(γ) = 3 + 2 1 = 4

so g = 0, while

#(β) + #(β1γ) = 1 + 1 = 2 and n + 2 #(γ) = 3 + 2 1 = 4

so g = 1. You should draw some pictures to verify that the results of these formal computations are sensible. The critical difference between α and β, when you do this, will be the orientation requirement.

11.2. Moment-cumulant formulas.

Proposition 11.5 ([21]). The finite free cumulants satisfy (and are defined implicitly by) the following moment-cumulant relation:

mn = (1)n1 Nn+1(n 1)! α,βSn α,βSn transitive (N)#(α)+#(β)κ α(N) = (1)n1 Nn+1(n 1)! γSn αSn α,γSn transitive (N)#(α)+#(α1γ) κα(N)

Example 11.6. With n = 1, the above says

m1 = 1 N2(N)2κ 1(N) = κ 1(N)

so there is no difference yet. With n = 2, it says

m2 = 1 N3 ((N)3κ 1(N)κ 1(N) + (N)3κ 2(N) + (N)2κ 2(N)) = κ2(N) + κ 1(N)κ 1(N) 1 Nκ2(N)

so

m2 m12 = (1 1 N )κ2(N)κ 2(N) = N N 1(m2 m12).

When N , the rational function in N goes away and you recover the free case.

With n = 3, the pairs α,β which generate a transitive subgroup are the following:

So the moment-cumulant formula says

mn = 1 2N4 (2(N)3+1κ 1κ1κ1 + 6(N)2+2κ 2κ1 + 2(N)3+1κ 3 + 6(N)2+1κ 2κ1 +3(N)1+2κ 3 + 2(N)1+1κ 3 + 3(N)1+2κ 3 + 2(N)1+1κ 3)

Notice, again, that the leading order is

m3 = κ13 + 3κ 1κ2 + κ3 + O(N1)

which matches the moment-cumulant formula for free cumulants.

Theorem 11.7 ([2]). Let μ be a compactly supported measure, and for N 1, let pN be a monic real-rooted polynomial with degree N. Then the following are equivalent:

  1. ρ(pN) μ weakly;
  2. κn(N)(pN) κn(μ) for all n 1.

Proof. First, assume that κn(N)(pN) κn(μ) for all n 1. Then by the moment-cumulant formula, we have

lim Nmn(pN) = lim N(1)n1 (n 1)! α,βSn α,β transitive (1)#(α)+#(β) ( 1 N )n+1#(α)#(β)κ α(N)(p) = lim N(1)n1 (n 1)! γSn αSn α,γ transitive (1)#(α)+#(α1γ) ( 1 N )n+1#(α)#(α1γ) κα(N)(p)

From Theorem 11.1, since g 0 and #(γ) 1, we have

#(α) + #(α1γ) n + 2 #(γ) n + 1

This means the expression above does not blow up, and the summands that survive are the ones where this inequality is an equality. So we can continue the above as

lim Nmn(pN) = (1)n1 (n 1)! γSn αSn α,γ transitive #(α)+#(α1γ)=n+1 (1)n+1κ α = 1 (n 1)! γSn αSn α,α1γ transitive #(α)+#(α1γ)=n+1 κα

Now, if γ Sn has #(γ) /mo> 1, then we have

so the inner sum is empty. One can check the inner sum also only depends on γ through the cycle type: if γ = 𝜎𝛾σ1, then

αSn α,γ transitive #(α)+#(α1γ)=n+1 κα = αSn 𝜎𝛼σ1,𝜎𝛾σ1 transitive #(𝜎𝛼σ1)+#((𝜎𝛼σ1)1(𝜎𝛾σ1))=n+1 κ𝜎𝛼σ1 = αSn α,γ transitive #(α)+#(α1γ)=n+1 κα.

So if #(γ) = 1, then the limiting expression reads as

#(α) + #(α1γ) = n + 2 #(γ) 2g /mo> n + 2 1 2g = n + 1 2g n + 1
αSn #(α)+#(α1γ n)=n+1 κα = π𝑁𝐶(n)κπ = mn

which shows that lim Nmn(pN) mn(μ).

Conversely, assume that ρ(pN) μ weakly, and assume for the sake of simplicity that the measures ρ(pN) have uniformly bounded support (this isn’t a serious restriction, see [1, Appendix B]), so mn(pN) mn(μ) for all n 1. We will prove by induction on n that κn(N)(pN) κn(μ) for all n 1.

For the base case n = 1, we have

lim Nκ1(N)(p N) = lim Nm1(pN) = m1(μ) = κ1(μ).

Now suppose the claim is true up to n 1. Then the moment-cumulant formula can be re-arranged: the α which produce κn(N) are exactly the ones with only one cycle, and for these, any β will produce a transitive subgroup. So we have

(1)n1 Nn+1(n 1)! α,βSn #(α)=1 (N)1+#(β)κ n(N)(p N) = mn(pN) (1)n1 Nn+1(n 1)! α,βSn α,β transitive #(α)/mo>1 (N)#(α)+#(β)κ α(N)(p N)

On the LHS, we have

(1)n (n 1)! βSn(1)#(β) ( 1 N )n#(β)κ n(N)(p N) = κn(N)(p N) 1 (n 1)! βSn (1 N )n#(β)

and by (1) the induction assumption and (2) the moment convergence assumption, on the RHS we have

lim N (mn(pN) (1)n1 (n 1)! α,γSn α,γ transitive #(α)/mo>1 (1)#(α)+#(α1γ) ( 1 N )n+1#(α)#(α1γ) κα(N)(p N)) = lim Nmn(pN) 1 (n 1)! α,γSn α,γ transitive #(α)/mo>1 #(α)+#(α1γ)=n+1 lim Nκα(N)(p N) = mn(μ) π𝑁𝐶(n) |π|/mo>1 κπ(μ).

So the RHS has a limit, and the non-cumulant part of the LHS converges to 1. This situation corresponds to the following: let ak and bk be sequences where ak 1, ak0, and akbk L for some L . Then we have

bk = akbk ak L

by the quotient rule for limits of sequences. So we can conclude that

which proves the theorem.

lim Nκn(N)(p N) = mn(μ) π𝑁𝐶(n) |π|/mo>1 κπ(μ) = κn(μ)

Corollary 11.8. For N 1, let pN and qN be monic real-rooted polynomials with degree N. Let μ and ν be compactly supported measures and suppose that ρ(pN) μ and ρ(qN) ν. Then ρ(pN NqN) μ ν.

Proof. Again, assume for the sake of simplicity that the supports of ρ(pN) and ρ(qN) are uniformly bounded. Then mn(pN) mn(μ) and mn(qN) mn(ν). By Theorem 11.7, we have κn(N)(pN) κn(μ) and κn(N)(qN) κn(ν) for all n 1.

Now, we can use Theorem 10.4 to ensure that pN NqN is real-rooted, and apply Theorem 10.12 and Corollary 8.8 to obtain

κn(N)(p N NqN) = κn(N)(p N) + κn(N)(q N) κn(μ) + κn(ν) = κn(μ ν)

for all n 1. Applying Theorem 11.7 again, we can conclude that ρ(pN NqN) μ ν.

References

1.
Octavio Arizmendi, Jorge Garza-Vargas, and Daniel Perales, Finite free cumulants: Multiplicative convolutions, genus expansion and infinitesimal distributions, Trans. Amer. Math. Soc. 376 (2023), no. 6, 4383–4420.
2.
Octavio Arizmendi and Daniel Perales, Cumulants for finite free convolution, J. Combin. Theory Ser. A 155 (2018), 244–266.
3.
Philippe Biane, Some properties of crossings and partitions, Discrete Math. 175 (1997), no. 1-3, 41–53.
4.
Patrick Billingsley, Probability and measure, third ed., Wiley Series in Probability and Mathematical Statistics, John Wiley &Sons, Inc., New York, 1995.
5.
Benoît Collins, Moments and cumulants of polynomial random variables on unitary groups, the Itzykson-Zuber integral, and free probability, Int. Math. Res. Not. (2003), no. 17, 953–982.
6.
Benoît Collins, Moment methods on compact groups: Weingarten calculus and its applications, ICM—International Congress of Mathematicians. Vol. 4. Sections 5–8, EMS Press, Berlin, 2023, pp. 3142–3164.
7.
Benoît Collins and Piotr Śniady, Integration with respect to the Haar measure on unitary, orthogonal and symplectic group, Comm. Math. Phys. 264 (2006), no. 3, 773–795.
8.
Pavel Etingof, Oleg Golberg, Sebastian Hensel, Tiankai Liu, Alex Schwendner, Dmitry Vaintrob, and Elena Yudovina, Introduction to representation theory, Student Mathematical Library, vol. 59, American Mathematical Society, Providence, RI, 2011, With historical interludes by Slava Gerovitch.
9.
Roger A. Horn and Charles R. Johnson, Matrix analysis, second ed., Cambridge University Press, Cambridge, 2013.
10.
Adam W. Marcus, Polynomial convolutions and (finite) free probability, 2021, arXiv:
2108.07054 [math.CO].
11.
Adam W. Marcus, Daniel A. Spielman, and Nikhil Srivastava, Finite free convolutions of polynomials, Probab. Theory Related Fields 182 (2022), no. 3-4, 807–848.
12.
James A. Mingo and Roland Speicher, Free probability and random matrices, Fields Institute Monographs, vol. 35, Springer, New York; Fields Institute for Research in Mathematical Sciences, Toronto, ON, 2017.
13.
Alexandru Nica and Roland Speicher, Lectures on the combinatorics of free probability, London Mathematical Society Lecture Note Series, vol. 335, Cambridge University Press, Cambridge, 2006.
14.
G. Szegö, Bemerkungen zu einem Satz von J. H. Grace über die Wurzeln algebraischer Gleichungen, Math. Z. 13 (1922), no. 1, 28–55.
15.
Dan Voiculescu, Symmetries of some reduced free product C-algebras, Operator algebras and their connections with topology and ergodic theory (Buşteni, 1983), Lecture Notes in Math., vol. 1132, Springer, Berlin, 1985, pp. 556–588.
16.
_________ , Limit laws for random matrices and free products, Invent. Math. 104 (1991), no. 1, 201–220.
17.
J. L. Walsh, On the location of the roots of certain types of polynomials, Trans. Amer. Math. Soc. 24 (1922), no. 3, 163–180.