RANDOM MATRICES AND FREE PROBABILITY

JACOB CAMPBELL

Contents
Notation
1.  GUE and genus expansion
1.1.  Moments of gaussians
1.2.  Self-adjoint gaussian matrix
1.3.  Exercises
2.  Semicircle law and noncommutative probability spaces
2.1.  Which pairings survive in the limit?
2.2.  Moment problem
2.3.  Noncommutative probability spaces
2.4.  Exercises
3.  Free independence
3.1.  Motivation and definition
3.2.  What does freeness actually do?
3.3.  Free central limit theorem
3.4.  Exercises
4.  Asymptotic freeness of multiple GUEs
4.1.  Free semicircular families
4.2.  Asymptotic freeness of GUEs
4.3.  Exercises
5.  How to integrate over compact groups
5.1.  Reminders about integration
5.2.  Projection onto fixed points
5.3.  Gram matrices and pseudoinverses
5.4.  Exercises
6.  Weingarten formula
6.1.  The spanning set
6.2.  Character expansion
6.3.  Exercises
7.  Randomly rotated matrices
7.1.  Exact trace formula
7.2.  Asymptotics of Weingarten function
7.3.  Limit of trace formula
8.  Free additive convolution
8.1.  Free cumulants
8.2.  Cauchy transform and Stieltjes inversion
8.3.  R-transform
8.4.  Exercises
9.  Expected characteristic polynomials
9.1.  Exercises
10.  Finite free probability
10.1.  Finite free cumulants
10.2.  Exercises
11.  Asymptotics of finite free convolution
11.1.  Combinatorics of maps
11.2.  Moment-cumulant formulas
References

Notation

For $X \subseteq ℂ$ , let $M (X)$ be the set of probability measures on $X$ , and let $M_{c} (X)$ be the set of probability measures on $X$ with compact support. Let $ℍ^{+}$ and $ℍ^{-}$ be the upper and lower half-planes, respectively.

1. GUE and genus expansion

1.1. Moments of gaussians. Recall the following distribution:

Definition 1.1. A random variable $X$ has a standard gaussian distribution if it has density

f_{X} (x) = \frac{1}{\sqrt{2 π}} e^{- \frac{x^{2}}{2}}

for $x \in ℝ$ . This is the case of mean $0$ and variance $1$ .

Actually, we care about all the moments:

Proposition 1.2. If $X$ is standard gaussian, then

𝔼 (X^{2 k}) = (2 k - 1)!! and 𝔼 (X^{2 k - 1}) = 0

for $k \geq 1$ .

Proof. Exercise. □

Definition 1.3 (Set partitions). A partition of the set $[n] : = {1, \dots, n}$ is a collection ${V_{1}, \dots, V_{l}}$ of blocks, which are disjoint subsets of $[n]$ , with $[n] = ⋃_{i = 1}^{l} V_{i}$ . The set of partitions of $[n]$ will be denoted by $P (n)$ . A pair partition is a partition whose blocks all have size $2$ ; the set of pair partitions of $[n]$ will be denoted by $P_{2} (n)$ . Note that $P_{2} (n) = \emptyset$ when $n$ is odd.

Let $ϕ : P (n) \to S_{n}$ be the map which turns blocks into cycles, with the usual order inherited from $[n]$ . This is obviously injective, and the range of $P_{2} (n)$ is the set of permutations in $S_{n}$ with order $2$ and no fixed points. We will refer interchangeably to $P_{2} (n)$ and its image in $S_{n}$ ; notice that in the case of $P_{2} (n)$ , the arbitrary choice of order in the definition of $ϕ$ does not actually matter.

Example 1.4.

P_{2} (4) = {{{1, 2}, {3, 4}}, {{1, 3}, {2, 4}}, {{1, 4}, {2, 3}}}

Proposition 1.5. We have $| P_{2} (2 k) | = (2 k - 1)!!$ for $k \geq 1$ .

Idea of proof. When you choose what gets paired with $1$ , you have $2 k - 1$ choices. Next, you have $2 k - 3$ choices. This goes on to give you $(2 k - 1)!!$ choices in total. Exercise: formalize this idea. □

Theorem 1.6 (Wick formula). Let $X_{1}, \dots, X_{n}$ be gaussian with covariance matrix $Σ$ . Then for $𝐢 : [k] \to [n]$ , we have

𝔼 (X_{𝐢 (1)} \dots X_{𝐢 (k)}) = \sum_{π \in P_{2} (k)} \prod_{(r, s) \in π} 𝔼 (X_{𝐢 (r)} X_{𝐢 (s)}) .

Proof. Exercise. □

Definition 1.7 (Complex gaussian). A standard complex gaussian variable $Z$ is obtained by letting $X$ and $Y$ be independent standard real gaussians and letting $Z = \frac{1}{\sqrt{2}} (X + 𝑖𝑌)$ . This has mean $0$ and variance $1$ :

𝔼 (Z) = \frac{1}{\sqrt{2}} (𝔼 (X) + 𝑖𝔼 (Y)) = 0

and

𝔼 (Z \bar{Z}) = \frac{1}{2} 𝔼 ((X + 𝑖𝑌) (X - 𝑖𝑌)) = \frac{1}{2} (𝔼 (X^{2}) + 𝔼 (Y^{2})) = 1 .

Proposition 1.8. Let $Z$ be a standard complex gaussian variable. Then

𝔼 (Z^{m} {\bar{Z}}^{n}) = {\begin{matrix} m! & if m = n \\ 0 & otherwise \end{matrix}

for $m, n \geq 0$ .

Proof. Exercise. □

Remark 1.9. By multilinearity, the structure of the Wick formula is retained when one replaces some of the real gaussians with complex gaussians and/or their adjoints. We will use this observation freely.

1.2. Self-adjoint gaussian matrix.

Definition 1.10 (Gaussian unitary ensemble). Let $A = {(a_{𝑖𝑗})}_{1 \leq i, j \leq N}$ be a matrix where

${a_{𝑖𝑖} : 1 \leq i \leq N}$ are iid real gaussian with mean $0$ and variance $\frac{1}{N}$ ,
${a_{𝑖𝑗} : 1 \leq i /mo> j \leq N} are iid complex gaussian with mean 0,

variance \frac{1}{N},
and$
$a_{𝑖𝑗} = \bar{a_{𝑗𝑖}}$ for $1 \leq j /mo> i \leq N .$

The point of this definition is that $A$ is self-adjoint with independent gaussian entries, except as required by the self-adjointness. The choice of variance $\frac{1}{N}$ is for normalization and will play an important role when we make $N \to \infty$ .

Definition 1.11. Let $(Ω, F, P)$ be a probability space and for $1 \leq i, j \leq N$ , let $a_{𝑖𝑗} : Ω \to ℂ$ be a random variable. Assume that $A = {(a_{𝑖𝑗})}_{i, j}$ is self-adjoint. This random matrix produces a random probability measure

ν_{A} : = \frac{1}{N} \sum_{i = 1}^{N} δ_{λ_{i}}

on $ℝ$ , where $λ_{1}, \dots, λ_{N}$ are the eigenvalues of $A$ . This is called the empirical spectral distribution (or ESD) of $A$ .

The average eigenvalue distribution of $A$ , denoted by $μ_{A}$ , is simply the mean of $ν_{A}$ : define $μ_{A}$ by

\int_{ℝ} f (x) d μ_{A} (x) = 𝔼 (\int_{ℝ} f (x) d ν_{A} (x))

for measurable $f$ . This gives us a nice way to access the moments:

\int_{ℝ} x^{m} d μ_{A} (x) = 𝔼 (\int_{ℝ} x^{m} d ν_{A} (x)) = 𝔼 (\frac{1}{N} \sum_{i = 1}^{N} λ_{i}^{m}) = 𝔼 {tr}_{N} (A^{m}) .

Theorem 1.12 (Wigner’s semicircle law). Let $A$ be the random matrix from Definition 1.10. Then $μ_{A} \to μ$ weakly as $N \to \infty$ , where $𝑑𝜇 = f 𝑑𝑥$ with

f (x) = {\begin{matrix} \frac{1}{2 π} \sqrt{4 - x^{2}} & if x \in [- 2, 2] \\ 0 & otherwise \end{matrix} .

Remark 1.13. The mode of convergence in Theorem 1.12 can be strengthened. Namely, one can show $ν_{A} \to μ$ weakly in probability or weakly almost surely. This could be a good starting point for your project.

In this class and next, we will prove Theorem 1.12 using moments. The moments of the average eigenvalue distribution turn out to be encoded by a combinatorially rich sequence of polynomials – related to topological genus of certain surfaces – evaluated at $\frac{1}{N}$ . We will come back to the latter connection later on; for now, we will only be concerned with the leading order.

Notation 1.14. Let $γ_{n} = (1, \dots, n)$ . We will use the notation $# (\cdot)$ for the number of disjoint cycles in a permutation (including singletons/fixed points).

Theorem 1.15 (Genus expansion). We have

𝔼 ({tr}_{N} (A^{m})) = \sum_{π \in P_{2} (m)} {(\frac{1}{N})}^{\frac{m}{2} + 1 - # (γ_{m} π)}

for $m \geq 1$ . When $m$ is odd, the formula above should be read as an empty sum, i.e. as $0$ .

Remark 1.16. For $π \in P_{2} (2 k)$ , there are at least two ways to interpret the exponent $k + 1 - # (γ_{2 k} π)$ . It is $2 g_{π}$ , where $g_{π}$ is defined in either of the following two ways:

$g_{π}$ is the genus of the surface obtained from a polygon with $2 k$ sides by gluing the sides together in pairs according to $π$ ;
$g_{π}$ is the smallest possible genus for which the following can be done: put the elements of $[2 k]$ on a circle clockwise, make that circle the boundary of a surface with genus $g_{π}$ , and draw $π$ on the surface without crossings.

This is where the phrase “genus expansion” comes from.

Proof of Theorem 1.15. We have

\begin{array}{l} 𝔼 ({tr}_{N} (A^{m})) & = \frac{1}{N} \sum_{𝐢 : [m] \to [N]} 𝔼 (a_{𝐢 (1) 𝐢 (2)} \dots a_{𝐢 (m) 𝐢 (1)}) \\ = \frac{1}{N} \sum_{𝐢 : [m] \to [N]} \sum_{π \in P_{2} (m)} \prod_{(r, s) \in π} 𝔼 (a_{𝐢 (r) 𝐢 (r + 1)} a_{𝐢 (s) 𝐢 (s + 1)}) \\ = \frac{1}{N} \sum_{𝐢 : [m] \to [N]} \sum_{π \in P_{2} (m)} \prod_{(r, s) \in π} δ_{𝐢 (r) = 𝐢 (s + 1)} δ_{𝐢 (r + 1) = 𝐢 (s)} \frac{1}{N} \\ = \frac{1}{N} \sum_{𝐢 : [m] \to [N]} \sum_{\begin{array}{c} π \in P_{2} (m) \\ 𝐢 = 𝐢 \circ γ_{m} \circ π \end{array}} {(\frac{1}{N})}^{\frac{m}{2}} \\ = {(\frac{1}{N})}^{\frac{m}{2} + 1} \sum_{π \in P_{2} (m)} | {𝐢 : [m] \to [N] : 𝐢 = 𝐢 \circ γ_{m} \circ π} | \\ = {(\frac{1}{N})}^{\frac{m}{2} + 1} \sum_{π \in P_{2} (m)} N^{# (γ_{m} π)} \\ = \sum_{π \in P_{2} (m)} {(\frac{1}{N})}^{k + 1 - # (γ_{m} π)} \end{array}

since a map $𝐢 : [m] \to [N]$ with $𝐢 = 𝐢 \circ γ_{m} \circ π$ amounts to a choice of label from $[N]$ for each cycle in $γ_{m} π$ . □

1.3. Exercises.

Exercise 1.17. In this exercise, you will show that the moments of a standard gaussian variable count pair partitions.

Let $X$ be a standard gaussian variable. Prove that

𝔼 (X^{2 k}) = (2 k - 1)!! and 𝔼 (X^{2 k - 1}) = 0

for all $k \geq 1$ . Use integration by parts to find a recursion.

Prove that $| P_{2} (2 k) | = (2 k - 1)!!$ by putting $P_{2} (2 k)$ in bijection with a set of cardinality $(2 k - 1) | P_{2} (2 k - 2) |$ .

Exercise 1.18. In this exercise, you will prove the Wick formula: if $X_{1}, \dots, X_{n}$ are gaussian with mean $0$ and covariance matrix $Σ$ , then

𝔼 (X_{𝐢 (1)} \dots X_{𝐢 (k)}) = \sum_{π \in P_{2} (k)} \prod_{(r, s) \in π} 𝔼 (X_{𝐢 (r)} X_{𝐢 (s)})

for all $𝐢 : [k] \to [n]$ .

Assume $Σ$ is diagonal and prove the claim.
Prove the claim in general by diagonalizing $Σ$ and using the multilinearity of the claimed formula.

Exercise 1.19 ([12, Exercise 1.6]). Let $Z$ be a standard complex gaussian variable. Show that the moments are

𝔼 (Z^{m} {\bar{Z}}^{n}) = {\begin{matrix} m! & if m = n \\ 0 & otherwise \end{matrix} .

Hint: start by showing that

𝔼 (Z^{m} {\bar{Z}}^{n}) = \frac{1}{π} \int_{ℝ^{2}} {(t_{1} + i t_{2})}^{m} {(t_{1} - i t_{2})}^{n} e^{- (t_{1}^{2} + t_{2}^{2})} d t_{1} d t_{1}

and then switch to polar coordinates to show that

𝔼 (Z^{m} {\bar{Z}}^{n}) = \frac{1}{π} \int_{0}^{2 π} \int_{0}^{\infty} r^{m + n + 1} e^{𝑖𝜃 (m - n)} e^{- r^{2}} 𝑑𝑟 𝑑𝜃 .

Use this to prove the claim.

2. Semicircle law and noncommutative probability spaces

Recall the theorem from last class:

Theorem 1.15 (Genus expansion). We have

𝔼 ({tr}_{N} (A^{m})) = \sum_{π \in P_{2} (m)} {(\frac{1}{N})}^{\frac{m}{2} + 1 - # (γ_{m} π)}

for $m \geq 1$ . When $m$ is odd, the formula above should be read as an empty sum, i.e. as $0$ .

What happens when we send $N \to \infty$ ? Clearly, for the RHS of Theorem 1.15 to converge, the exponents have to be non-negative; if this is indeed the case, the only summands that will survive are the ones with $\frac{m}{2} + 1 - # (𝛾𝜋) = 0$ .

Definition 2.1. A partition $π \in P (n)$ is said to be noncrossing if the following situation never occurs: there are some blocks $V, W \in π$ with $V \neq W$ , and some $a, c \in V$ and $b, d \in W$ with $a /mo> b /mo> c /mo> d .
The set of noncrossing partitions of [n] is denoted by 𝑁𝐶 (n) .$

Theorem 2.2 ([3]). We have $# (γ_{2 k} π) \leq k + 1$ for all $π \in P_{2} (2 k)$ . Moreover, we have equality if and only if $π$ is noncrossing.

We will come back to this in a moment. First, let’s use it to find the limiting moments of $μ_{A}$ .

Notation 2.3. Write $N C_{2} (m)$ for the non-crossing pair partitions of $[m]$ . When $m$ is odd, $N C_{2} (m)$ is of course empty.

Proposition 2.4. We have

| N C_{2} (2 k) | = \frac{1}{k + 1} (\binom{2 k}{k}) and | N C_{2} (2 k - 1) | = 0

for all $k \geq 1$ .

Proof. Exercise. □

Definition 2.5. The $k$ -th Catalan number is $Cat (k) : = \frac{1}{k + 1} (\binom{2 k}{k})$ .

Corollary 2.6. We have

\lim_{N \to \infty} 𝔼 ({tr}_{N} (A^{2 k})) = Cat (k) and \lim_{N \to \infty} 𝔼 ({tr}_{N} (A^{2 k - 1})) = 0

for $k \geq 1$ .

2.1. Which pairings survive in the limit?. Recall the core combinatorial theorem:

Theorem 2.2 ([3]). We have $# (γ_{2 k} π) \leq k + 1$ for all $π \in P_{2} (2 k)$ . Moreover, we have equality if and only if $π$ is noncrossing.

To see the inequality, we can make a simple observation about permutations:

Lemma 2.7. Let $α \in S_{n}$ and let $τ = (i, j)$ be a transposition. Then

# (𝛼𝜏) = {\begin{matrix} # (α) + 1 & if i and j are in the same cycle of α \\ # (α) - 1 & if i and j are in different cycles of α \end{matrix} .

Remark 2.8. The permutation $γ_{2 k} π$ can be thought of in terms of a building process: each pair in $π$ is a transposition, which bumps the cycle count up or down by $1$ . We start with the one cycle of $γ_{2 k}$ , and the maximal situation is that each of the $k$ pairs in $π$ increases the number of cycles. This makes $# (γ_{2 k} π) \leq k + 1$ .

For example, let $k = 3$ and $π = (1, 4) (2, 3) (5, 6)$ . Then we start at $(1, 2, 3, 4, 5, 6)$ with $1$ cycle. Next, we multiply by $(1, 4)$ :

(1, 2, 3, 4, 5, 6) (1, 4) = (1, 5, 6) (2, 3, 4)

which has two cycles. Next,

(1, 2, 3, 4, 5, 6) (1, 4) (2, 3) = (1, 5, 6) (2, 4) (3)

which has three cycles. Finally,

(1, 2, 3, 4, 5, 6) (1, 4) (2, 3) (5, 6) = (1, 5) (2, 4) (3) (6)

which has four cycles. This is the maximal situation: each pair split a cycle in two.

On the other hand, now consider $π = (1, 5) (2, 3) (4, 6)$ . Then we again start at $(1, 2, 3, 4, 5, 6)$ with $1$ cycle. Then

(1, 2, 3, 4, 5, 6) (1, 5) = (1, 6) (2, 3, 4, 5)

has two cycles, as before. Next,

(1, 2, 3, 4, 5, 6) (1, 5) (2, 3) = (1, 6) (2, 4, 5) (3)

which is another splitting step (notice we haven’t arrived at the crossing yet) giving us three cycles. The last step is different:

(1, 2, 3, 4, 5, 6) (1, 5) (2, 3) (4, 6) = (1, 6, 5, 2, 4) (3)

which is back down to two cycles. This is the generic situation: we can have both splits and merges.

Analyzing the case of equality can probably be done directly, but it is really more insightful to put all this in a proper algebraic framework. Let’s work with general partitions and permutations, with size $n$ ; the results we need for pairings will fall out as special cases.

Definition 2.9. For $α \in S_{n}$ , let $| α |$ be the length of $α$ : the minimal number of transpositions needed to factor $α$ . For $α, β \in S_{n}$ , write $d (α, β) : = | α^{- 1} β |$ .

Proposition 2.10. Notation as above. Then

$| α | = n - # (α)$ for all $α \in S_{n}$ ;
$(S_{n}, d)$ is a metric space.

Proof. For (1), use Lemma 2.7. (2) is a very direct and straightforward verification of axioms. □

Proposition 2.11 ([3]). Let $ϕ : P (n) \to S_{n}$ be the injective map which turns blocks into cycles. Let

S_{𝑁𝐶} (γ_{n}) : = {α \in S_{n} : d (e, α) + d (α, γ_{n}) = d (e, γ_{n})} .

Then $S_{𝑁𝐶} (γ_{n})$ is the range of ${ϕ |}_{𝑁𝐶 (n)}$ .

Proof. Finicky induction but elementary. See [3, Theorem 1]. □

Proof of Theorem 2.2. In $(S_{2 k}, d)$ , for $π \in P_{2} (2 k)$ , the triangle inequality gives

d (e, γ_{2 k}) \leq d (e, π) + d (π, γ_{2 k}) .

Unpacking notation, this translates to

2 k - 1 \leq (2 k - k) + (2 k - # (γ_{2 k} π))

which simplifies to $# (γ_{2 k} π) \leq k + 1$ . The case of equality, i.e.

d (e, γ_{2 k}) = d (e, π) + d (π, γ_{2 k}),

corresponds to $π$ being noncrossing due to Proposition 2.11. □

2.2. Moment problem.

Question 2.12. We just showed that the moments of the average eigenvalue distribution of a GUE converge to the Catalan numbers. How do we know this is a moment sequence, and how do we know this determines the weak limit of the average eigenvalue distribution?

The Catalan numbers appear as the moments of a particular distribution:

Proposition 2.13. Let $𝑑𝜇 = f (x) 𝑑𝑥$ where

f (x) = {\begin{matrix} \frac{1}{2 π} \sqrt{4 - x^{2}} & if x \in [- 2, 2] \\ 0 & otherwise \end{matrix} .

Then

\int_{ℝ} x^{2 k} 𝑑𝜇 (x) = \frac{1}{k + 1} (\binom{2 k}{k}) and \int_{ℝ} x^{2 k - 1} 𝑑𝜇 (x) = 0

for all $k \geq 1$ .

Proof. Exercise. □

Okay, so we can guess that maybe the average eigenvalue distribution of a GUE converges to the semicircle distribution. But we just found the moments match – a priori this could just be a funny coincidence. So we can refocus Question 2.12:

Question 2.12 $^{'}$ . If ${(μ_{n})}_{n \geq 1}$ is a sequence in $M (ℝ)$ and $μ \in M (ℝ)$ has

\lim_{n \to \infty} \int_{ℝ} x^{m} d μ_{n} (x) = \int_{ℝ} x^{m} 𝑑𝜇 (x)

for all $m \geq 1$ , then does $μ_{n}$ converge to $μ$ in some meaningful sense at the level of measures?

The answer to Question $^{'}$ is a qualified “yes”:

Theorem 2.14. Let $μ$ be a probability measure on $ℝ$ with compact support.

If $ν$ is another probability measure on $ℝ$ with

\int_{ℝ} x^{m} 𝑑𝜇 (x) = \int_{ℝ} x^{m} 𝑑𝜈 (x)

for all $m \geq 1$ , then $μ = ν$ .

If ${(μ_{n})}_{n \geq 1}$ is a sequence of probability measures on $ℝ$ with
$\lim_{n \to \infty} \int_{ℝ} x^{m} d μ_{n} (x) = \int_{ℝ} x^{,} 𝑑𝜇 (x)$

for all $m \geq 1$ , then $μ_{n} \to μ$ weakly as $n \to \infty$ .

Proof. Big exercise. See [4, Theorems 30.1 &30.2]. □

Proof of Theorem 1.12. Let $μ_{N}$ be the average eigenvalue distribution of an $N \times N$ GUE, and let $μ$ be the semicircle distribution with radius $2$ . We have already proved that

\lim_{N \to \infty} \int_{ℝ} x^{m} d μ_{N} (x) = \int_{ℝ} x^{m} 𝑑𝜇 (x)

for all $m \geq 1$ . Since $μ$ has compact support, Theorem 2.14 shows that $μ_{A} \to μ$ weakly as $N \to \infty$ . □

2.3. Noncommutative probability spaces.

Definition 2.15. A $*$ -algebra is a unital associative algebra over $ℂ$ with an operation $𝒜 \to 𝒜 : a \mapsto a^{*}$ which is conjugate-linear and has ${(a^{*})}^{*} = a$ and ${(𝑎𝑏)}^{*} = b^{*} a^{*}$ for all $a, b \in 𝒜$ . An element $a \in 𝒜$ is said to be positive if $a = b^{*} b$ for some $b \in 𝒜$ .

Definition 2.16. A $*$ -probability space is a pair $(𝒜, φ)$ where $𝒜$ is a $*$ -algebra and $φ$ is a linear functional on $𝒜$ with $φ (a^{*} a) \geq 0$ for all $a \in 𝒜$ , and $φ (1) = 1$ .

Remark 2.17. The point of Definition 2.16 is to view the elements of $𝒜$ as “random variables” and the linear functional $φ$ as the “expectation”. This abstraction becomes useful when one needs to consider noncommutative objects as random variables.

Example 2.18 (Commutative case). Let $(Ω, F, P)$ be a (classical) probability space, i.e. a measure space with total measure $1$ . Let $𝒜 : = L^{\infty} (Ω, F, P)$ be the algebra of essentially bounded functions, and let $φ$ be the linear functional defined by

φ (f) = \int_{Ω} f (ω) 𝑑𝑃 (ω)

for $f \in 𝒜$ . This is a $*$ -probability space, where the $*$ -operation is complex conjugation – in fact, it has a lot of analytic structure that we will put aside for now.

Example 2.19 (Finite moments). Major objection to Example 2.18: many random variables we care about are not bounded. We can set up a similar example that will actually be more relevant for us: let $𝒜 = L^{\infty -} (Ω, F, P)$ where $L^{\infty -} (Ω, F, P) : = ⋂_{p \geq 1} L^{p} (Ω, F, P)$ , i.e. the algebra of random variables with all moments finite. (Use Hölder’s inequality to show $𝒜$ is a $*$ -algebra: once to show $L^{p} \supseteq L^{q}$ for $p \leq q$ , and once to show $𝑓𝑔 \in L^{1}$ for $f, g \in 𝒜$ .) The expectation $φ$ is defined in the same way, by integrating functions against $P$ . This is again a $*$ -probability space.

Example 2.20 (Scalar matrices). Let $𝒜 = M_{N}$ be the algebra of $N \times N$ matrices and let $φ (A) = \frac{1}{N} Tr (A)$ . This is a $*$ -probability space which is not commutative.

Example 2.21 (Random matrices). Let $𝒜 = M_{N} (L^{\infty -} (Ω, F, P))$ be the $*$ -algebra of matrices with entries that are RVs with finite moments, and let $φ$ be the expected trace: $φ (A) = 𝔼 tr (A)$ . For example, our GUE random matrix lives here.

So far, we haven’t seen anything new. The real reason for making this abstraction is to include genuinely noncommutative situations – where there is no hope of identifying an underlying classical probability space – and view them as essentially probabilistic in nature. The best example comes from groups and their group rings.

Example 2.22 (Group algebra). Let $G$ be a group and let $𝒜 : = ℂ [G]$ . The expectation is defined by

φ (\sum_{g \in G} a_{g} g) = a_{e} .

This is a $*$ -probability space.

Next class, we will use group algebras and group-theoretical freeness to motivate Voiculescu’s highly influential notion of free independence. Then, we will use it to construct a concrete operator model for the asymptotics of multiple GUEs.

2.4. Exercises.

Exercise 2.23. In this exercise, you will enumerate the noncrossing pairings and show that they are counted by the moments of the semicircle distribution. (1) and (2) are classic textbook combinatorics. (3) is a somewhat involved calculus problem.

Find a recursion for $| N C_{2} (2 k) |$ . Think back to the enumeration of $P_{2} (2 k)$ .
Show that $Cat (k) : = \frac{1}{k + 1} (\binom{2 k}{k})$ satisfy the same recursion that you found in (1). To do this, first show that with $C (z) : = \sum_{k = 0}^{\infty} Cat (k) z^{k}$ , we have $C (z) = \frac{1 - \sqrt{1 - 4 z}}{2 z}$ , and derive the functional equation $C (z) = 1 + 𝑧𝐶 {(z)}^{2}$ . Recover the recursion from this functional equation.

Let $𝑑𝜇 = f (x) 𝑑𝑥$ where

f (x) = {\begin{matrix} \frac{1}{2 π} \sqrt{4 - x^{2}} & if x \in [- 2, 2] \\ 0 & otherwise \end{matrix} .

Show that

\int_{ℝ} x^{2 k} 𝑑𝜇 (x) = Cat (k) and \int_{ℝ} x^{2 k - 1} 𝑑𝜇 (x)

for $k \geq 1$ . To do this, make the substitution $x = 2 \cos 𝜃$ . You can use the following identity without proof:

\int_{0}^{π} \cos^{2 m} 𝜃 𝑑𝜃 = \frac{(2 m - 1)!!}{(2 m)!!} π .

Exercise 2.24. In this exercise, you will fill in the details of why convergence in moments implies weak convergence in the case of the semicircular law. In both parts, let $μ$ be a probability measure on $ℝ$ with compact support. This is part genuine exercise, part “book report”. Both parts require some knowledge of measure theory.

Let $ν$ be a probability measure on $ℝ$ with
$\int_{ℝ} x^{m} 𝑑𝜇 (x) = \int_{ℝ} x^{m} 𝑑𝜈 (x)$

for all $m \geq 1$ . Prove that $μ = ν$ . Note: there is no assumption that $ν$ has compact support! Be careful and think about how to get around this issue.

Let ${(μ_{n})}_{n \geq 1}$ be a sequence of probability measures on $ℝ$ with

\lim_{n \to \infty} \int_{ℝ} x^{m} d μ_{n} (x) = \int_{ℝ} x^{m} 𝑑𝜇 (x)

for all $m \geq 1$ . Prove that $μ_{n} \to μ$ weakly. This is standard textbook material (Billingsley P& Theorems 30.1 and 30.2). What I’m looking for is a concise summary of the argument that addresses its various subtleties: for example, how exactly is (1) being used, how do you establish tightness, etc. You don’t need to reproduce proofs of basic classical theorems, like Markov inequality or Helly selection theorem/Prokhorov’s theorem.

3. Free independence

3.1. Motivation and definition.

Definition 3.1 (Group-theoretical freeness). Let $G$ be a group and let ${G_{i} : i \in I}$ be a family of subgroups. The subgroups are free if for all $k \geq 1$ , for all $i_{1}, \dots, i_{k} \in I$ with $i_{j} \neq i_{j + 1}$ , and for all $g_{j} \in G_{i_{j}}$ with $g_{j} \neq e$ , we have $g_{1} \dots g_{k} \neq e$ .

How does this look in terms of the $*$ -probability space $(ℂ [G], φ)$ ? Well, the condition that

g_{j} \neq e ⟹ g_{1} \dots g_{k} \neq e translates to φ (g_{j}) = 0 ⟹ φ (g_{1} \dots g_{k}) = 0 .

Definition 3.2 (Freeness of subalgebras). Let $(𝒜, φ)$ be a $*$ -probability space, and let ${𝒜_{i} : i \in I}$ be a family of unital $*$ -subalgebras. The family ${𝒜_{i} : i \in I}$ is said to be freely independent (or free) if for all $k \geq 1$ , for all $i_{1}, \dots, i_{k} \in I$ with $i_{j} \neq i_{j + 1}$ , and for all $a_{j} \in 𝒜_{i_{j}}$ with $φ (a_{j}) = 0$ , we have $φ (a_{1} \dots a_{k}) = 0$ .

Proposition 3.3. Let $G$ be a group and let ${G_{i} : i \in I}$ be a family of subgroups. Then the subgroups are free in the group-theoretical sense if and only if the subalgebras $ℂ [G_{i}]$ of $ℂ [G]$ are freely independent with respect to $φ$ .

Proof. Exercise. I think this is an essential one. □

Definition 3.4 (Freeness of variables). Let $(𝒜, φ)$ be a $*$ -probability space and let ${a_{i} : i \in I}$ be a family in $𝒜$ . For $i \in I$ , let $𝒜_{i}$ be the $*$ -subalgebra of $𝒜$ generated by $a_{i}$ . The family ${a_{i} : i \in I}$ is said to be freely independent (or free) if the family ${𝒜_{i} : i \in I}$ is freely independent.

3.2. What does freeness actually do?. To understand what freeness means, let’s look at some examples. This section is entirely based on [13, Lecture 5].

Example 3.5. Suppose that $𝒜$ and $B$ are freely independent subalgebras of some ambient algebra. Let’s look at some words with mixtures between $𝒜$ and $B$ .

Length two: for $a \in 𝒜$ and $b \in B$ , freeness says

0 = φ ((a - φ (a)) (b - φ (b))) = φ (𝑎𝑏) - φ (a) φ (b)

so $φ (𝑎𝑏) = φ (a) φ (b)$ .

Length three: for $a_{1}, a_{2} \in 𝒜$ and $b \in B$ , freeness says

φ ((a - φ (a)) (b - φ (b)) (a - φ (a))) = 0 .

Expanding and using the length two factorization, this says

\begin{array}{l} 0 & = φ (a_{1} b a_{2}) - φ (a_{2}) φ (a_{1} b) - φ (b) φ (a_{1} a_{2}) + φ (a_{1}) φ (a_{2}) φ (b) \\ - φ (a_{1}) φ (b a_{2}) + φ (a_{1}) φ (a_{2}) φ (b) + φ (a_{1}) φ (b) φ (a_{2}) - φ (a_{1}) φ (a_{2}) φ (b) \\ = φ (a_{1} b a_{2}) - φ (a_{2}) φ (a_{1}) φ (b) - φ (b) φ (a_{1} a_{2}) + φ (a_{1}) φ (a_{2}) φ (b) \\ - φ (a_{1}) φ (b) φ (a_{2}) + φ (a_{1}) φ (a_{2}) φ (b) + φ (a_{1}) φ (b) φ (a_{2}) - φ (a_{1}) φ (a_{2}) φ (b) \\ = φ (a_{1} b a_{2}) - φ (a_{1} a_{2}) φ (b) \end{array}

so $φ (a_{1} b a_{2}) = φ (a_{1} a_{2}) φ (b)$ .

IMPORTANT: this factorization pattern does not continue. The same kind of calculation, with $a_{1}, a_{2} \in 𝒜$ and $b_{1}, b_{2} \in B$ , shows that

φ (a_{1} b_{1} a_{2} b_{2}) = φ (a_{1} a_{2}) φ (b_{1}) φ (b_{2}) + φ (a_{1}) φ (a_{2}) φ (b_{1} b_{2}) - φ (a_{1}) φ (b_{1}) φ (a_{2}) φ (b_{2}) .

The point of all this is that mixed moments only depend on mixed moments from the same subalgebra, via some complicated pattern.

Proposition 3.6. Let $(𝒜, φ)$ be a $*$ -probability space and let ${𝒜_{i} : i \in I}$ be freely independent $*$ -subalgebras. Let $B$ be the $*$ -subalgebra of $𝒜$ generated by ${𝒜_{i} : i \in I}$ . Then ${φ |}_{B}$ is uniquely determined by ${{φ |}_{𝒜_{i}} : i \in I}$ .

Proof. See [13, Lemma 5.13]. □

3.3. Free central limit theorem. Similar to the last section, this one is entirely based on [13, Lecture 8]. From now on, fix a $*$ -probability space $(𝒜, φ)$ and a sequence ${(a_{n})}_{n \geq 1}$ in $𝒜$ with $a_{n}^{*} = a_{n}$ , $φ (a_{n}) = 0$ , and $φ (a_{n}^{2}) = 1$ for all $n \geq 1$ . Also assume all the $a_{n}$ have the same moment sequence and they are freely independent.

Theorem 3.7 ([15]). We have

\lim_{N \to \infty} φ ({(\frac{a_{1} + \dots + a_{N}}{\sqrt{N}})}^{m}) = {\begin{matrix} Cat (k) & if m = 2 k \\ 0 & otherwise \end{matrix}

for $m \geq 1$ .

The first step is to simply unpack the LHS:

φ ({(\frac{a_{1} + \dots + a_{N}}{\sqrt{N}})}^{m}) = \frac{1}{N^{m ∕ 2}} \sum_{𝐢 : [m] \to [N]} φ (a_{𝐢 (1)} \dots a_{𝐢 (m)}) .

Notation 3.8. For $𝐢 : [m] \to ℕ$ , let $\ker (𝐢)$ be the partition of $[m]$ defined by letting $p$ and $q$ be in the same block if and only if $𝐢 (p) = 𝐢 (q)$ .

Lemma 3.9. If $𝐢, 𝐣 : [m] \to ℕ$ have $\ker (𝐢) = \ker (𝐣)$ , then

φ (a_{𝐢 (1)} \dots a_{𝐢 (m)}) = φ (a_{𝐣 (1)} \dots a_{𝐣 (m)}) .

Idea of proof. This follows from the assumption that the $a_{n}$ all have the same distribution, plus the fact that free independence means mixed moments are determined by the individual distributions.

For example, suppose the common partition is ${{1, 3}, {2, 4}}$ , and $𝐢 = (1, 2, 1, 2)$ and $𝐣 = (5, 3, 5, 3)$ . Then, using the computations from Example 3.5, we have

φ (a_{1} a_{2} a_{1} a_{2}) = φ (a_{1}^{2}) φ {(a_{2})}^{2} + φ {(a_{1})}^{2} φ (a_{2}^{2}) - φ {(a_{1})}^{2} φ {(a_{2})}^{2}

and

φ (a_{5} a_{3} a_{5} a_{3}) = φ (a_{5}^{2}) φ {(a_{3})}^{2} + φ {(a_{5})}^{2} φ (a_{3}^{2}) - φ {(a_{5})}^{2} φ {(a_{3})}^{2} .

Since we assume the $a_{n}$ all have the same moment sequence, these are the same. □

Notation 3.10. In light of Lemma 3.9, for $π \in P (m)$ , write $φ (π)$ for the common value of $φ (a_{𝐢 (1)} \dots a_{𝐢 (m)})$ where $𝐢 : [m] \to [N]$ has $\ker (𝐢) = π$ . Let $A_{π}^{N}$ be the number of maps $𝐢 : [m] \to [N]$ with $\ker (𝐢) = π$ .

So we have

φ ({(\frac{a_{1} + \dots + a_{N}}{\sqrt{N}})}^{m}) = \frac{1}{N^{m ∕ 2}} \sum_{π \in P (m)} A_{π}^{N} φ (π)

and the task is to see what happens to $A_{π}^{N}$ as $N \to \infty$ . We can immediately dispense with any $π$ that has a singleton block:

Lemma 3.11. If $π \in P (m)$ and there is some $V \in π$ with $| V | = 1$ , then $φ (π) = 0$ .

Proof. Suppose that $𝐢 (j) = r$ and ${j} \in π$ . Then

\begin{array}{l} φ (π) & = φ (a_{𝐢 (1)} \dots a_{𝐢 (j - 1)} a_{r} a_{𝐢 (j + 1)} \dots a_{𝐢 (m)}) \\ = φ (a_{𝐢 (1)} \dots a_{𝐢 (j - 1)} a_{𝐢 (j + 1)} \dots a_{𝐢 (m)}) φ (a_{r}) \end{array}

using freeness and one of the computations in Example 3.5. We have $φ (a_{r}) = 0$ , so $φ (π) = 0$ . □

Now, we need to understand the limit of $A_{π}^{N}$ where $π \in P (m)$ has $| V | \geq 2$ for all $V \in π$ . This is straightforward: a map $𝐢 : [m] \to [N]$ with $π = \ker (𝐢)$ amounts to a choice of label from $[N]$ for each block of $π$ , where we do not allow duplicate labels. There are

N (N - 1) \dots (N - | π | + 1)

choices, so we have

φ ({(\frac{a_{1} + \dots + a_{N}}{\sqrt{N}})}^{m}) = \sum_{\begin{array}{c} π \in P (m) \\ | V | \geq 2 \forall V \in π \end{array}} \frac{N (N - 1) \dots (N - | π | + 1)}{N^{m ∕ 2}} φ (π) .

The condition on $π$ makes $| π | \leq m ∕ 2$ , and when $N \to \infty$ , the fraction involving $N$ goes to $1$ if $| π | = m ∕ 2$ , and goes to $0$ otherwise. The only way we can have $| π | = 1$ is if $π \in P_{2} (m)$ . So we have

\lim_{N \to \infty} φ ({(\frac{a_{1} + \dots + a_{N}}{\sqrt{N}})}^{m}) = \sum_{π \in P_{2} (m)} φ (π) .

Of course, if $m$ is odd, this is $0$ .

Now let us compute $φ (π)$ for $π \in P_{2} (2 k)$ . Pick $𝐢 : [m] \to ℕ$ with $π = \ker (𝐢)$ . If $𝐢 (j) \neq 𝐢 (j + 1)$ for all $1 \leq j \leq m - 1$ , then $φ (π) = 0$ by freeness; otherwise, there is some $1 \leq j \leq m - 1$ with $𝐢 (j) = 𝐣 (j + 1)$ . Using the computations from Example 3.5, we have

\begin{array}{l} φ (π) & = φ (a_{𝐢 (1)} \dots a_{𝐢 (j - 1)} (a_{𝐢 (j)} a_{𝐢 (j + 1)}) a_{𝐣 (j + 2)} \dots a_{𝐢 (m)}) \\ = φ (a_{𝐢 (1)} \dots a_{𝐢 (j - 1)} a_{𝐢 (j + 2)} \dots a_{𝐢 (m)}) φ (a_{𝐢 (j)} a_{𝐢 (j + 1)}) \\ = φ (a_{𝐢 (1)} \dots a_{𝐢 (j - 1)} a_{𝐢 (j + 2)} \dots a_{𝐢 (m)}) . \end{array}

The pairings where we can keep doing this $k$ times are exactly the ones that are noncrossing, so we have

φ (π) = {\begin{matrix} 1 & if π \in N C_{2} (2 k) \\ 0 & if π \in P_{2} (2 k) ∖ N C_{2} (2 k) \end{matrix} .

This shows that

\lim_{N \to \infty} φ ({(\frac{a_{1} + \dots + a_{N}}{\sqrt{N}})}^{m}) = {\begin{matrix} Cat (k) & if m = 2 k \\ 0 & otherwise \end{matrix}

which is the claim of Theorem 3.7.

3.4. Exercises.

Exercise 3.12. Let $G$ be a group and let $𝒜 = ℂ [G]$ . Let $φ$ be the linear functional on $𝒜$ defined by

φ (\sum_{g \in G} a_{g} g) = a_{e} .

I would say (3) is essential for understanding the concept of free independence. I strongly suggest you do it, even if you don’t hand it in.

Prove that $φ$ is positive by directly using the definition above.
Find a Hilbert space $H$ , a $*$ -homomorphism $π : 𝒜 \to B (H)$ , and a vector $ξ \in H$ such that $φ (a) = ⟨ π (a) ξ, ξ ⟩$ for all $a \in 𝒜$ . Use the fact that such a thing exists to prove that $φ$ is positive.
Let ${G_{i} : i \in I}$ be a family of subgroups of $G$ , and for $i \in I$ , let $𝒜_{i} : = ℂ [G_{i}]$ . Prove that ${G_{i} : \in I}$ are free in the group-theoretical sense if and only if ${𝒜_{i} : i \in I}$ are freely independent.

4. Asymptotic freeness of multiple GUEs

4.1. Free semicircular families. The example of freeness in the last section will become relevant when we talk about random unitary matrices, but for now, we are looking for some noncommutative data that captures the asymptotics of multiple GUEs. The answer is that when $N \to \infty$ , they become freely independent families of semicircular variables.

Definition 4.1 (Semicircular family). Let $(𝒜, φ)$ be a $*$ -probability space. A free semicircular family in $(𝒜, φ)$ consists of some elements $s_{1}, \dots, s_{r} \in 𝒜$ such that

$s_{j}$ is self-adjoint and semicircular, and
${s_{1}, \dots, s_{r}}$ is freely independent.

Theorem 4.2. For each $r \geq 1$ , there is a $*$ -probability space $(𝒜, φ)$ and some elements $s_{1}, \dots, s_{r} \in 𝒜$ such that

${s_{1}, \dots, s_{r}}$ is a free semicircular family, and
$𝒜$ is generated by ${s_{1}, \dots, s_{r}}$ .

In this section, we will construct a concrete instance of a free semicircular family using creation and annihilation operators on the so-called full Fock space. This model will allow for very concrete calculations involving certain lattice paths which are in obvious bijection with $N C_{2}$ .

Definition 4.3 (Full Fock space). Fix a Hilbert space $H$ . The full Fock space on $H$ is the Hilbert space

F (H) : = ℂ Ω \oplus \oplus_{n = 1}^{\infty} H^{\otimes n} .

Here, the inner product of $H^{\otimes n}$ is given on elementary tensors by

⟨ ξ_{1} \otimes \dots \otimes ξ_{n}, η_{1} \otimes \dots \otimes η_{n} ⟩ = ⟨ ξ_{1}, η_{1} ⟩ \dots ⟨ ξ_{n}, η_{n} ⟩

and the direct sum is the set of sequences ${(ξ_{n})}_{n \geq 0}$ where $ξ_{0} \in ℂ Ω$ and $ξ_{n} \in H^{\otimes n}$ for $n \geq 1$ , with $\sum_{n = 1}^{\infty} ∥ ξ_{n} ∥^{2} /mo> \infty . The
inner product is given by$

⟨ {(ξ_{n})}_{n \geq 0}, {(η_{n})}_{n \geq 0} ⟩ = \sum_{n = 0}^{\infty} ⟨ ξ_{n}, η_{n} ⟩ .

The $*$ -algebra $B (F (H))$ carries a canonical vacuum state:

φ (T) = ⟨ T Ω, Ω ⟩ .

Then $(B (F (H))), φ)$ is a $*$ -probability space.

Proposition 4.4. For $ξ \in H$ , there is an operator $c (ξ) \in B (F (H))$ defined by $c (ξ) Ω = ξ$ and

c (ξ) ξ_{1} \otimes \dots \otimes ξ_{n} = ξ \otimes ξ_{1} \otimes \dots \otimes ξ_{n} .

The adjoint is given by

c {(ξ)}^{*} Ω = 0, c {(ξ)}^{*} ξ_{1} = ⟨ ξ_{1}, ξ ⟩ Ω,

and

c {(ξ)}^{*} ξ_{1} \otimes \dots \otimes ξ_{n} = ⟨ ξ_{1}, ξ ⟩ ξ_{2} \otimes \dots \otimes ξ_{n}

and has the following useful property:

c {(ξ)}^{*} c (η) = ⟨ η, ξ ⟩ \cdot 1 .

Specifically, $c (ξ)$ is a scalar multiple of an isometry, and $∥ c (ξ) ∥ = ∥ ξ ∥$ .

Proof. Exercise. □

Proposition 4.5 (Semicircular). Let $H$ be a Hilbert space and let $ξ \in H$ . Then $c (ξ) + c {(ξ)}^{*} \in B (F (H))$ is a semicircular element with radius $2 ∥ ξ ∥$ .

The combinatorics of noncrossing pairings are directly built into this model, but in a slightly different form.

Definition 4.6. A Dyck path with $m$ steps is a path in $ℤ_{\geq 0} \times ℤ_{\geq 0}$ which

starts at $(0, 0)$ ,
takes steps of $(1, 1)$ or $(1, - 1)$ , and
ends at $(m, 0)$ .

Note that this definition includes the requirement that the path never goes below the $x$ -axis. Let $D (m)$ be the set of Dyck paths with $m$ steps; when $m$ is odd, $D (m) = \emptyset$ .

Lemma 4.7. There is a canonical bijection $N C_{2} (m) ≃ D (m)$ .

Proof of Proposition 4.5. To clean up notation, let $a = c (ξ)$ . We will compute moments:

φ ({(a + a^{*})}^{m}) = \sum_{𝜖_{1}, \dots, 𝜖_{m} \in {1, *}} φ (a^{𝜖_{1}} \dots a^{𝜖_{m}}) .

The key observation is that $φ (a^{𝜖_{1}} \dots a^{𝜖_{m}})$ is just an indicator of a certain combinatorial property of the sequence $(𝜖_{1}, \dots, 𝜖_{m})$ : the sequence determines a lattice path with NE and SE steps, and the indicator detects whether or not it is a Dyck path.

To see this, for $𝜖_{1}, \dots, 𝜖_{m} \in {1, *}$ , let

λ_{j} = {\begin{matrix} 1 & if 𝜖_{m - j + 1} = 1 \\ - 1 & if 𝜖_{m - j + 1} = * \end{matrix}

for $1 \leq j \leq m$ . Then, if we read the expression $ξ^{\otimes 0}$ as $Ω$ , we have

a^{𝜖_{1}} \dots a^{𝜖_{m}} Ω = {\begin{matrix} ξ^{\otimes (λ_{1} + \dots + λ_{m})} & if λ_{1} + \dots + λ_{k} \geq 0 \forall 1 \leq k \leq m \\ 0 & otherwise \end{matrix}

and

\begin{array}{l} φ (a^{𝜖_{1}} \dots a^{𝜖_{m}}) & = ⟨ a^{𝜖_{1}} \dots a^{𝜖_{m}} Ω, Ω ⟩ \\ = {\begin{matrix} ⟨ ξ^{\otimes (λ_{1} + \dots + λ_{m})}, Ω ⟩ & if λ_{1} + \dots + λ_{k} \geq 0 \forall 1 \leq k \leq m \\ 0 & otherwise \end{matrix} \end{array}

The inner product in the last line is $1$ if $λ_{1} + \dots + λ_{m} = 0$ , and $0$ otherwise. The condition $λ_{1} + \dots + λ_{k} \geq 0$ for all $1 \leq k \leq m$ means the path doesn’t go below the $x$ -axis, and the condition that $λ_{1} + \dots + λ_{m} = 0$ means it does return to the $x$ -axis, i.e. it’s a Dyck path. So $φ ({(a + a^{*})}^{m})$ is the number of Dyck paths with $m$ steps. By Lemma 4.7, this shows that $a + a^{*}$ is semicircular. □

Proposition 4.8 (Freeness). Let $H$ be a Hilbert space and let $H_{1}, \dots, H_{r}$ be mutually orthogonal subspaces of $H$ . For $1 \leq i \leq r$ , let $𝒜_{i}$ be the $*$ -subalgebra of $B (F (H))$ generated by ${c (ξ) : ξ \in H_{i}}$ . Then $𝒜_{1}, \dots, 𝒜_{r}$ are freely independent.

Proof. Let $1 \leq i_{1}, \dots, i_{k} \leq r$ with $i_{j} \neq i_{j + 1}$ , and pick $T_{j} \in 𝒜_{i_{j}}$ with $φ (T_{j}) = 0$ . Assume without loss of generality that

T_{j} = c (ξ_{1}^{j}) \dots c (ξ_{m_{j}}^{j}) c {(η_{1}^{j})}^{*} \dots c {(η_{n_{j}}^{j})}^{*}

with $m_{j} + n_{j} \geq 1$ and all vectors in $H_{i_{j}}$ . The reason we can do this, and ignore contributions from $1$ , is that any non-zero contribution from $1$ will make $φ$ non-zero.

If there is some $1 \leq j \leq k - 1$ with $n_{j} \neq 0$ and $m_{j + 1} \neq 0$ , then the product $T_{j} T_{j + 1}$ includes the product $c {(η_{n_{j}}^{j})}^{*} c (ξ_{1}^{j + 1}) = ⟨ ξ_{1}^{j + 1}, η_{n_{j}}^{j} ⟩ 1$ . Since $H_{j} ⊥ H_{j + 1}$ , this is $0$ and then $φ (T_{1} \dots T_{k}) = 0$ .

Otherwise, for all $1 \leq j \leq k - 1$ , we have either $n_{j} = 0$ or $m_{j + 1} = 0$ . Then

T_{1} \dots T_{k} = c (ξ_{1}) \dots c (ξ_{m}) c {(η_{1})}^{*} \dots c {(η_{n})}^{*}

for some $ξ_{1}, \dots, ξ_{m}, η_{1}, \dots, η_{n} \in H$ with $m + n \geq 1$ , and

\begin{array}{l} φ (T_{1} \dots T_{k}) & = ⟨ c (ξ_{1}) \dots c (ξ_{m}) c {(η_{1})}^{*} \dots c {(η_{n})}^{*} Ω, Ω ⟩ \\ = ⟨ c {(η_{1})}^{*} \dots c {(η_{n})}^{*} Ω, c {(ξ_{m})}^{*} \dots c {(ξ_{1})}^{*} Ω ⟩ . \end{array}

Since at least one of $m$ or $n$ is $\geq 1$ , one of the two arguments in the inner product is of the form $c {(ξ)}^{*} Ω = 0$ , so $φ (T_{1} \dots T_{k}) = 0$ . □

Proof of Theorem 4.2. Let ${e_{1}, \dots, e_{r}}$ be the standard orthonormal basis of $ℂ^{r}$ , and let $s_{j} = c (e_{j}) \in B (F (ℂ^{r}))$ for $1 \leq j \leq r$ . Let $𝒜$ be the $*$ -algebra in $B (F (ℂ^{r}))$ generated by ${s_{1}, \dots, s_{r}}$ , and let $φ (T) = ⟨ T Ω, Ω ⟩$ for $T \in 𝒜$ . □

Remark 4.9. Theorem 4.2 is a bit artificially weak, because I don’t want to assume any background in operator algebras. If we let $𝒜$ be the $C^{*}$ -subalgebra of $B (F (ℂ^{r}))$ generated by ${c (e_{1}) + c {(e_{1})}^{*}, \dots, c (e_{r}) + c {(e_{r})}^{*}}$ and let $φ (T) = ⟨ T Ω, Ω ⟩$ for $T \in 𝒜$ , then

$Ω$ is a cyclic vector for $𝒜$ , meaning that ${T Ω : T \in 𝒜}$ is dense in $F (H)$ ;
$φ$ is a faithful trace;
for any $C^{*}$ -algebra $B$ with a faithful state $ψ$ , if $B$ is generated by a free semicircular family of size $r$ , then there is a state-preserving $*$ -isomorphism $𝒜 ≃ B$ .

This is a very nice object and in light of (3), it is referred to as the semicircular $C^{*}$ -algebra with $r$ generators. See [13, Lecture 7] for details.

4.2. Asymptotic freeness of GUEs.

Theorem 4.10 ([16]). Let ${A_{1}, \dots, A_{r}}$ be an independent family of $N \times N$ GUEs, and let ${s_{1}, \dots, s_{r}}$ be a free semicircular family. Then $(A_{1}, \dots, A_{r})$ converges in distribution to $(s_{1}, \dots, s_{r})$ , in the following sense: for all $m \geq 1$ and $𝐢 : [m] \to [r]$ , we have

\lim_{N \to \infty} 𝔼 tr (A_{𝐢 (1)} \dots A_{𝐢 (m)}) = φ (s_{𝐢 (1)} \dots s_{𝐢 (m)}) .

This theorem follows from an easy multivariate generalization of the genus expansion:

Theorem 1.15 $^{'}$ . Let $m \geq 1$ and $𝐢 : [m] \to [r]$ . Then

𝔼 ({tr}_{N} (A_{𝐢 (1)} \dots A_{𝐢 (m)})) = \sum_{\begin{array}{c} π \in P_{2} (m) \\ π \leq \ker (𝐢) \end{array}} {(\frac{1}{N})}^{\frac{m}{2} + 1 - # (γ_{m} π)} .

When $m$ is odd, the formula should be read as an empty sum, i.e. as $0$ .

Proof of Theorem 4.10. When $N \to \infty$ in the formula above, the only summands that survive are the ones with $\frac{m}{2} + 1 - # (γ_{m} π) = 0$ . These are the ones with $π$ noncrossing, so

𝔼 ({tr}_{N} (A_{𝐢 (1)} \dots A_{𝐢 (m)})) = | {π \in N C_{2} (m) : π \leq \ker (𝐢)} | + O (N^{- 2})

and we need to show that

φ (s_{𝐢 (1)} \dots s_{𝐢 (m)}) = | {π \in N C_{2} (m) : π \leq \ker (𝐢)} | .

This is similar to Proposition 4.5: in the proof there, the words $a^{𝜖_{1}} \dots a^{𝜖_{m}}$ which contribute to the count $| N C_{2} (m) |$ correspond to Dyck paths. There is a similar constraint here, but we also need pairs $(r, s)$ of up and down steps to have $𝐢 (r) = 𝐢 (s)$ . In terms of $N C_{2} (m)$ , this amounts to $π \leq \ker (𝐢)$ .

Here is an example of how this works: consider the path which makes two up steps and then two down steps, or in other words the noncrossing pairing ${{1, 4}, {2, 3}}$ . This corresponds to

\begin{array}{l} a_{𝐢 (1)}^{*} a_{𝐢 (2)}^{*} a_{𝐢 (3)} a_{𝐢 (4)} Ω & = a_{𝐢 (1)}^{*} a_{𝐢 (2)}^{*} (e_{𝐢 (3)} \otimes e_{𝐢 (4)}) \\ = a_{𝐢 (1)}^{*} ⟨ e_{𝐢 (3)}, e_{𝐢 (2)} ⟩ e_{𝐢 (4)} \\ = ⟨ e_{𝐢 (3)}, e_{𝐢 (2)} ⟩ ⟨ e_{𝐢 (1)}, e_{𝐢 (4)} ⟩ Ω \end{array}

so we have the additional requirement that $𝐢 (2) = 𝐢 (3)$ and $𝐢 (1) = 𝐢 (4)$ , i.e. the pairing is compatible with $𝐢$ . □

4.3. Exercises.

Exercise 4.11. Let $H$ be a Hilbert space, let $F (H)$ be its full Fock space, and let $F_{0} (H)$ be the dense subspace of finite linear combinations of elementary tensors. In this exercise, you will fill in the details of the construction of creation and annihilation operators on $F (H)$ .

For $ξ \in H$ , define $c (ξ) : F_{0} (H) \to F (H)$ by letting
$c (ξ) (ξ_{1} \otimes \dots \otimes ξ_{n}) = ξ \otimes ξ_{1} \otimes \dots \otimes ξ_{n}$

and extending by linearity. Prove that $c (ξ)$ is bounded and extends to an element $c (ξ) \in B (H)$ with $∥ c (ξ) ∥ = ∥ ξ ∥$ . Try to do it in a way that implies $c (ξ)$ is a scalar multiple of an isometry.

Prove that

c {(ξ)}^{*} Ω = 0, c {(ξ)}^{*} ξ_{1} = ⟨ ξ_{1}, ξ ⟩ Ω,

and

c {(ξ)}^{*} (ξ_{1} \otimes \dots \otimes ξ_{n}) = ⟨ ξ_{1}, ξ ⟩ ξ_{2} \otimes \dots \otimes ξ_{n}

for $n \geq 2$ .

Prove that $c {(ξ)}^{*} c (η) = ⟨ η, ξ ⟩ \cdot 1$ .

Exercise 4.12. A $C^{*}$ -probability space is a $*$ -probability space $(𝒜, φ)$ where $𝒜$ is equipped with a norm, with respect to which it is complete, which has the properties

$∥ 𝑎𝑏 ∥ \leq ∥ a ∥ ∥ b ∥$ for all $a, b \in 𝒜$ , and
$∥ a^{*} a ∥ = ∥ a ∥^{2}$ for all $a \in 𝒜$ .

This might be familiar to you as a $C^{*}$ -algebra equipped with a state. As usual in this course, we consider all algebras and related structures to be unital. This exercise requires some knowledge of functional analysis beyond the prerequisites.

Let $(𝒜, φ)$ be a $C^{*}$ -probability space and let $a \in 𝒜$ be normal (meaning that $a^{*} a = a a^{*}$ ). Prove that there is a unique compactly supported measure on $ℂ$ such that
$φ (a^{p} {a^{*}}^{q}) = \int_{ℂ} z^{p} {\bar{z}}^{q} 𝑑𝜇 (z)$

for all $p, q \geq 0$ . This is called the $*$ -distribution, or simply the distribution, of $a$ .
Let $(𝒜, φ)$ be a $C^{*}$ -probability space and let ${𝒜_{i} : i \in I}$ be a freely independent family of $*$ -subalgebras of $𝒜$ . For $i \in I$ , let $B_{i}$ be the norm-closure of $𝒜_{i}$ . Prove that ${B_{i} : i \in I}$ are freely independent.

5. How to integrate over compact groups

In this class and the next, I want to show how to integrate functions on the unitary group $U_{N}$ . Specifically, $C (U_{N})$ has a dense subalgebra generated by the functions $u_{𝑖𝑗} : U_{N} \to ℂ$ – which take a matrix $U \in U_{N}$ and pick out the $(i, j)$ -th entry – and their conjugates (this is an easy exercise with the Stone-Weierstrass theorem). So here is the problem we will solve:

Problem 5.1. Let $𝐢, 𝐣 : [k] \to [N]$ and $𝐢^{'}, 𝐣^{'} : [l] \to [N]$ . How to compute

\int_{U_{N}} u_{𝐢 (1) 𝐣 (1)} \dots u_{𝐢 (k) 𝐣 (k)} \bar{u_{𝐢^{'} (1) 𝐣^{'} (1)}} \dots \bar{u_{𝐢^{'} (l) 𝐣^{'} (l)}} 𝑑𝑈 ?

This problem was solved comprehensively in [5] using the representation theory of $S_{k}$ and Schur-Weyl duality. The ideas of that paper were refined in [7], and in my opinion the most attractive approach to the problem is the one outlined in the recent survey [6]. I claim no originality in my exposition of these works.

5.1. Reminders about integration. Let $G$ be a compact group. Fundamental theorem: there is a unique left-invariant probability measure on $G$ , called the Haar measure. In terms of integration, left invariance means that

\int_{G} f (y^{- 1} x) 𝑑𝑥 = \int_{G} f (x) 𝑑𝑥

for all $y \in G$ . Here are some important properties of this measure:

it is also right-invariant, which in terms of integration means that

\int_{G} f (𝑥𝑦) 𝑑𝑥 = \int_{G} f (x) 𝑑𝑥

for all $y \in G$ ;

it has the property of unimodularity:
$\int_{G} f (x) 𝑑𝑥 = \int_{G} f (x^{- 1}) 𝑑𝑥;$
every non-empty open set has positive measure.

5.2. Projection onto fixed points. The key starting point is the following observation: we can assemble the integration problem into an operator

P : = {(\int_{U_{N}} u_{𝐢 (1) 𝐣 (1)} \dots u_{𝐢 (k) 𝐣 (k)} \bar{u_{𝐢^{'} (1) 𝐣^{'} (1)}} \dots \bar{u_{𝐢^{'} (l) 𝐣^{'} (l)}} 𝑑𝑈)}_{𝐢, 𝐢^{'}, 𝐣, 𝐣^{'}}

on ${(ℂ^{N})}^{\otimes k} \otimes {(ℂ^{N})}^{\otimes l}$ . This matrix is actually an orthogonal projection, and its range is easy to describe. I think it will clarify things if we work in a more general setting.

Proposition 5.2. Let $G$ be a compact group and let $ρ : G \to 𝒰 (H)$ be a unitary representation of $G$ on a Hilbert space $H$ . Let

P : = \int_{G} ρ (x) 𝑑𝑥,

where the integral is defined in the weak sense by

⟨ 𝑃𝜉, η ⟩ = \int_{G} ⟨ ρ (x) ξ, η ⟩ 𝑑𝑥

for all $ξ, η \in H$ . Then $P$ is the orthogonal projection onto

{Fix}_{G} (ρ) : = {ξ \in H : ρ (x) ξ = ξ for all x \in G} .

Remark 5.3. If you are not comfortable with the description of $P$ given above, you can pretend that $H$ is finite-dimensional with orthonormal basis ${e_{i} : i \in I}$ and

P = {(\int_{G} ⟨ ρ (x) e_{j}, e_{i} ⟩ 𝑑𝑥)}_{i, j \in I} .

You will not lose anything – this is the situation we care about.

Proof of Proposition 5.2. Invariance gives

\begin{array}{l} ⟨ P^{2} ξ, η ⟩ & = \int_{G} ⟨ ρ (x) 𝑃𝜉, η ⟩ 𝑑𝑥 \\ = \int_{G} ⟨ 𝑃𝜉, ρ {(x)}^{*} η ⟩ 𝑑𝑥 \\ = \int_{G} (\int_{G} ⟨ ρ (y) ξ, ρ {(x)}^{*} η ⟩ 𝑑𝑦) 𝑑𝑥 \\ = \iint_{G} ⟨ ρ (𝑥𝑦) ξ, η ⟩ 𝑑𝑦 𝑑𝑥 \\ = \int_{G} ⟨ ρ (x) ξ, η ⟩ 𝑑𝑥 \\ = ⟨ 𝑃𝜉, η ⟩ \end{array}

so $P^{2} = P$ , and unimodularity gives

\begin{array}{l} ⟨ P^{*} ξ, η ⟩ & = ⟨ ξ, 𝑃𝜂 ⟩ \\ = \bar{⟨ 𝑃𝜂, ξ ⟩} \\ = \bar{\int_{G} ⟨ ρ (x) η, ξ ⟩ 𝑑𝑥} \\ = \int_{G} \bar{⟨ ρ (x) η, ξ ⟩} 𝑑𝑥 \\ = \int_{G} ⟨ ξ, ρ (x) η ⟩ 𝑑𝑥 \\ = \int_{G} ⟨ ρ {(x)}^{*} ξ, η ⟩ 𝑑𝑥 \\ = \int_{G} ⟨ ρ (x^{- 1}) ξ, η ⟩ 𝑑𝑥 \\ = \int_{G} ⟨ ρ (x) ξ, η ⟩ 𝑑𝑥 \\ = ⟨ 𝑃𝜉, η ⟩ \end{array}

so $P^{*} = P$ . In other words, $P$ is an orthogonal projection.

To compute the range, one inclusion is immediate: if $ξ \in {Fix}_{G} (ρ)$ , then

⟨ 𝑃𝜉, η ⟩ = \int_{G} ⟨ ρ (x) ξ, η ⟩ 𝑑𝑥 = \int_{G} ⟨ ξ, η ⟩ 𝑑𝑥 = ⟨ ξ, η ⟩

for all $η \in H$ so $𝑃𝜉 = ξ$ and $ξ$ is in the range of $P$ .

For the reverse inclusion, suppose that $ξ \notin {Fix}_{G} (ρ)$ , so there is some $x \in G$ such that $ρ (x) ξ \neq ξ$ . If we can show that $∥ 𝑃𝜉 ∥ /mo> ∥ ξ ∥, then we can conclude
that ξ is not in the
range of P, because if it
was, we would have 𝑃𝜉 = ξ .$

Claim: for any $ξ \in H$ , we have

∥ ξ ∥^{2} - ∥ 𝑃𝜉 ∥^{2} = \frac{1}{2} \int_{G} ∥ ξ - ρ (x) ξ ∥^{2} 𝑑𝑥 .

To see this, expand

\begin{array}{l} ∥ ξ - ρ (x) ξ ∥^{2} & = ⟨ ξ - ρ (x) ξ, ξ - ρ (x) ξ ⟩ \\ = ⟨ ξ, ξ ⟩ - ⟨ ξ, ρ (x) ξ ⟩ - ⟨ ρ (x) ξ, ξ ⟩ + ⟨ ρ (x) ξ, ρ (x) ξ ⟩ \\ = 2 ∥ ξ ∥^{2} - (⟨ ρ (x) ξ, ξ ⟩ + \bar{⟨ ρ (x) ξ, ξ ⟩}) & (ρ (x) is unitary) \end{array}

\begin{array}{l} \int_{G} ∥ ξ - ρ (x) ξ ∥^{2} & = 2 ∥ ξ ∥^{2} - (\int_{G} ⟨ ρ (x) ξ, ξ ⟩ 𝑑𝑥 + \int_{G} \bar{⟨ ρ (x) ξ, ξ ⟩} 𝑑𝑥) \\ = 2 ∥ ξ ∥^{2} - (\int_{G} ⟨ ρ (x) ξ, ξ ⟩ 𝑑𝑥 + \bar{\int_{G} ⟨ ρ (x) ξ, ξ ⟩ 𝑑𝑥}) \\ = 2 ∥ ξ ∥^{2} - 2 Re \int_{G} ⟨ ρ (x) ξ, ξ ⟩ 𝑑𝑥 \\ = 2 ∥ ξ ∥^{2} - 2 Re ⟨ 𝑃𝜉, ξ ⟩ \\ = 2 ∥ ξ ∥^{2} - 2 Re ⟨ P^{2} ξ, ξ ⟩ \\ = 2 ∥ ξ ∥^{2} - 2 Re ⟨ 𝑃𝜉, 𝑃𝜉 ⟩ \\ = 2 ∥ ξ ∥^{2} - 2 ∥ 𝑃𝜉 ∥^{2} \end{array}

and dividing by $2$ proves the claim.

The point of this claim is that since we have $ξ$ with $ρ (x) ξ \neq ξ$ for some $x \in G$ , i.e. $∥ ξ - ρ (x) ξ ∥ /mo> 0 for some x \in G, the same is true on a set
of positive measure, so \int_{G} ∥ ξ - ρ (x) ξ ∥^{2} 𝑑𝑥 /mo> 0 and ∥ 𝑃𝜉 ∥^{2} /mo> ∥ ξ ∥^{2}, hence ∥ 𝑃𝜉 ∥ /mo> ∥ ξ ∥ and ξ cannot be in
the range of P .$ □

5.3. Gram matrices and pseudoinverses. The basic data we have in hand is the following: a finite-dimensional Hilbert space $H$ , a subspace $K$ , and the orthogonal projection $P$ of $H$ onto $K$ . Suppose we have a finite spanning set ${ξ_{α} : α \in A}$ for $K$ , and let $G$ be its Gram matrix:

G = {(⟨ ξ_{α}, ξ_{β} ⟩)}_{α, β \in A} .

We want to describe the entries of $P$ .

Proposition 5.4 (Moore-Penrose pseudoinverse). Let $A$ be an $n \times n$ matrix. There is a unique $n \times n$ matrix $B$ such that

$𝐴𝐵𝐴 = A$ ,
$𝐵𝐴𝐵 = B$ , and
$𝐴𝐵$ and $𝐵𝐴$ are self-adjoint.

This matrix $B$ is called the pseudoinverse of $A$ and is denoted by $A^{+}$ . If $A$ is invertible, then $A^{+} = A^{- 1}$ .

Remark 5.5. How to construct $A$ : invert ${A |}_{\ker {(A)}^{⊥}} : \ker {(A)}^{⊥} \to range (A)$ as usual and send $range {(A)}^{⊥}$ to $0$ . If $A$ is self-adjoint – as it will be for us – then you can diagonalize and simply invert the nonzero eigenvalues while leaving any $0$ s alone.

Theorem 5.6. Let $K$ be a finite-dimensional Hilbert space with a finite spanning set ${ξ_{α} : α \in A}$ with Gram matrix $G$ . Then for $η, ζ \in K$ , we have

⟨ η, ζ ⟩ = \sum_{α, β \in A} ⟨ η, ξ_{α} ⟩ \bar{⟨ ζ, ξ_{β} ⟩} G_{α, β}^{+} .

Proof. The RHS can be written as

\sum_{β \in A} (\sum_{α \in A} ⟨ η, ξ_{α} ⟩ G_{α, β}^{+}) ⟨ ξ_{β}, ζ ⟩

so we want to show that with $η = \sum_{β \in A} y_{β} ξ_{β}$ , we have

y_{β} = \sum_{α \in A} ⟨ η, ξ_{α} ⟩ G_{α, β}^{+} .

Let $z$ be the vector of values on the RHS above:

z_{β} : = \sum_{α \in A} ⟨ η, ξ_{α} ⟩ G_{α, β}^{+} i.e. z : = {(⟨ η, ξ_{α} ⟩)}_{α \in A} G^{+}

so $𝑧𝐺 = {(⟨ η, ξ_{α} ⟩)}_{α \in A} G^{+} G$ .

We have

{(⟨ η, ξ_{α} ⟩)}_{α \in A} = {(\langle \sum_{β \in A} y_{β} ξ_{β}, ξ_{α} \rangle)}_{α \in A} = {(\sum_{β \in A} y_{β} ⟨ ξ_{β}, ξ_{α} ⟩)}_{α \in A} = 𝑦𝐺

\begin{array}{l} {(\langle \sum_{α \in A} z_{α} ξ_{α}, ξ_{β} \rangle)}_{β \in A} & = {(\sum_{α \in A} z_{α} ⟨ ξ_{α}, ξ_{β} ⟩)}_{β \in A} \\ = {(z_{α})}_{α \in A} {(⟨ ξ_{α}, ξ_{β} ⟩)}_{α, β \in A} \\ = 𝑧𝐺 = 𝑦𝐺 G^{+} G = 𝑦𝐺 = {(⟨ η, ξ_{β} ⟩)}_{β \in A} . \end{array}

Now, to show $η = \sum_{α \in A} z_{α} ξ_{α}$ , it is enough to show that for any $v \in K$ , we have

⟨ η, v ⟩ = \langle \sum_{α \in A} z_{α} ξ_{α}, v \rangle .

But we know that $v = \sum_{β \in A} v_{β} ξ_{β}$ since ${ξ_{β} : β \in A}$ spans $K$ , so

⟨ η, v ⟩ = \sum_{β \in A} \bar{v_{β}} ⟨ η, ξ_{β} ⟩ = \sum_{β \in A} \bar{v_{β}} \langle \sum_{α \in A} z_{α} ξ_{α}, ξ_{β} \rangle = \langle \sum_{α \in A} z_{α} ξ_{α}, v \rangle

and we are done. □

Corollary 5.7. Let $H$ be a finite-dimensional Hilbert space with orthonormal basis ${e_{i} : i \in I}$ . Let $K$ be a subspace of $H$ with a spanning set ${ξ_{α} : α \in A}$ and Gram matrix $G$ , and let $P$ be the orthogonal projection of $H$ onto $K$ . Then

⟨ P e_{i}, e_{j} ⟩ = \sum_{α, β \in A} ⟨ e_{i}, ξ_{α} ⟩ ⟨ ξ_{β}, e_{j} ⟩ G_{α, β}^{+}

for $i, j \in I$ .

Proof. The theorem says

\begin{array}{l} ⟨ P e_{i}, e_{j} ⟩ & = ⟨ P^{2} e_{i}, e_{j} ⟩ \\ = ⟨ P e_{i}, P e_{j} ⟩ \\ = \sum_{α, β \in A} ⟨ P e_{i}, ξ_{α} ⟩ ⟨ ξ_{β}, P e_{j} ⟩ G_{α, β}^{+} \\ = \sum_{α, β \in A} ⟨ e_{i}, P^{*} ξ_{α} ⟩ ⟨ P^{*} ξ_{β}, e_{j} ⟩ G_{α, β}^{+} \\ = \sum_{α, β \in A} ⟨ e_{i}, ξ_{α} ⟩ ⟨ ξ_{β}, e_{j} ⟩ G_{α, β}^{+} \end{array}

as claimed. □

5.4. Exercises.

Exercise 5.8. Prove that if $k \neq l$ , then for any $𝐢, 𝐣 : [k] \to [N]$ and $𝐢^{'}, 𝐣^{'} : [l] \to [N]$ , we have

\int_{U_{N}} u_{𝐢 (1) 𝐣 (1)} \dots u_{𝐢 (k) 𝐣 (k)} \bar{u_{𝐢^{'} (1) 𝐣^{'} (1)}} \dots \bar{u_{𝐢^{'} (l) 𝐣^{'} (l)}} 𝑑𝑈 = 0 .

Hint: this can be done directly using facts that have appeared before this point.

6. Weingarten formula

Recall what we did last time: we let $P$ be the operator

{(\int_{U_{N}} u_{𝐢 (1) 𝐣 (1)} \dots u_{𝐢 (k) 𝐣 (k)} \bar{u_{𝐢^{'} (1) 𝐣^{'} (1)}} \dots \bar{u_{𝐢^{'} (k) 𝐣^{'} (k)}} 𝑑𝑈)}_{𝐢, 𝐢^{'}, 𝐣, 𝐣^{'}}

on ${(ℂ^{N})}^{\otimes k} \otimes {(ℂ^{N})}^{\otimes k}$ , and we showed that $P$ is the orthogonal projection onto the subspace

{Fix}_{U_{N}} (U^{\otimes k} \otimes {\bar{U}}^{\otimes k}) : = {ξ \in {(ℂ^{N})}^{\otimes k} \otimes {(ℂ^{N})}^{\otimes k} : (U^{\otimes k} \otimes {\bar{U}}^{\otimes k}) ξ = ξ \forall U \in U_{N}} .

If ${ξ_{α} : α \in A}$ is a spanning set of this range, then one can show (by elementary linear algebra) that the entries of $P$ are

⟨ P e_{𝐣} \otimes e_{𝐣^{'}}, e_{𝐢} \otimes e_{𝐢^{'}} ⟩ = \sum_{α, β \in A} ⟨ ξ_{α}, e_{𝐢} \otimes e_{𝐢^{'}} ⟩ ⟨ ξ_{β}, e_{𝐣} \otimes e_{𝐣^{'}} ⟩ G_{α, β}^{+}

where $e_{𝐢} : = e_{𝐢 (1)} \otimes \dots \otimes e_{𝐢 (k)}$ , $G : = {(⟨ ξ_{α}, ξ_{β} ⟩)}_{α, β \in A}$ is the Gram matrix of the spanning set, and $G^{+}$ is the pseudoinverse of $G$ (or just the inverse if the spanning set is linearly independent).

So at least in principle, the solution to our problem of computing unitary matrix integrals comes down to finding a convenient spanning set for the space of fixed points.

6.1. The spanning set.

Notation 6.1. For $α \in S_{k}$ , let

E_{α} : = \sum_{𝐢 : [k] \to [N]} e_{𝐢 (α (1))} \otimes \dots \otimes e_{𝐢 (α (k))} \otimes e_{𝐢 (1)} \otimes \dots \otimes e_{𝐢 (k)} .

Theorem 6.2 (Schur-Weyl duality). We have

{Fix}_{U_{N}} (U^{\otimes k} \otimes {\bar{U}}^{\otimes l}) = span {E_{α} : α \in S_{k}} .

Remark 6.3. We do not have time in this course to prove this theorem, but the idea is easy to explain. We have two natural representations $ρ$ and $π$ , of $U_{N}$ and $S_{k}$ respectively, on ${(ℂ^{N})}^{\otimes k}$ :

ρ (U) ξ_{1} \otimes \dots \otimes ξ_{k} = U ξ_{1} \otimes \dots \otimes U ξ_{k} and π (σ) ξ_{1} \otimes \dots \otimes ξ_{k} = ξ_{σ^{- 1} (1)} \otimes \dots \otimes ξ_{σ^{- 1} (k)} .

You can immediately check that they commute. The basic principle of Schur-Weyl duality is that if $𝒜$ is the subalgebra spanned by $ρ$ and $B$ is the subalgebra spanned by $π$ , we actually have $𝒜^{'} = B$ and $B^{'} = 𝒜$ . The theorem above is a consequence of this more fundamental theorem.

Proposition 6.4. For $α \in S_{k}$ and $𝐢, 𝐢^{'} : [k] \to [N]$ , we have

⟨ E_{α}, e_{𝐢} \otimes e_{𝐢^{'}} ⟩ = {\begin{matrix} 1 & if 𝐢 = 𝐢^{'} \circ α \\ 0 & otherwise \end{matrix} .

Proof. We have

\begin{array}{l} ⟨ E_{α}, e_{𝐢} \otimes e_{𝐢^{'}} ⟩ & = \sum_{𝐣 : [k] \to [N]} ⟨ e_{𝐣 (α (1))} \otimes \dots \otimes e_{𝐣 (α (k))} \otimes e_{𝐣 (1)} \otimes \dots \otimes e_{𝐣 (k)}, \\ e_{𝐢 (1)} \otimes \dots \otimes e_{𝐢 (k)} \otimes e_{𝐢^{'} (1)} \otimes \dots \otimes e_{𝐢^{'} (k)} ⟩ \\ = ⟨ e_{𝐢^{'} (α (1))} \otimes \dots \otimes e_{𝐢^{'} (α (k))}, e_{𝐢 (1)} \otimes \dots \otimes e_{𝐢 (k)} ⟩ & (𝐣 = 𝐢^{'}) \\ = {\begin{matrix} 1 & if 𝐢 = 𝐢^{'} \circ α \\ 0 & otherwise \end{matrix} \end{array}

hence the claim. □

So with $W : = G^{+}$ , the integration formula now reads as

\int_{U_{N}} u_{𝐢 (1) 𝐣 (1)} \dots u_{𝐢 (k) 𝐣 (k)} \bar{u_{𝐢^{'} (1) 𝐣^{'} (1)}} \dots \bar{u_{𝐢^{'} (k) 𝐣^{'} (k)}} 𝑑𝑈 = \sum_{\begin{array}{c} α, β \in S_{k} \\ 𝐢 = 𝐢^{'} \circ α \\ 𝐣 = 𝐣^{'} \circ β \end{array}} W_{α, β}

and what remains is to understand something about the matrix $W$ .

6.2. Character expansion.

Notation 6.5. Define ${Gr}_{N, k} : S_{k} \to ℂ$ by

{Gr}_{N, k} (σ) = N^{# (σ)}

for $σ \in S_{k}$ . If no confusion will arise, we may refer to ${Gr}_{N, k}$ as $Gr$ .

Proposition 6.6. The function ${Gr}_{N, k}$ appears in the following familiar ways:

it is the character of the representation $π$ of $S_{k}$ on ${(ℂ^{N})}^{\otimes k}$ ;
we have $G_{α, β} = {Gr}_{N, k} (α^{- 1} β)$ for all $α, β \in S_{k}$ .

Proof. For (1), compute

\begin{array}{l} Tr (π (σ)) & = \sum_{𝐢 : [k] \to [N]} ⟨ π (σ) e_{𝐢 (1)} \otimes \dots \otimes e_{𝐢 (k)}, e_{𝐢 (1)} \otimes \dots \otimes e_{𝐢 (k)} ⟩ \\ = \sum_{𝐢 : [k] \to [N]} ⟨ e_{𝐢 (σ^{- 1} (1))} \otimes \dots \otimes e_{𝐢 (σ^{- 1} (k))}, e_{𝐢 (1)} \otimes \dots \otimes e_{𝐢 (k)} ⟩ \\ = \sum_{𝐢 : [k] \to [N]} δ_{𝐢 = 𝐢 \circ σ^{- 1}} \\ = # (𝐢 : [k] \to [N] : 𝐢 = 𝐢 \circ σ^{- 1}) \\ = N^{# (σ)} . \end{array}

The last line is a straightforward count: $𝐢 = 𝐢 \circ σ^{- 1}$ just means $𝐢$ is constant on the cycles of $σ^{- 1}$ , so we are picking a label from $[N]$ for each cycle, and there are $N^{# (σ^{- 1})} = N^{# (σ)}$ choices.

For (2), for $α, β \in S_{k}$ , we have

\begin{array}{l} ⟨ E_{α}, E_{β} ⟩ & = \sum_{𝐢, 𝐣 : [k] \to [N]} ⟨ e_{𝐢 (α (1))} \otimes \dots \otimes e_{𝐢 (α (k))} \otimes e_{𝐢 (1)} \otimes \dots \otimes e_{𝐢 (k)}, \\ e_{𝐣 (β (1))} \otimes \dots \otimes e_{𝐣 (β (k))} \otimes e_{𝐣 (1)} \otimes \dots \otimes e_{𝐣 (k)} ⟩ \\ = \sum_{𝐢 : [k] \to [N]} ⟨ e_{𝐢 (α (1))} \otimes \dots \otimes e_{𝐢 (α (k))}, e_{𝐢 (β (1))} \otimes \dots \otimes e_{𝐢 (β (k))} ⟩ & (the last k components force 𝐢 = 𝐣) \\ = \sum_{𝐢 : [k] \to [N]} ⟨ e_{𝐢 (1)} \otimes \dots \otimes e_{𝐢 (k)}, e_{𝐢 (α^{- 1} β (1))} \otimes \dots \otimes e_{𝐢 (α^{- 1} β (k))} ⟩ & (replace 𝐢 with 𝐢 \circ α^{- 1}) \\ = \sum_{𝐢 : [k] \to [N]} δ_{𝐢 = 𝐢 \circ (α^{- 1} β)} = N^{# (α^{- 1} β)} \end{array}

so $⟨ E_{α}, E_{β} ⟩ = {Gr}_{N, k} (α^{- 1} β)$ . □

We have a function ${Gr}_{N, k}$ on $S_{k}$ , and it is clearly a class function – ${Gr}_{N, k} (σ)$ only depends on $σ$ through $# (σ)$ , which only depends on the conjugacy class of $σ$ . Equivalently, this can be seen as a central element of $ℂ [S_{k}]$ :

Lemma 6.7. Let $f : S_{k} \to ℂ$ be a function. The element $\sum_{σ \in S_{k}} f (σ) σ$ of $ℂ [S_{k}]$ is central if and only if $f$ is a class function.

Proof. First, suppose that $f$ is a class function. Then for any $τ \in S_{k}$ , we have

\sum_{σ \in S_{k}} f (σ) 𝜎𝜏 = \sum_{σ \in S_{k}} f (𝜏𝜎 τ^{- 1}) 𝜏𝜎 τ^{- 1} τ = \sum_{σ \in S_{k}} f (σ) 𝜏𝜎 = τ \sum_{σ \in S_{k}} f (σ) σ

and passing to linear combinations of $τ$ s, we see that $\sum_{σ \in S_{k}} f (σ) σ$ is central. Conversely, if $\sum_{σ \in S_{k}} f (σ) σ$ is central, then for any $τ \in S_{k}$ , we have

\begin{array}{l} \sum_{σ \in S_{k}} f (𝜏𝜎 τ^{- 1}) σ & = \sum_{σ \in S_{k}} f (σ) τ^{- 1} 𝜎𝜏 = τ^{- 1} (\sum_{σ \in S_{k}} f (σ) σ) τ \\ = τ^{- 1} τ \sum_{σ \in S_{k}} f (σ) σ = \sum_{σ \in S_{k}} f (σ) σ \end{array}

so for all $σ \in S_{k}$ , we have $f (𝜏𝜎 τ^{- 1}) = f (σ)$ . □

This means, in particular, that we have a central element of $ℂ [S_{k}]$ corresponding to ${Gr}_{N, k}$ :

{\tilde{Gr}}_{N, k} = \sum_{σ \in S_{k}} {Gr}_{N, k} (σ) σ .

To describe the inverse, we need some tools from algebra. From this point forward, we will assume $N \geq k$ for simplicity.

Notation 6.8. Let $𝕐_{k}$ be the set of Young diagrams with $k$ boxes: these are grids of boxes, justified to the left, where the length of rows is non-increasing. (These are in bijection with partitions of $k$ .) The length of $λ \in 𝕐_{k}$ , denoted by $ℓ (λ)$ , is the number of rows in $λ$ .

For $λ \in 𝕐_{k}$ , a tableau is a labeling of the boxes by elements of ${1, \dots, k}$ , and a tableau is said to be standard if the rows and columns are strictly increasing. The number of standard tableaux of shape $λ$ will be denoted by $\dim (λ)$ .

Theorem 6.9 (Young, Frobenius, etc?). There is a family ${χ^{λ} : λ \in 𝕐_{k}}$ of functions on $S_{k}$ with the following properties:

$χ^{λ}$ is a class function, in the sense that it is constant on conjugacy classes;
every class function on $S_{k}$ is a linear combination of ${χ^{λ} : λ \in 𝕐_{k}}$ ;
$⟨ χ^{λ}, χ^{μ} ⟩ = δ_{λ = μ}$ .

Let $π$ be the representation of $S_{k}$ on ${(ℂ^{N})}^{\otimes k}$ defined by

π (σ) ξ_{1} \otimes \dots \otimes ξ_{k} = ξ_{σ^{- 1} (1)} \otimes \dots \otimes ξ_{σ^{- 1} (k)} .

Then

Tr (π (σ)) = \sum_{λ ⊢ k} 𝗌_{λ} (1^{N}) χ^{λ} (σ)

for $σ \in S_{k}$ , where

𝗌_{λ} (1^{N}) = \frac{\dim (λ)}{k!} \prod_{(i, j) \in λ} (N + j - i) .

Notation 6.10. Define ${Wg}_{N, k} : S_{k} \to ℂ$ by

{Wg}_{N, k} (σ) = \frac{1}{k!^{2}} \sum_{λ ⊢ k} \frac{\dim {(λ)}^{2}}{𝗌_{λ} (1^{N})} χ^{λ} (σ)

for $σ \in S_{k}$ . If no confusion will arise, we may refer to ${Wg}_{N, k}$ by $Wg$ . Note that for this definition to make sense, we are using the assumption $N \geq k$ – otherwise, $𝗌_{λ} (1^{N})$ might be $0$ .

Theorem 6.11. Let $\tilde{Wg} : = \sum_{σ \in S_{k}} Wg (σ) σ$ and assume $N \geq k$ . Then we have $\tilde{Wg} \tilde{Gr} = \tilde{Gr} \tilde{Wg} = 1$ .

Proof. We have

\begin{array}{l} \tilde{Wg} \tilde{Gr} & = \frac{1}{k!^{2}} \sum_{λ, μ \in 𝕐_{k}} \frac{\dim {(λ)}^{2} 𝗌_{μ} (1^{N})}{𝗌_{λ} (1^{N})} (\sum_{α \in S_{k}} χ^{λ} (α) α) (\sum_{β \in S_{k}} χ^{μ} (β) β) \\ = \sum_{λ, μ \in 𝕐_{k}} \frac{\dim (λ) 𝗌_{μ} (1^{N})}{\dim (μ) 𝗌_{λ} (1^{N})} C_{λ} C_{μ} \end{array}

where $C_{λ} : = \frac{\dim (λ)}{k!} \sum_{α \in S_{k}} χ^{λ} (α) α$ and similarly for $μ$ . It is an exercise in representation theory of finite groups to check that

C_{λ} C_{μ} = {\begin{matrix} C_{λ} & if λ = μ \\ 0 & otherwise \end{matrix} and \sum_{λ \in 𝕐_{k}} C_{λ} = 1

so $\tilde{Wg} \tilde{Gr} = 1$ . Similarly $\tilde{Gr} \tilde{Wg} = 1$ . □

Remark 6.12. The assumption that $N \geq k$ isn’t really necessary; if we drop it, we need to work in a smaller subalgebra that incorporates an extra condition on the irreducible components. Specifically, in the isomorphism

ℂ [S_{k}] ≃ \underset{λ \in 𝕐_{k}}{\oplus} End (V^{λ})

from Maschke’s theorem, we can define

ℂ_{N} [S_{k}] : = \underset{\begin{array}{c} λ \in 𝕐_{k} \\ ℓ (λ) \leq N \end{array}}{\oplus} End (V^{λ})

and invert ${Gr}_{N, k}$ in there; this is done in [7]. We only assumed $N \geq k$ for simplicity, and because for our purposes this will always be the case.

Corollary 6.13. Continue to assume $N \geq k$ . We have

W_{α, β} = \frac{1}{k!^{2}} \sum_{λ \in 𝕐_{k}} \frac{\dim {(λ)}^{2}}{𝗌_{λ} (1^{N})} χ^{λ} (α^{- 1} β)

for $α, β \in S_{k}$ .

Proof. Let ${\tilde{W}}_{α, β} : = {Wg}_{N, k} (α^{- 1} β)$ . We will show $\tilde{W} = W$ by checking that $\tilde{W} G = G \tilde{W} = I$ . From the theorem, which says that

1 = \sum_{α, β \in S_{k}} Wg (α) Gr (β) 𝛼𝛽 = \sum_{σ \in S_{k}} σ \sum_{α \in S_{k}} Wg (α) Gr (α^{- 1} σ),

we know that

\sum_{α \in S_{k}} Wg (α) Gr (α^{- 1} σ) = {\begin{matrix} 1 & if σ = e \\ 0 & otherwise \end{matrix} .

\begin{array}{l} {[\tilde{W} G]}_{α, β} & = \sum_{γ \in S_{k}} {\tilde{W}}_{α, γ} G_{γ, β} \\ = \sum_{γ \in S_{k}} Wg (α^{- 1} γ) Gr (γ^{- 1} β) \\ = \sum_{γ \in S_{k}} Wg (α^{- 1} 𝛼𝛾) Gr ({(𝛼𝛾)}^{- 1} β) \\ = \sum_{γ \in S_{k}} Wg (γ) Gr (γ^{- 1} α^{- 1} β) \\ = {\begin{matrix} 1 & if α^{- 1} β = e \\ 0 & otherwise \end{matrix} \end{array}

and ${[\tilde{W} G]}_{α, β} = δ_{α = β}$ . Similarly we have ${[G \tilde{W}]}_{α, β} = δ_{α, β}$ so $\tilde{W} = W$ . □

6.3. Exercises.

Exercise 6.14. Let ${E_{α} : α \in S_{k}}$ be the spanning set described above.

Prove that ${E_{α} : α \in S_{k}}$ is linearly independent if and only if $N \geq k$ .
Let ${ξ_{α} : α \in A}$ be a family of vectors in a Hilbert space $H$ . Prove that if ${ξ_{α} : α \in A}$ is linearly independent, then the Gram matrix
${(⟨ ξ_{α}, ξ_{β} ⟩)}_{α, β \in A}$

is invertible.

Exercise 6.15 ([8, Problem 4.5.2]). Let $G$ be a finite group and let ${χ_{i} : i \in I}$ be the irreducible characters of $G$ . For $i \in I$ , let $ρ_{i}$ be the irreducible representation giving rise to $χ_{i}$ , and let

C_{i} : = \frac{\dim (V_{i})}{| G |} \sum_{g \in G} χ_{i} (g) g^{- 1} \in ℂ [G] .

Recall that $ρ_{i}$ extends by linearity to an algebra homomorphism $ℂ [G] \to End (V_{i})$ .

Prove that

ρ_{j} (C_{i}) = {\begin{matrix} Id & if j = i \\ 0 & otherwise \end{matrix}

for $i, j \in I$ .

Prove that
$C_{i} C_{j} = {\begin{matrix} C_{i} & if i = j \\ 0 & otherwise \end{matrix}$

for $i, j \in I$ .
Prove that $\sum_{i \in I} C_{i} = 1$ in $ℂ [G]$ .

Hint: the important keywords are central, Schur’s lemma, Maschke’s theorem, and orthogonality relations.

7. Randomly rotated matrices

In this section we will show that conjugating by a random unitary matrix produces asymptotic freeness. This section is heavily based on [13, Lecture 23].

First, let us fix our terminology:

Definition 7.1. For $N \geq 1$ , let $(𝒜_{N}, φ_{N})$ be a $*$ -probability space and let ${a_{i}^{(N)} : i \in I}$ be a family of self-adjoint elements of $𝒜_{N}$ . We say that this family is asymptotically free if for all $k \geq 1$ , for all $i_{1}, \dots, i_{k} \in I$ with $i_{1} \neq \dots \neq i_{k}$ , for all polynomials $P_{1}, \dots, P_{k}$ with $φ_{N} (P_{j} (a_{i_{j}}^{(N)})) \to 0$ for $1 \leq j \leq k$ , we have

\lim_{N \to \infty} φ_{N} (P_{1} (a_{i_{1}}^{(N)}) \dots P_{k} (a_{i_{k}}^{(N)})) = 0 .

Theorem 7.2 (Voiculescu). For $N \geq 1$ , let $A_{N}$ and $B_{N}$ be self-adjoint $N \times N$ matrices, let $U_{N}$ be a random $N \times N$ unitary matrix, and suppose that for all $m \geq 1$ , the sequences ${tr}_{N} (A_{N}^{m})$ and ${tr}_{N} (B_{N}^{m})$ converge as $N \to \infty$ . Then $U_{N} A_{N} U_{N}^{*}$ and $B_{N}$ are asymptotically free.

Here is what we need to show: if $P_{1}, \dots, P_{k}$ and $Q_{1}, \dots, Q_{k}$ are polynomials with

𝔼 {tr}_{N} (P_{j} (𝑈𝐴 U^{*})) \to 0 and 𝔼 {tr}_{N} (Q_{j} (B)) \to 0

then

𝔼 {tr}_{N} (P_{1} (𝑈𝐴 U^{*}) Q_{1} (B) \dots P_{k} (𝑈𝐴 U^{*}) Q_{k} (B)) \to 0 .

Strictly speaking, there is another case to be handled where the $Q_{k} (B)$ is omitted; this can be done in the same way as the case mentioned above.

We can simplify this by bringing out the unitaries from inside the polynomials: we want to show that if $P_{1}, \dots, P_{k}$ and $Q_{1}, \dots, Q_{k}$ are polynomials with

{tr}_{N} (P_{j} (A)) \to 0 and 𝔼 {tr}_{N} (Q_{j} (B)) \to 0

then

𝔼 {tr}_{N} (U P_{1} (A) U^{*} Q_{1} (B) \dots U P_{k} (A) U^{*} Q_{k} (B)) \to 0 .

So what we really need to do here is compute things like

𝔼 {tr}_{N} (U A_{1} U^{*} B_{1} \dots U A_{k} U^{*} B_{k})

where $A_{1}, \dots, A_{k}$ and $B_{1}, \dots, B_{k}$ are tuples of matrices.

7.1. Exact trace formula.

Notation 7.3. Let $(𝒜, φ)$ be a tracial $*$ -probability space. For a cycle $c = (i_{1}, \dots, i_{p})$ , write

φ_{c} (a_{1}, \dots, a_{n}) : = φ (a_{i_{1}} \dots a_{i_{p}}) .

For $α \in S_{n}$ with disjoint cycle decomposition $α = c_{1} \dots c_{l}$ , write

φ_{α} (a_{1}, \dots, a_{n}) : = φ_{c_{1}} (a_{1}, \dots, a_{n}) \dots φ_{c_{l}} (a_{1}, \dots, a_{n})

Lemma 7.4. For $N \times N$ matrices $A_{1}, \dots, A_{k}$ with $A_{r} = {(a_{𝑖𝑗}^{(r)})}_{1 \leq i, j \leq N}$ for $1 \leq r \leq k$ , we have

{tr}_{α} (A_{1}, \dots, A_{k}) = \frac{1}{N^{# (α)}} \sum_{𝐢 : [k] \to [N]} a_{𝐢 (1) 𝐢 (α (1))}^{(1)} \dots a_{𝐢 (k) 𝐢 (α (k))}^{(k)} .

Proof. Direct verification from the definition of the trace. □

Proposition 7.5. Let $A_{1}, \dots, A_{k}$ and $B_{1}, \dots, B_{k}$ be $N \times N$ matrices. Then

\begin{array}{l} 𝔼 {tr}_{N} (U A_{1} U^{*} B_{1} \dots U A_{k} U^{*} B_{k}) \\ = \sum_{α, β \in S_{k}} {Wg}_{N, k} (α^{- 1} β) N^{# (α) + # (β^{- 1} γ) - 1} {tr}_{α} (A_{1}, \dots, A_{k}) {tr}_{β^{- 1} γ} (B_{1}, \dots, B_{k}) . \end{array}

Proof. We have

\begin{array}{l} 𝔼 {tr}_{N} (U A_{1} U^{*} B_{1} \dots U A_{k} U^{*} B_{k}) \\ = \frac{1}{N} \sum_{𝐢, 𝐢^{'}, 𝐣, 𝐣^{'} : [k] \to [N]} 𝔼 (u_{𝐢 (1) 𝐣 (1)} a_{𝐣 (1) 𝐣^{'} (1)}^{(1)} \bar{u_{𝐢^{'} (1) 𝐣^{'} (1)}} b_{𝐢^{'} (1) 𝐢 (2)}^{(1)} \dots \\ \dots u_{𝐢 (k) 𝐣 (k)} a_{𝐣 (k) 𝐣^{'} (k)}^{(k)} \bar{u_{𝐢^{'} (k) 𝐣^{'} (k)}} b_{𝐢^{'} (k) 𝐢 (1)}^{(k)}) \\ = \frac{1}{N} \sum_{𝐢, 𝐢^{'}, 𝐣, 𝐣^{'} : [k] \to [N]} a_{𝐣 (1) 𝐣^{'} (1)}^{(1)} \dots a_{𝐣 (k) 𝐣^{'} (k)}^{(k)} b_{𝐢^{'} (1) 𝐢 (2)}^{(1)} \dots b_{𝐢^{'} (k) 𝐢 (1)}^{(k)} \\ 𝔼 (u_{𝐢 (1) 𝐣 (1)} \dots u_{𝐢 (k) 𝐣 (k)} \bar{u_{𝐢^{'} (1) 𝐣^{'} (1)}} \dots \bar{u_{𝐢^{'} (k) 𝐣^{'} (k)}}) \\ = \frac{1}{N} \sum_{𝐢, 𝐢^{'}, 𝐣, 𝐣^{'} : [k] \to [N]} a_{𝐣 (1) 𝐣^{'} (1)}^{(1)} \dots a_{𝐣 (k) 𝐣^{'} (k)}^{(k)} b_{𝐢^{'} (1) 𝐢 (2)}^{(1)} \dots b_{𝐢^{'} (k) 𝐢 (1)}^{(k)} \\ \sum_{α, β \in S_{k}} δ_{𝐢 = 𝐢^{'} \circ α} δ_{𝐣 = 𝐣^{'} \circ β} {Wg}_{N, k} (α^{- 1} β) \\ = \frac{1}{N} \sum_{α, β \in S_{k}} {Wg}_{N, k} (α^{- 1} β) \\ \sum_{𝐢^{'}, 𝐣^{'} : [k] \to [N]} a_{𝐣^{'} (β (1)) 𝐣^{'} (1)}^{(1)} \dots a_{𝐣^{'} (β (k)) 𝐣^{'} (k)}^{(k)} b_{𝐢^{'} (1) 𝐢^{'} (α (2))}^{(1)} \dots b_{𝐢^{'} (k) 𝐢^{'} (α (1))}^{(k)} \end{array}

We can relabel things to make this formula easier to work with: swap $α$ with $β$ , $𝐣^{'}$ with $𝐢^{'}$ , and drop the primes. Also, move the permutation in the $a$ s to the column index using the inverse. Then the above becomes

\begin{array}{l} \frac{1}{N} \sum_{α, β \in S_{k}} {Wg}_{N, k} (α^{- 1} β) \\ \sum_{𝐢, 𝐣 : [k] \to [N]} a_{𝐢 (1) 𝐢 (α (1))}^{(1)} \dots a_{𝐢 (k) 𝐢 (α (k))}^{(k)} b_{𝐣 (1) 𝐣 (β^{- 1} γ (1))}^{(1)} \dots b_{𝐣 (k) 𝐣 (β^{- 1} γ (k))}^{(k)} \\ = \frac{1}{N} \sum_{α, β \in S_{k}} {Wg}_{N, k} (α^{- 1} β) {Tr}_{α} (A_{1}, \dots, A_{k}) {Tr}_{β^{- 1} γ} (B_{1}, \dots, B_{k}) \\ = \sum_{α, β \in S_{k}} {Wg}_{N, k} (α^{- 1} β) N^{# (α) + # (β^{- 1} γ) - 1} {tr}_{α} (A_{1}, \dots, A_{k}) {tr}_{β^{- 1} γ} (B_{1}, \dots, B_{k}) \end{array}

as claimed. □

In order to take the limit $N \to \infty$ , we need to understand the asymptotics of ${Wg}_{k, N}$ as a function of $N$ .

7.2. Asymptotics of Weingarten function. Again, assume $N \geq k$ . Recall our formula for ${Wg}_{N, k}$ :

{Wg}_{N, k} (σ) = \frac{1}{k!^{2}} \sum_{λ \in 𝕐_{k}} \frac{\dim {(λ)}^{2}}{𝗌_{λ} (1^{N})} χ^{λ} (σ) .

The only part which involves $N$ is

\frac{1}{𝗌_{λ} (1^{N})} = \frac{k!}{\dim (λ)} \prod_{(i, j) \in λ} \frac{1}{N + j - i}

and the highest power of $N$ in the denominator is $N^{k}$ . So we can expand as a Laurent series in $N$ :

{Wg}_{N, k} (σ) = \sum_{n = 0}^{\infty} a_{n} (σ) N^{- k - n} .

There is a nice formula for the coefficients.

Theorem 7.6 ([5, Theorem 2.2]). Let $a_{n} (σ)$ be the Laurent coefficients above. Then

with $n = 0$ , we have

a_{0} (σ) = {\begin{matrix} 1 & if σ = e \\ 0 & otherwise \end{matrix}

for $n /mo> 0$ ,
$a_{n} (σ) = \sum_{m = 1}^{n} {(- 1)}^{m} b_{m, n} (σ)$

where $b_{m, n} (σ)$ is the number of tuples $(σ_{1}, \dots, σ_{m})$ with
- $σ_{j} \neq e$ ,
- $σ σ_{1} \dots σ_{m} = e$ , and
- $\sum_{j = 1}^{m} | σ_{j} | = n$ .

Corollary 7.7. For all $σ \in S_{k}$ , we have ${Wg}_{N, k} (σ) = O (N^{- k - | σ |})$ .

Proof. The claim here is that $a_{n} (σ) = 0$ for all $n /mo> | σ | .
Suppose n /mo> | σ |,
let 1 \leq m \leq n, and
take σ_{1}, \dots, σ_{m} \in S_{k} with \sum_{j = 1}^{m} | σ_{j} | = n .
By the triangle inequality, we have$

| σ | = | σ σ_{1} \dots σ_{m} σ_{m}^{- 1} \dots σ_{1}^{- 1} | \leq | σ σ_{1} \dots σ_{m} | + \sum_{j = 1}^{m} | σ_{j} |

so $| σ σ_{1} \dots σ_{m} | \geq | σ | - \sum_{j = 1}^{m} | σ_{j} | /mo> 0 .
This means σ σ_{1} \dots σ_{m} \neq e .$ □

Notation 7.8. Let $μ (σ)$ be the leading coefficient of ${Wg}_{N, k} (σ)$ .

Remark 7.9. With some more work [5, 7], one can get a much more detailed understanding of this expansion:

μ (σ) = \prod_{c \in Cyc (σ)} {(- 1)}^{| c | - 1} Cat (| c | - 1)

and

N^{k + | σ |} {Wg}_{N, k} (σ) = μ (σ) + O (N^{- 2}) .

This level of detail will not be necessary for our purposes.

7.3. Limit of trace formula. Recall the trace formula:

\begin{array}{l} 𝔼 {tr}_{N} (U A_{1} U^{*} B_{1} \dots U A_{k} U^{*} B_{k}) \\ = \sum_{α, β \in S_{k}} {Wg}_{N, k} (α^{- 1} β) N^{# (α) + # (β^{- 1} γ) - 1} {tr}_{α} (A_{1}, \dots, A_{k}) {tr}_{β^{- 1} γ} (B_{1}, \dots, B_{k}) \\ = \sum_{α, β \in S_{k}} {Wg}_{N, k} (α^{- 1} β) N^{2 k - 1 - | α | - | β^{- 1} γ |} {tr}_{α} (A_{1}, \dots, A_{k}) {tr}_{β^{- 1} γ} (B_{1}, \dots, B_{k}) \end{array}

We have

{Wg}_{N, k} (α^{- 1} β) = μ (α^{- 1} β) N^{- k - | α^{- 1} β |} + O (N^{- k - | α^{- 1} β | - 1})

so the leading order in $N$ is

N^{k - 1 - | α | - | α^{- 1} β | - | β^{- 1} γ |} .

Lemma 7.10. We have

| α | + | α^{- 1} β | + | β^{- 1} γ | \geq k - 1 .

If we have equality, then either $α$ or $β^{- 1} γ$ has a fixed point.

Proof. The first claim is just triangle inequality:

k - 1 = | γ | = | α α^{- 1} β β^{- 1} γ | \leq | α | + | α^{- 1} β | + | β^{- 1} γ | .

For the second claim, notice that at least one of

| α | \leq \frac{k - 1}{2} or | β^{- 1} γ | \leq \frac{k - 1}{2}

is true. In the first case, $α$ can be written as a product of $\leq \frac{k - 1}{2}$ transpositions, so it can only move up to $k - 1$ points, i.e. it has a least one fixed point. Similarly, in the second case, $β^{- 1} γ$ can only move up to $k - 1$ points, so it has at least one fixed point. □

Recall the original goal: we need to show that for all $P_{1}, \dots, P_{k}$ and $Q_{1}, \dots, Q_{k}$ with ${tr}_{N} (P_{j} (A)) \to 0$ and ${tr}_{N} (Q_{j} (B)) \to 0$ , we have

𝔼 {tr}_{N} (U P_{1} (A) U^{*} Q_{1} (B) \dots U P_{k} (A) U^{*} Q_{k} (B)) \to 0

Using the lemma and formula above, we have

\begin{array}{l} \lim_{N \to \infty} 𝔼 {tr}_{N} (U P_{1} (A) U^{*} Q_{1} (B) \dots U P_{k} (A) U^{*} Q_{k} (B)) \\ = \sum_{\begin{array}{c} α, β \in S_{k} \\ | α | + | α^{- 1} β | + | β^{- 1} γ | = k - 1 \end{array}} μ (α^{- 1} β) \\ (\lim_{N \to \infty} {tr}_{α} (P_{1} (A), \dots, P_{k} (A))) (\lim_{N \to \infty} {tr}_{β^{- 1} γ} (Q_{1} (B), \dots, Q_{k} (B))) \end{array}

and according to the lemma, for each summand, either $α$ or $β^{- 1} γ$ has a fixed point. This means we have a factor of $\lim_{N \to \infty} {tr}_{N} (P_{j} (A))$ or $\lim_{N \to \infty} {tr}_{N} (Q_{j} (B))$ . This makes each summand $0$ .

8. Free additive convolution

Here is a very concrete problem: let $A$ be the $2 N \times 2 N$ diagonal matrix with half its entries $1$ and the other half $- 1$ . Rotate $A$ randomly by conjugating with a Haar unitary, and keep another copy of $A$ . These are two matrices with the same eigenvalues, but in generic/random position in relation to each other. What does $A + 𝑈𝐴 U^{*}$ look like?

The distribution is complicated in an exact sense, but in our context it is natural to ask this question when $N \to \infty$ . This question can be answered using asymptotic freeness and free probability. This section draws from [12, Chapter 3] and [13, Lecture 12], and is generally rather sketchy about technical details. Rigorous proofs of everything in this section can be found in one of the two aforementioned references.

Definition 8.1. Let $μ, ν \in M_{c} (ℝ)$ and let $(𝒜, φ)$ be a $*$ -probability space with freely independent self-adjoint elements $a, b \in 𝒜$ such that

φ (a^{m}) = \int_{ℝ} x^{m} 𝑑𝜇 (x) and φ (b^{m}) = \int_{ℝ} x^{m} 𝑑𝜈 (x)

for all $m \geq 1$ . Let $μ ⊞ ν$ be the distribution of $a + b$ .

This construction can always be made using free products (see [13]). It produces a well-defined operation on probability measures by a basic property of freeness: the distribution of $a + b$ only depends on the individual distributions of $a$ and $b$ .

According to asymptotic freeness, the limit of $A + 𝑈𝐴 U^{*}$ should be described by $μ ⊞ μ$ where $μ = \frac{1}{2} (δ_{1} + δ_{- 1})$ . My goal for today is to compute $μ ⊞ μ$ .

8.1. Free cumulants.

Notation 8.2. Let $𝒜$ be an algebra and let ${(φ_{n})}_{n \geq 1}$ be a sequence of multilinear functionals $φ_{n} : 𝒜^{n} \to ℂ$ . For a set $V \subseteq [n]$ , say $V = {i_{1}, \dots, i_{p}}$ , write

φ_{V} (a_{1}, \dots, a_{n}) = φ_{p} (a_{i_{1}}, \dots, a_{i_{p}}) .

For $π \in P (n)$ , write

φ_{π} (a_{1}, \dots, a_{n}) : = \prod_{V \in π} φ_{V} (a_{1}, \dots, a_{n}) .

For example, if $π = {{1, 3, 5}, {2, 4}, {6}}$ , then

φ_{π} (a_{1}, a_{2}, a_{3}, a_{4}, a_{5}, a_{6}) = φ_{3} (a_{1}, a_{3}, a_{5}) φ_{2} (a_{2}, a_{4}) φ_{1} (a_{6}) .

Definition 8.3. Let $(𝒜, φ)$ be a $*$ -probability space. The free cumulants are the multilinear functionals $κ_{n} : 𝒜^{n} \to ℂ$ defined implicitly by

φ (a_{1} \dots a_{n}) = \sum_{π \in 𝑁𝐶 (n)} κ_{π} (a_{1}, \dots, a_{n}) .

Example 8.4. With $n = 1$ , the moment-cumulant formula says

φ (a_{1}) = κ_{1} (a_{1}) .

With $n = 2$ , there are two noncrossing partitions so

φ (a_{1} a_{2}) = κ_{2} (a_{1}, a_{2}) + κ_{1} (a_{1}) κ_{1} (a_{2}) = κ_{2} (a_{1}, a_{2}) + φ (a_{1}) φ (a_{2})

hence $κ_{2} (a_{1}, a_{2}) = φ (a_{1} a_{2}) - φ (a_{1}) φ (a_{2})$ .

With $n = 3$ , there are $5$ noncrossing partitions:

\begin{array}{l} φ (a_{1} a_{2} a_{3}) & = κ_{3} (a_{1}, a_{2}, a_{3}) + κ_{2} (a_{1}, a_{2}) κ_{1} (a_{3}) + κ_{2} (a_{1}, a_{3}) κ_{1} (a_{2}) \\ + κ_{1} (a_{1}) κ_{2} (a_{2}, a_{3}) + κ_{1} (a_{1}) κ_{1} (a_{2}) κ_{1} (a_{3}) \end{array}

and the only one we don’t already know is $κ_{3} (a_{1}, a_{2}, a_{3})$ . This pattern continues to determine $κ_{n}$ .

Remark 8.5. The proper way to formulate this definition involves the theory of Möbius inversion. This is a general procedure for functions on lattices which generalizes the observation we made in the previous example – that the relation

φ (a_{1} \dots a_{n}) = \sum_{π \in 𝑁𝐶 (n)} κ_{π} (a_{1}, \dots, a_{n})

implicitly determines $κ_{n} (a_{1}, \dots, a_{n})$ .

Notation 8.6. If we have a single variable $a$ , we will abbreviate

κ_{n} (a) : = κ_{n} (a, \dots, a) .

This means that for a single variable, we have a new sequence ${(κ_{n})}_{n \geq 1}$ which is related to the moment sequence by $m_{n} = \sum_{π \in 𝑁𝐶 (n)} κ_{π}$ .

Theorem 8.7. Let $(𝒜, φ)$ be a $*$ -probability space and let $𝒜_{1}, \dots, 𝒜_{r}$ be $*$ -subalgebras of $𝒜$ . Then $𝒜_{1}, \dots, 𝒜_{r}$ are freely independent if and only if for any $1 \leq i_{1}, \dots, i_{n} \leq r$ with $i_{j} \neq i_{k}$ for some $1 \leq j, k \leq r$ , for any choice of $a_{j} \in 𝒜_{i_{j}}$ for each $1 \leq j \leq n$ , we have $κ_{n} (a_{1}, \dots, a_{n}) = 0$ .

Proof. See [13, Lecture 11]. □

Corollary 8.8. If $a$ and $b$ are free, then $κ_{n} (a + b) = κ_{n} (a) + κ_{n} (b)$ .

Proof. Suppose that $a$ and $b$ are free. Then

κ_{n} (a + b) = κ_{n} (a + b, \dots, a + b) = κ_{n} (a, \dots, a) + \dots + κ_{n} (b, \dots, b)

and the intermediate terms vanish, so the above reads as

κ_{n} (a + b) = κ_{n} (a) + κ_{n} (b)

as claimed. □

Example 8.9. If $a$ is standard semicircular, the moment-cumulant formula says

\sum_{π \in 𝑁𝐶 (n)} κ_{π} = {\begin{matrix} Cat (m) & if n = 2 m \\ 0 & otherwise \end{matrix} .

This relation is satisfied by

κ_{n} = {\begin{matrix} 1 & if n = 2 \\ 0 & otherwise \end{matrix} .

Then, if $a$ and $b$ are free semicirculars, we have

κ_{n} (a + b) = κ_{n} (a) + κ_{n} (b) = {\begin{matrix} 2 & if n = 2 \\ 0 & otherwise \end{matrix}

m_{2 k} (a + b) = \sum_{π \in 𝑁𝐶 (2 k)} κ_{π} (a + b) = \sum_{π \in N C_{2} (2 k)} 2^{k} = 2^{k} Cat (k)

which are the moments of the semicircle measure with variance $2$ . In general, the semicircle measure with variance $t$ is given by the density

d μ_{t} (x) = {\begin{matrix} \frac{1}{2 𝜋𝑡} \sqrt{4 t - x^{2}} 𝑑𝑥 & if x \in [- 2 \sqrt{t}, 2 \sqrt{t}] \\ 0 & otherwise \end{matrix} .

The observation above can be generalized as follows: we have a semigroup ${(μ_{t})}_{t \geq 1}$ with respect to free additive convolution, in the sense that $μ_{s} ⊞ μ_{t} = μ_{s + t}$ . This is a fundamental property of semicircular variables. In particular, if $A$ and $B$ are GUEs, then the asymptotic distribution of $A + B$ is still semicircular, just with a bigger radius.

Remark 8.10. The semigroup ${(μ_{t})}_{t \geq 1}$ of semicircular measures mentioned above is a special case of an important general construction in free probability. Namely, for any $μ \in M_{c} (ℝ)$ , there is a semigroup ${(μ_{t})}_{t \geq 1}$ with $μ_{1} = μ$ , $μ_{s + t} = μ_{s} ⊞ μ_{t}$ for all $s, t \geq 1$ , and $t \mapsto μ_{t}$ weakly continuous. The measures $μ_{t}$ can be constructed using compressions by free projections in $C^{*}$ -probability spaces, see [13, Lecture 14].

8.2. Cauchy transform and Stieltjes inversion.

Definition 8.11. For $μ \in M (ℝ)$ , the Cauchy transform of $μ$ is the function $G_{μ}$ defined by

G_{μ} (z) : = \int_{ℝ} \frac{1}{z - t} 𝑑𝜇 (t)

for $z \in ℍ^{+}$ .

Proposition 8.12. Let $μ \in M_{c} (ℝ)$ .

$G_{μ}$ is a holomorphic map $ℍ^{+} \to ℍ^{-}$ .

With $r : = \sup_{t \in supp (μ)} | t |$ , we have

G_{μ} (z) = \sum_{n = 0}^{\infty} m_{n} z^{- n - 1}

for $| z | /mo> r$ .

We have
$\lim_{y \to \infty} 𝑖𝑦 G_{μ} (𝑖𝑦) = 1 .$

Theorem 8.13 (Stieltjes inversion). For $μ \in M (ℝ)$ , we have

μ ((a, b)) + \frac{1}{2} μ {a, b} = - \lim_{𝜀 \to 0^{+}} \frac{1}{π} \int_{a}^{b} Im G_{μ} (t + 𝑖𝜀) 𝑑𝑡

for $a /mo> b$ . If $μ, ν \in M (ℝ)$ have $G_{μ} = G_{ν}$ , then $μ = ν$ .

Example 8.14 (Semicircle distribution). Recall that the standard semicircle distribution has the density

𝑑𝜇 (t) = {\begin{matrix} \frac{1}{2 π} \sqrt{4 - t^{2}} & if t \in [- 2, 2] \\ 0 & otherwise \end{matrix} .

The moments are

\int_{ℝ} t^{m} 𝑑𝜇 (t) = {\begin{matrix} Cat (k) & if m = 2 k \\ 0 & otherwise \end{matrix}

where $Cat (k) : = \frac{1}{k + 1} (\binom{2 k}{k})$ are the Catalan numbers. The Catalan numbers satisfy the following recursion: $Cat (0) = 1$ and

Cat (k) = \sum_{r = 1}^{k} Cat (r - 1) Cat (k - r)

for $k \geq 1$ . We can use this to compute the Cauchy transform:

\begin{array}{l} G (z) & = \frac{1}{z} + \sum_{k = 1}^{\infty} Cat (k) \frac{1}{z^{2 k + 1}} \\ = \frac{1}{z} + \sum_{k = 1}^{\infty} (\sum_{r = 1}^{k} Cat (r - 1) Cat (k - r)) \frac{1}{z^{2 k + 1}} \\ = \frac{1}{z} + \frac{1}{z} \sum_{k = 1}^{\infty} \sum_{r = 1}^{k} \frac{Cat (r - 1)}{z^{2 r - 1}} \frac{Cat (k - r)}{z^{2 (k - r) + 1}} \\ = \frac{1}{z} + \frac{1}{z} \sum_{r = 1}^{\infty} \frac{Cat (r - 1)}{z^{2 r - 1}} \sum_{k = r}^{\infty} \frac{Cat (k - r)}{z^{2 (k - r) + 1}} \\ = \frac{1}{z} + \frac{1}{z} \sum_{r = 1}^{\infty} \frac{Cat (r - 1)}{z^{2 r - 1}} G (z) \\ = \frac{1}{z} + \frac{1}{z} G {(z)}^{2} \end{array}

so $G (z)$ satisfies the quadratic equation $G {(z)}^{2} - 𝑧𝐺 (z) + 1 = 0$ . By the quadratic formula, we have

G (z) = \frac{z \pm \sqrt{z^{2} - 4}}{2} .

Since $G (z)$ satisfies $𝑖𝑦𝐺 (𝑖𝑦) \to 1$ , we need to pick the minus:

G (z) = \frac{z - \sqrt{z^{2} - 4}}{2} .

What do we actually mean by $\sqrt{z^{2} - 4}$ ? For $z \in ℍ^{+}$ , let $𝜃_{1}$ be the angle between the $x$ -axis and the line from $2$ to $z$ , and let $𝜃_{2}$ be the angle between the $x$ -axis and the line from $- 2$ to $z$ . Then $z - 2 = | z - 2 | e^{i 𝜃_{1}}$ and $z + 2 = | z + 2 | e^{i 𝜃_{2}}$ , and we let $\sqrt{z^{2} - 4} = | z^{2} - 4 |^{1 ∕ 2} e^{i \frac{𝜃_{1} + 𝜃_{2}}{2}}$ .

Now let’s see what our inversion procedure does: we have

Im \sqrt{{(t + 𝑖𝜀)}^{2} - 4} = | {(t + 𝑖𝜀)}^{2} - 4 |^{1 ∕ 2} \sin (\frac{𝜃_{1} + 𝜃_{2}}{2})

and when $𝜀 \to 0^{+}$ , we have two cases: either $| t | /mo> 2,
in which case 𝜃_{1} and 𝜃_{2} both go
to either 0 or π, so the \sin goes
to \sin (0) = \sin (2 π) = 0; or | t | \leq 2 and one
of 𝜃_{1}, 𝜃_{2} goes to 0 while the

other goes to π,
so the \sin goes to \sin (π ∕ 2) = 1 .$

So if $| t | /mo> 2,
we have$

\lim_{𝜀 \to 0^{+}} (- \frac{1}{π} Im G (t + 𝑖𝜀)) = Im (- \frac{t}{2 π}) = 0 .

If $| t | \leq 2$ , we have

\begin{array}{l} \lim_{𝜀 \to 0^{+}} (- \frac{1}{π} Im G (t + 𝑖𝜀)) & = - \frac{1}{2 π} \lim_{𝜀 \to 0^{+}} Im (t + 𝑖𝜀 - \sqrt{{(t + 𝑖𝜀)}^{2} - 4}) \\ = - \frac{1}{2 π} \lim_{𝜀 \to 0^{+}} (𝜀 - | {(t + 𝑖𝜀)}^{2} - 4 |^{1 ∕ 2} \sin (\frac{𝜃_{1} + 𝜃_{2}}{2})) \\ = \frac{1}{2 π} \sqrt{4 - t^{2}} . \end{array}

This recovers the familiar semicircular density.

8.3. $R$ -transform.

Theorem 8.15 (Voiculescu). Let $μ \in M (ℝ)$ with $supp (μ) \subseteq [- R, R]$ , and let $G$ be the Cauchy transform. Then there is an analytic function $R (z)$ on $| z | /mo>\frac{1}{6 R}$ with

$G (R (z) + 1 ∕ z) = z$ for $0 /mo> | z | /mo>\frac{1}{6 R}$ , and
$R (z) = \sum_{n = 0}^{\infty} κ_{n + 1} z^{n}$ for $| z | /mo>\frac{1}{6 R}$ .

Corollary 8.16. For $μ, ν \in M_{c} (ℝ)$ , we have $R_{μ ⊞ ν} (z) = R_{μ} (z) + R_{ν} (z)$ .

Example 8.17. Let $μ = \frac{1}{2} (δ_{1} + δ_{- 1})$ . We will compute $μ ⊞ μ$ and solve the problem posed at the beginning of class; this is a classic textbook example of what free convolutions look like.

The first step is to find the Cauchy transform, and then in turn the $R$ -transform. The Cauchy transform is easy:

G_{μ} (z) = \int_{ℝ} \frac{1}{z - t} 𝑑𝜇 (t) = \frac{1}{2} (\frac{1}{z - 1} + \frac{1}{z + 1}) = \frac{z}{z^{2} - 1} .

For the $R$ -transform, we want to use the functional equation $G_{μ} (R_{μ} (z) + z^{- 1}) = z$ . Let $K_{μ} (z) : = R_{μ} (z) + z^{- 1}$ , so

z = G_{μ} (K_{μ} (z)) = \frac{K_{μ} (z)}{K_{μ} {(z)}^{2} - 1} ⟹ K_{μ} {(z)}^{2} - z^{- 1} K_{μ} (z) - 1 = 0

and the quadratic formula says

K_{μ} (z) = \frac{z^{- 1} \pm \sqrt{z^{- 2} + 4}}{2} = \frac{1 \pm \sqrt{1 + 4 z^{2}}}{2 z} .

To pick which one, observe that we are supposed to have $R_{μ} (0) = 0$ ; the minus sign would give something non-zero divided by zero, while the plus sign would give $0 ∕ 0$ , so the plus sign is the only possiblity. This makes

R_{μ} (z) = \frac{\sqrt{1 - 4 z^{2}} - 1}{2 z} .

Now, we can use the linearization property of the $R$ -transform:

R_{μ ⊞ μ} (z) = R_{μ} (z) + R_{μ} (z) = \frac{\sqrt{1 - 4 z^{2}} - 1}{z} ⟹ K_{μ ⊞ μ} (z) = \frac{\sqrt{1 + 4 z^{2}}}{z} .

One can show (see the proof of [13, Theorem 12.7]) that $z = K_{μ ⊞ μ} (G_{μ ⊞ μ} (z))$ . Then

z = K_{μ ⊞ μ} (G_{μ ⊞ μ} (z)) = \frac{\sqrt{1 + 4 G_{μ ⊞ μ} {(z)}^{2}}}{G_{μ ⊞ μ} (z)} ⟹ (z^{2} - 4) G_{μ ⊞ μ} {(z)}^{2} - 1 = 0

and $G_{μ ⊞ μ} (z) = \frac{1}{\sqrt{z^{2} - 4}}$ .

Finally, we can do Stieltjes inversion with this:

\begin{array}{l} \lim_{𝜀 \to 0} (- \frac{1}{π} Im G_{μ ⊞ μ} (t + 𝑖𝜀)) & = - \frac{1}{π} \lim_{𝜀 \to 0} Im \frac{1}{\sqrt{{(t + 𝑖𝜀)}^{2} - 4}} \\ = - \frac{1}{π} Im \frac{1}{\sqrt{t^{2} - 4}} . \end{array}

So if $| t | \leq 2$ , the above is

- \frac{1}{π} Im \frac{1}{\sqrt{t^{2} - 4}} = - \frac{1}{π} Im \frac{1}{i \sqrt{4 - t^{2}}} = \frac{1}{π} Im \frac{i}{\sqrt{4 - t^{2}}} = \frac{1}{π \sqrt{4 - t^{2}}}

and otherwise, there is no imaginary part so it’s just $0$ .

8.4. Exercises.

Exercise 8.18. Let $μ = δ_{0}$ . Compute the limit in the Stieltjes inversion formula and explain why you couldn’t have worked directly with the integrand. Note: the point of this exercise is to see how Stieltjes inversion works when you have atoms.

9. Expected characteristic polynomials

In this section, we will use familiar combinatorial methods to compute some expected characteristic polynomials. This is the starting point for a new theory called finite free probability, which we will spend the next few classes surveying.

Notation 9.1. For $A \in M_{N} (ℂ)$ , say $A = {(a_{𝑖𝑗})}_{1 \leq i, j \leq N}$ , and $S, T \subseteq [N]$ , write

A (S, T) : = {(a_{𝑖𝑗})}_{\begin{array}{c} i \in S \\ j \in T \end{array}} .

Let $c_{x} (A) : = \det (𝑥𝐼 - A)$ be the characteristic polynomial of $A$ .

Proposition 9.2. For $A \in M_{N} (ℂ)$ , we have

c_{x} (A) = \sum_{k = 0}^{N} x^{N - k} {(- 1)}^{k} 𝖾_{k} (A) where 𝖾_{k} (A) : = \sum_{\begin{array}{c} S \subseteq [N] \\ | S | = k \end{array}} \det (A (S, S)) .

Proof. See [9, Section 1.2]. □

The first example of an expected characteristic polynomial is straightforward:

Example 9.3 (GUE and Hermite polynomials). Let $A$ be an $N \times N$ GUE. Denoting by $Sym (S)$ the set of permutations of $S \subseteq [N]$ , we have

\begin{array}{l} 𝔼 𝖾_{m} (A) & = \sum_{| S | = m} 𝔼 \det (A (S, S)) \\ = \sum_{| S | = m} \sum_{σ \in Sym (S)} sgn (σ) 𝔼 (\prod_{i \in S} a_{𝑖𝜎 (i)}) \\ = \sum_{| S | = m} \sum_{σ \in Sym (S)} sgn (σ) \sum_{π \in P_{2} (S)} \prod_{(r, s) \in π} 𝔼 (a_{𝑟𝜎 (r)} a_{𝑠𝜎 (s)}) . \end{array}

If $m$ is odd, then the inner sum above is empty and the whole expression is just $0$ . This means the expected characteristic polynomial is even in the sense that it is of the form

x^{N} + (\dots) x^{N - 2} + (\dots) x^{N - 4} + \dots .

In other words, the roots come in plus-minus pairs.

Now suppose $m = 2 k$ . Then

\begin{array}{l} 𝔼 𝖾_{2 k} (A) & = \sum_{| S | = 2 k} \sum_{σ \in Sym (S)} sgn (σ) \sum_{π \in P_{2} (S)} \prod_{(r, s) \in π} 𝔼 (a_{𝑟𝜎 (r)} a_{𝑠𝜎 (s)}) \\ = \sum_{| S | = 2 k} \sum_{σ \in Sym (S)} sgn (σ) \sum_{π \in P_{2} (S)} \frac{1}{N^{k}} δ_{\begin{array}{c} r = σ (s) \\ s = σ (r) \\ \forall (r, s) \in π \end{array}} \\ = \frac{1}{N^{k}} \sum_{| S | = 2 k} \sum_{π \in P_{2} (S)} sgn (π) \\ = \frac{1}{N^{k}} (\binom{N}{2 k}) (2 k - 1)!! {(- 1)}^{k} \\ = \frac{1}{N^{k}} \frac{N!}{(2 k)! (N - 2 k)!} \frac{(2 k)!}{2^{k} k!} {(- 1)}^{k} \\ = \frac{{(N)}_{2 k}}{N^{k}} \frac{{(- 1)}^{k}}{2^{k} k!} . \end{array}

This is the very famous Hermite polynomial with degree $N$ .

The central computation that gives rise to finite free probability involves the randomly rotated matrices we discussed previously. Next class, we will turn the following theorem into a definition of a “finite free convolution” operation on polynomials.

Theorem 9.4 ([11]). Let $A$ and $B$ be normal $N \times N$ matrices and let $U$ be a random $N \times N$ unitary matrix. Then

𝔼 c_{x} (A + 𝑈𝐵 U^{*}) = \sum_{k = 0}^{N} x^{N - k} {(- 1)}^{k} \sum_{i + j = k} \frac{{(N)}_{k}}{{(N)}_{i} {(N)}_{j}} 𝖾_{i} (A) 𝖾_{j} (B)

and

𝔼 c_{x} (𝐴𝑈𝐵 U^{*}) = \sum_{k = 0}^{N} x^{N - k} {(- 1)}^{k} \frac{1}{(\binom{N}{k})} 𝖾_{k} (A) 𝖾_{k} (B) .

Proof. First of all, we can assume without loss of generality that $A$ and $B$ are diagonal with

A = diag (a_{1}, \dots, a_{N}) and B = diag (b_{1}, \dots, b_{N})

as outlined in the exercises. Let $X = A + 𝑈𝐵 U^{*}$ ; we need to compute $𝔼 (X (S, S))$ for $| S | = k$ . We have

\begin{array}{l} \det (X (S, S)) & = \sum_{σ \in Sym (S)} sgn (σ) \prod_{i \in S} x_{𝑖𝜎 (i)} \\ = \sum_{σ \in Sym (S)} sgn (σ) \prod_{i \in S} (a_{i} δ_{i = σ (i)} + \sum_{p = 1}^{N} u_{𝑖𝑝} b_{p} \bar{u_{σ (i) p}}) \\ = \sum_{σ \in Sym (S)} sgn (σ) \sum_{R \subseteq S} \prod_{i \in R} a_{i} δ_{i = σ (i)} \prod_{i \in S ∖ R} \sum_{p = 1}^{N} u_{𝑖𝑝} b_{p} \bar{u_{σ (i) p}} \\ = \sum_{R \subseteq S} (\prod_{i \in R} a_{i}) \sum_{σ \in Sym (S ∖ R)} sgn (σ) \sum_{𝐩 : S ∖ R \to [N]} \prod_{i \in S ∖ R} u_{i 𝐩 (i)} b_{𝐩 (i)} \bar{u_{σ (i) 𝐩 (i)}} \\ = \sum_{R \subseteq S} (\prod_{i \in R} a_{i}) \sum_{𝐩 : S ∖ R \to [N]} (\prod_{i \in S ∖ R} b_{𝐩 (i)}) \\ \sum_{σ \in Sym (S ∖ R)} sgn (σ) \prod_{i \in S ∖ R} u_{i 𝐩 (i)} \bar{u_{σ (i) 𝐩 (i)}} \end{array}

so we need to compute

\sum_{σ \in Sym (S ∖ R)} sgn (σ) 𝔼 (\prod_{i \in S ∖ R} u_{i 𝐩 (i)} \bar{u_{σ (i) 𝐩 (i)}})

for each $| S | = k$ , $R \subseteq S$ , and $𝐩 : S ∖ R \to [N]$ .

As outlined in the exercises, this can be reduced to the following problem: for $0 \leq n \leq N$ and $𝐩 : [n] \to [N]$ , compute

\sum_{σ \in S_{n}} sgn (σ) 𝔼 (\prod_{i = 1}^{n} u_{i 𝐩 (i)} \bar{u_{σ (i) 𝐩 (i)}}) .

By Weingarten calculus, it’s equal to

\sum_{σ \in S_{n}} sgn (σ) \sum_{\begin{array}{c} α, β \in S_{n} \\ i = σ (α (i)) \\ 𝐩 (i) = 𝐩 (β (i)) \end{array}} {Wg}_{N, n} (α^{- 1} β) = \sum_{σ \in S_{n}} sgn (σ) \sum_{\begin{array}{c} α \in S_{n} \\ 𝐩 = 𝐩 \circ α \end{array}} {Wg}_{N, n} (𝜎𝛼);

the reduction from a double sum to a single sum was made by observing that $i = σ (α (i))$ for all $1 \leq i \leq n$ forces $α = σ^{- 1}$ . If $𝐩$ is not injective, say $𝐩 (i) = 𝐩 (j)$ with $i \neq j$ , then one can use the transposition $(i, j)$ to set up cancellation of pairs in the outer sum, as outlined in the exercises. So in this case, the expression above is $0$ .

Now, assume $𝐩$ is injective, so the constraint $𝐩 = 𝐩 \circ α$ makes $α = e$ . Using the character expansion, the above is equal to

\frac{1}{n!^{2}} \sum_{λ ⊢ n} \frac{\dim {(λ)}^{2}}{𝗌_{λ} (1^{N})} \sum_{σ \in S_{n}} sgn (σ) χ^{λ} (σ)

and the inner sum can be immediately recognized as a scalar multiple of the inner product $⟨ χ^{1^{n}}, χ^{λ} ⟩$ . As such, the orthogonality relations give

\sum_{σ \in S_{n}} sgn (σ) χ^{λ} (σ) = {\begin{matrix} n! & if λ = 1^{n} \\ 0 & otherwise \end{matrix}

so the above is

\frac{1}{n!} \frac{\dim {(1^{n})}^{2}}{𝗌_{1^{n}} (1^{N})} = \frac{1}{n!} \prod_{(i, j) \in 1^{n}} n! N (N - 1) \dots (N - n + 1) = \frac{(N - n)!}{N!} .

In total, what we have shown is that

\sum_{σ \in S_{n}} sgn (σ) 𝔼 (\prod_{i = 1}^{n} u_{i 𝐩 (i)} \bar{u_{σ (i) 𝐩 (i)}}) = {\begin{matrix} \frac{(N - n)!}{N!} & if 𝐩 injective \\ 0 & otherwise \end{matrix} .

Putting this back into the big formula, we have

\begin{array}{l} 𝔼 𝖾_{k} (A + 𝑈𝐵 U^{*}) & = \sum_{| S | = k} 𝔼 \det (X (S, S)) \\ = \sum_{| S | = k} \sum_{R \subseteq S} (\prod_{i \in R} a_{i}) \sum_{\begin{array}{c} 𝐩 : S ∖ R \to [N] \\ injective \end{array}} (\prod_{i \in S ∖ R} b_{𝐩 (i)}) \frac{(N - | S ∖ R |)!}{N!} \\ = \sum_{| S | = k} \sum_{R \subseteq S} \det (A (R, R)) | S ∖ R |! 𝖾_{| S ∖ R |} (B) \frac{(N - | S ∖ R |)!}{N!} \\ = \sum_{| S | = k} \sum_{r = 0}^{k} \frac{(k - r)! (N - (k - r))!}{N!} 𝖾_{k - r} (B) \sum_{\begin{array}{c} R \subseteq S \\ | R | = r \end{array}} \det (A (R, R)) \\ = \sum_{r = 0}^{k} \frac{(k - r)! (N - (k - r))!}{N!} 𝖾_{k - r} (B) \sum_{\begin{array}{c} R \subseteq [N] \\ | R | = r \end{array}} \det (A (R, R)) \\ \sum_{\begin{array}{c} R \subseteq S \subseteq [N] \\ | S | = k \end{array}} 1 \\ = \sum_{r = 0}^{k} (\binom{N - r}{k - r}) \frac{(k - r)! (N - (k - r))!}{N!} 𝖾_{r} (A) 𝖾_{k - r} (B) \\ = \sum_{r = 0}^{k} \frac{(N - r)! (N - (k - r))!}{N! (N - k)!} 𝖾_{r} (A) 𝖾_{k - r} (B) \end{array}

which is the claim of (1) in the theorem. The claim of (2) can be proved by similar techniques; the character computation can be reused, and the only difference is in the final “reassembly” step. □

9.1. Exercises.

Exercise 9.5. Let $A$ and $B$ be normal $N \times N$ matrices with eigenvalues $(a_{1}, \dots, a_{N})$ and $(b_{1}, \dots, b_{N})$ respectively. Let $U$ be a random $N \times N$ unitary matrix.

Let
$D_{A} = diag (a_{1}, \dots, a_{N}) and D_{B} = diag (b_{1}, \dots, b_{N}) .$

Prove that
$𝔼 c_{x} (A + 𝑈𝐵 U^{*}) = 𝔼 c_{x} (D_{A} + U D_{B} U^{*}) .$

When you need to compute

\sum_{σ \in Sym (S)} sgn (σ) 𝔼 (\prod_{i \in S} u_{i 𝐩 (i)} \bar{u_{σ (i) 𝐩 (i)}})

for $S \subseteq [N]$ with $| S | = n$ and $𝐩 : S \to [N]$ in the proof of Theorem 9.4, why can you safely assume $S = [n]$ ?

Exercise 9.6. Let $0 \leq n \leq N$ , and let $𝐩 : [n] \to [N]$ be a multi-index which is not injective. Prove that

\sum_{σ \in S_{n}} sgn (σ) 𝔼 (\prod_{i = 1}^{n} u_{i 𝐩 (i)} \bar{u_{σ (i) 𝐩 (i)}}) = 0 .

10. Finite free probability

Notation 10.1. Let $p$ be a polynomial of degree $N$ . We write the coefficients as follows:

p (x) = \sum_{k = 0}^{N} x^{N - k} {(- 1)}^{k} 𝖾_{k} (p) .

One can immediately show that

where $λ_{1}, \dots, λ_{N}$ are the roots of $p$ . (Write $p$ as a product of linear factors and expand.) This is the so-called elementary symmetric polynomial, perhaps familiar to you from algebraic combinatorics or undergrad algebra.

𝖾_{k} (p) = \sum_{1 \leq i_{1} /mo> \dots /mo> i_{k} \leq N λ_{i_{1}} \dots λ_{i_{k}}}

Definition 10.2. For monic polynomials $p$ and $q$ with degree $N$ , define

p ⊞_{N} q : = \sum_{k = 0}^{N} x^{N - k} {(- 1)}^{k} \sum_{i + j = k} \frac{{(N)}_{k}}{{(N)}_{i} {(N)}_{j}} 𝖾_{i} (p) 𝖾_{j} (p)

and

p ⊠_{N} q : = \sum_{k = 0}^{N} x^{N - k} {(- 1)}^{k} \frac{1}{(\binom{N}{k})} 𝖾_{k} (p) 𝖾_{k} (q) .

It must be emphasized that this definition is merely a repackaging of Theorem 9.4.

Theorem 10.3 (Quadrature for finite free convolution). Let $p$ and $q$ be monic polynomials of degree $N$ . Let $A$ and $B$ be diagonal $N \times N$ matrices with $c_{x} (A) = p (x)$ and $c_{x} (B) = q (x)$ . Then

𝔼 c_{x} (A + 𝑈𝐵 U^{*}) = p ⊞_{N} q

and

𝔼 c_{x} (𝐴𝑈𝐵 U^{*}) = p ⊠_{N} q

where $U$ is a random matrix from any of the following groups:

$U_{N}$ – $N \times N$ unitary matrices;
$O_{N}$ – $N \times N$ orthogonal matrices;
$S_{N}$ – $N \times N$ permutation matrices;
$𝐙_{r} ≀ S_{N}$ – $N \times N$ permutation matrices signed by $r$ -th roots of unity, for $1 \leq r \leq \infty$ , where the case $r = \infty$ is shorthand for the circle group.

Theorem 10.4 ([17, 14]). Let $p$ and $q$ be monic polynomials of degree $N$ . Suppose that $p$ and $q$ are real-rooted. Then

$p ⊞_{N} q$ is real-rooted;
if at least one of $p$ or $q$ is non-negative-rooted, then $p ⊠_{N} q$ is real-rooted;
if both $p$ and $q$ are non-negative-rooted, then $p ⊠_{N} q$ is non-negative-rooted.

Notation 10.5. For a polynomial $p$ of degree $N$ , write

ρ (p) : = \frac{1}{N} \sum_{i = 1}^{N} δ_{λ_{i}}

where $λ_{1}, \dots, λ_{N}$ are the roots of $p$ with multiplicity.

Theorem 10.6 ([10, 2, 1]). For $N \geq 1$ , let $p_{N}$ and $q_{N}$ be monic real-rooted polynomials with degree $N$ , and suppose that there are some $μ, ν \in M_{c} (ℝ)$ with $ρ (p_{N}) \to μ$ and $ρ (q_{N}) \to ν$ . Then

$ρ (p_{N} ⊞_{N} q_{N}) \to μ ⊞ ν$ ;
if “real-rooted” is replaced by “non-negative-rooted” and $M_{c} (ℝ)$ is replaced by $M_{c} (ℝ_{\geq 0})$ , then $ρ (p_{N} ⊠_{N} q_{N}) \to μ ⊠ ν$ .

Remark 10.7. We have not discussed Voiculescu’s multiplicative convolution $⊠$ , but you can find information about it in [13]. We will eventually prove (1); the proof of (2) is similar and can be found in [1].

10.1. Finite free cumulants. Recall that for a self-adjoint variable $a$ , the free cumulants are a sequence ${(κ_{n})}_{n \geq 1}$ , related to the moment sequence by a sum over noncrossing partitions, in which $κ_{1}$ is the mean and $κ_{2}$ is the variance.

Definition 10.8 ([2]). The finite free cumulants are defined implicitly by

𝖾_{n} (p) = \frac{1}{N^{n}} (\binom{N}{n}) \sum_{π \in P (n)} N^{| π |} μ (π) κ_{π}^{(N)} (p)

where $μ (π) : = {(- 1)}^{n - | π |} \prod_{i \geq 2} i!^{m_{i + 1} (π)}$ and $m_{j} (π)$ is the number of blocks in $π$ with size $j$ .

Remark 10.9. The finite free cumulants can also be defined by a moment-cumulant formula; we will do this next class.

Example 10.10. With $n = 1$ , the relation above says that

𝖾_{1} (p) = N κ_{1}^{(N)} (p) ⟹ κ_{1}^{(N)} (p) = \frac{1}{N} \sum_{i = 1}^{N} λ_{i}

where $λ_{1}, \dots, λ_{N}$ are the roots of $p$ . This is just the mean of $ρ (p)$ .

With $n = 2$ , it gives

κ_{2} = \frac{1}{N} 𝖾_{1} 𝖾_{1} - \frac{2}{N - 1} 𝖾_{2}

which is not the variance. We will see later that

κ_{2} = \frac{N}{N - 1} (m_{2} - m_{1}^{2}) .

Example 10.11. Recall the expected characteristic polynomial of GUE:

𝔼 c_{x} (A) = \sum_{k = 0}^{⌊ N ∕ 2 ⌋} x^{N - 2 k} {(- 1)}^{k} \frac{{(N)}_{2 k}}{2^{k} N^{k} k!}

Claim: the finite free cumulants are

κ_{n}^{(N)} = {\begin{matrix} 1 & if n = 2 \\ 0 & otherwise \end{matrix}

To check this: putting these $κ$ s into the definition, we have

\frac{1}{N^{n}} (\binom{N}{n}) \sum_{π \in P (n)} N^{| π |} μ (π) κ_{π}^{(N)} = \frac{1}{N^{n}} (\binom{N}{n}) \sum_{π \in P_{2} (n)} N^{| π |} μ (π)

which is $0$ if $n$ is odd. If $n = 2 k$ , then the above is

\begin{array}{l} \frac{1}{N^{2 k}} (\binom{N}{2 k}) \sum_{π \in P_{2} (2 k)} N^{k} {(- 1)}^{2 k - k} & = \frac{1}{N^{2 k}} (\binom{N}{2 k}) (2 k - 1)!! N^{k} {(- 1)}^{k} \\ = \frac{1}{N^{k}} \frac{N!}{(2 k)! (N - 2 k)!} \frac{(2 k)!}{2^{k} k!} {(- 1)}^{k} \\ = {(- 1)}^{k} \frac{{(N)}_{2 k}}{2^{k} N^{k} k!} \end{array}

which matches.

Theorem 10.12 ([2]). $κ_{n}^{(N)} (p ⊞_{N} q) = κ_{n}^{(N)} (p) + κ_{n}^{(N)} (q)$

Proof. We have

\begin{array}{l} \frac{1}{{(N)}_{k}} 𝖾_{k} (p ⊞_{N} q) \\ = \sum_{i + j = k} \frac{1}{{(N)}_{i}} 𝖾_{i} (p) \frac{1}{{(N)}_{j}} 𝖾_{j} (q) & (Definition 10.2) \\ = \sum_{i + j = k} (\frac{1}{N^{i} i!} \sum_{π_{1} \in P (i)} N^{| π_{1} |} μ (π_{1}) κ_{π_{1}}^{(N)} (p)) (\frac{1}{N^{j} j!} \sum_{π_{2} \in P (j)} N^{| π_{2} |} μ (π_{2}) κ_{π_{2}}^{(N)} (q)) & (Definition 10.8) \\ = \frac{1}{N^{k} k!} \sum_{i + j = k} (\binom{k}{i}) (\sum_{π_{1} \in P (i)} N^{| π_{1} |} μ (π_{1}) κ_{π_{1}}^{(N)} (p)) (\sum_{π_{2} \in P (j)} N^{| π_{2} |} μ (π_{2}) κ_{π_{2}}^{(N)} (q)) \\ = \frac{1}{N^{k} k!} \sum_{S \subseteq [k]} (\sum_{π_{1} \in P (S)} N^{| π_{1} |} μ (π_{1}) κ_{π_{1}}^{(N)} (p)) (\sum_{π_{2} \in P ([k] ∖ S)} N^{| π_{2} |} μ (π_{2}) κ_{π_{2}}^{(N)} (q)) \\ = \frac{1}{N^{k} k!} \sum_{π \in P (k)} \sum_{π = π_{1} \cup π_{2}} N^{| π_{1} | + | π_{2} |} μ (π_{1}) μ (π_{2}) κ_{π_{1}}^{(N)} (p) κ_{π_{2}}^{(N)} (q) \\ = \frac{1}{N^{k} k!} \sum_{π \in P (k)} N^{| π |} μ (π) \sum_{π = π_{1} \cup π_{2}} κ_{π_{1}}^{(N)} (p) κ_{π_{2}}^{(N)} (q) \\ = \frac{1}{N^{k} k!} \sum_{π \in P (k)} N^{| π |} μ (π) {(κ^{(N)} (p) + κ^{(N)} (q))}_{π} \end{array}

so $κ_{n}^{(N)} (p ⊞_{N} q) = κ_{n}^{(N)} (p) + κ_{n}^{(N)} (q)$ . □

10.2. Exercises.

Exercise 10.13. Prove (3) in Theorem 10.3. Hint: the only difference is in computing

\sum_{σ \in S_{n}} sgn (σ) 𝔼 (\prod_{i = 1}^{n} u_{i 𝐩 (i)} \bar{u_{σ (i) 𝐩 (i)}})

where $U = {(u_{𝑖𝑗})}_{i, j}$ is a random permutation matrix. Show that the expression above is the same as it is for $U \in U_{N}$ .

11. Asymptotics of finite free convolution

Reminder: free moment-cumulant formula is

m_{n} = \sum_{π \in 𝑁𝐶 (n)} κ_{π}

For example,

$m_{1} = κ_{1}$
$m_{2} = κ_{2} + κ_{1}^{2}$
$m_{3} = κ_{3} + 3 κ_{2} κ_{1} + κ_{1}^{3}$

Last class, we introduced a sequence of finite free cumulants associated to a degree $N$ polynomial $p$ , denoted by $κ_{n}^{(N)} (p)$ . In this class, we will develop a moment-cumulant relation and use it to prove that the finite free cumulants characterize convergence in root distribution.

11.1. Combinatorics of maps. Recall that the noncrossing partitions can be characterized as follows:

𝑁𝐶 (n) ≃ {α \in S_{n} : # (α) + # (α^{- 1} γ_{n}) = n + 1}

where $γ_{n} : = (1, \dots, n)$ and $# (\cdot)$ is the number of disjoint cycles. Furthermore, we have

# (α) + # (α^{- 1} γ_{n}) \leq n + 1

for all $α \in S_{n}$ , and the LHS decreases in twos. We need to generalize this phenomenon:

Theorem 11.1. Let $α, γ \in S_{n}$ and suppose that $⟨ α, γ ⟩$ is transitive. Then there is some $g \geq 0$ such that

# (α) + # (α^{- 1} γ) = (n + 2 - # (γ)) - 2 g .

Idea of proof. For each $α$ , there is a surface

whose boundary consists of disjoint discs, each of which has a cycles of $γ$ drawn on it in order;
which has the permutation $α$ drawn on it, in such a way that the cycles do not cross and each cycle is homeomorphic to a circle oriented in such a way that the normal vector points outward.

Shrinking the discs to points produces a closed surface with a graph drawn on it. The Euler characteristic formula says $V - E + F = 2 - 2 g$ ; the number of vertices is $# (γ)$ , the number of edges is $n$ , and the number of faces is $# (α) + # (α^{- 1} γ)$ . This makes

# (γ) - n + # (α) + # (α^{- 1} γ) = 2 (1 - g) .

□

Example 11.2. We have already seen an instance of this theorem, when $γ = (1, \dots, n)$ and $α$ is an involution with no fixed points. In this case, with $n = 2 k$ , the formula reads as

k + # (α^{- 1} γ) + 1 = 2 k + 2 - 2 g ⟺ # (α^{- 1} γ) = k + 1 - 2 g

and we recover the exponents from the GUE genus expansion.

Example 11.3. More generally, we saw earlier in the course that there is a bijection

𝑁𝐶 (n) ≃ {α \in S_{n} : # (α) + # (α^{- 1} γ_{n}) = n + 1}

where $γ_{n} : = (1, \dots, n)$ . This is the case in Theorem 11.1 where $γ = (1, \dots, n)$ and $g = 0$ .

Example 11.4 ([12]). Let $γ = (1, 2, 3)$ , and $α = (1, 2, 3)$ and $β = (1, 3, 2)$ . Then

α^{- 1} γ = (1, 3, 2) (1, 2, 3) = (1) (2) (3) and β^{- 1} γ = (1, 2, 3) (1, 2, 3) = (1, 3, 2)

and we can check genus:

# (α) + # (α^{- 1} γ) = 1 + 3 = 4 and n + 2 - # (γ) = 3 + 2 - 1 = 4

so $g = 0$ , while

# (β) + # (β^{- 1} γ) = 1 + 1 = 2 and n + 2 - # (γ) = 3 + 2 - 1 = 4

so $g = 1$ . You should draw some pictures to verify that the results of these formal computations are sensible. The critical difference between $α$ and $β$ , when you do this, will be the orientation requirement.

11.2. Moment-cumulant formulas.

Proposition 11.5 ([2, 1]). The finite free cumulants satisfy (and are defined implicitly by) the following moment-cumulant relation:

\begin{array}{l} m_{n} & = \frac{{(- 1)}^{n - 1}}{N^{n + 1} (n - 1)!} \sum_{\begin{array}{c} α, β \in S_{n} \\ ⟨ α, β ⟩ \leq S_{n} transitive \end{array}} {(- N)}^{# (α) + # (β)} κ_{α}^{(N)} \\ = \frac{{(- 1)}^{n - 1}}{N^{n + 1} (n - 1)!} \sum_{γ \in S_{n}} \sum_{\begin{array}{c} α \in S_{n} \\ ⟨ α, γ ⟩ \leq S_{n} transitive \end{array}} {(- N)}^{# (α) + # (α^{- 1} γ)} κ_{α}^{(N)} \end{array}

Example 11.6. With $n = 1$ , the above says

m_{1} = \frac{1}{N^{2}} {(- N)}^{2} κ_{1}^{(N)} = κ_{1}^{(N)}

so there is no difference yet. With $n = 2$ , it says

\begin{array}{l} m_{2} & = \frac{- 1}{N^{3}} ({(- N)}^{3} κ_{1}^{(N)} κ_{1}^{(N)} + {(- N)}^{3} κ_{2}^{(N)} + {(- N)}^{2} κ_{2}^{(N)}) \\ = κ_{2}^{(N)} + κ_{1}^{(N)} κ_{1}^{(N)} - \frac{1}{N} κ_{2}^{(N)} \end{array}

m_{2} - m_{1}^{2} = (1 - \frac{1}{N}) κ_{2}^{(N)} ⟹ κ_{2}^{(N)} = \frac{N}{N - 1} (m_{2} - m_{1}^{2}) .

When $N \to \infty$ , the rational function in $N$ goes away and you recover the free case.

With $n = 3$ , the pairs $α, β$ which generate a transitive subgroup are the following:

$e, (1, 2, 3)$
$e, (1, 3, 2)$
$(1, 2), (2, 3)$
$(1, 2), (1, 3)$
$(1, 3), (1, 2)$
$(1, 3), (2, 3)$
$(2, 3), (1, 3)$
$(2, 3), (1, 2)$
$(1, 2, 3), e$
$(1, 3, 2), e$
$(1, 2), (1, 2, 3)$
$(1, 2), (1, 3, 2)$
$(1, 3), (1, 2, 3)$
$(1, 3), (1, 3, 2)$
$(2, 3), (1, 2, 3)$
$(2, 3), (1, 3, 2)$
$(1, 2, 3), (1, 2)$
$(1, 2, 3), (1, 3)$
$(1, 2, 3), (2, 3)$
$(1, 2, 3), (1, 2, 3)$
$(1, 2, 3), (1, 3, 2)$
$(1, 3, 2), (1, 2)$
$(1, 3, 2), (1, 3)$
$(1, 3, 2), (2, 3)$
$(1, 3, 2), (1, 2, 3)$
$(1, 3, 2), (1, 3, 2)$

So the moment-cumulant formula says

\begin{array}{l} m_{n} & = \frac{1}{2 N^{4}} (2 {(- N)}^{3 + 1} κ_{1} κ_{1} κ_{1} + 6 {(- N)}^{2 + 2} κ_{2} κ_{1} + 2 {(- N)}^{3 + 1} κ_{3} + 6 {(- N)}^{2 + 1} κ_{2} κ_{1} \\ + 3 {(- N)}^{1 + 2} κ_{3} + 2 {(- N)}^{1 + 1} κ_{3} + 3 {(- N)}^{1 + 2} κ_{3} + 2 {(- N)}^{1 + 1} κ_{3}) \end{array}

Notice, again, that the leading order is

m_{3} = κ_{1}^{3} + 3 κ_{1} κ_{2} + κ_{3} + O (N^{- 1})

which matches the moment-cumulant formula for free cumulants.

Theorem 11.7 ([2]). Let $μ$ be a compactly supported measure, and for $N \geq 1$ , let $p_{N}$ be a monic real-rooted polynomial with degree $N$ . Then the following are equivalent:

$ρ (p_{N}) \to μ$ weakly;
$κ_{n}^{(N)} (p_{N}) \to κ_{n} (μ)$ for all $n \geq 1$ .

Proof. First, assume that $κ_{n}^{(N)} (p_{N}) \to κ_{n} (μ)$ for all $n \geq 1$ . Then by the moment-cumulant formula, we have

\begin{array}{l} \lim_{N \to \infty} m_{n} (p_{N}) \\ = \lim_{N \to \infty} \frac{{(- 1)}^{n - 1}}{(n - 1)!} \sum_{\begin{array}{c} α, β \in S_{n} \\ ⟨ α, β ⟩ transitive \end{array}} {(- 1)}^{# (α) + # (β)} {(\frac{1}{N})}^{n + 1 - # (α) - # (β)} κ_{α}^{(N)} (p) \\ = \lim_{N \to \infty} \frac{{(- 1)}^{n - 1}}{(n - 1)!} \sum_{γ \in S_{n}} \\ \sum_{\begin{array}{c} α \in S_{n} \\ ⟨ α, γ ⟩ transitive \end{array}} {(- 1)}^{# (α) + # (α^{- 1} γ)} {(\frac{1}{N})}^{n + 1 - # (α) - # (α^{- 1} γ)} κ_{α}^{(N)} (p) \end{array}

From Theorem 11.1, since $g \geq 0$ and $# (γ) \geq 1$ , we have

# (α) + # (α^{- 1} γ) \leq n + 2 - # (γ) \leq n + 1

This means the expression above does not blow up, and the summands that survive are the ones where this inequality is an equality. So we can continue the above as

\begin{array}{l} \lim_{N \to \infty} m_{n} (p_{N}) & = \frac{{(- 1)}^{n - 1}}{(n - 1)!} \sum_{γ \in S_{n}} \sum_{\begin{array}{c} α \in S_{n} \\ ⟨ α, γ ⟩ transitive \\ # (α) + # (α^{- 1} γ) = n + 1 \end{array}} {(- 1)}^{n + 1} κ_{α} \\ = \frac{1}{(n - 1)!} \sum_{γ \in S_{n}} \sum_{\begin{array}{c} α \in S_{n} \\ ⟨ α, α^{- 1} γ ⟩ transitive \\ # (α) + # (α^{- 1} γ) = n + 1 \end{array}} κ_{α} \end{array}

Now, if $γ \in S_{n}$ has $# (γ) /mo> 1,
then we have$

so the inner sum is empty. One can check the inner sum also only depends on $γ$ through the cycle type: if $γ^{'} = 𝜎𝛾 σ^{- 1}$ , then

\begin{array}{l} \sum_{\begin{array}{c} α \in S_{n} \\ ⟨ α, γ^{'} ⟩ transitive \\ # (α) + # (α^{- 1} γ^{'}) = n + 1 \end{array}} κ_{α} & = \sum_{\begin{array}{c} α \in S_{n} \\ ⟨ 𝜎𝛼 σ^{- 1}, 𝜎𝛾 σ^{- 1} ⟩ transitive \\ # (𝜎𝛼 σ^{- 1}) + # ({(𝜎𝛼 σ^{- 1})}^{- 1} (𝜎𝛾 σ^{- 1})) = n + 1 \end{array}} κ_{𝜎𝛼 σ^{- 1}} \\ = \sum_{\begin{array}{c} α \in S_{n} \\ ⟨ α, γ ⟩ transitive \\ # (α) + # (α^{- 1} γ) = n + 1 \end{array}} κ_{α} . \end{array}

So if $# (γ) = 1$ , then the limiting expression reads as

# (α) + # (α^{- 1} γ) = n + 2 - # (γ) - 2 g /mo> n + 2 - 1 - 2 g = n + 1 - 2 g \leq n + 1

\sum_{\begin{array}{c} α \in S_{n} \\ # (α) + # (α^{- 1} γ_{n}) = n + 1 \end{array}} κ_{α} = \sum_{π \in 𝑁𝐶 (n)} κ_{π} = m_{n}

which shows that $\lim_{N \to \infty} m_{n} (p_{N}) \to m_{n} (μ)$ .

Conversely, assume that $ρ (p_{N}) \to μ$ weakly, and assume for the sake of simplicity that the measures $ρ (p_{N})$ have uniformly bounded support (this isn’t a serious restriction, see [1, Appendix B]), so $m_{n} (p_{N}) \to m_{n} (μ)$ for all $n \geq 1$ . We will prove by induction on $n$ that $κ_{n}^{(N)} (p_{N}) \to κ_{n} (μ)$ for all $n \geq 1$ .

For the base case $n = 1$ , we have

\lim_{N \to \infty} κ_{1}^{(N)} (p_{N}) = \lim_{N \to \infty} m_{1} (p_{N}) = m_{1} (μ) = κ_{1} (μ) .

Now suppose the claim is true up to $n - 1$ . Then the moment-cumulant formula can be re-arranged: the $α$ which produce $κ_{n}^{(N)}$ are exactly the ones with only one cycle, and for these, any $β$ will produce a transitive subgroup. So we have

\begin{array}{l} \frac{{(- 1)}^{n - 1}}{N^{n + 1} (n - 1)!} \sum_{\begin{array}{c} α, β \in S_{n} \\ # (α) = 1 \end{array}} {(- N)}^{1 + # (β)} κ_{n}^{(N)} (p_{N}) \\ = m_{n} (p_{N}) - \frac{{(- 1)}^{n - 1}}{N^{n + 1} (n - 1)!} \sum_{\begin{array}{c} α, β \in S_{n} \\ ⟨ α, β ⟩ transitive \\ # (α) /mo> 1 \end{array}} \end{array} On the LHS, we have \begin{array}{l} \frac{{(- 1)}^{n}}{(n - 1)!} \sum_{β \in S_{n}} {(- 1)}^{# (β)} {(\frac{1}{N})}^{n - # (β)} κ_{n}^{(N)} (p_{N}) \\ = κ_{n}^{(N)} (p_{N}) \frac{1}{(n - 1)!} \sum_{β \in S_{n}} {(- \frac{1}{N})}^{n - # (β)} \end{array}

and by (1) the induction assumption and (2) the moment convergence assumption, on the RHS we have

\begin{array}{l} \lim_{N \to \infty} (m_{n} (p_{N}) \\ - \frac{{(- 1)}^{n - 1}}{(n - 1)!} \sum_{\begin{array}{c} α, γ \in S_{n} \\ ⟨ α, γ ⟩ transitive \\ # (α) /mo> 1 \end{array}}) & = \lim_{N \to \infty} m_{n} (p_{N}) - \frac{1}{(n - 1)!} \sum_{\begin{array}{c} α, γ \in S_{n} \\ ⟨ α, γ ⟩ transitive \\ # (α) /mo> 1 & # (α) + # (α^{- 1} γ) = n + 1 \end{array}} & = m_{n} (μ) - \sum_{\begin{array}{c} π \in 𝑁𝐶 (n) \\ | π | /mo> 1 \end{array}} & So the RHS has a limit, and the non-cumulant part of the LHS converges to & 1 & . This situation corresponds
to the following: let & a_{k} & and & b_{k} & be
sequences where & a_{k} \to 1 & , & a_{k} \neq 0 & , and & a_{k} b_{k} \to L & for
some & L \in ℝ & .
Then we have \end{array}

b_{k} = \frac{a_{k} b_{k}}{a_{k}} \to L

by the quotient rule for limits of sequences. So we can conclude that

which proves the theorem. □

\lim_{N \to \infty} κ_{n}^{(N)} (p_{N}) = m_{n} (μ) - \sum_{\begin{array}{c} π \in 𝑁𝐶 (n) \\ | π | /mo> 1 \end{array}}

Corollary 11.8. For $N \geq 1$ , let $p_{N}$ and $q_{N}$ be monic real-rooted polynomials with degree $N$ . Let $μ$ and $ν$ be compactly supported measures and suppose that $ρ (p_{N}) \to μ$ and $ρ (q_{N}) \to ν$ . Then $ρ (p_{N} ⊞_{N} q_{N}) \to μ ⊞ ν$ .

Proof. Again, assume for the sake of simplicity that the supports of $ρ (p_{N})$ and $ρ (q_{N})$ are uniformly bounded. Then $m_{n} (p_{N}) \to m_{n} (μ)$ and $m_{n} (q_{N}) \to m_{n} (ν)$ . By Theorem 11.7, we have $κ_{n}^{(N)} (p_{N}) \to κ_{n} (μ)$ and $κ_{n}^{(N)} (q_{N}) \to κ_{n} (ν)$ for all $n \geq 1$ .

Now, we can use Theorem 10.4 to ensure that $p_{N} ⊞_{N} q_{N}$ is real-rooted, and apply Theorem 10.12 and Corollary 8.8 to obtain

κ_{n}^{(N)} (p_{N} ⊞_{N} q_{N}) = κ_{n}^{(N)} (p_{N}) + κ_{n}^{(N)} (q_{N}) \to κ_{n} (μ) + κ_{n} (ν) = κ_{n} (μ ⊞ ν)

for all $n \geq 1$ . Applying Theorem 11.7 again, we can conclude that $ρ (p_{N} ⊞_{N} q_{N}) \to μ ⊞ ν$ . □

References

1.: Octavio Arizmendi, Jorge Garza-Vargas, and Daniel Perales, Finite free cumulants: Multiplicative convolutions, genus expansion and infinitesimal distributions, Trans. Amer. Math. Soc. 376 (2023), no. 6, 4383–4420.
2.: Octavio Arizmendi and Daniel Perales, Cumulants for finite free convolution, J. Combin. Theory Ser. A 155 (2018), 244–266.
3.: Philippe Biane, Some properties of crossings and partitions, Discrete Math. 175 (1997), no. 1-3, 41–53.
4.: Patrick Billingsley, Probability and measure, third ed., Wiley Series in Probability and Mathematical Statistics, John Wiley &Sons, Inc., New York, 1995.
5.: Benoît Collins, Moments and cumulants of polynomial random variables on unitary groups, the Itzykson-Zuber integral, and free probability, Int. Math. Res. Not. (2003), no. 17, 953–982.
6.: Benoît Collins, Moment methods on compact groups: Weingarten calculus and its applications, ICM—International Congress of Mathematicians. Vol. 4. Sections 5–8, EMS Press, Berlin, 2023, pp. 3142–3164.
7.: Benoît Collins and Piotr Śniady, Integration with respect to the Haar measure on unitary, orthogonal and symplectic group, Comm. Math. Phys. 264 (2006), no. 3, 773–795.
8.: Pavel Etingof, Oleg Golberg, Sebastian Hensel, Tiankai Liu, Alex Schwendner, Dmitry Vaintrob, and Elena Yudovina, Introduction to representation theory, Student Mathematical Library, vol. 59, American Mathematical Society, Providence, RI, 2011, With historical interludes by Slava Gerovitch.
9.: Roger A. Horn and Charles R. Johnson, Matrix analysis, second ed., Cambridge University Press, Cambridge, 2013.
10.: Adam W. Marcus, Polynomial convolutions and (finite) free probability, 2021, arXiv:
2108.07054 [math.CO].
11.: Adam W. Marcus, Daniel A. Spielman, and Nikhil Srivastava, Finite free convolutions of polynomials, Probab. Theory Related Fields 182 (2022), no. 3-4, 807–848.
12.: James A. Mingo and Roland Speicher, Free probability and random matrices, Fields Institute Monographs, vol. 35, Springer, New York; Fields Institute for Research in Mathematical Sciences, Toronto, ON, 2017.
13.: Alexandru Nica and Roland Speicher, Lectures on the combinatorics of free probability, London Mathematical Society Lecture Note Series, vol. 335, Cambridge University Press, Cambridge, 2006.
14.: G. Szegö, Bemerkungen zu einem Satz von J. H. Grace über die Wurzeln algebraischer Gleichungen, Math. Z. 13 (1922), no. 1, 28–55.
15.: Dan Voiculescu, Symmetries of some reduced free product $C^{*}$ -algebras, Operator algebras and their connections with topology and ergodic theory (Buşteni, 1983), Lecture Notes in Math., vol. 1132, Springer, Berlin, 1985, pp. 556–588.
16.: _________ , Limit laws for random matrices and free products, Invent. Math. 104 (1991), no. 1, 201–220.
17.: J. L. Walsh, On the location of the roots of certain types of polynomials, Trans. Amer. Math. Soc. 24 (1922), no. 3, 163–180.