Author Archives: rolandspeicher

On the origin of moment-cumulant formulas

When I gave a class on free probability theory a few years ago, I thought it would be a good idea to localize evidence for my usual statement that in the classical context the idea of viewing moment-cumulant formulas in terms of (multiplicative functions on) set partitions, as well as the vanishing of mixed cumulants in independent random variables, goes back to Rota; the main reference on this seemed to be the Foundations of Combinatorial Theory papers, part I and part VI. This is at least what I said in my old papers, like here or here, and what Jonathan Novak, for example, also iterates in his nice Three Lectures on Free Probability. But when I tried to find any mentioning of cumulants in those papers of Rota I could not localize anything. Also in the paper of Rota with Shen, On the Combinatorics of Cumulants, there is no clear mentioning of vanishing of mixed cumulants. I am still quite sure that I learned a lot and was inspired by the papers of Rota, but maybe this was more about multiplicative functions, and cumulants did not show up there explicitly. At this point I decided to ask Jonathan whether he has some clearer idea about the origin of the classical moment-cumulant formulas. Here is what he said:

About your question, I remember also having a difficult time tracking down a proof of the equivalence of independence and mixed cumulants vanishing in the literature.  I actually think that the earliest paper where this statement is explicitly made, with a complete proof given, is “Cumulants and partition lattices,” by T.P. Speed, Australian Journal of Statistics 25 (1983), 378-388. An annotated version of this paper appears in the collected works of Speed, edited by P. McCullagh (Chapter 6 of the volume). I hope this helps; I don’t know an earlier reference.

I was happy with this and more or less forgot about it. But a few days ago the same issue came up after a talk of Philippe Biane in the online seminar on Algebraic and Combinatoiral Perspectives in the Mathematical Sciences. It seems a couple of people are interested in this and could also provide a bit more information on aspects of the origin of moment-cumulant formulas, and maybe cumulants in general. So I thought it might be a good idea to collect here this information and invite others to add possibly some more remarks on this history.

Franz Lehner offered the following insightful remarks:

Here are some considerations concerning “Rota’s approach to cumulants”.

Both in his posthumous paper Rota/Shen: On the combinatorics of cumulants, J. Combin. Theory Ser. A 91 (2000), and in his Fubini lectures Twelve problems in probability no one likes to bring up he talks about the “Rota approach” but always with reference to Speed. So apparently he never published it himself explicitly, although he certainly knew it for a long time. Speed did not prove any new results, but gave elegant lattice theoretic proofs of known results (his notation is a bit messy though).

On the other hand it must be said that Rota did not invent the Möbius function either as he repeatedly mentions in his 1964 paper, but he clearly saw its fundamental importance (and proved some important results). Rota was a bird in the sense of Dyson and without his efforts to systematize and popularize it, the Möbius function would have remained in its oblivious state for yet another generation.

According to Rota, the Möbius function was invented by Weisner in the thirties. Multiplicative functions and the reduced incidence algebra were introduced in Doubilet/Rota/Stanley Foundations of Combinatorial Theory VI: the idea of generating function, 6th Berkeley Symposium on Probability, 1970/71. Cumulants are not mentioned there, but maybe not without reason this paper appeared in a Symposium on Probability. Similarly his 1964 paper on Möbius functions was not probabilistic, yet it appeared in Probability Theory and Related Fields. Again without mentioning cumulants explicitly, probably because “to prevent the length of this paper from growing beyond bounds, we have omitted applications of the theory”.

He just mentions in passing on p.359 that Schützenberger computed the Möbius function of the partition lattice (independently of Frucht and himself). Indeed in the cited paper Contribution aux applications statistiques de la théorie de l’information (Publ. Inst. Statist. Univ. Paris, 3(1-2):3–117, 1954, Thèse d’État) Schützenberger states as a remark on p.24 the Möbius formula for cumulants. To my knowledge this is the earliest occurrence; Leonov and Shiryaev also use the partition formalism in their 1959 paper, but apparently don’t know the concept of Möbius inversion.

In the statistics literature these developments went largely unnoticed for a long time and the graph theoretic calculation rules of Fisher, Kendall, James etc, remained the tool of choice, see the foreword by McCullagh to Speed’s collected works.

Joachim Kock added the following:

In Kendall’s ‘Advanced Theory of Statistics’ from 1945, there is already an explicit formula for cumulants in terms of moments, and the Möbius function (-1)^{n-1} (n-1)! appears explicitly in the formula! But of course, he doesn’t know that this combinatorial factor is the Möbius function.

In the notes he attributes various moment-cumulant relations to Frisch’s PhD thesis (Oslo 1926), but I don’t know if this particular formula is in there.

Regarding the Möbius function for posets, Weisner’s paper is from 1935 but it only deals with complete lattices, whereas Hall (independently) has the Möbius function for finite posets in his 1936 paper. In both cases, their proofs actually work the same for locally finite posets, which is Rota’s level of generality. (Stretching it a little bit, it is actually the same arguments that work for Möbius categories, and for certain abstract coalgebras.)

In case you are in a historical mood, allow me to advertise my paper From Möbius inversion to renormalisation. (It has no cumulants, though.)

Referring to the combinatorial factor, Franz could add some more insights:

Yes, this expression is already in Thiele’s 1899 paper (reprinted in Anders Hald The Early History of the Cumulants and the Gram-Charlier Series, International Statistical Review 68 (2000) 137-153, in English), but of course not realized as Möbius function, because that one was not known before Schützenberger.

It remains to clarify who was the first to explicitly write moments as a sum over set partitions. Leonov & Shiryaev just infer it from the factorial formula without comment and Schützenberger simply says “nous supposerons connu le fait”.

Thanks to Franz and Joachim for their remarks. Any more comments are welcome …

Another blog on “Free Probability” by Teo Banica

Teo Banica got a bit bored by the lockdown and started to write a series of blogs on various topics, close to his heart and his knowledge – one of them is also one free probability. Check it out here. It’s written in Teo’s personal style, which might seem annoying or provocative to some, but in any case it’s interesting …

Correction on my lecture notes on random matrices (weak convergence versus convergence of moments)

I just noticed that I have a stupid mistake in my random matrix lecture notes (and also in the recording of the corresponding lecture). I am replacing the notes with a new version which corrects this.

In Theorem 4.16, I was claiming that weak convergence is equivalent to convergence of moments, in a setting where all moments exist and the limit is determined by its moments. Of course, this is a too optimistic statement. What is true is the direction that convergence of moments implies weak convergence. That’s the important direction. The other direction would be more of a relevance for combinatorial aficionados like me, as it would allow me to claim that the combinatorial and the analytical approach in such a setting are equivalent. However, the other direction is clearly wrong without some additional assumptions; and thus there are nice-looking situations where one cannot prove weak convergence by dealing with moments.

Of course this is not a new insight. In the context of proving the convergence to the semicircle for Wigner matrices with non-zero mean for the entries we know that we cannot do this with moments (see for example, Remark 11 in Chapter 4 of my book with Jamie).

To get a kind of positive spin out of this annoying mistake, I started to think about what kind of convergence we actually want in our theorems in free probability. Usually our convergence is in distribution, i.e., we are looking on moments – which seems to be the natural thing to do in the multivariate case of several non-commuting operators. However, we can also project things down to the classical world of one variable by taking functions in our operators and ask for the convergence of all such functions. And then there might be a difference whether we ask for weak convergence or for convergence in distribution (i.e., convergence of all moments).

This might become kind of relevant in the context of rational functions. Sheng Yin showed in Non-commutative rational functions in strong convergent random variables that convergence in distribution goes over from polynomials to rational functions (in the case where we assume that the rational function in the limit is a bounded operator) if we assume strong convergence on the level of polynomials (i.e., also convergence of the operator norms). Without the assumption of strong convergence it is easy to see that there are examples (see page 12 of the paper of Sheng) where one has convergence in distribution for the polynomials, but not for the rational functions. However, though one does not have convergence of the moments of the rational function, it is still true in this example that one has weak convergence of the (selfadjoint) rational function. So maybe it could still be the case that, even without strong convergence assumptions, convergence in distribution for polynomials (or maybe weak convergence for polynomials) implies weak convergence for rational functions. At least at the moment we do not know a counter example to this.

Another online seminar: Wales MPPM Zoom Seminar

At the moment there are many online activities going on …. and here is another one: the Wales Mathematical Physics Zoom Seminar, organized by Edwin BeggsDavid EvansGwion Evans,Rolf GohmTim Porter.

Why do I mention in particular this one; there are at least two reasons. Today there is a talk by Mikael Rordam around the Connes embedding problem, and next week I will give a talk, on my joint work with Tobias Mai and Sheng Yin of the last years around rational functions of random matrices and operators.

If you are interested in any of this, here is the website of the seminar, where you can find more information.

Update: The talks are usually recorded and posted on a youtube channel. There you can find my talk on “Random Matrices and Their Limits”.

The saga ends …

I have now finished my class on random matrices. The last lecture motivated the notion of (asymptotic) freeness from the point of view of looking on independent GUE random matrices. So you might think that there should now be continuations on free probability and alike coming soon. But actually this part of the story was already written and recorded and if you don’t want to spoil the tension you should watch the series not in its historical but in its logical order:

  1. Random Matrices (videos, homepage of class)
  2. Free Probability Theory (videos, homepage of class)
  3. Non-commutative Distributions (and Operator-Valued free Probability Theory) (videos, homepage of class)

More information, in particular the underlying script (sometimes in a handwritten version, sometimes in a more polished texed version), can be found on the corresponding home page of the lecture series.

May freeness be with you …

Is there an impact of a negative solution to Connes’ embedding problem on free probability?

There is an exciting new development on Connes’ embedding problem. The recent preprint MIP*=RE by Ji, Natarajan, Vidick, Wright, Yuen claims to have solved the problem to the negative via a negative answer to Tsirelson’s problem via the relation to decision problems on the class MIP* of languages that can be decided by a classical verifier interacting with multiple all powerful quantum provers. I have to say that I don’t really understand what all this is about – but in any case there is quite some excitement about this and there seems to be a good chance that Connes’ problem might have a negative solution. To get some idea about the excitement around this, you might have look on the blogs of Scott Aaronson or of Gil Kalai. At the operator front I have not yet seen much discussion, but it might be that we still have to get over our bafflement.

Anyhow, there is now a realistic chance that there are type II factors which are not embeddable and this raises the question (among many others) what this means for free probability. I was asked this by a couple of people and as I did not have a really satisfying answer I want to think a bit more seriously about this. At the moment my answer is just: Okay, we have our two different approaches to free entropy and a negative solution to Connes embedding problem means that they cannot always agree. This is because we always have for the non-microstates free entropy \chi^* that \chi^*(x_1+\sqrt\epsilon s_n,\dots,x_n+\sqrt\epsilon s_n)>-\infty, if s_1,\dots,s_n are free semicircular variables which are free from x_1,\dots,x_n. The same property for the microstates free entropy \chi, however, would imply that x_1,\dots,x_n have microstates, i.e., the von Neumann algebra generated by x_1,\dots,x_n is embeddable; see these notes of Shlyakhtenko.

But does this mean more then just saying that there are some von Neumann algebras for which we don’t have microstates but for which the non-microstates approach give some more interesting information, or is there more to it? I don’t know, but hopefully I will come back with more thoughts on this soon.

Of course, everybody is invited to share more information or thoughts on this!

Welcome to the Non-Commutative World!

About two weeks ago I posted with Tobias Mai on the archive the preprint “A Note on the Free and Cyclic Differential Calculus”. Here is what we say in the abstract:

In 2000, Voiculescu proved an algebraic characterization of cyclic gradients of noncommutative polynomials. We extend this remarkable result in two different directions: first, we obtain an analogous characterization of free gradients; second, we lift both of these results to Voiculescu’s fundamental framework of multivariable generalized difference quotient rings. For that purpose, we develop the concept of divergence operators, for both free and cyclic gradients, and study the associated (weak) grading and cyclic symmetrization operators, respectively. One the one hand, this puts a new complexion on the initial polynomial case, and on the other hand, it provides a uniform framework within which also other examples – such as a discrete version of the Ito stochastic integral – can be treated.

At the moment I am not in the mood to say more specifically about this preprint (maybe Tobias or I will do so later), but I want to take the opportunity — in particular as the first anniversary of this blog is also coming closer — to put this in a bigger context and mumble a bit about the bigger picture and our dreams … so actually about what this blog should be all about.

Free probability theory has come a long way. Whereas born in the subject of operator algebras, the realization that is also has to say quite a bit about random matrices paved the way to its use in many (and, in particular, also applied) subjects. Hence there are now also papers in statistics, like this one, or in deep learning, like this one or this one, which use tools from free probability for their problems. The last words on how far the use of free probability goes in those subjects are surely not yet spoken but I am looking forward to see more on this.

This is of course all great and nice for our subject, but on the other hand there is also a bigger picture in the background, where I would hope for some more fundamental uses of free probability.

This goes roughly like this. There is the classical world, where we are dealing with numbers and functions and everything commutes; then there is our non-commutative world, where we are dealing with operators and limits of random matrices and where on the basic level nothing commutes. That’s where quite a bit of maximal non-commutative mathematics has been (and is still being) developed from various points of views:

  • free probability deals with a non-commutative notion of independence for non-commuting random variables;
  • there is a version of a non-commutative differential calculus which allows to talk about derivatives in non-commutative variables; my paper with Tobias mentioned above is in this context and tries to formalize and put all this a bit further;
  • free analysis (or free/non-commutative function theory) aims at a non-commutative version of classical complex analysis, i.e., a theory of analytic functions in non-commuting variables;
  • free quantum groups provide the right kind of symmetries for such non-commuting variables.

The nice point is that all those subjects have their own source of motivation but it turns out that there are often relations between them which are non-commutative analogues of classical results.

So, again this is all great and nice, BUT apart from the commutative and our maximal non-commutative world there is actually the, maybe most important, quantum world. This is of course also non-commutative, but only up to some point. There operators don’t commute in general, but commutativity is replaced by some other relations, like the canonical commutation relations, and there are actually still a lot of operators which commute (for example, measurements which are at space-like positions are usually modeled by commuting operators). Because of this commutativity, basic concepts of free probability do not have a direct application there.

Here is a bit more concretely what I mean with that. In free probability we have free analogues of such basic concepts as entropy or Fisher information. There are a lot of nice statements and uses of those concepts and via random matrices they can also be seen as arising as a kind of large N limit of the corresponding classical concepts. However, in the classical world those concepts have usually also a kind of operational meaning by being the answer to fundamental questions. For example, the classical Shannon entropy is the answer to the question how much information one can transmit over classical channels. Now there are quantum channels and one can ask how much information one can transmit over them; again there are answers in terms of an entropy, but this is unfortunately not free entropy, but von Neumann entropy, a more commutative non-commutative cousin of classical entropy. There are just too many tensor products showing up in the quantum world which prevent a direct use of basic free probability concepts. But still, I am dreaming of finding some day operational meanings of free entropy and similar quantities.

Anyhow, I hope to continue to explain in this blog more of the concrete results and problems which we have in free probability and related subjects; but I just wanted to point out that there are also some bigger dreams in the background.

Class on “Random Matrices”, Winter Term 2019/20

Our winter term has just started, running from mid October 2019 to mid February 2020, with a two-week break around Christmas. This term I am giving an introduction to random matrices. Again, the lectures will be recorded and put online. The lectures can be found on our video platform; more info on the lectures are also on the website of the class.

The lectures will follow roughly the material from the same class of summer term 2018, for which there exist also texed lecture notes. There will be a few reorganizations and shifts in the material, so there might emerge also a new version of the lectures notes sometimes in the future …