Author Archives: rolandspeicher

PhD position for a project on “Free Probability Aspects of Neural Networks” at Saarland University

I have funding from the German Science Foundation DFG for a PhD position on the interrelations between free probability, random matrices, and neural networks. Below are more details. See also here for the announcement as a pdf.

Project: Neural networks are, roughly speaking, functions of many parameters in a high-dimensional space and the training of such a network consists in finding the parameters such that the function does what it is supposed to do on the “training inputs”, but also generalizing this in a meaningful way to “real test inputs”. Random matrix and free probability theory are mathematical theories which deal with typical behaviours of functions (which have an underlying matrix structure) in high dimensions and in the limit of large matrix size. Thus it is not surprising that those theories should have something to offer for describing and dealing with neural networks. Accordingly, there have been approaches relying on random matrix and/or free probability theory to investigate questions around deep learning. This interaction between free probability and neural networks is hoped to be bi-directional in the long run. However, the project will not address practical purposes of deep learning; instead we want to take the deep learning challenges as new questions around random matrices and free probability and we aim to develop those theories further on a mathematical level.

Prerequisites: Applicants should have an equivalent of a Master’s degree and a background in at least one of the subjects

  • free probability
  • random matrices
  • neural networks

and an interest in learning the remaining ones and, in particular, in working on their interrelations.

Application: Inquiries and applications should be addressed to Roland Speicher. Your application, in German or in English, should arrive before June 20, 2021. It should contain your curriculum vitae and an abstract of your Master’s thesis. Arrange also for at least one recommendation letter to be sent directly to Roland Speicher, preferably by email. State in your application the name of those you asked for such a letter .

Contact:
Prof. Dr. Roland Speicher
Saarland University
Department of Mathematics
Postfach 15 11 50
66041 Saarbrücken
Germany
speicher@math.uni-sb.de
https://www.uni-saarland.de/lehrstuhl/speicher/

Announcement of two talks on free probability at the Technion … and of some more talks

update (from Jan 27): the recordings of the talks of Tobias and mine have been uploaded to youtube, here are the direct links:

I will give a colloquium talk at the Math Department of the Technion, Israel on next Monday, January 25 – online, of course. They have the nice option of a pre-colloquium talk, which provides students with some background for the material appearing in the colloquium talk. Tobias agreed to give such a preparation for my talk. So he will give tomorrow (on Thursday, January 21) an introduction to free probability and its relation with random matrices. Surely a great opportunity for everyone to learn (more) about the subject.

My talk will, of course, have such material in the background, but I tried to prepare it in such a way that even without knowing about free probability one should be able to get the main ideas. So, it might help to know what free semi-circulars are, but it is not necessary (and not assumed) for my talk.

Below are the titles and abstracts of our talks; and here is a link to the Technion page with access information:

Colloquium

update: actually, next week seems to be a busy week for talks around free probability; don’t forget that the UC Berkeley Probabilistic Operator Algebra Seminar will be starting again, on Monday, January 25, with a talk of Friedrich Goetze; and then there will also be a talk by Serban Belinschi on “The Christoffel-Darboux kernel in noncommutative probability” at the Probability Seminar at  Warsaw University of Technology, on Tuesday, January 26.

Tobias Mai: What actually is free probability theory? (Thursday, January 21, 2021)

In my talk, I want to answer this question by giving an introduction to the underlying ideas, basic concepts, and fundamental results of free probability theory. In particular, I will highlight the deep connections of this field with random matrix theory.

Roland Speicher: Singularity of matrices in non-commuting variables and free probability (Monday, January 25, 2021)

The Edmonds’ problem asks to decide about the singularity of a given matrix with linear polynomials in commuting variables as entries, or more general to compute the rank of such a matrix over the field of rational functions. This problem has no known deterministic polynomial time algorithm and it relates to fundamental questions in complexity theory.

Recently, there has been much interest in analyzing a non-commutative variant of the Edmonds’ problem, where the entries are linear polynomials in non-commuting variables and the rank is over the field of non-commutative rational functions (aka free skew field). Garg, Gurvits, Oliveira, and Wigderson showed that for this non-commutative Edmonds’ problem there exists a deterministic polynomial time algorithm. This problem has a remarkable number of diverse origins and motivations and I will present in my talk another such manifestation of the problem, arising from the relation with free probability and random matrix theory. In particular, this approach results also in another, quite analytic, algorithm for calculating the non-commutative rank.

This talk is based on joint work with Johannes Hoffmann, Tobias Mai, and Sheng Yin.

Nothing New on Connes’ Embedding

It’s now almost a year that we have been told that Connes’ embedding conjecture is not a conjecture anymore, but that it’s actually false. In principle, this is great news as it should open totally new playgrounds, with von Neumann algebras never seen before. The only problem is that we still have not seen them. I am sure that many are looking for them but as far as I am aware nobody outside the quantum information community was able to shed more light on the refutation of Connes’ embedding.

As a believer in the power of non-commutative distributions I tried all my arsenal of moments, cumulants, or Cauchy transforms to get a grasp on how such a non-embeddable von Neumann algebra could look like — of course, without any success. But let me say a few more words an some of my thoughts – if only to come up with a bit longer post for the end of the year.

In our non-commutative distribution language, the refutal of Connes’ embedding says that there are operators in a tracial von Neumann algebra whose mixed moments cannot be approximated by moments of matrices with respect to the trace. We have quite a few of distributions in free probability theory, but the main problem in the present context is that all of them usually can be approximated by matrices, and also all available constructions (like taking free products) preserve such approximations (in particular, since we can model free independence via conjugation by unitary random matrices). Very roughly: our constructions of distributions take some input and then produce some distribution — however, if the input is embeddable, then the output will be so, too. Thus I cannot use those constructions directly to make the leap from our known universe to the new ones which should be out there. The only way I see to overcome this obstruction is to look for distributions which create themselves “out of nothing” via such constructions, i.e., for fixed point distributions of those constructions. For such fixed point distributions I see at least no apriori reason to be embeddable.

But is there any way to make this concrete? My naive attempt is to use the transition from moments to cumulants (or, more analyticially, from Cauchy transforms to R-transforms) for this. We know that infinitely divisible distributions (in particular, compound Poisson ones) are given in the form that their free cumulants are essentially the moments of some other distribution. So I am trying to find reasonable fixed points of this mapping, i.e., I am looking for distributions whose cumulants are (up to scaling or shift) the same as their moments. Unfortunately, all concrete such distributions seem to arise via solving the fixed point equation in an iterative way – which is also bad from our embedding point of view, since those iterations also seem to preserve embeddability. So I have to admit complete and utter failure.

Anyhow, if the big dreams are not coming true, one should scale down a bit and see whether anything interesting is left … so let me finally come to something concrete, which might, or might not, have some relevance …

In the case of one variable we are looking on probability measures, and as those can be approximated by discrete measures with uniform weights on the atoms (thus by the distribution of matrices), this situation is not relevant for Connes’ embedding question. However, I wonder whether a fixed point of the moments-to-cumulants mapping in this simple situation has any relevance. The only meaningful mapping in this case seems to be that I take a moment sequence, shift it by 2 and then declare it as a cumulant sequence — necessarily of an infinitely divisible distribution. Working out the fixed point of this mapping gives the following sequence of even moment/cumulants: 1, 1, 3, 14, 84, 596. The Online Encyclopedia of Integer Sequences labels this as A088717 — which gives, though, not much more information than the fixed point equation for the generating power series.

The above moment-cumulant mapping was of course using free cumulants. Doing the same with classical cumulants gives by a not too careful quick calculation the sequence 1, 1, 4, 34, 496, which seems to be https://oeis.org/A002105 — which goes under the name ”reduced tangent numbers”. There are also a couple of links to various papers, which I still have to check …

Okay, I suppose that’s it for now. Any comment on the relevance or meaning of the above numbers, or their probability distributions, would be very welcome – as well, as any news on Connes’ conjecture.

Lecture Notes on Non-Commutative Distributions

Waiting has come to an end … finally the pdf edition of the Lecture Notes on Non-Commutative Distributions has arrived. As a bonus for loyal followers I have added, compared to the actual content of the lecture series, two small sections at the end on what our machinery has to say about Connes embedding problem and the q-Gaussian distribution. Though, don’t expect too much there …

On the origin of moment-cumulant formulas

When I gave a class on free probability theory a few years ago, I thought it would be a good idea to localize evidence for my usual statement that in the classical context the idea of viewing moment-cumulant formulas in terms of (multiplicative functions on) set partitions, as well as the vanishing of mixed cumulants in independent random variables, goes back to Rota; the main reference on this seemed to be the Foundations of Combinatorial Theory papers, part I and part VI. This is at least what I said in my old papers, like here or here, and what Jonathan Novak, for example, also iterates in his nice Three Lectures on Free Probability. But when I tried to find any mentioning of cumulants in those papers of Rota I could not localize anything. Also in the paper of Rota with Shen, On the Combinatorics of Cumulants, there is no clear mentioning of vanishing of mixed cumulants. I am still quite sure that I learned a lot and was inspired by the papers of Rota, but maybe this was more about multiplicative functions, and cumulants did not show up there explicitly. At this point I decided to ask Jonathan whether he has some clearer idea about the origin of the classical moment-cumulant formulas. Here is what he said:

About your question, I remember also having a difficult time tracking down a proof of the equivalence of independence and mixed cumulants vanishing in the literature.  I actually think that the earliest paper where this statement is explicitly made, with a complete proof given, is “Cumulants and partition lattices,” by T.P. Speed, Australian Journal of Statistics 25 (1983), 378-388. An annotated version of this paper appears in the collected works of Speed, edited by P. McCullagh (Chapter 6 of the volume). I hope this helps; I don’t know an earlier reference.

I was happy with this and more or less forgot about it. But a few days ago the same issue came up after a talk of Philippe Biane in the online seminar on Algebraic and Combinatoiral Perspectives in the Mathematical Sciences. It seems a couple of people are interested in this and could also provide a bit more information on aspects of the origin of moment-cumulant formulas, and maybe cumulants in general. So I thought it might be a good idea to collect here this information and invite others to add possibly some more remarks on this history.

Franz Lehner offered the following insightful remarks:

Here are some considerations concerning “Rota’s approach to cumulants”.

Both in his posthumous paper Rota/Shen: On the combinatorics of cumulants, J. Combin. Theory Ser. A 91 (2000), and in his Fubini lectures Twelve problems in probability no one likes to bring up he talks about the “Rota approach” but always with reference to Speed. So apparently he never published it himself explicitly, although he certainly knew it for a long time. Speed did not prove any new results, but gave elegant lattice theoretic proofs of known results (his notation is a bit messy though).

On the other hand it must be said that Rota did not invent the Möbius function either as he repeatedly mentions in his 1964 paper, but he clearly saw its fundamental importance (and proved some important results). Rota was a bird in the sense of Dyson and without his efforts to systematize and popularize it, the Möbius function would have remained in its oblivious state for yet another generation.

According to Rota, the Möbius function was invented by Weisner in the thirties. Multiplicative functions and the reduced incidence algebra were introduced in Doubilet/Rota/Stanley Foundations of Combinatorial Theory VI: the idea of generating function, 6th Berkeley Symposium on Probability, 1970/71. Cumulants are not mentioned there, but maybe not without reason this paper appeared in a Symposium on Probability. Similarly his 1964 paper on Möbius functions was not probabilistic, yet it appeared in Probability Theory and Related Fields. Again without mentioning cumulants explicitly, probably because “to prevent the length of this paper from growing beyond bounds, we have omitted applications of the theory”.

He just mentions in passing on p.359 that Schützenberger computed the Möbius function of the partition lattice (independently of Frucht and himself). Indeed in the cited paper Contribution aux applications statistiques de la théorie de l’information (Publ. Inst. Statist. Univ. Paris, 3(1-2):3–117, 1954, Thèse d’État) Schützenberger states as a remark on p.24 the Möbius formula for cumulants. To my knowledge this is the earliest occurrence; Leonov and Shiryaev also use the partition formalism in their 1959 paper, but apparently don’t know the concept of Möbius inversion.

In the statistics literature these developments went largely unnoticed for a long time and the graph theoretic calculation rules of Fisher, Kendall, James etc, remained the tool of choice, see the foreword by McCullagh to Speed’s collected works.

Joachim Kock added the following:

In Kendall’s ‘Advanced Theory of Statistics’ from 1945, there is already an explicit formula for cumulants in terms of moments, and the Möbius function (-1)^{n-1} (n-1)! appears explicitly in the formula! But of course, he doesn’t know that this combinatorial factor is the Möbius function.

In the notes he attributes various moment-cumulant relations to Frisch’s PhD thesis (Oslo 1926), but I don’t know if this particular formula is in there.

Regarding the Möbius function for posets, Weisner’s paper is from 1935 but it only deals with complete lattices, whereas Hall (independently) has the Möbius function for finite posets in his 1936 paper. In both cases, their proofs actually work the same for locally finite posets, which is Rota’s level of generality. (Stretching it a little bit, it is actually the same arguments that work for Möbius categories, and for certain abstract coalgebras.)

In case you are in a historical mood, allow me to advertise my paper From Möbius inversion to renormalisation. (It has no cumulants, though.)

Referring to the combinatorial factor, Franz could add some more insights:

Yes, this expression is already in Thiele’s 1899 paper (reprinted in Anders Hald The Early History of the Cumulants and the Gram-Charlier Series, International Statistical Review 68 (2000) 137-153, in English), but of course not realized as Möbius function, because that one was not known before Schützenberger.

It remains to clarify who was the first to explicitly write moments as a sum over set partitions. Leonov & Shiryaev just infer it from the factorial formula without comment and Schützenberger simply says “nous supposerons connu le fait”.

Thanks to Franz and Joachim for their remarks. Any more comments are welcome …

Another blog on “Free Probability” by Teo Banica

Teo Banica got a bit bored by the lockdown and started to write a series of blogs on various topics, close to his heart and his knowledge – one of them is also one free probability. Check it out here. It’s written in Teo’s personal style, which might seem annoying or provocative to some, but in any case it’s interesting …

Update (September 2020): It seems that Teo got also bored or annoyed of his own blog, so the link above does not work any more … but much of the material has actually been moved to lecture notes and videos. In particular, Teo has the goal of trying to reorganize the quantum group basics, via a series of books. Probably the best to stay updated on this is to check his website or his YouTube channel

Correction on my lecture notes on random matrices (weak convergence versus convergence of moments)

I just noticed that I have a stupid mistake in my random matrix lecture notes (and also in the recording of the corresponding lecture). I am replacing the notes with a new version which corrects this.

In Theorem 4.16, I was claiming that weak convergence is equivalent to convergence of moments, in a setting where all moments exist and the limit is determined by its moments. Of course, this is a too optimistic statement. What is true is the direction that convergence of moments implies weak convergence. That’s the important direction. The other direction would be more of a relevance for combinatorial aficionados like me, as it would allow me to claim that the combinatorial and the analytical approach in such a setting are equivalent. However, the other direction is clearly wrong without some additional assumptions; and thus there are nice-looking situations where one cannot prove weak convergence by dealing with moments.

Of course this is not a new insight. In the context of proving the convergence to the semicircle for Wigner matrices with non-zero mean for the entries we know that we cannot do this with moments (see for example, Remark 11 in Chapter 4 of my book with Jamie).

To get a kind of positive spin out of this annoying mistake, I started to think about what kind of convergence we actually want in our theorems in free probability. Usually our convergence is in distribution, i.e., we are looking on moments – which seems to be the natural thing to do in the multivariate case of several non-commuting operators. However, we can also project things down to the classical world of one variable by taking functions in our operators and ask for the convergence of all such functions. And then there might be a difference whether we ask for weak convergence or for convergence in distribution (i.e., convergence of all moments).

This might become kind of relevant in the context of rational functions. Sheng Yin showed in Non-commutative rational functions in strong convergent random variables that convergence in distribution goes over from polynomials to rational functions (in the case where we assume that the rational function in the limit is a bounded operator) if we assume strong convergence on the level of polynomials (i.e., also convergence of the operator norms). Without the assumption of strong convergence it is easy to see that there are examples (see page 12 of the paper of Sheng) where one has convergence in distribution for the polynomials, but not for the rational functions. However, though one does not have convergence of the moments of the rational function, it is still true in this example that one has weak convergence of the (selfadjoint) rational function. So maybe it could still be the case that, even without strong convergence assumptions, convergence in distribution for polynomials (or maybe weak convergence for polynomials) implies weak convergence for rational functions. At least at the moment we do not know a counter example to this.

Another online seminar: Wales MPPM Zoom Seminar

At the moment there are many online activities going on …. and here is another one: the Wales Mathematical Physics Zoom Seminar, organized by Edwin BeggsDavid EvansGwion Evans,Rolf GohmTim Porter.

Why do I mention in particular this one; there are at least two reasons. Today there is a talk by Mikael Rordam around the Connes embedding problem, and next week I will give a talk, on my joint work with Tobias Mai and Sheng Yin of the last years around rational functions of random matrices and operators.

If you are interested in any of this, here is the website of the seminar, where you can find more information.

Update: The talks are usually recorded and posted on a youtube channel. There you can find my talk on “Random Matrices and Their Limits”.

The saga ends …

I have now finished my class on random matrices. The last lecture motivated the notion of (asymptotic) freeness from the point of view of looking on independent GUE random matrices. So you might think that there should now be continuations on free probability and alike coming soon. But actually this part of the story was already written and recorded and if you don’t want to spoil the tension you should watch the series not in its historical but in its logical order:

  1. Random Matrices (videos, homepage of class)
  2. Free Probability Theory (videos, homepage of class)
  3. Non-commutative Distributions (and Operator-Valued free Probability Theory) (videos, homepage of class)

More information, in particular the underlying script (sometimes in a handwritten version, sometimes in a more polished texed version), can be found on the corresponding home page of the lecture series.

May freeness be with you …