Boltzmann Machines beyond the "independent identically distributed" Paradigm: a Mathematical Physics Approach

PRIN 2022 B5LF52 - CUP J53D23003690006 - PI: Pierluigi Contucci

Funded by the European Union - NextGenerationEU under the National Recovery and Resilience Plan (PNRR) - Mission 4 Education and research - Component 2 From research to business - Investment 1.1 Notice Prin 2022 - DD N. 104 del 2/2/2022

The project plans to go beyond the paradigm of independent identically distributed (i.i.d.) disorder that the theory of complex systems has followed so far. From its birth in the mid seventies, to the Parisi theory of the spin glass, up to the recent rigorous mathematical results, such theory has been mostly developed with the hypothesis of i.i.d. random variables appearing in the Hamiltonian function.

While such a framework represents a natural and physically motivated starting point to investigate magnetic alloys, nowadays the most striking applications of disordered systems, such as those coming from machine learning and inference, require to go beyond. Machine learning indeed, or more generally a high dimensional statistical inference task, can be interpreted as the solution of a statistical mechanics system of interacting particles with random parameters, usually called Boltzmann Machine.

The data, represented by a suitable distribution, is in fact mapped into a distribution of those parameters that in general have no reason to be neither independent nor identically distributed.

More precisely our plan is to proceed along three different and somehow complementary themes.

Multispecies structure

Beyond the identical distribution assumption: we want to avoid the use of the full permutation symmetry among particles and we propose to break it into different groups. This leads to the multispecies structure where, in particular, we will focus on the non convex setting that contains the case of deep architectures.

Multiscale measure

Beyond the independent distribution assumption: we plan to investigate a correlated noise structure called multiscale measure, introduced within the rigorous approaches to the Parisi solution. We aim to analyse multispecies models endowed with a multiscale
structure from the static and dynamic point of view. In particular we are interested in out of equilibrium properties like aging and violation of the Fluctuation Dissipation in relation to machine learning dynamics.

Orthogonally invariant noise

Again beyond the independent distribution assumption: we address spin-glass models with orthogonally invariant noise drawn from Orthogonal Random Matrix Ensembles, with a particular emphasis on the inference task of retrieving a finite rank matrix corrupted by orthogonally invariant noise with methods borrowed from the Statistical Mechanics of Disordered Systems.

The above generalizations of the standard Boltzmann Machines and the investigation of their behavior in relation with data structure, network architecture and weights regularization, will provide a closer look at learning and inference regimes through the detection of phase transitions.

We believe that the proposed research project represents a fundamental step toward the identification of the still lacking theoretical foundation of artificial intelligence.

That in turn will favor, in perspective, progress with awareness and responsibility toward its use in society.