Statistical Mechanics of Learning Machines: from algorithmic and information-theoretical limits to new biologically inspired paradigms:

PRIN 2022 9T9EAT - CUP J53D23003640001 - PI: Daniele Tantari

Areas: Mathematical physics

Funded by the European Union - NextGenerationEU under the National Recovery and Resilience Plan (PNRR) - Mission 4 Education and research - Component 2 From research to business - Investment 1.1 Notice Prin 2022 - DD N. 104 del 2/2/2022

The increasing availability of massive data sets, with ever larger volume, variety and velocity has signed the direction of the recent ultimate technological progress, where machine learning (ML) has emerged as the key paradigm of modern Artificial Intelligence systems.

On the one hand, despite the many impressive achievements, there are serious gaps in our theoretical understanding of learning systems. Deep neural networks (DNN) are often used as big black boxes: common deep learning (DL) practices (architectural choices, parameter fine-tuning) can mostly be viewed as alchemy, since nobody understands what makes it work.

On the other hand, precisely because of those impressive achievements, ranging from image and speech recognition, language and text translation to computer vision and prediction of protein structures, the confidence of the scientific community in their power is unrestrainedly increasing and is pushing the expectations even further: from systems succeeding in specific tasks to more intelligent generalist models that can learn a sort of common sense of the world and do multiple tasks. However this ambition is crashing against the ancient bottleneck of data scarcity: it’s impossible to label everything in the world.

A deeper understanding of the actual learning systems is thus becoming extremely necessary not only to calm a noble thirst of mathematical knowledge but also to avoid a new middle age of machine learning in favor of a new generation of systems able to acquire new skills without such a massive amount of labeled data, with a stronger abstraction capacity, thus closer to biological and human-level intelligence. To this purpose it is important contributing as mathematicians to:

assess the information-theoretical limit of learning: given certain models for the learning machine and the environment, how much information (in terms of amount of data) is necessary for the machine to produce a faithful representation and correct predictions about the environment, regardless of any computational barrier?
assess the actual limit of algorithms: the eventual information codified in the dataset has to be efficiently retrieved. It is important to understand whether existing algorithms are able to extract this information and how they use it to build a representation of the
environment with the machine parameters.
propose new biologically and human intelligence inspired learning approaches: intended as new learning machines, new dynamics of learning and algorithms, and new paradigms for data driven artificial intelligence, which are less task-specific, more efficient and parsimonious with data.

Following this route the project intends to develop a mathematically grounded understanding of ML merging ideas from statistical inference, statistical mechanics and information theory. At the same time it aims at proposing new, biologically and human intelligence inspired learning paradigms, models and algorithms.