Home page > Equipes_Fr_En_It > Signal Image numerical Probability leArning Statistics > SIMPAS

## SIMPAS

### SIMPAS - Signal IMage Probabilités numériques Apprentissage Statistique

### SIMPAS Team

### (Signal IMage numerical Probability leArning Statistics)

**Scientific coordinators :** E. Gobet and E. Le Pennec

This research team gathers researchers of CMAP in the field of randomness and stochastics (in the broad sense), whose works are centered on the digital processing of data or the simulation of random models, ranging from theoretical bases of the algorithms and methods, to the effective data-processing developments.

**Confirmed permanent researchers :**

Stéphanie Allassonnière (Professeur Chargé de Cours Ecole Polytechnique)

Emmanuel Bacry (Chargé de Recherches CNRS et Professeur Associé Ecole Polytechnique)

Antonin Chambolle (Directeur de Recherches CNRS et Professeur Chargé de Cours Ecole Polytechnique)

Stefano De Marco (Professeur Chargé de Cours Ecole Polytechnique)

Gersende Fort (Directeur de Recherches CNRS et Professeur Chargé de Cours Ecole Polytechnique)

Stéphane Gaiffas (Professeur Chargé de Cours Ecole Polytechnique)

Emmanuel Gobet (Professeur Ecole Polytechnique)

Erwan Le Pennec (Professeur Associé Ecole Polytechnique)

Guillaume Lecue (2012-2015, Chargé de Recherches CNRS et Professeur Chargé de cours Ecole Polytechnique)

Eric Moulines (Professeur Ecole Polytechnique)

**Associated researchers :**

Agathe Guilloux (Maitre de Conférences Université Pierre et Marie Curie)

Sophie Laruelle (Maitre de Conférences Université Paris-Est Créteil)

Marc Lavielle (Directeur de Recherches INRIA)

Marc Lelarge (Chargé de Recherches INRIA)

**Post-doctoral researchers and engineers:**

Christos Giatsidis (2015- ) : Data Science initiative

Jacopo Mastromatteo (2014-2015) : statistics of order books

Maryan Morel (2015- ) : project CNAM

Roque Porchetto (2015- ) : project TEMPO

Plamen Turkedjiev (2013-1015) : simulation of non-linear processes

Samuel Vaiter (2014- ): variational regularization in signal and image processing

**PhD students supervised in CMAP:**

Massil Achab (2014- ), supervised by E. Bacry and S. Gaiffas

Martin Bompaire (2015- ), supervised by E. Bacry and S. Gaiffas

Romain Bompis (2010-2013), supervised by E. Gobet : Asympotic expansions for the approximations of diffusion processes

Etienne Corman (2013- ), supervised by A. Chambolle and M. Ovsjanikov (LIX) : Matching of shapes

Raphael Deswartes (2014- ), supervised by G. Lecué

Loïc Devilliers (2015- ), supervised by S. Allassonnière

Adrian Iuga (2010-2013), supervised by E. Bacry and M. Hoffmann (Univ. Paris-Dauphine) : Modeling and statistical analysis of the formation of prices through different scales

Thibault Jaisson(2012-2015), supervised by E. Bacry and M. Rosenbaum (UPMC) : studies of statistical problems from market microstructures

Gang Liu (2013- ), supervised by E. Gobet and P. Del Moral (INRIA Alea) : Simulation of rare events

Gustaw Matulewicz (2014- ), supervised by S. Gaiffas, E. Gobet and M. Varzigiannis

Isaque Pimentel (2015- ), supervised by E. Gobet and X. Warin (EDF)

Jean-Baptiste Schiratti (2014- ), supervised by S. Allassonnière

Qihao She (2013- ), supervised by E. Gobet and N. Privault (NTU, Singapore)

Pauline Tan (2013- ), supervised by A. Chambolle and P. Monasse (CERTIS, ENPC) : stereo vision

Alain Virouleau (2015- ), supervised by E. Bacry and S. Gaiffas

Hao Xu (2011-2014), supervised by S. Allassonnière and B. Thirion (INRIA Parietal)

**PhD students supervised oustide CMAP:**

Islem Rekik (2010-2013), supervised by S. Allassonnière and J. Wardlaw (Neuroradiologue, Univ. Edimbourg)

Mokhtar Alaya (Univ Paris 6), supervised by S. Gaiffas

Benoit Baylin (2015- , Telecom Paris), supervised by G. Fort

Hajer Braham (2012-2015, Telecom Paris), supervised by G. Fort

Alain Durmus (2014- ), supervised by G. Fort and E. Moulines

Lucie Montuelle (2011-2014, Univ. Paris Sud), supervised by E. Le Pennec

Solenne Thivin (2012-2015 , Univ. Paris Sud), supervised by E. Le Pennec

**Main industrial and institutional partners :**

Chaire Axa Data Science for Insurance Sector (2015- )

Chaire Keyrus-Orange-Thales (2014- )

Chaire Havas, Economy of new data (2013- )

CNAM (2015- )

Data Science Initiative

EDF

INRIA Select, INRIA Parietal

Parternship with high frequency data provider QUANTHOUSE

Research Initiative "Numerical methods for stochastic control equations" of FiME laboratory

Thales

**Main fundings :**

ANR CAESARS "Control and simulAtion of Electrical Systems, interAction and RobustnesS", E. Gobet (2015-2019)

ANR Blanc international EANOI "Efficient Algorithms for Nonsmooth Optimization in Imaging", A. Chambolle, avec Thomas Pock, TU. Graz (2012-2015).

Projet Digiteo MMoVNI "Modélisation Mathématique de la Variabilité inter-sujets en Neuro-Imagerie", S. Allassonnière (2010-2014)

### Research topics :

**Machine learning :** web mining, big data, large dimension, non supervised or weakly supervised learning

We study learning on data in large dimensions: that presupposes the existence of a underlying structure of low dimension, such as, for example, matrices of large size but low rank, or functions defined on the large-dimensional space but depending only on a small number of variables, or large graphs but organized in small communities, etc in link with as many applications as in social networks, textual and semantic analyzes, forecast via aggregation of experts, data web mining, big-data… The identification of each structure from data (sometimes noisy) requires to develop specific statistical procedures. The example of the methods by penalization using a criteria forcing parsimony is a typical learning technique in large dimension. Concerning the analysis of matrices in large dimensions but of low rank, we proposed procedures with penalties favouring certain structures, or with penalization of classical estimators. The performances of these procedures depend only on the intrinsic dimension of the problem and not of the dimension of ambient space. We are interested in the applications in collaborative filtering.

**Stochastic approximations :** asymptotic methods, optimization and stochastic algorithms, MCMC and bayesian inference

We study approximations of the law of diffusions processes (approximation of marginal density and expectation of path functional) in the form of Gaussian perturbation (method of proxy and Malliavin calculus), under assumptions of given regularity. The large deviations techniques also make it possible to catch the behaviors in short time. All these techniques lead to new explicit representations, either by analytical formulas or by Gaussian simulations with corrective terms. The extensions to the non-linear processes (backward SDE, McKean SDE) and to non-Brownian noises are on-going researches. These results form a wide set of very useful approximations in multiple applications and other numerical stochastic methods. In addition, we develop effective stochastic algorithms for the bayesian estimation and classification; the applications in particular in medical imagery are very important and require very powerful methods for the real-time imagery. Other applications relate to the parametric and semi-parametric inference of hierarchical models. As regards the theory of the MCMC methods, we analyze Markov chains in general state spaces, that have subgeometrical convergences, with a focus on the mixing property in large dimension and on mean-fields problems. We also study adaptive MCMC methods.

**Monte-Carlo methods :** empirical regressions in large dimension and non-linear stochastic processes, particle methods, rare events, large deviations

We study the effective resolution, by Monte Carlo simulations and empirical regressions, of the dynamic programming equations appearing in stochastic control, or of backward SDEs and/or with interaction. We seek to include increasingly general non-linearities for which there does not exist yet of numerical methods. That requires to develop tools dedicated to take into account the effects of dimension, the non-boundedness of the approximating functions, general distribution measures (sometimes in feedback loop), the parsimony of the representations… The mean-field games constitute an ambitious framework. In addition, we couple the techniques of large deviations in continuous time and the particle methods at discrete times more effectively to simulate the large deviations of the processes at continuous time. We also develop parallel versions of the particle methods (particle islands).

**Mathematical statistics :** non-parametric estimation, model selection, classification, reduction of dimension

We study theoretical problems of selection or aggregation of estimators in a context of large dimension. Concerning the problem of aggregation of estimators, we have been able to build two optimal procedures whereas this framework is difficult because the classical procedures in statistics are sub-optimal here. Other results have been obtained regarding the cross-validation method, the aggregate with exponential weights for the problem of convex aggregation and on the single-index model. We also got results about model selection in estimation problems of conditional density, with applications to the image segmentation.

**Statistics for stochastic processes :** The phenomena with scale invariance are usually observed in finance and turbulence: we devote part of our work to it, in particular along the multi-fractal approaches (cf multi-fractal random walk of Bacry) which is nowadays a reference in the field. We are interested in particular in the estimation problems within this multi-fractal framework. In addition, we study the statistics of point processes, both from the probabilistic point of view (in particular the link with the scale invariance at the diffusive level), from the statistical point of view or application. We are also working on the Hawkes processes in large dimension in order to better understand the dynamics of diffusion of information on a network (applications to the social network twitter and to the problems of systemic risk on the financial markets).

**Signal and image processing:** adaptive methods, adapted representations, multi-scale analysis, multi-fractal analysis, compressed sensing, variational methods, non-supervised classification using mixture and model selection

Part of research of the group relates to the statistical analysis of images with application to the medical imagery. Typically, one wishes to consider a full atlas of the human brain starting from population of images, while having theoretical garantuees in the estimation. That passes by (1) the modeling of large varieties of multimode images, (2) the numerical estimation of the atlas starting from these models (statistical learning) and (3) to prove the statistical relevance of the estimator, allowing to provide to doctors an atlas which one can estimate confidence. Theoretical results of density estimation were in addition used within the framework of segmentation of hyperspectral images by methods of mixtures. The framework of the variational methods in image processing is also investigated in the team.