The Paris-Saclay Center for Data Science: an interdisciplinary project around data
The subject of data science is the design of automated methods to analyze massive and complex data in order to extract useful information. Data science projects require expertise from a vast spectrum of scientific fields ranging from research on methods (statistics, signal processing, machine learning, data mining, data visualization) through software building and maintenance to the mastery of the scientific domain where the data originate from.
The objectives of the CDS
The goal of this initiative is to establish an institutionalized agora in which these scientists can find each other, exchange ideas, initiate and nurture interdisciplinary projects, and share their experience on past data science projects. To foster synergy between data analysts and data producers we propose to provide initial resources for helping collaborations to get off the ground, to mitigate the non-negligible risk taken by researchers venturing into interdisciplinary data science projects, and to encourage the use of unconventional forms of information transmission and dissemination essential in this communication-intensive research area. The CDS fits perfectly in the recent surge of similar initiatives, both at the international and at the national level, and it has the potential to make the University one of the international fore-runners of data science.
Data science in human, natural, and engineering sciences
More than 250 permanent researchers in 35 laboratories participate in the CDS. On the mathematics/computer science side, the major research themes are
- machine learning,
- signal processing,
- data visualization,
- and databases.
At the same time, we focus on data coming from
- Biology and medicine,
- Astrophysics, cosmology, and astrostatistics
- Natural language, text and music processing
- Particle and astroparticle physics
- Environment, atmospheric sciences, and oceanology
- Economy, finance, and insurance
- Social sciences and networks
- Engineering sciences
Balázs Kégl, CNRS, Laboratoire de l’Accélérateur Linéaire (LAL), firstname.lastname@example.org
Arnak Dalalyan, ENSAE, Laboratoire de Statistique (LS), ENSAE-CREST, email@example.com