The Data IT platform was created with three main functionalities:
- Mapping existing data repositories operated by CDS partners that are relevant for the interdisciplinary research of CDS, and their relations to wider scientific networks.
- A collaborative interface. Scientists should be able to do more than just retrieve information, by interacting and collaborating as creators of user-generated content in a virtual community: to make the data actually usable by a human researcher, information should be richer, easier to find and more thoroughly categorized than by the usual static ”portfolios”.
- The computer generates new information: Within the Semantic Web framework, we proposed to explore how related data repositories operated by CDS partners could be seamlessly and meaningfully queried with the specific objectives of Data Science.
The target audience of the platform was the CDS community. It evolved towards a goal of potential extension to Paris-Saclay and transferability to any analogous context. As such, the platform was not intended to be an experiment, but has been designed as a production quality product.
We addressed these challenges through the creation of the io.datascience platform.
With io.datascience, datasets can be declared, qualified, documented and linked, with configurable access rights. Currently, some 35 public datasets and 17 restricted to Paris-Saclay are available.