Resources for Innovation

Datasets and softwares

Datasets

Data Terra

Data Terra provides open, interoperable Earth system datasets, covering atmosphere, ocean, land, and solid Earth, to advance environmental and climate research. More about.

datagouv

Data.gouv.fr is France’s open data platform, offering free access to thousands of public datasets for transparency, innovation, and civic reuse. More about.

Insee

Insee’s data catalog provides official, high-quality statistical datasets on France’s economy, population, and society for research, policy, and decision-making. More about.

European data

Data.europa.eu is the EU’s official open data portal, offering free access to datasets from European institutions, countries, and research projects for transparency, innovation, and reuse. More about.

AIoD

The AIoD platform is a collaborative, community-driven digital space that supports European research and innovation in AI. More about.

Softwares

Mapie

Mapie is a Python library that enables the quantification of uncertainties associated with machine learning model predictions. It implements conformal prediction methods to compute prediction intervals or prediction sets with statistical guarantees, covering regression, classification, and time series tasks. More about.

Skrub

Designed to process complex data stored in one or more dataframes, skrub helps perform the necessary data cleaning operations to produce a dataset ready for use in a machine learning model. skrub provides essential building blocks for machine learning pipelines, while ensuring that transformations can be reliably applied to new data. More about.

AEON

AEON is also a library dedicated to time series analysis, offering a modular architecture for a wide variety of transformation and learning tasks. It incorporates state-of-the-art algorithms, including deep learning techniques tailored for time series, and provides tools for comparing methods and reproducing benchmarks. More about.

tslearn

tslearn est une bibliothèque Python dédiée à l’apprentissage automatique sur séries temporelles univariées ou multivariées, proposant les algorithmes standards adaptés aux séries de longueur variable. Elle est compatible avec l’API scikit-learn et optimisé pour la manipulation rapide. More about.

Corese

Corese is a software platform that implements and extends Semantic Web standards. It enables users to create, manipulate, parse, serialize, query, reason about, and validate RDF data, thereby providing powerful tools for managing and leveraging knowledge graphs in complex environments. More about.

Shapash

Shapash is a Python library dedicated to the interpretability of machine learning models. It provides tools for explaining and visualizing model predictions, both at the global and local levels, to help users better understand, trust, and communicate about model behavior. More about.

Scikit-learn

Scikit-learn is a leading library for machine learning. It offers simple and effective tools for predictive data analysis, is accessible to everyone, and can be reused in a variety of application contexts. More about.

Supported by