An organization structured into 8 work packages
- WP1 – Data collection and governance: Structuring, accessing and securing health data.
- WP2 – Data preprocessing and harmonization: Standardizing and preparing datasets for model training.
- WP3 – Foundation model development: Designing and training large-scale models for health data.
- WP4 – Use cases in oncology: Identification of tumor biomarkers and clinical applications.
- WP5 – Use cases in infectious diseases: Automatic detection and support for combating antibiotic resistance.
- WP6 – Evaluation and benchmarking: Assessing model performance, robustness and reproducibility.
- WP7 – Deployment and infrastructure: Integration into secure environments and operationalization.
- WP8 – Dissemination and collaboration: Promoting open science, sharing resources and engaging stakeholders.
The first work package is dedicated to the overall coordination of the PARTAGES project, as well as the dissemination and promotion of its results. It ensures sound project governance, consistency among the various work packages, and adherence to the timeline, scientific objectives, and regulatory requirements.
The objective of this second phase is to establish a rigorous methodology for the production and use of project data (primarily fictitious medical records) and to ensure the quality control of all raw datasets used to train the project’s models. It also supports data quality control at the healthcare facility level for model evaluation.
Work Package 3 is responsible for developing and implementing a common evaluation methodology for foundation models and all use cases, as well as for analyzing the evaluation results.
The objective of the fourth work package is to develop all the foundational models that will be used by the use cases, including the fine-tuning of the general-purpose generative LLM for the French-language medical domain, as well as the development of BERT-style encoder models (Bidirectional Encoder Representations from Transformers).
Lot No. 5 focuses on establishing the necessary technical infrastructure, including the creation, adaptation, and documentation of the federated validation platform, in which each partner healthcare facility serves as a node.
The goal of the sixth batch is to manage the recruitment and supervision of healthcare professionals (senior residents and junior physicians) for the creation of a corpus of 5,000 fictional patient records and for annotation tasks.
Lot No. 7 oversees the project's legal matters, including the implementation of the contractual framework and the monitoring of work related to the local use of healthcare facility reports.
This final phase covers the development of models for the specific use cases identified above.