The Carnot CALYM Institute’s data hub provides clinicians and researchers with secure and confidential access to bioclinical data. The architecture for this offering has been entrusted to the french TeamWork’s teams
Carnot CALYM Institute
The Carnot CALYM Institute is a consortium of 20 entities, including public research laboratories and cooperative groups. Its goal is to accelerate innovation in the research, characterization, and treatment of lymphoid hematopathies through public/private partnerships to improve patient care. This particularly concerns lymphomas, a major public health issue as the leading blood cancer, affecting nearly two million people worldwide.
Members of the Carnot CALYM Institute conduct significant R&D efforts, which in 2023 alone resulted in nearly 250 industrial contracts and 300 scientific publications in peer-reviewed international journals. The Carnot CALYM Institute operates with a dedicated team of about 20 individuals
Within this structure, Guillaume Codet de Boisse, Director of the Technological Innovations Department, leads his team, comprising a data engineer and two project managers. “Our main mission is to support consortium members in their projects by offering technological solutions that can accelerate their workflows,” he summarizes.
Enabling Better Data Utilization
“Collecting data as part of clinical research projects is a well-established activity for the consortium. However, reusing this data for secondary purposes in other projects remained complex,” continues Guillaume Codet de Boisse. “We then had the idea of creating a data hub that would provide researchers—whether academic or industrial—with workspaces to enhance the value of health data.”
In 2021, a first version of the Lymphoma Data Hub (LDH) was launched. This offering was based on Microsoft Azure cloud technologies. “The LDH allows researchers to securely and confidentially centralize data of interest identified by researchers. The LDH then provides researchers with the tools needed to exploit the data: bioinformatics, AI, image processing…” However, this first version remained an “artisan” platform, built and administered largely manually, based on each project’s needs.
In 2022, CALYM’s Technological Innovations Department began work on the LDH v2, an industrial solution with greater automation and improved data governance. “We chose to continue working with Microsoft, which is engaged in a certification process with health authorities.” An HDS-certified manager will handle the daily administration of LDH v2, with its architecture entrusted to TeamWork.
A Powerful Offering…
TeamWork relied on a specification built around four typical scientific projects, aiming to propose a generic platform capable of covering these use cases while offering the necessary flexibility to adapt to other needs.
Work began in September 2023, with the solution’s implementation in the first quarter of 2024 for one of the identified projects, followed by a rapid scale-up of the new platform. “By early summer, all our projects had been migrated to LDH v2, with the last components of v1 now fully decommissioned.”
The platform uses standard Microsoft Azure cloud solutions: SFTP for data transfers, data repositories for storage, virtual machines for processing, and more advanced tools like Azure Synapse for data flow management and Azure Machine Learning for artificial intelligence needs.
…but Also Flexible
“The strength of our platform is its flexibility. The backbone of our data hub remains common to all projects, but the methods of structuring, enriching, and exploiting data are defined project by project.” CALYM’s technical teams rely on templates defined by TeamWork, covering a wide range of needs. When a use case falls outside this framework, TeamWork steps in again to propose an appropriate solution.
This component assembly enables a trustworthy, compliant, and efficient architecture where nothing enters or exits without control. CALYM drew inspiration from the Trusted Research Environments popularized by the UK’s NHS.
The teams are already working on further improvements to the platform. “We want to offer a solution closer to a SaaS service, with a portal that would make it easier to manage projects without worrying about the Microsoft Azure components used. We’re also working on new data enrichment solutions, such as using NLP (Natural Language Processing) coupled with the TA4H* solution for text analysis,” concludes Guillaume Codet de Boisse.