EPFL scientists, in collaboration with MIT and Yale, have introduced a groundbreaking method called Secure Federated Genome-Wide Association Studies (SF-GWAS). This new method combines secure computation and distributed algorithms, enabling efficient and accurate genetic research while keeping data confidential.
Sharing data for genome-wide association studies (GWAS) is essential for finding genetic links to health and disease. However, current rules limit data-sharing across institutions.
Cryptographic tools can enable secure and private collaborative analysis, but existing methods are too complex or outdated. SF-GWAS solves these challenges, enhancing data collaborations in medical research.
It has been successfully tested on a large scale and is now being implemented across Europe. SF-GWAS allows efficient and accurate studies on private data from multiple entities while keeping data confidential. A survey of five datasets, including a UK Biobank cohort of 410,000 individuals, showed a significant improvement in runtime compared to previous methods.
Key Features of SF-GWAS
Federated Approach: Each dataset stays at its original site, reducing costs by avoiding large data transfers and using cryptographic operations to protect results.
Efficient Algorithms: Introduces an efficient algorithm to support federated execution of different GWAS pipelines.
Jean-Pierre Hubaux, Academic Director at EPFL’s Center for Digital Trust (C4DT), explained, “In many cases, it’s impossible to centralize data for practical or legal reasons or just because people aren’t willing to share it. So, the goal is to extract information without sharing the data.”
A new method allows hospitals to share patient data privately
Hubaux added, “We developed a prototype several years ago, but what was missing was the demonstration that it works at scale with real-world size datasets. This has now been done in collaboration with MIT and Yale, showing that extracting information from datasets that remain geographically distributed without significant precision loss is possible.”
SF-GWAS has been installed in five Swiss university hospitals and is being rolled out in several Italian hospitals and European cancer networks by Tune Insight, the EPFL spin-off leading this work. The company is also in talks with medical institutions in other countries.
Hubaux believes SF-GWAS will help optimize public healthcare policy by unlocking large-scale medical research, which is currently hindered by data silos. He described the current system as “prehistoric,” with datasets scattered worldwide on hard disks and tapes, making data transfer complicated and underutilized.
“We are setting up a value system to ensure that future data is going to be interoperable, recorded consistently place to place,” Hubaux said. “It’s costly, and the transition will take time, but we have developed the tools to facilitate it, and an evolution is underway.”
Hubaux emphasized that the willingness to work at scale is a cultural shift, encouraging rigorous data storage and structure to ensure interoperability. This will lead to better overall quality of health and medical data.
Journal Reference
Cho, H., Froelicher, D., Chen, J., Edupalli, M., Pyrgelis, A., R., J., Hubaux, J., & Berger, B. (2025). Secure and federated genome-wide association studies for biobank-scale datasets. Nature Genetics, 1-6. DOI: 10.1038/s41588-025-02109-1