INFORMATION FOR
Realistic microscopy image of breast cancer cells forming a cluster with irregular shapes and high mitotic activity. (AI generated image.)
Doctors and scientists study tissues in remarkable detail. They can measure not only which genes are active, but also where that activity is happening inside a given sample, creating a molecular map that shows how cells are arranged, how they behave, and how they can change because of development or disease.
But as these maps grow larger and more complex, it becomes harder to integrate them across experiments, technologies, and conditions.
A new study from Yale School of Public Health (YSPH) researchers has introduced a new computational method that could help scientists integrate many such maps at once. The study, published in Nature Genetics, may help researchers better understand cancer, organ development, wound healing, and the three-dimensional structure of tissues.
The tool, called INSPIRE, tackles one of the central challenges in spatial transcriptomics, a rapidly growing area of biomedical research. Different labs, technologies, and patient samples produce datasets that don’t easily integrate. Even when they are combined, multiple biological signals often remain intertwined, making it difficult to disentangle distinct cellular programs and limiting interpretability.
INSPIRE was built to solve that problem. By combining deep learning with statistical methods, the tool identifies interpretable patterns of gene activity tied to tissue structure, cell states, and biological pathways — what the researchers call “spatial factors” — while using adversarial learning to remove technical variation that otherwise distorts cross-platform integration.
The result is a relatively lightweight tool for cross-tissue analysis, said Jia Zhao, PhD, a YSPH postdoctoral associate and first author of the study.
“Different samples have different contexts, and our method is flexible enough to capture all of those different scenarios, from mouse to human, from developmental stages to tumor environments,” she said.
The team tested INSPIRE across a wide range of biological problems. The most striking results came from human breast cancer. When it was employed to analyze a 280,000-cell tumor sample, INSPIRE identified rare cell populations concentrated near the most aggressive areas of the tumor, and it also found specific biological pathways associated with the cancer.
“There is important biological insight hidden within the high-dimensional gene expression data,” Zhao said. “With the higher-resolution results from INSPIRE, we can better characterize the distributions of the cell states, understand how these cell states activate different gene programs, and uncover how cells communicate with each other to coordinate biological pathways.”
That level of insight into tumor biology could help researchers better understand why some cancers progress more quickly than others. INSPIRE’s ability to integrate data across multiple patients and platforms may open the door to more systematic comparisons of tumor environments, which could help develop more targeted, personalized approaches to cancer care, the researchers said.
INSPIRE is not yet able to use image data from tissues, which could carry additional biological context beyond gene expression alone.
In the future, the team hopes to embed INSPIRE into emerging AI agent-based research systems, widening the tool’s reach across the scientific community.
“We believe the broad applicability of INSPIRE across many different biological settings gives it the potential to make a substantial impact,” Zhao added. The study was led by researchers in the Department of Biostatistics at YSPH, with contributions from Yale’s Interdepartmental Program in Computational Biology and Bioinformatics, and the departments of Neuroscience and Cellular and Molecular Physiology at the Yale School of Medicine. Hongyu Zhao, PhD, Ira V. Hiscock Professor of Biostatistics, served as the study’s senior author.
Copy Link

Leave a Reply