Accelerating Breast Cancer Research: Komen Tissue Bank Partners with Manifold to Advance Phenotypic Insights

Understanding what distinguishes healthy breast tissue from malignant tissue remains one of the most fundamental questions in cancer biology. Yet for decades, researchers have faced a critical constraint: comprehensive, well-characterized normal breast tissue has been exceptionally difficult to access at scale. This gap has limited our ability to establish biological baselines, identify early molecular changes, and understand the tissue microenvironment that either protects against or promotes carcinogenesis.
The Komen Tissue Bank at Indiana University Melvin and Bren Simon Comprehensive Cancer Center (Komen Tissue Bank) addresses this challenge head-on. As the world's only - repository of normal breast tissue from healthy volunteers, it has already enabled groundbreaking discoveries in breast cancer research. Now, through a partnership with Manifold, this invaluable resource will become dramatically more accessible and actionable as a breast cancer research commons for the global research community.
The Scientific Foundation: What Makes Normal Tissue Essential
The Komen Tissue Bank's collection represents a unique scientific asset precisely because it captures what is typically absent from research: the biological baseline. While tumor tissue banks are relatively common, normal breast tissue from healthy donors is rare, despite being essential for understanding disease etiology and progression.
Published research using the Komen Tissue Bank has already demonstrated the power of this resource across multiple domains: characterizing normal methylation patterns and gene expression profiles critical for identifying early molecular aberrations; mapping the breast tissue microenvironment including immune cell composition and stromal architecture to understand cancer susceptibility; validating biomarkers and risk models by demonstrating specificity against normal tissue; and advancing precision medicine through studies of population-specific variation in breast biology that help explain disparities in cancer incidence and outcomes.
The Multimodal Advantage: Integrating Tissue, Imaging, and Clinical Data
What distinguishes the Komen Tissue Bank from other repositories is not just the tissue itself, but the richness of the multimodal data associated with each sample. Researchers gain access to:
- High-quality tissue specimens - Serial samples from ~1000 donors, men (~60), mutation carriers, pregnant/breastfeeding women, suitable for genomics, transcriptomics, proteomics, and metabolomics
- Histopathological imaging including H&E-stained slides for tissue architecture analysis
- Comprehensive clinical and demographic data including detailed reproductive history, lifestyle factors, and family history
- Longitudinal follow-up for subset of participants, enabling temporal analyses
This integration of tissue, imaging, and phenotypic data creates research opportunities that single-modality resources cannot support. A researcher studying the relationship between breast density and stromal composition, for example, can correlate imaging-derived measurements with molecular profiles from the same individual—a capability that has historically required assembling data from multiple disconnected sources.
The multimodal nature of the data also enables machine learning and AI-driven approaches that require paired datasets for training and validation. Algorithms designed to predict tissue composition from imaging, identify subtle architectural patterns associated with risk, or integrate multi-omic profiles for risk stratification all depend on the type of integrated, well-annotated data that the Komen Tissue Bank provides.
From Data Asset to Research Commons: The Manifold Transformation
Despite the scientific value of the Komen Tissue Bank, accessing and using this resource has historically involved significant friction. Scientists formulate hypotheses in the semantic space of biology asking questions about tissue phenotypes, molecular pathways, and population characteristics, but must then translate those questions into the technical language of databases and data structures. This translation gap has required either deep technical expertise or heavy reliance on data coordinators who serve as intermediaries between scientific intent and data reality.
Manifold's platform bridges this gap by enabling researchers to work in their native scientific language while the platform handles the complexity of data navigation. Rather than learning database schemas or relying on email-driven workflows, researchers gain immediate, interactive access to explore the full scope of available data before committing resources to a study. Together, the Komen Tissue Bank and Manifold form a research commons for breast cancer — a shared, disease-focused environment where multimodal biological data can be explored, cohorts can be defined, and insights can be generated by a broad research community without bespoke data work.
Intelligent Cohort Discovery for Hypothesis Generation
The Cohort Explorer provides real-time visualization of how different inclusion and exclusion criteria affect available sample sizes, data completeness, and cohort characteristics. Rather than submitting multiple rounds of queries via email and waiting for responses, researchers can:
- Dynamically filter across clinical variables, demographic characteristics, and available data types
- Visualize the distribution of key variables within potential cohorts
- Identify data completeness patterns that might affect analytical approaches
- Compare alternative cohort definitions to optimize statistical power
For a researcher designing a study of age-related changes in breast tissue composition, this means being able to instantly see how many samples are available across different age strata, what molecular data types exist for each stratum, and whether sufficient diversity exists for meaningful subgroup analyses.
Unlike general-purpose data platforms that focus on data ingestion, transformation, and analytics, Manifold is purpose-built for biological cohort discovery. The platform assumes that complex, multimodal life sciences data already exists, and instead focuses on enabling scientific and translational teams to rapidly identify, refine, and validate cohorts — the unit of value across research, translational, and commercial life sciences use cases. This distinction is critical in domains like oncology, where the ability to quickly test cohort feasibility often determines whether a study, collaboration, or commercial initiative moves forward at all.
Conversational AI for Complex Data Navigation
Layered on top of the visual exploration tools is Manifold's agentic AI capability, which allows researchers to ask natural language questions about data availability and structure. This capability directly addresses the translation gap: scientists can now pose questions exactly as they think about them scientifically, without needing to understand the underlying data model or technical schema.
This AI layer is what allows rich biological data assets like the Komen Tissue Bank to function as a true research commons at scale. By translating scientific intent directly into cohort-ready outputs, AI bridges the gap between raw data and usable populations, making it possible to reuse the same data asset across many studies and downstream translational or commercial programs without repeated manual effort.
The AI interprets queries in the semantic space of breast cancer biology and automatically navigates the technical complexity of the data structure. Instead of manually parsing data dictionaries or relying on a data coordinator, researchers can ask:
- "Which participants have both RNA-seq data and H&E imaging, and donated tissue both pre- and post-menopause?"
- "Are there samples from nulliparous women under 35 with high breast density measurements?"
- "Show me the overlap between participants with family history of breast cancer and those with available metabolomic profiles"
The AI interprets these queries, navigates the underlying data structure, and returns precise answers with visualizations, compressing what might have been weeks of iterative communication into minutes. The platform acts as a bilingual translator, fluent in both the language of science and the language of data.
For scientific leaders, the practical impact is clear: research teams can move from initial hypothesis to validated, fundable study design in a fraction of the traditional time. Projects that might have required months of preliminary feasibility work can now be evaluated in days, enabling faster grant preparation, more ambitious study designs, reduced risk in study planning, and increased innovation as researchers explore novel hypotheses they might otherwise have dismissed as too complex to assess.
Strategic Advantages for Research Leaders
For research leaders, the Komen Tissue Bank on Manifold offers: access to irreplaceable biological baseline data that enables research questions impossible to address with tumor-only datasets; reduced barriers to entry for investigators at institutions without established tissue collection infrastructure; and infrastructure that scales from targeted single-variable studies to complex multi-omic integrative analyses. This partnership provides a model for how disease-focused biorepositories can recognize that data value derives not just from collection but from accessibility, usability, and the infrastructure that connects researchers to data.
While this partnership focuses on breast cancer research, the underlying capability is broadly applicable across life sciences. The same cohort discovery workflow supports common pharma and biotech use cases such as biomarker validation, translational research planning, and data commercialization, where well-defined, reusable cohorts are the primary unit of value. In each case, speed and precision in cohort feasibility directly impact timelines, cost, and commercial outcomes.
"The Komen Tissue Bank was founded to make high-quality breast tissue data accessible to the global research community, and that mission depends on reducing friction between researchers and the data itself. Working with Manifold allows us to modernize how scientists explore, understand, and use this resource. By pairing our unique dataset with Manifold’s platform and emerging AI capabilities, we are building a research commons that makes collaboration easier and accelerates the pace of discovery for everyone."
Michele L. Coté, PhD, MPH , Director, Komen Tissue Bank at Indiana University Simon Comprehensive Cancer Center
From Infrastructure to Impact
The scientific value of the Komen Tissue Bank has been proven through published research advancing our understanding of breast biology, cancer risk, and early detection. The partnership with Manifold ensures that this value can be realized more broadly, more quickly, and by more researchers than ever before.
For translational and data strategy leaders, this means faster feasibility assessment, reduced risk in cohort-based programs, and the ability to reuse the same biological assets across multiple research and commercial initiatives — without rebuilding pipelines or reinterpreting data each time.
This is how data infrastructure should serve science, not as a technical backend, but as an enabler of discovery that gets out of the way and lets researchers focus on the questions that matter.
We invite the breast cancer research community to explore what's now possible with this essential resource.
To follow updates on accessing the Komen Tissue Bank through Manifold and to discuss how this resource could support your research, please visit https://cancer.iu.edu/ktb/index.html or contact Jill Henry (jihenry@iu.edu).