Skip to content

A bidirectional cross-attention and pre-trained general-purpose foundation model-based algorithm to predict super-resolved spatial gene expression by deeply integrating histology images and LR spot-based ST data.

Notifications You must be signed in to change notification settings

wenwenmin/HISTEX

Repository files navigation

HISTEX

Introduction

The groundbreaking development of spatial transcriptomics (ST) enables researchers to map gene expression across tissues with spatial precision. However, current next-generation sequencing methods, which theoretically cover the entire transcriptome, face limitations in resolving spatial gene expression at high resolution. The recently introduced Visium HD technology offers a balance between sequencing depth and spatial resolution, but its complex sample preparation and high cost limit its widespread adoption. To address these challenges, we introduce HISTEX, a multimodal fusion approach that leverages a bidirectional cross-attention mechanism and a general-purpose foundation model. HISTEX integrates spot-based ST data with histology images to predict super-resolution (SR) spatial gene expression. Experimental evaluations demonstrate that HISTEX outperforms state-of-the-art methods in accurately predicting SR gene expression across diverse datasets from multiple platforms. Moreover, experimental validation underscores HISTEX’s potential to generate new biological insights. It enhances spatial patterns, enriches biologically significant pathways, and facilitates the SR annotation of tissue structures. These findings highlight HISTEX as a powerful tool for advancing ST research.

Overview.png

Requirements

All experiments were conducted on an NVIDIA RTX 3090 GPU. Before running HISTEX, you need to create a conda environment and install the required packages:

conda create -n HISTEX python==3.11.5
conda activate HISTEX
pip install -r requirements.txt

Data

The Xenium human breast cancer datasets: https://www.10xgenomics.com/products/xenium-in-situ/preview-dataset-human-breast.

The Visium HD human breast cancer: https://www.10xgenomics.com/datasets/visium-hd-cytassist-gene-expression-human-breast-cancer-fresh-frozen.

The Visium HD mouse brain: https://www.10xgenomics.com/datasets/visium-hd-cytassist-gene-expression-mouse-brain-fresh-frozen.

The HER2-positive breast cancer datasets: https://github.com/almaan/her2st.

Pre-trained general-purpose foundation mode

Given the outstanding performance of large pre-trained general-purpose foundation models in clinical tasks, we use UNI as the backbone feature extractor. Before using HISTEX, you need to apply to UNI for permission to access the model weights: https://huggingface.co/mahmoodlab/UNI.

Training and Inferring

  • First, histology image and location information are normalized by running image_calibration.py.
  • Second, high-density spot-based ST data and histological features are acquired by running get_HR.py and Histology_Extractor.py.
  • Third, mask file is obtained to filter out non-tissue areas by running masking_non_tissue.py.
  • Finally, super-resolution gene expression profiles are generated by running HISTEX.py.
python image_calibration.py --directory dataset\\
python get_HR.py --directory dataset\\
python Histology_Extractor.py --directory dataset\\ --login ***
python masking_non_tissue.py --directory dataset\\
python HISTEX.py --directory dataset\\ --epochs 500 --n-states 5

--directory represents the directory of your dataset, and --login represents the key of the UNI model you own.

Contact details

If you have any questions, please contact [email protected] (The current paper is undergoing double-blind review at MICCAI 2025).

About

A bidirectional cross-attention and pre-trained general-purpose foundation model-based algorithm to predict super-resolved spatial gene expression by deeply integrating histology images and LR spot-based ST data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages