Skip to content

Commit

Permalink
Add files via upload
Browse files Browse the repository at this point in the history
  • Loading branch information
huawenbo authored Mar 11, 2024
1 parent e0a2a19 commit 12c2702
Showing 1 changed file with 87 additions and 18 deletions.
105 changes: 87 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,38 +1,107 @@
# GGOT: An Optimal Transport method for disease critical point detection
# GGOT: Detecting Disease Critical Transitions Using Gaussian Graphical Optimal Transport Embedded With Protein Interaction Networks.

This repository contains our proposed methods *Gaussian Graphical Optimal Transport* (GGOT) for detecting critical transitions and identifying trigger molecules of diseases, the corresponding pre-processed diseases datasets from TCGA and GEO databases, and pre-processed Protein-Protein Interaction (PPI) Networks of the Human and Mouse Genomes.

Code for the paper:

# Overview
> Wenbo Hua, Ruixia Cui, Heran Yang, Jingyao Zhang, Chang Liu, Jian Sun. "Detecting Disease Critical Transitions Using Gaussian Graphical Optimal Transport Embedded With Protein Interaction Networks"
<img src="assets/Overview.png" alt="Overview" style="zoom:30%;">
<!-- [[arxiv]](https://arxiv.org/abs/1907.03907) -->

# Dataset
## Overview

| Dataset | Description | Species | stages | Type of disease |
| ------------------------------------------------------------ | ------------ | ------------ | ------ | ------------------------------------- |
| [GSE48452](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE48452) | | Home sapiens | 3 | Chronic progressive benign disease |
| [LUAD](https://portal.gdc.cancer.gov/projects/TCGA-LUAD) | | Home sapiens | 7 | Chronic progressive malignant disease |
| [COAD](https://portal.gdc.cancer.gov/projects/TCGA-COAD) | | Home sapiens | 7 | Chronic progressive malignant disease |
| [GSE2565](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2565) | lung injury | Mus musculus | 9 | Acute progressive noncritical illness |
| [GSE154918](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE154918) | Septic shock | Home sapiens | 3 | Acute progressive critical illness |
| XJTUSepsis | XJTUSepsis | Home sapiens | 7 | Acute progressive critical illness |
GGOT uses Gaussian graphical model, incorporating the gene interaction network, to model the data distributions at different disease stages. Then we use population-level optimal transport to calculate the Wasserstein distance and transport maps between stages, enabling us to detect critical transitions. By analyzing the per-molecule transport map, we quantify the importance of each molecule and identify the trigger molecules. Moreover, GGOT predicts the occurrence of critical transitions in unseen samples and visualizes the disease progression process.

# Constructing the model
<img src="assets/Overview.png" alt="Overview" style="zoom: 25%;" />

## Install required packages
## Datasets

Clicking on the name of the corresponding dataset will redirect you to the website to download the corresponding dataset. The dataset of **XJTUSepsis** need to contact [email protected] to get access.

| Dataset | Description | Species | stages | Type of disease |
| ---------------------------------------------------------------------- | ------------------------- | ------------ | ------ | ------------------------------------- |
| [GSE48452](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE48452) | non-alcoholic fatty liver | Home sapiens | 4 | Chronic progressive benign disease |
| [GSE2565](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2565) | lung injury | Mus musculus | 10 | Acute progressive noncritical disease |
| [LUAD](https://portal.gdc.cancer.gov/projects/TCGA-LUAD) | lung adenocarcinoma | Home sapiens | 8 | Chronic progressive malignant disease |
| [COAD](https://portal.gdc.cancer.gov/projects/TCGA-COAD) | colon adenocarcinoma | Home sapiens | 8 | Chronic progressive malignant disease |
| XJTUSepsis | sepsis | Home sapiens | 8 | Acute progressive critical disease |
| [GSE154918](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE154918) | sepsis | Home sapiens | 4 | Acute progressive critical disease |

## Prerequisites

1 Install required packages

```bash
pip install -r requirements.txt
```

## Running the model for different dataset
2 Download the pre-processed Protein-Protein Interaction (PPI) Networks of the Human and Mouse Genomes into "`PPI/`" . We constuct the PPI networks using [string](https://string-db.org) database.

PPI Networks of two species:[[GGOT_PPI](https://drive.google.com/file/d/1-DyWv20mInYLvRTWN3iGT1QjaY6683Ro/view?usp=sharing)]

3 Download the RNA-seq dataset of different diseases into "`data/source/`".

Raw datasets: [[GGOT_datasets](https://drive.google.com/file/d/1c0SqU3dq22lE5qNlW7wbKG5UqoHKS83_/view?usp=sharing)]

**Make sure your data file tree used similar with the following.**

```
GGOT
├──run_model.py
├──data/
│ ├──source
│ │ ├──GSE48452
│ │ │ ├──GSE48452.csv (RNA-seq expression)
│ │ │ ├──group.csv (stage information)
│ │ ├──GSE2565
│ │ │ ├──GSE48452.csv
│ │ │ ├──group.csv
│ │ ├──LUAD
│ │ │ ├──LUAD.csv
│ │ │ ├──group.csv
│ │ ├──......
├──PPI
│ ├──Human
│ │ ├──ppi_database.npy
│ │ ├──...
│ ├──Mus
│ │ ├──ppi_database.npy
│ │ ├──...
├──......
```

## Running GGOT for different type datasets

- run model for chronic progressive non-critical disease, GSE48452

```bash
python run_model.py -d GSE48452 -s Human
```

- run modelfor acute progressive non-critical disease, GSE2565

```bash
python run_model.py -d GSE2565 -s Mus
```

- run modelfor chronic progressive critical disease, LUAD and COAD

```bash
python run_model.py -d LUAD -s Human
python run_model.py -d COAD -s Human
```

- run modelfor acute progressive critical disease, XJTUSepsis, GSE154918

```bash
python run_model.py -d Sepsis -s Human
python run_model.py -d XJTUSepsis -s Human
python run_model.py -d GSE154918 -s Human
```

## Making the visualization

`` python visualization.py -d GSE2565 ``

# Result
## Results (example on simulation dataset)

<img src="assets/Numsim.png" alt="Numsim" style="zoom:30%;">
<img src="assets/Numsim.png" alt="Overview" style="zoom:25%;">

0 comments on commit 12c2702

Please sign in to comment.