Gene-compression-and-Cancer-type-classification

Fall 2021 Final Project for CM226 - Machine Learning in Bioinformatics

Dataset

TCGA - The Cancer Genome Atlas PanCanAtlas RNAseq data from the National Cancer Institute Genomic Data Commons
These data consisted of 11,069 samples with 20,531 measured genes. Preprocessing-
Tumors that were measured from multiple sites were removed.
Data was normalised
This resulted in a final TCGA PanCanAtlas gene expression matrix with 11,060 samples, which included 33 different cancer types, and 16,148 genes.
The data is split into 90% training and 10% testing partitions. The data is partitioned such that each split contained relatively equal representation of each cancer type.

Proposed Model

Implementation of PCA, NMF, ICA

These models have been built using sklearn library

Implementation of VAE Model

This VAE model is inspired from Tybalt's implementation

Authors

Rushi Bhatt
Ronak Kaoshik
Shruti Mohanty

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
figure		figure
models		models
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
ICA_LR.ipynb		ICA_LR.ipynb
LICENSE		LICENSE
NMF_LR.ipynb		NMF_LR.ipynb
PCA_LR.ipynb		PCA_LR.ipynb
README.md		README.md
TCGA Gene exp classification.ipynb		TCGA Gene exp classification.ipynb
VAE_LR.ipynb		VAE_LR.ipynb
VAE_with_preprocess.ipynb		VAE_with_preprocess.ipynb
gene_model_train.py		gene_model_train.py
preprocess.py		preprocess.py
proposed_model.png		proposed_model.png
vae_model.png		vae_model.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gene-compression-and-Cancer-type-classification

Dataset

Proposed Model

Implementation of PCA, NMF, ICA

Implementation of VAE Model

Authors

License

About

Releases

Packages

Contributors 3

Languages

License

RushiBhatt007/Gene-compression-and-Cancer-type-classification

Folders and files

Latest commit

History

Repository files navigation

Gene-compression-and-Cancer-type-classification

Dataset

Proposed Model

Implementation of PCA, NMF, ICA

Implementation of VAE Model

Authors

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages