Skip to content

SymNMF algorithm implemented in C and provided as a Python module.

Notifications You must be signed in to change notification settings

Koren-Ben-Ezra/symNMF-algorithm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📚 Academic Course Project

As part of an academic course, I embarked on the implementation of a clustering algorithm known as Symmetric Non-negative Matrix Factorization (symNMF). symNMF offers a robust solution for clustering tasks, addressing real-world challenges in data analysis and interpretation.

🚀 Practical Applications of symNMF Algorithm

The symNMF algorithm offers practical solutions beyond traditional clustering methods like K-means. For instance, in the healthcare industry, symNMF can be employed to analyze patient data for disease clustering. By operating on symmetric non-negative matrices, symNMF accurately identifies clusters of patients exhibiting similar symptoms or disease progression, aiding in personalized treatment plans and medical research. Its flexibility extends to financial data analysis, where symNMF can uncover patterns in market trends for investment portfolio optimization, enhancing decision-making processes for financial institutions.

🛠️ How to Use the Project

To use symNMF module for your clustering tasks, follow these simple steps:

1. Downloading the Project

Clone or download the symNMF project repository from GitHub to your local machine.

2. Setting Up the Environment

Ensure you have Python installed on your system, along with necessary libraries such as NumPy and SciPy.

4. Integration into Your Project

Import the symnmf.py module into your Python project to access the symNMF functionalities.

5. Specifying Parameters

Customize the symNMF algorithm by specifying parameters such as the number of clusters (k) and the desired goal (goal). The available goals include:

  • symnmf: Perform the full symNMF clustering algorithm and output the resulting matrix H.
  • sym: Calculate and output the similarity matrix A.
  • ddg: Calculate and output the Diagonal Degree Matrix D.
  • norm: Calculate and output the normalized similarity matrix W.

6. Calling Functions

Call the relevant functions from symnmf.py based on your specific clustering objectives:

  • To perform symNMF clustering: symnmf(k, 'symnmf', file_name)
  • To calculate the similarity matrix: symnmf(k, 'sym', file_name)
  • To compute the Diagonal Degree Matrix: symnmf(k, 'ddg', file_name)
  • To derive the normalized similarity matrix: symnmf(k, 'norm', file_name)

7. Analyzing Results

Analyze the output matrices generated by symNMF to gain insights into your dataset's underlying structures. Compare the clustering performance against industry-standard algorithms like K-means using metrics such as silhouette score.







📝 Appendix: Special Mathematical Matrices

1. Similarity Matrix (A)

The similarity matrix A ∈ Rn×n is defined as:

$$a_{ij} = \begin{cases} \exp\left(-\frac{||x_i - x_j||^2}{2}\right) & \text{if } i \neq j \\\ 0 & \text{o.w.} \end{cases}$$

2. Diagonal Degree Matrix (D)

The degree matrix D is defined as the diagonal matrix with degrees d1, ..., dn on the diagonal and zero elsewhere. The degree of a vertex xi ∈ X is defined as:

$$d_i = \sum_{j=1}^n a_{ij}$$

3. Normalized Similarity Matrix (Graph Laplacian) (W)

The graph Laplacian W ∈ Rn×n is defined as:

$$W = D^{-1/2}AD^{-1/2}$$

4. Decomposition Matrix (H)

$$H_{n×k} = \text{arg min}_{H\geq0} ||W-HH^T||^2_F$$

About

SymNMF algorithm implemented in C and provided as a Python module.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published