Skip to content

Commit

Permalink
Minor fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
anton-bushuiev committed Jan 23, 2024
1 parent d158fd4 commit 0f26515
Show file tree
Hide file tree
Showing 3 changed files with 195 additions and 170 deletions.
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ PPIRef is a Python package for processing and analysing 3D structures of protein

# PPIRef dataset

PPIRef is a complete* and non-redundant dataset of protein-protein interactions (PPIs). It was constructed in three steps: (i) exhaustevily extract all putative protein dimers from PDB based on heavy atom contacts, (ii) filter out not proper PPIs based on buried surface area and other criteria, (iii) remove near-duplicate PPIs with iDist - a fast algorithm accurately approximating PPI alignment-based algorithms such as iAlign. See [our paper](https://arxiv.org/abs/2310.18515) for details.
PPIRef is a complete* and non-redundant dataset of protein-protein interactions (PPIs). It was constructed in three steps: (i) exhaustevily extract all putative protein dimers from PDB based on heavy atom contacts, (ii) filter out not proper PPIs based on buried surface and quality criteria, (iii) remove near-duplicate PPIs with iDist - a fast algorithm accurately approximating PPI alignment-based algorithms such as iAlign. See [our paper](https://arxiv.org/abs/2310.18515) for details.

<p align="center">
<img align="right" width="350" src="https://github.com/anton-bushuiev/PPIRef/assets/67932762/f5d12ffb-8d1b-40b8-bab1-23d33a091f05"/>
Expand All @@ -26,7 +26,7 @@ The PPIRef dataset can be downloaded from Zenodo (TBD soon). Alternatively, the

## Installation

To install the package, clone the repository and run `pip install .` in the root directory. The package was tested with Python 3.9.
To install the package, clone the repository (`git clone [email protected]:anton-bushuiev/PPIRef.git; cd PPIRef`) and run `pip install .` in the root directory. The package was tested with Python 3.9.

Please see the `external/README.md` directory for the details on how to install the external software for comparing PPIs and calculating buried surface area (BSA).

Expand All @@ -53,15 +53,15 @@ extractor.extract(pdb_file, partners=['A', 'C'])
# Extract a contact-based PPI between three specified chains (trimer)
extractor.extract(pdb_file, partners=['A', 'B', 'C'])

# Extract a complete complex by setting high expansion radius around interface
# Extract a complete dimer complex by setting high expansion radius around interface
ppi_complexes_dir = PPIREF_TEST_DATA_DIR / 'ppi_dir_complexes'
extractor = PPIExtractor(out_dir=ppi_complexes_dir, kind='heavy', radius=6., bsa=False, expansion_radius=1_000_000.)
extractor.extract(pdb_file, partners=['A', 'C'])
```

## Analysing PPIs

After the extraction, one can visualize the PPIs via a PyMOL wrapper and get their statistics.
After the extraction, one can visualize the PPIs via the PyMOL wrapper and get their statistics.

```python
from ppiref.visualization import PyMOL
Expand Down Expand Up @@ -110,7 +110,7 @@ from ppiref.comparison import IAlign, USalign, IDist
extractor = PPIExtractor(out_dir=ppi_dir, kind='heavy', radius=6., bsa=False)
extractor.extract(PPIREF_TEST_DATA_DIR / '1p7z.pdb', partners=['A', 'C'])
extractor.extract(PPIREF_TEST_DATA_DIR / '3p9r.pdb', partners=['B', 'D'])
ppis = [PPIREF_TEST_DATA_DIR / 'p7/1p7z_A_C.pdb', PPIREF_TEST_DATA_DIR / 'p9/3p9r_B_D.pdb']
ppis = [ppi_dir / 'p7/1p7z_A_C.pdb', ppi_dir / 'p9/3p9r_B_D.pdb']

# Compare with iAlign (see external/README.md for installation)
ialign = IAlign()
Expand Down Expand Up @@ -157,7 +157,7 @@ idist.embeddings

# And then query for near-duplicates
idist.query(idist.embeddings['1p7z_A_C'])
> (array([2.63417803e-09, 2.66141793e-03]),
> (array([0. , 0.00346618]),
> array(['1p7z_A_C', '3p9r_B_D'], dtype=object))

# Or directly deduplicate them
Expand Down Expand Up @@ -187,7 +187,7 @@ split = read_split('skempi2_iclr24_split')
split['test'][:3]
> ['1B3S_A_D', '1B2U_A_D', '1BRS_A_D']

# DIPS set used to train EquiDock
# DIPS set used to train EquiDock and DiffDock-PP
split = read_split('dips_equidock')
split['train'][:3]
> ['1v6j_A_D', '2v6a_L_A', '2v6a_B_O']
Expand Down
350 changes: 188 additions & 162 deletions notebooks/demo.ipynb

Large diffs are not rendered by default.

1 change: 0 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,6 @@ pytest==7.3.1
python-dateutil==2.8.2
python-json-logger==2.0.7
pytz==2023.3
PyYAML==5.4.1
pyzmq==25.0.2
qtconsole==5.4.3
QtPy==2.3.1
Expand Down

0 comments on commit 0f26515

Please sign in to comment.