Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dataset-joined pdb_residues file doesn`t match with fasta sequence #57

Open
ProkopDivin opened this issue Jul 3, 2022 · 0 comments
Open
Labels

Comments

@ProkopDivin
Copy link

I run these commands, where joined.ds is from: https://github.com/rdk/p2rank-datasets

./prank.sh analyze residues joined.ds
./prank analyze fasta-masked joined.ds

But several files with residues don`t match with the fasta sequence.
All the files are here:
files.zip

In these files length of the sequence of chain, I and L are OK, but the sequence of the chain H should be longer according to csv file.

1hxf.pdb_residues.csv

1hxf_H.fasta
1hxf_I.fasta
1hxf_L.fasta

In these files, the length of chain A is 66 and the length of B is 65 but there are 232 rows in 1pts.pbd_residues.csv and I'm not getting any other files.

1pts.pbd_residues

1pts_A.fasta
1pts_B.fasta

I always get one fasta file for each csv file with residues and the sequence is shorter than the number of rows in csv.

1bbs.pdb_residues.csv
1bb_A.fasta

1chg.pdb_residues.csv
1chg_A.fasta

1djb.pdb_residues.csv
1djb_A.fasta

2cba.pdb_residues.csv
2cba_A.fasta

2fbp.pdb_residues.csv
2fbp_A.fasta

2tga.pdb_residues.csv
2tga_A.fasta

3lck.pdb_residues.csv
3lck_A.fasta

3p2p.pdb_residues.csv
3p2p_A.fasta

3ptn.pdb_residues.csv
3ptn_A.fasta

4ca2.pdb_residues.csv
4ca2_A.fasta

5dfr.pdb_residues.csv
5dfr_A.fasta

@rdk rdk added the bug label Oct 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants