No hits from DIAMOND #170

JosieMainwaring · 2024-03-15T19:56:20Z

Hi all,

I have annotated the example E. coli K12 genome & my genome of interest using run_dbcan on a virtualbox linux system and I had no issues with errors in the code, and the output data files were produced as expected. However, in both cases, there are zero hits in the column for DIAMOND (all just '-' entries, and no hits with all 3 tools), which is unexpected for both genomes.

Does anyone know what might be causing this?

For reference, the diamond version I'm running is 2.0.11

Any help appreciated!

JosieMainwaring · 2024-03-17T17:54:46Z

Hi, I'm still having this issue.
I've tried building the databases using
dbcan_build --cpus 8 --db-dir db --clean
or by the Database Installation Command, and the problem persists, even though it seems like diamond has been installed.
The diamond.out files are not populated.
Any help please?

linnabrown · 2024-04-18T23:50:01Z

Diamond version here is 2.1.9. I just create an new environment and install the dbcan according to our document.
It is very strange there is no hits for diamond on your end.
I tried this command to run the example E. coli genome, which only choose diamond so no result for EC number, hmmer and dbcan_sub:

run_dbcan EscheriaColiK12MG1655.faa protein --out_dir output_233 -t diamond

Following is the overview result

Gene ID EC#     HMMER   dbCAN_sub       DIAMOND #ofTools
NP_414562.1     -       -       -       GT77    1
NP_414631.1     -       -       -       GT28    1
NP_414632.1     -       -       -       GT28    1
NP_414638.1     -       -       -       CE11    1
NP_414654.1     -       -       -       GH13_3  1
NP_414672.1     -       -       -       CE4     1
NP_414691.1     -       -       -       GT51    1
NP_414724.1     -       -       -       GT19    1
NP_414726.1     -       -       -       GH13_30 1
NP_414736.1     -       -       -       CBM50+GH25      1
NP_414747.1     -       -       -       CBM50+GH23      1
NP_414805.1     -       -       -       GH43_11 1
NP_414845.1     -       -       -       AA3_2   1
NP_414869.1     -       -       -       GH1     1
NP_414877.1     -       -       -       GH36    1
NP_414878.1     -       -       -       GH2     1
NP_414879.3     -       -       -       GH2     1
NP_414897.1     -       -       -       GT2     1
NP_414936.1     -       -       -       GH13_3  1
NP_414937.2     -       -       -       CBM34+GH13_21   1
NP_415006.1     -       -       -       GH152   1
NP_415017.1     -       -       -       CBM50   1
NP_415059.1     -       -       -       GH27    1
NP_415087.1     -       -       -       GH24    1
NP_415101.1     -       -       -       GT0     1
NP_415108.1     -       -       -       GH13_3  1
NP_415118.1     -       -       -       GT2     1
NP_415167.1     -       -       -       GH103   1
NP_415168.1     -       -       -       GH103   1
NP_415175.1     -       -       -       GH13_26 1
NP_415188.1     -       -       -       GT4     1
NP_415203.1     -       -       -       CE9     1
NP_415206.1     -       -       -       CE8     1
NP_415214.1     -       -       -       CBM48+GH13_9    1
NP_415252.1     -       -       -       GT51    1
NP_415254.1     -       -       -       GT2     1
NP_415255.1     -       -       -       GT2     1
NP_415256.1     -       -       -       GT2     1
NP_415257.1     -       -       -       GT22    1
NP_415260.1     -       -       -       GH38    1
NP_415279.1     -       -       -       GH3     1
NP_415293.1     -       -       -       CE8     1
NP_415296.1     -       -       -       AA5_1   1
NP_415403.1     -       -       -       GT4     1
NP_415410.1     -       -       -       GT2     1
NP_415541.1     -       -       -       GT2     1
NP_415542.1     -       -       -       CE4+GH153       1
NP_415543.1     -       -       -       CE4+GH153       1
NP_415567.1     -       -       -       GT2     1

Can you have the same overview result like mine?

JosieMainwaring · 2024-05-22T01:20:10Z

Thanks for your reply! I updated my Diamon version to 2.1.9 and tried the above and I'm still having the same problem! It runs as expected, and comes up with no errors, but the diamond.out files and diamond column of the overview.txt file are empty still. Any other thoughts???

JosieMainwaring · 2024-05-22T03:20:41Z

I've just tried from scratch again, setting up a new environment and installing everything again from scratch and still having the same issue :( looking like this data will just be missing from my dissertation! (Which is due next week)

JosieMainwaring · 2024-05-22T03:28:19Z

I need run_dbcan version too (can't use online) because it's a fungal genome

linnabrown · 2024-05-22T03:50:39Z

Can you provide the data you are using? That does not make sense diamond no hits

JosieMainwaring · 2024-05-22T04:38:43Z

Thanks for replying. I've been using the example data to try to get it to work. Have tried both nucelotide and aa sequences, using
"run_dbcan EscheriaColiK12MG1655.fna prok --out_dir output_EscheriaColiK12MG1655"
as well as the code you provided above:
"run_dbcan EscheriaColiK12MG1655.faa protein --out_dir output_233 -t diamond"

And running my query data gave the same issue

yinlabniu · 2024-05-22T04:49:09Z

Sounds like your diamond might have not actually worked. If you run "diamond help", do you see the help information? If you do, you can try to run diamond on your protein file directly on the command line (i.e., not using run_dbcan), like "diamond blastp -d {cazy_indexfile} -e {dia_eval} -q {yourfaafile} -k 1 -o diamond.out -f 6". Let me know if you see any output in diamond.out. Yanbin

…

________________________________ From: JosieMainwaring ***@***.***> Sent: Tuesday, May 21, 2024 11:39 PM To: linnabrown/run_dbcan ***@***.***> Cc: Subscribed ***@***.***> Subject: Re: [linnabrown/run_dbcan] No hits from DIAMOND (Issue #170) Caution: Non-NU Email Thanks for replying. I've been using the example data to try to get it to work. Have tried both nucelotide and aa sequences, using "run_dbcan EscheriaColiK12MG1655.fna prok --out_dir output_EscheriaColiK12MG1655" as well as the code you provided above: "run_dbcan EscheriaColiK12MG1655.faa protein --out_dir output_233 -t diamond" And running my query data gave the same issue — Reply to this email directly, view it on GitHub<#170 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AEXNKZXUJOJZYODAGOPCQXLZDQOORAVCNFSM6AAAAABEYSOU6KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRTHA2TCNBXGY>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

JosieMainwaring · 2024-05-22T05:02:20Z

Yes, I see the following:
"
(dbcan3) tup@Tuptop-VirtualBox:~$ diamond help
diamond v2.1.9.163 (C) Max Planck Society for the Advancement of Science, Benjamin Buchfink, University of Tuebingen
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

Syntax: diamond COMMAND [OPTIONS]

Commands:
makedb Build DIAMOND database from a FASTA file
prepdb Prepare BLAST database for use with Diamond
blastp Align amino acid query sequences against a protein reference database
blastx Align DNA query sequences against a protein reference database
cluster Cluster protein sequences
linclust Cluster protein sequences in linear time
realign Realign clustered sequences against their centroids
recluster Recompute clustering to fix errors
reassign Reassign clustered sequences to the closest centroid
view View DIAMOND alignment archive (DAA) formatted file
merge-daa Merge DAA files
help Produce help message
version Display version information
getseq Retrieve sequences from a DIAMOND database file
dbinfo Print information about a DIAMOND database file
test Run regression tests
makeidx Make database index
greedy-vertex-cover Compute greedy vertex cover

Possible [OPTIONS] for COMMAND can be seen with syntax: diamond COMMAND

Online documentation at http://www.diamondsearch.org
"
I'll try this for the example data, but for my query sequence I don't have an amino acid file unfortunately!

JosieMainwaring · 2024-05-22T05:04:09Z

What do I input for cazy_indexfile and dia_eval ?

linnabrown · 2024-05-22T05:46:34Z

Can you install the docker version? This is the fastest way.

JosieMainwaring · 2024-05-22T05:51:18Z

I haven't tried the docker version yet - not familiar with Docker at all. But I'll give it a try

Edit: Will it be fastest for a noob who doesn't yet have Docker installed?

JosieMainwaring · 2024-05-22T07:27:41Z

I don't have space on my computer to pull the haidyi/run_dbcan image for Docker setup - I'll have to try through my university HPC tomorrow! Thanks for help so far guys. It's the last piece of data I need - all just to write a couple of numbers into a table! Will be back tomorrow

yinlabniu · 2024-05-22T15:19:14Z

So you do have a working diamond installed, but you don't have a protein fasta file. This is likely the reason that you don't have any result in diamond.out (I also have doubt that you will have meaningful result in hmmer.out). For eukaryotic genomes, we suggested that you use protein instead of nucleotide input. This is from the help page of dbCAN web server: [cid:e0856152-b189-4e22-a5d7-1a8cae1c6fab] Within run_dbcan, we call prodigal to predict protein coding genes if users input the genome nt fasta. But prodigal is for prokaryote/phage genomes but not designed for eukaryotes, so we do not recommend users use nt input for run_dbcan. Instead you should predict proteins outside of run_dbcan. Yanbin

…

________________________________ From: JosieMainwaring ***@***.***> Sent: Wednesday, May 22, 2024 12:02 AM To: linnabrown/run_dbcan ***@***.***> Cc: Yanbin Yin ***@***.***>; Comment ***@***.***> Subject: Re: [linnabrown/run_dbcan] No hits from DIAMOND (Issue #170) Caution: Non-NU Email Yes, I see the following: " (dbcan3) ***@***.***:~$ diamond help diamond v2.1.9.163 (C) Max Planck Society for the Advancement of Science, Benjamin Buchfink, University of Tuebingen Documentation, support and updates available at http://www.diamondsearch.org Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021) Syntax: diamond COMMAND [OPTIONS] Commands: makedb Build DIAMOND database from a FASTA file prepdb Prepare BLAST database for use with Diamond blastp Align amino acid query sequences against a protein reference database blastx Align DNA query sequences against a protein reference database cluster Cluster protein sequences linclust Cluster protein sequences in linear time realign Realign clustered sequences against their centroids recluster Recompute clustering to fix errors reassign Reassign clustered sequences to the closest centroid view View DIAMOND alignment archive (DAA) formatted file merge-daa Merge DAA files help Produce help message version Display version information getseq Retrieve sequences from a DIAMOND database file dbinfo Print information about a DIAMOND database file test Run regression tests makeidx Make database index greedy-vertex-cover Compute greedy vertex cover Possible [OPTIONS] for COMMAND can be seen with syntax: diamond COMMAND Online documentation at http://www.diamondsearch.org " I'll try this for the example data, but for my query sequence I don't have an amino acid file unfortunately! — Reply to this email directly, view it on GitHub<#170 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AEXNKZTLYQYDN3RGDUKFFNLZDQRHDAVCNFSM6AAAAABEYSOU6KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRTHA3TCMZVG4>. You are receiving this because you commented.

JosieMainwaring · 2024-05-22T19:58:25Z

That makes sense for the query sequence, but why would the example E. coli data not work either? Including with the amino acid file? If I can get the example data working, then I still have hope for my query sequence. I'd just have to translate it to .faa by other means, right?

yinlabniu · 2024-05-22T20:22:07Z

E.coli data should work. Yes, you can check if using ecoli protein file would work. Commands are at here https://dbcan.readthedocs.io/en/latest/user_guide/index.html. There are example files here https://bcb.unl.edu/dbCAN2/download/Samples/.

…

________________________________ From: JosieMainwaring ***@***.***> Sent: Wednesday, May 22, 2024 2:58 PM To: linnabrown/run_dbcan ***@***.***> Cc: Yanbin Yin ***@***.***>; Comment ***@***.***> Subject: Re: [linnabrown/run_dbcan] No hits from DIAMOND (Issue #170) Caution: Non-NU Email That makes sense for the query sequence, but why would the example E. coli data not work either? Including with the amino acid file? If I can get the example data working, then I still have hope for my query sequence. I'd just have to translate it to .faa by other means, right? — Reply to this email directly, view it on GitHub<#170 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AEXNKZXP3BDB3TLR44VCYWDZDT2HRAVCNFSM6AAAAABEYSOU6KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRVGY2DGOBRGY>. You are receiving this because you commented.Message ID: ***@***.***>

JosieMainwaring · 2024-05-22T22:57:25Z

Thanks everyone for your help. I got everything (example data & query) working just by running all the same steps on my HPC. For whatever reason Diamond was just determined to be broken on my linux. So, not solved but worked around.

linnabrown · 2024-05-22T22:59:39Z

Again, highly recommend to use docker image when you confront this issue next time. Each person might change the configuration of his/her system which might ruin the installation for other software. Since docker won't ruin your linux system and it created its own linux system already @JosieMainwaring

HaidYi assigned linnabrown May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No hits from DIAMOND #170

No hits from DIAMOND #170

JosieMainwaring commented Mar 15, 2024

JosieMainwaring commented Mar 17, 2024

linnabrown commented Apr 18, 2024

JosieMainwaring commented May 22, 2024

JosieMainwaring commented May 22, 2024

JosieMainwaring commented May 22, 2024

linnabrown commented May 22, 2024

JosieMainwaring commented May 22, 2024 •

edited

Loading

yinlabniu commented May 22, 2024 via email

JosieMainwaring commented May 22, 2024

JosieMainwaring commented May 22, 2024

linnabrown commented May 22, 2024

JosieMainwaring commented May 22, 2024 •

edited

Loading

JosieMainwaring commented May 22, 2024

yinlabniu commented May 22, 2024 via email

JosieMainwaring commented May 22, 2024

yinlabniu commented May 22, 2024 via email

JosieMainwaring commented May 22, 2024

linnabrown commented May 22, 2024 •

edited

Loading

No hits from DIAMOND #170

No hits from DIAMOND #170

Comments

JosieMainwaring commented Mar 15, 2024

JosieMainwaring commented Mar 17, 2024

linnabrown commented Apr 18, 2024

JosieMainwaring commented May 22, 2024

JosieMainwaring commented May 22, 2024

JosieMainwaring commented May 22, 2024

linnabrown commented May 22, 2024

JosieMainwaring commented May 22, 2024 • edited Loading

yinlabniu commented May 22, 2024 via email

JosieMainwaring commented May 22, 2024

JosieMainwaring commented May 22, 2024

linnabrown commented May 22, 2024

JosieMainwaring commented May 22, 2024 • edited Loading

JosieMainwaring commented May 22, 2024

yinlabniu commented May 22, 2024 via email

JosieMainwaring commented May 22, 2024

yinlabniu commented May 22, 2024 via email

JosieMainwaring commented May 22, 2024

linnabrown commented May 22, 2024 • edited Loading

JosieMainwaring commented May 22, 2024 •

edited

Loading

JosieMainwaring commented May 22, 2024 •

edited

Loading

linnabrown commented May 22, 2024 •

edited

Loading