-
Notifications
You must be signed in to change notification settings - Fork 7
tBLASTn parameters and error with the PredictGenes program #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi preteven, Thanks for opening the issue. Cheers, Pablo |
Hi Pablo, I downloaded the BG7.jar from https://github.com/bg7/bg7 by clicking "Download this repository as a zip file" on 20.2.2013. Thanx! Primoz |
OK, another question then, when you say you ran the program FixFastaHeaders, you did that before generating your BLAST XML files, right? Perhaps this point is not clearly explained in the documentation but you are supposed to execute this program (when needed) as a preprocessing tool for input FASTA files at the very beginning of the process, that's to say, even before launching any BLAST ( @rtobes , @marina-manrique please correct me if I'm wrong with this) |
Yes, the program FixFastaHeaders have to be run before launching BLAST. |
OK, now it works, thanx! Great, one mystery is solved, two more to go ;) Primoz |
No problem ;) Regarding your other two questions: _What are ussual parameters you use for tBLASTn comparisson (evalue, word size, penalties,…)? @rtobes , @marina-manrique Could you chime in on this? Cheers, Pablo |
It depends on the goals of your annotation but you can try with an e-value of 10E-20 and default values for the rest of tBLASTn parameters. You can test the system with some fragment of an available well annotated genome and analyze the results with different values for Extension_threshold and Overlapping_threshold. Raquel |
Thanx for all the information! Now I just have to do some annotation :) Bye! Primoz |
Hi,
I must say you did great work with BG7. I am trying to use BG7 pipeline for annotation of our Lactobacillus gasseri strain (454 sequencing + Sanger gap closures – in large contigs). Firstly I have question about tBLASTn parameters. What are ussual parameters you use for tBLASTn comparisson (evalue, word size, penalties,…)?
Additionally what Extension_threshold and Overlapping_threshold do you recommend to start with, for this organism?
Secondly I encountered strange error with the PredictGenes program.
Here is the printout of the last few rows:
hit = contig00004 length=117373 numreads=6125
Analyzing hsps hit, there are 1
Entering while... hsps.size()=1
Iteration Q047C0 has 0hits
Iteration Q047E4 has 1hits
hit = contig00009 length=78808 numreads=7139
Analyzing hsps hit, there are 1
Entering while... hsps.size()=1
Iterations finished !!! :)
java.lang.NullPointerException
at com.era7.bioinfo.annotation.PredictGenes.main(PredictGenes.java:446)
C:\Users\PTreven\Downloads\BG7\BG7-master\jars>
Interestingly, this error occurred only when I useed XX_sequences_header_fixed.fna for an input. If I used the original XX_sequences.fna the porgram finished successfully. FixFastaHeaderQC.jar reported no problems.
By the way, I am using Windows 7 on Intel Core i7 3,4 GHz, 8 GB of RAM.
Thank you in advance for all the answers!
PTreven
The text was updated successfully, but these errors were encountered: