You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I actually needed to add .\+ since my COGs were not found directly next to the locus_tag. egrep "COG[0-9]{4}" ./Output/Standard.gff | cut -f9 | sed 's/.\+COG\([0-9]\+\).\+;locus_tag=\(GANJLKBE_[0-9]\+\);.\+/\2\tCOG\1/g' > Standard.cog
I have noticed that some of my COGs do not have a corresponding ec_number. With the code provided in the workshop tutorial, we are extracting all COGs. Why is that?
For example-
_1. ID=AELJIOAN_00031;eC_number=2.6.1.83;Name=dapL_1;dbxref=COG:COG0436;gene=dapL_1;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:A0LEA5;locus_tag=AELJIOAN_00031;product=LL-diaminopimelate aminotransferase
ID=AELJIOAN_00034;Name=fliS_1;dbxref=COG:COG1516;gene=fliS_1;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P39739;locus_tag=AELJIOAN_00034;product=Flagellar secretion chaperone FliS_
egrep "COG[0-9]{4}" PROKKA_${date}.gff | cut -f9 | cut -f1,5 -d ';'| sed 's/ID=//g'| sed 's/;dbxref=COG:/\t/g' | grep COG
The text was updated successfully, but these errors were encountered: