Geptop:A Gene essentiality prediction tool for COMPLETE-GENOME based on orthology and phylogeny

Essential genes are absolutely required for the survival of an organism and are therefore considered the foundation of life. Geptop is a webserver, which first provides a platform to detect essential gene sets over bacterial species, by comparing orthology and phylogeny of query protein sets with essential gene datasets determined experimentally (from DEG database).

You will get the possibilities of essentiality score for each gene, the number of predicted essential genes and the estimated accuracy for the genome by our webserver.

How to cite the Geptop: Wei W, Ning LW, Ye YN, Guo FB*. (2013) Geptop: a gene essentiality prediction tool for sequenced bacterial genomes based on orthology and phylogeny. PLoS One. 8(8): e72343.

The new Geptop will coming soon!

 

Essentiality score cutoff:
Email address:

Upload your FASTA sequences below

 
(uploads limited to 3MB)

Choose Your FASTA sequences type below



 

 

1 Accessing Geptop

1.1 Websever

Geptop is maintained by CEFG group and for the current is available online at http://cefg.uestc.edu.cn/geptop/.

1.2 Standalone package

Geptop is also available as a standalone python script to run in a Linux or Windows environment. The file is available at the Download. These modules MUST be installed BEFORE Geptop is used.

2 Submitting a Sequence on the Websever

The sequence can be submitted at http://cefg.uestc.edu.cn/geptop/. The “Upload from file” can be used to analyze your local file. Geptop requires the whole-genomic PROTEIN sequences in FASTA format and essentiality score cutoff (range:0~1,default: 0.15). This format consists of these parts as follow:

  • A “>” symbol at the beginning of the title line and followed a description
  • The sequence itself at a newline

An example of FASTA format:


>gi|16128009|ref|NP_414556.1| chaperone Hsp40, co-chaperone with DnaK [Escherichia coli str. K-12 substr. MG1655]
MAKQDYYEILGVSKTAEEREIRKAYKRLAMKYHPDRNQGDKEAEAKFKEIKEAYEVLTDSQKRAAYDQYG
HAAFEQGGMGGGGFGGGADFSDIFGDVFGDIFGGGRGRQRAARGADLRYNMELTLEEAVRGVTKEIRIPT
LEECDVCHGSGAKPGTQPQTCPTCHGSGQVQMRQGFFAVQQTCPHCQGRGTLIKDPCNKCHGHGRVERSK
TLSVKIPAGVDTGDRIRLAGEGEAGEHGAPAGDLYVQVQVKQHPIFEREGNNLYCEVPINFAMAALGGEI
EVPTLDGRVKLKVPGETQTGKLFRMRGKGVKSVRGGAQGDLLCRVVVETPVGLNERQKQLLQELQESFGG
PTGEHNSPRSKSFFDGVKKFFDDLTR

3 Submitting a Sequence on the Standalone script

Whole-genomic PROTEIN sequences in FASTA format can also submit on the standalone Geptop. The usage of command line and optional parameters are described below.

Usage: python geptop.py –i protein.faa [Optional parameters] –p path of BLAST executable –s essentiality score cutoff –o output file

4 Output

Output sample:

#Total 14 genes are submitted
#3 of them are predicted as essential genes
#Estimated accuracy is 0.84
Class(essential gene:1,others:0)            Essentiality Score        Protein
0          0.1358                 gi|78059652|ref|YP_3
1          0.1966                 gi|78059654|ref|YP_3
0          0.0704                 gi|78059649|ref|YP_3
0          0.0704                 gi|78059647|ref|YP_3
0          0.1037                 gi|78059657|ref|YP_3
0          0.0704                 gi|78059650|ref|YP_3
0          0.0358                 gi|78059644|ref|YP_3
1          0.2532                 gi|78059655|ref|YP_3
1          0.1966                 gi|78059656|ref|YP_3

... ...

 

Links: [ CEFG | DEG | CEG | Contact ]