Manual

Introduction

The EnzymeDetector database offers a comparative and integrative approach to find enzymatic annotations. The most comprehensive databases for protein and genome annotation, namely manually annotated data and text mining data from BRENDA, UniProt, KEGG, PATRIC, and NCBI's RefSeq, are integrated to have a vast view on the organism of interest. The data are complemented with self-performed annotation methods, i.e. BLAST vs. all enzyme annotations from Swiss-Prot and BrEPS enzyme pattern recognition.

Start using EnzymeDetector

On the main page of EnzymeDetector, you can either select an organism or an enzyme. Start typing in the respective search field to get suggestions or click on Show all to get an overview. The organism page gives you the organism-specific, integrative overview that was described before. The enzyme page shows all results for the selected enzyme, across all available species, and the maximum score from the organism.

Confidence Score

One of the outstanding features of EnzymeDetector is the confidence score. It is an indicator for the quality of an enzyme annotation and is calculated from the sum of the weighted domain-specific reliability of a source. The reliability based on the comparison with manually annotated data is evaluated for the following sources:

Source Bacteria Archaea Eukaryota
BRENDA 1.00 1.00 1.00
Swiss-Prot 0.91 0.84 0.88
TrEMBL 0.76 0.69 0.76
BLAST < 1e-120 0.94 0.83 0.94
BLAST < 1e-80 0.85 0.79 0.86
BLAST < 1e-50 0.78 0.74 0.81
BLAST < 1e-20 0.73 0.69 0.76
BLAST > 1e-20 0.64 0.66 0.72
BrEPS 0.95 0.94 0.95
PATRIC 0.82 0.71 -
NCBI's RefSeq 0.67 0.78 0.80
KEGG 0.87 0.80 0.88

The web interface

Filtering of data columns

Every column with a filter field below the column name can be filtered. The KEGG and BrEPS columns can be filtered by their identifiers, the BRENDA column by a "B", and the UniProt column by a "S" for SwissProt or a "T" for TrEMBL. It is always exact filtering that allows the usage of * as wildcard, with two exceptions:

  1. Reference scores are always filtered using a lower cutoff, so giving a value of 2, only entries with a score >= 2 are shown
  2. BLAST e-values are always filtered using a upper cutoff, so giving a value of 1e-100, only entries with an e-value lower than this are shown

The histogram

The histogram in the upper part of the results page shows the distribution of the confidence scores for all predictions. This facilitates the choice of a suitable cutoff in order to filter out unreliable annotations. The percentage of proteins that have an enzymatic function due to this threshold is indicated below the cutoff input field.

Diverse enzymatic annotations

Diverse enzymatic annotations for the same sequence can be recognized easily by the circular indicator behind the UniProt Accession. By clicking on this indicator, all available annotations for a sequence are displayed.

Publications