Documentation

Welcome to the EnzymeDetector Documentation!
This site will explain the features of EnzymeDetector to help you get the best results.

About Getting started.
About Selecting organisms.
About Selecting enzymes.

Getting started

Upon opening EnzymeDetector website for the first time, you are welcomed by the screen above with background information, a description
and conclusions. To start working, hover over the “Select” button. You can decide whether you want to search for Enzymes or Organisms:


02_SelectSearchtype

Select: Organism

This Button displays a list of all the organisms EnzymeDetector has information for. It is possible to filter the list according to the initial letter by using the Buttons at the top. Hovering the initials shows the number of containing entries, for example, the Database contains 118 Organisms beginning with “E”. Select the Organism of interest to get to the result page explained below.

Note: There are only organisms with a NCBI entry marked as “completed genome project” listed here.
03_Organism

Select Enzyme

Similar to the Organism list it is also possible to look for certain Enzymes/ EC-numbers.

04_Enzyme

Result view

The result view is the core of EnzymeDetector. It displays the information of the selected entry (Enzyme or Organism) and provides several tools to further analyze the data.

05_ResultOrganism

Table view: Functions

The core information you get is displayed in a table. To provide you a good working experience, we will introduce its functions to you below.

General constraints
06_OrganismConstraints

You can define certain constraints above the tables to influence the data the table contains. The functionality of the single constraints is explained below.

Note: After changing one or more constraints, the table has to be reloaded by clicking “Update”.

Show entries
07_OrganismEntries
The number of elements displayed on each page is set here. You can choose between “20”, “100”, “1000” and “all”.

Search
08_OrganismSearch
The search functionality searches the table for the search phrase. All rows that contain the search phrase anywhere are displayed.

Show/ hide columns
09_OrganismHideShow
Columns can be hidden by deselecting them via this drop down menu. They can be shown again by reselecting them.

Select
10_OrganismSelect
Left-clicking a row in the results selects it. If you do not only want to select a single row, you can use [Shift] and [Ctrl] to select several rows (similar to common usage in Windows and many programs): [Shift] + Click selects all entries between your previous selected entry and the row you just clicked at; [Ctrl] + Click adds a single new entry to the existing selection.

If desired, one can use selections to save certain rows of interest. For details regarding saving, see the save feature below.

Deselect all
11_OrganismDeselect
This button simply removes your selections.

Save
12_OrganismSave
By clicking on the save button, a drop down menu appears. Here you can choose the way you want to save the table.

  • “Print” switches to the print view of the page. You can leave it with [Esc].
  • “Copy” copies the selection to your clipboard ready to paste it somewhere else.
  • “CSV” generates a CSV file (Comma Separated Values) to save at your PC.
  • “PDF” generates a PDF file (Portable Document Format) to save at your PC.

If any rows are selected, only the selected ones are saved. Exception: The print view shows the entire table.

Filter
13_OrganismFilter
If you do not want to see all entries found, you can use the filtering options to exclude non-matching results. You can find them below the table. Each column has its own filter method. There are three possible filter mechanics:

  • A drop down menu lets you choose which of the distinct possible values of the column you want to see.
  • A text field allows you to search all the entries for your input in the corresponding column. All entries containing your search phrase are displayed.
  • Two text fields allow you to define a range of values. All entries having a value between the user-defined borders are displayed. Make sure to use numbers only, otherwise you won’t get any result.

It is possible to combine several filters at the same time. All filters are applied instantly.
Note: This is different from the settings in “General constraints” explained above, although both appear to have the same functions. The “General constraints” determine what your browser is loading from the server, the filtering options define what is shown in the browser window. You can change the filter without reloading the page (and without contacting the server again), but you can only see what was previously loaded from the server.
Example: If you do not allow Relevance values below 45 in the general constraints, you won’t get any entries in the table by setting the relevance filter to “0-45″.

Sort
14_OrganismSort
You can sort every column in ascending or descending order by clicking at the column heading. A small arrow indicates the current sorting:

  • A double arrow in both directions indicates possibility sorting.
  • An up arrow indicates ascending order.
  • A down arrow indicates descending order.

Note: When applied one after the other, the sorting of different columns can be combined, but the table cannot be restored to the default sorting. If you want to reset it, you have to reload the table.

Link
The results contain several links to other databases with additional information about the subject. They are indicated by the blue font color and can be opened by a left click.

Organism
If you search for an organism, EnzymeDetector will generate a list with all gene loci and the corresponding enzyme function annotations it has found. Different methods are used:

  • Annotations of different databases (BRENDA, KEGG, NCBI, PATRIC, SwissProt).
  • BLAST Search.
  • BrEPS Alignment.
  • AMENDA Text mining.

All results of these methods are assigned relevance values that can be defined by the user.
Combining the results of all methods provides useful information regarding the reliability of
annotations.

General
15_OrganismResultAdv
The overview of the Organism search result contains the name of the organism you searched for, the name of the chromosome and names of plasmids (if existing). Moreover you can find the total number of genes the organism has and the number of enzyme coding genes found by the EnzymeDetector. The additional options are explained below.

Relevance-Cutoff
24_OrganismRelevance
In EnzymeDetector, all hits have a relevance score, representing its reliability. The Relevance-Cutoff allows to ignore hits with a score lower than the input value. Changing the value needs a reload (“update”) afterwards.

The default value is calculated with the target that about 30% of the genes are enzymes.

Relevances
17_OrganismRelevances
All possible annotation sources have predefined relevance scores determined by statistical analysis of the sources’ qualities. For details see this paper.
You can change them here if you desire. Important: After changing the constraints, a reload via “update” is necessary. For information regarding the single sources see “Table entries” below.

Advanced options
18_OrganismMore
EnzymeDetector offers additional options to analyze your result data. You can show statistics, get pathway information or compare organisms of interest. All three possibilities are explained below.

Create Statistics
The statistics feature presents statistical information regarding the current search separated into the categories Genes, EC numbers and source entries.
Note: The statistics are based upon the relevance options of your search. Therefore, changing them at the top of the statistics window will influence the statistical information.

Genes
The gene statistics counts the total number of genes annotated, as well as the number of genes withone and more than one EC numbers predicted. The relevance groups give information about the overall annotation quality distribution. Group 0 counts all hits between a relevance of 6 and the user-defined cutoff (i.e. all excluded hits) whereas the groups one to three split all annotations into groups with fix relevance scores. Each group’s score range is mentioned in brackets.
Note: Hits that occur in the relevance group 0 do not occur in other groups, even if the relevance fits; The hits are unique in all groups.

EC numbers
The EC number category displays the number of distinct EC numbers annotated by your search. You can also see how many EC numbers have 1, 2-4, 5-10 or more than 10 annotations. The number of distinct EC numbers annotated by non-sequence-based methods (e.g. via AMENDA) is also mentioned.

Source entries
This category lists how many hits are found in each source.

Pathway Coverage
19_OrganismPathway
The pathway feature is a useful tool to find incomplete pathways. Therefore it generates a table containing columns for:

  • the name of the pathway linked to the pathway information of the biochemical reaction database BKM-react.
  • the number of pathway enzymes found in the organism of the total number of enzymes the pathway contains.
  • the percentage of found enzymes.
  • the list of the EC numbers contained in the pathway and not found in your organism, linked to their BRENDA pages.

Note: The pathway view is based upon the relevance options of your search. Therefore, changing them at the top of the comparison window will influence the pathway information. Moreover, you can select a different pathway data source via radio buttons.

Compare Organisms
20_OrganismCompare
With the compare function you can choose a second organism and compare it with your first one. The table compares all annotated EC numbers and opposes its best relevance score in the query organism to its best relevance score in the subject organism. The e-Value of both organisms is the BLAST e-Value of the respective annotation.
Note: The comparison is based upon the relevance options of your search. Therefore, changing them at the top of the pathway window will influence the pathway information.

Select subject organism
21_OrganismCompareSelect
The organism your results should be compared with has to be selected out of a drop down menu. To activate the comparison, click on the “Select” button. You can change the organism you want to compare your results with in the same manner.
Tip: Type the first letters of the target organism to find it faster.

Table columns
The result table contains several columns which will be shortly explained here.
Chr/P
States if the found enzyme is located at the chromosome or a plasmid.
Locus
Gives the gene’s unique locus tag.
GI
Gene identification number, linked to the NCBI entry.
UniProt ID
UniProt identifier, linked to the UniProt entry.
EC-Number
Enzyme Commission number for enzyme classification, linked to the corresponding BRENDA overview page.
Relevance
Lists the relevance of the entry calculated by the sum of all hits in the databases; The relevance scores of the particular databases are taken from the predefined values (see “Relevances” above).
Recommended Name
Name of enzymes with the corresponding function as suggested by the IUBMB.
Gene-Start
Base pair position in chromosome or plasmid at which the gene starts.
Gene-Stop
Base pair position in chromosome or plasmid at which the gene ends.
Gene-Direction
The direction the gene is transcripted. “+” means forward, “-” backward.
BLAST-eValue
e-Value of the BLAST result. Represents the number of hits with the scoring of the result a database will contain by pure chance. The lower the value the better the hit.
BLAST-Identity
Sequence identity of the Alignment. 100% means identical sequence.
BLAST-Start
Position in the corresponding gene where the BLAST alignment starts.
BLAST-Stop
Position in the corresponding gene where the BLAST alignment ends.
PFAM-Acc
PFAM family ID (accession), linked to the pfam webpage with detailed information about it.
PFAM-Start
Position in the corresponding gene where the BLAST alignment starts.
PFAM-Stop
Position in the corresponding gene where the BLAST alignment ends.
PFAM-eValue
e-Value of the PFAM result. Represents the number of hits with the scoring of the result a database will contain by pure chance. The lower the value the better the hit.
BRENDA
If the annotation was found in BRENDA, the corresponding relevance value (can be defined by the user) is stated here.
AMENDA
If the annotation was found via AMENDA text mining in PubMed publications, the corresponding relevance value (can be defined by the user) is stated here.
SwissProt
If the annotation was found in SwissProt, the corresponding relevance value (can be defined by the user) is stated here.
BREPS
If the annotation was found via BREPS alignment, the corresponding relevance value (can be defined by the user) is stated here.
PATRIC
If the annotation was found in PATRIC, the corresponding relevance value (can be defined by the user) is stated here.
KEGG
If the annotation was found in KEGG, the corresponding relevance value (can be defined by the user) is stated here.
KEGG[Orthology]
If the annotation was found via KEGG Orthology, the corresponding relevance value (can be defined by the user) is stated here.
NCBI
If the annotation was found in NCBI, the corresponding relevance value (can be defined by the user) is stated here.
BLAST
If the annotation was found via BLAST search, the corresponding relevance value (can be defined by the user) is stated here. The relevance depends also on the quality of the BLAST hit.
PFAM
If the annotation was found via PFAM, the corresponding relevance value (can be defined by the user) is stated here. The relevance depends also on the quality of the PFAM hit (e-value).


Select: Enzyme

22_EnzymeResult
If you search for an enzyme, EnzymeDetector generates a list of all known microorganisms where the enzyme is annotated. The methods used for annotations are the same as for organism search.
General
23_EnzymeResultAdv
The overview of the Enzyme search result contains the EC number you searched for, the recommended name of the enzyme, and the total number of organisms possessing the selected enzyme. The options are explained below.

Relevance-Cutoff
16_EnzymeResultRelevance
In EnzymeDetector, all hits have a relevance score, representing its reliability. The Relevance-Cutoff allows to ignore hits with a score lower than the input value. Changing the value needs a reload (“update”) afterwards. The default value is calculated with the target that about 30% of the genes are enzymes.

Include not sequence based annotations
25_EnzymeResultnonbased
By default, the enzyme search will only include sequence based annotations like BLAST search or database entries. If this option is activated, non-sequence-based methods like AMENDA text mining will be added. In addition, some sequence based annotations without gene locus information (like BRENDA hits without strand information) are included here, too.

Table columns
The result table contains several columns which are explained below.
Organism
Name of the organism the enzyme was found in, linked to the organism search of EnzymeDetector.
Domain
Domain of the found organism, usually Bacteria or Archaea.
TaxID
Taxonomic ID of the organism, linked to the NCBI taxonomy database.
Best Relevance
Lists the relevance of the annotations in the corresponding organism calculated by the sum of all hits in the databases. The relevance scores of the particular sources are taken from the predefined values (see “Relevances” above). For each organism only the best annotation is shown. Exception: If the organism has plasmids, each plasmid is allowed to have one additional annotation of the enzyme.
GI
Gene identification number, linked to the NCBI entry.
Chr/P
States if the found enzyme is located at the chromosome or a plasmid.
Accessions
Unique genome ID, linked to the NCBI genome entry.