Proteasix Help
  • Protease browsing
    Proteasix provides browsing and search functionalities for proteases, i.e. proteolytic enzymes that hydrolyze peptide bond (Read More...). Proteases specifically cleave protein substrates either from the N or C termini (exopeptidases known respectively as aminopeptidases and carboxypeptidases) and/or in the middle of the molecule (endopeptidases).

    Protease browsing can be performed through the dedicated "Proteases" page listing alphabetically the proteases contained in the Proteasix Knowledge Base.

  • Substrate browsing
    Proteasix provides browsing and search functionalities for substrates, i.e. the protein on which proteases act.

    Substrate browsing can be performed through the dedicated "Substrates" page listing alphabetically the substrates contained in the Proteasix Knowledge Base.

  • Search Proteasix Knowledge Base
    The Proteasix Knowledge Base contains Swiss-Prot and TrEMBL protein records from the UniProtKB. The knowledge in Proteasix about proteolytic events is more limited. And thus, if confirmatory information on a protease protein interaction is not found it does not mean it cannot occur. The knowledge in the literature is much greater than the knowledge in Proteasix. Read more about the Proteasix Knowledge Base.

    Search Protease There proteases contained in the Proteasix Knowledge Base can be searched for using the "Search" page by:
           – Gene symbol: MMP2
           – UniProt accession (UniProt AC): P08253
           – UniProt identifier (UniProt ID): MMP2_HUMAN
           – Protein name: 72 kDa type IV collagenase

    Search Substrate The substrates contained in the Proteasix Knowledge Base can be searched for using the "Search" page by:
           – Gene symbol: ALB
           – UniProt accession (UniProt AC): P02768
           – UniProt identifier (UniProt ID): ALBU_HUMAN
           – Protein name: Serum albumin

    Search for Cleavage Site Cleavage Site searches allow to search for amino acid sequences (8 amino acids) and retrieve protease/cleavage site associations contained in the Proteasix Knowledge Base. Cleavage occurs at the scissile bond between P1 and P1' residues (see FAQs section "What does P4, P3, P2, P1, P1', P2', P3', P4' mean?").

    Search Protein Proteasix Knowledge Base contains Swiss-Prot and TrEMBL protein records from the UniProtKB. Some of these proteins are known to participate in proteolytic events as proteases and/or substrates. Proteins can be searched for using the "Search" page by:
           – Gene symbol: HIF1A
           – UniProt accession (UniProt AC): Q16665
           – UniProt identifier (UniProt ID): HIF1A_HUMAN
           – Protein name: Hypoxia-inducible factor 1-alpha

  • Prediction tool
    The Proteasix peptide-centric prediction tool enables, from an input peptide list, the automatic reconstruction of N- and C- terminal cleavage sites and identification of observed and predicted proteases involved in the proteolysis of these cleavage sites.

    The Proteasix Knowledge Base supports the two operational modes of the Prediction tool to find proteases: Observed to match against cleavage site associations collected from the literature to find proteases; and Predicted to calculate the probability of cleavage by a protease based on MEROPS specificity matrices (see FAQs section "What are the MEROPS specificity matrices?") .

    User input The prediction tool has been designed to specifically allow single or batch peptide-centric searches. Each peptide is described by:
           – Peptide identifier: peptide identifiers are mandatory for the search but can be in any kind of format.
           – Parent protein: UniProt identifier (ID) or UniProt accession (AC).
           – A number that indicates the Start amino acid position in the Parent protein sequence.
           – A number that indicates the Stop amino acid position in the Parent protein sequence.

    Input list should be copy/pasted in a tab-delimited format. Example inputs (with 38 or 107 peptides) are available and can be used to test the prediction tool. It should be noted that peptide identifiers (e.g. 1, 2, 3, etc or Peptide1, Peptide2, Peptide3, etc) are useful when working with large lists of peptides as they can help to provide an audit trail for the data.

    See also FAQs sections:
           – "Why peptide sequences can not be pasted directly in the Prediction tool?".
           – "How many input peptides are the optimal number for the preformance of the Proteasix Prediction tool?".
           – "How confidential is the input peptide list?".

    Automatic reconstruction of N- and C-terminal cleavage sites Proteasix automatically retrieves the full-length parent protein amino acid sequence extracted from the UniProtKB (Swiss-Prot and TrEMBL) and reconstruct the N- and C-terminal cleavage sites in their octapepeptide form.

    Protease/cleavage site association Proteasix prediction tool aims to identify observed and predicted proteolysis associated with peptide cleavage sites. Using the PxO ontology, Proteasix retrieves information about previously observed protease/cleavage site associations that are stored in the Proteasix Knowledge Base. Moreover, for each cleavage site generated from the input peptide list, a prediction confidence score is calculated (see FAQs section "How is the probability for cleavage site prediction calculated?") in order to predict potential protease/cleavage site associations.

    Exopeptidase activity While secreted in body fluids, it is a high probability that measured peptides could result from a combination of specific endoprotease activity followed by unspecific exoprotease activity. In order to assess this possibility, in addition to the reconstruction of N- and C-terminal cleavage sites from the input peptides, the Proteasix prediction tool in the Observed operational mode also reconstructs cleavage sites taking into account potential removal of 1 amino acid (input+1AA), 2 amino acids (input+2AA), or 3 amino acids (input+3AA).

    Results format for the Observed finding mode The outcome of the prediction tool for the Observed finding mode is a set of lines/rows displayed in simple text area. Each line/row has tab-separated values for the following columns:
           – Peptide identifier (Peptide ID): this is the peptide identifier (as provided by the user).
           – Parent Protein AC from input: UniProt accession for the parent protein.
           – Start amino acid: a number that indicates the Start amino acid position in the Parent protein sequence (as provided by the user).
           – End amino acid: a number that indicates the Stop amino acid position in the Parent protein sequence (as provided by the user).
           – N or C-terminus: it asserts if this is a N- or a C-terminal cleavage site.
           – Protease Human/Mouse/Rat: gene symbol (UniProt accession; UniProt identifier) for the endopeptidase.
           – Cleavage site: amino acid sequence at cleavage site.
           – Source: it traces back the evidence of the protease/cleavage site association. Typically it shows the source (e.g. PubMed-ID or CutDB-ID)
                together with a substrate: gene symbol (UniProt accession; UniProt identifier) for the parent protein. The Parent Protein AC from input can
                be the same (same substrate) or different (different substrate) if no matching with the parent protein from input was found.
           – Plausible proteolysis: it mainly states if protease/cleavage site associations could result from a combination of specific endoprotease activity
                followed by unspecific exoprotease activity.

    The results for the Observed finding mode can be grouped considering Plausible proteolysis into:
           – Endopeptidase (input): observed protease/cleavage site associations considering only endoprotease activity for the given input peptide;
           – Endopeptidase followed by exopeptidase (input+1AA): observed protease/cleavage site associations considering the potential removal of
                1 amino acid (input+1AA);
           – Endopeptidase followed by exopeptidase (input+2AA): observedprotease/cleavage site associations considering the potential removal of
                2 amino acid (input+2AA);
           – Endopeptidase followed by exopeptidase (input+3AA): observed protease/cleavage site associations considering the potential removal of
                3 amino acid (input+3AA).

    Note: when the simple text area appears, the "New prediction" button also appears. This button will go to the first step of the prediction tool where the user input is required.

    Results format for the Predicted finding mode The outcome of the prediction tool for the Predicted finding mode is a set of lines/rows displayed in simple text area. Each line/row has tab-separated values for the following columns:
           – Peptide identifier (Peptide ID): this is the peptide identifier (as provided by the user).
           – Parent Protein AC from input: UniProt accession for the parent protein.
           – Start amino acid: a number that indicates the Start amino acid position in the Parent protein sequence (as provided by the user).
           – End amino acid: a number that indicates the Stop amino acid position in the Parent protein sequence (as provided by the user).
           – N or C-terminus: it asserts if this is a N- or a C-terminal cleavage site.
           – Protease Human/Mouse/Rat: gene symbol (UniProt accession; UniProt identifier) for the endopeptidase.
           – Cleavage site: amino acid sequence at cleavage site.
           – Source: it shows the MEROPS ID, which has a MEROPS specificity matrix.
           – Plausible proteolysis: only endoprotease activity for the given input peptide is considered. This is a difference with respect to the Observed
                finding mode.
           – Probability Calculated: confidence probability for the cleavage site calculated using a log-likelihood based on the protease MEROPS
                specificity weight matrix.
           – Threshold applied (value): either MEROPS score threshold or 99th percentile and its value in brackets.
           – Sensitivity: percentage of cleavage sites correctly predicted to be cleaved by the protease.
           – Specificity: : percentage of cleavage sites correctly predicted not to be cleaved by the protease.

    See also FAQs section "How is the probability for cleavage site prediction calculated?".

    Note: when the simple text area appears, the "New prediction" button also appears. This button will go to the first step of the prediction tool where the user input is required.

    Export results When text area is displayed, its content can be copy and paste into any text editor or Excel. There is also the "Download" button, which allows to donwload the results shown in the text area as a text file – Please note that the donwload behaviour may change slighlty depending on the Web browser (i.e. Safari, Firefox, Opera, Chrome, or Internet Explorer).