Direkt zum Inhalt Direkt zur Suche Direkt zur Navigation

Humboldt-Universität zu Berlin - Mathematisch-Naturwissenschaftliche Fakultät II - Wissensmanagement in der Bioinformatik

Molecular databases

Datenbankkatalog


Literature search

Entrez PubMed PubMed, a service of the National Library of Medicine, provides access to over 12 million MEDLINE citations back to the mid-1960's and additional life science journals. PubMed includes links to many sites providing full text articles and other related resources.
MeSH - Medical Subject Headings It is designed to help quickly locate descriptors of possible interest and to show the hierarchy in which descriptors of interest appear.
OMIM - Online Mendelian Inheritage in Man This database is a catalog of human genes and genetic disorders.

DNA sequence databases

GenBank GenBank® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences.
EMBL über SRS@EBI The EMBL Nucleotide Sequence Database constitutes Europe's primary nucleotide sequence resource. Main sources for DNA and RNA sequences are direct submissions from individual researchers, genome sequencing projects and patent applications.

Protein sequence databases

SWISS-PROT The Swiss-Prot Protein Sequence Database is a database of protein sequences. It contains high-quality annotation,is non-redundant, and cross-referenced to several other databases, notably the EMBL nucleotide sequence database, PROSITE pattern database and PDB.
PIR The Protein Identification Resource consists of an integrated computer system composed of a number of protein and nucleic acid sequence databases and software designed for the identification and analysis of protein sequences and their corresponding coding sequences.

Databases of Protein Motifs, Domains

BLOCKS Blocks are multiply aligned ungapped segments corresponding to the most highly conserved regions of proteins. Block Searcher, Get Blocks and Block Maker are aids to detection and verification of protein sequence homology. They compare a protein or DNA sequence to a database of protein blocks (current version), retrieve blocks, and create new blocks, respectively.
PROSITE PROSITE is a database of protein families and domains. It consists of biologically significant sites, patterns and profiles that help to reliably identify to which known protein family (if any) a new sequence belongs
PRINTS PRINTS is a compendium of protein fingerprints. A fingerprint is a group of conserved motifs used to characterise a protein family; its diagnostic power is refined by iterative scanning of a SWISS-PROT/TrEMBL composite.
PFAM Pfam is a large collection of multiple sequence alignments and hidden Markov models covering many common protein families.
CATH CATH is a novel hierarchical classification of protein domain structures, which clusters proteins at four major levels, Class(C), Architecture(A), Topology(T) and Homologous superfamily (H).

Protein Structure databases

PDB - Protein Database The Protein Data Bank (PDB) is an archive of experimentally determined three-dimensional structures of biological macromolecules, serving a global community of researchers, educators, and students. The archives contain atomic coordinates, bibliographic citations, primary and secondary structure information, as well as crystallographic structure factors and NMR experimental data.

Enzymes, reactions and metabolic pathway databases

KEGG KEGG is a suite of databases and associated software, integrating our current knowledge on molecular interaction networks in biological processes (PATHWAY database), the information about the universe of genes and proteins (GENES/SSDB/KO databases), and the information about the universe of chemical compounds and reactions (COMPOUND/REACTION databases).
ENZYME The ENZYME data bank contains the following data for each type of characterized enzyme for which an EC number has been provided: EC number, Recommended name, Alternative names, Catalytic activity, Cofactors, Pointers to the Swiss-Prot entrie(s) that correspond to the enzyme, Pointers to disease(s) associated with a deficiency of the enzyme.
Roche Applied Science "Biochemical Pathway" This page gives access to the digitized version of the Roche Applied Science "Biochemical Pathways" wall chart.

Gene ontology resources

QuickGO The objective of GO is to provide controlled vocabularies for the description of the molecular function, biological process and cellular component of gene products. These terms are to be used as attributes of gene products by collaborating databases, facilitating uniform queries across them. The controlled vocabularies of terms are structured to allow both attribution and querying to be at different levels of granularity.

Bioinformatics Tools

FASTA Provides sequence similarity and homology searching against nucleotide and protein databases using the Fasta programs. Fasta can be very specific when identifying long regions of low similarity especially for highly diverged sequences. You can also conduct sequence similarity and homology searching against complete proteome or genome databases using the Fasta programs.
BLAST BLAST® (Basic Local Alignment Search Tool) is a set of similarity search programs designed to explore all of the available sequence databases regardless of whether the query is protein or DNA. The BLAST programs have been designed for speed, with a minimal sacrifice of sensitivity to distant sequence relationships.
ClustalW Clustal W is a general purpose multiple sequence alignment program for DNA or proteins.It produces biologically meaningful multiple sequence alignments of divergent sequences. It calculates the best match for the selected sequences, and lines them up so that the identities, similarities and differences can be seen.