Bioinformatics. Группа авторов

Читать онлайн книгу.

Bioinformatics - Группа авторов


Скачать книгу
not. As the major public sequence databases need to be able to store data in a fairly generalized fashion, these databases often do not contain more specialized types of information that would be of interest to specific segments of the biological community. To address this, many smaller, specialized databases have emerged and have been developed and curated by biologists “in the trenches” to fulfill specific needs. These databases, which contain information ranging from strain crosses to gene expression data, provide a valuable adjunct to the more visible public sequence databases, and users are encouraged to make intelligent use of both types of databases. An annotated list of such databases can be found in the yearly Database issue of Nucleic Acids Research (Rigden and Fernández 2018).

      The position of this chapter at the beginning of this book reflects the belief that an understanding of biological databases is the first step toward being able to perform robust and accurate bioinformatic analyses. The reader is very strongly encouraged to take the time to understand the structure of the data found within these databases, as the basis for finding sequence data of interest and performing the more advanced analyses described in the chapters that follow.

      The author thanks Rolf Apweiler for the use of material from the third edition of this book.

DDBJ Database Divisions www.ddbj.nig.ac.jp/ddbj/data-categories-e.html
DNA Database of Japan (DDBJ) www.ddbj.nig.ac.jp
EMBL Nucleotide Sequence Database www.embl.org
ENA Data Formats www.ebi.ac.uk/ena/submit/data-formats
European Bioinformatics Institute www.ebi.ac.uk
GenBank www.ncbi.nlm.nih.gov
GenBank Database Divisions www.ncbi.nlm.nih.gov/genbank/htgs/divisions
Genome Ontology Consortium geneontology.org
INSDC Feature Table Definition insdc.org/documents/feature_table.html
International Society for Biocuration biocuration.org
NCBI Data Model www.ncbi.nlm.nih.gov/IEB/ToolBox/SDKDOCS/DATAMODL.HTML
NCBI Protein Database www.ncbi.nlm.nih.gov/protein
Nucleic Acids Research Database issue academic.oup.com/nar
Protein Data Bank (PDB) www.rcsb.org
Protein Identification Resource (PIR) pir.georgetown.edu
Protein Research Foundation www.proteinresearch.net
RefSeq www.ncbi.nlm.nih.gov/refseq
Swiss-Prot (EBI) www.ebi.ac.uk/uniprot
Swiss-Prot (ExPASy) web.expasy.org/docs/swiss-prot_guideline.html
UniProt Consortium www.uniprot.org

      1 Bairoch, A. (2000). Serendipity in bioinformatics: the tribulations of a Swiss bioinformatician through exciting times! Bioinformatics. 16: 48–64. A personal narrative conveying the early history of the development of sequence databases and related software tools, events that set the groundwork for the modern bioinformatics landscape.

      2 Green, E.D., Rubin, E.M., and Olson, M.V. (2017). The future of DNA sequencing. Nature. 550: 179–181. An insightful perspective regarding the next several decades of the application of DNA sequencing methodologies in novel contexts and the implications of those applications to issues of data storage and data sharing.

      3 Rigden, D.J. and Fernández, X.M. (2018). The 2018 Nucleic Acids Research database issue and the online molecular biology database collection. Nucleic Acids Res. 46: D1–D7. The 25th overview of the annual database issue published by Nucleic Acids Research, capturing the wide variety of publicly available bioinformatic databases available to the community. This overview is updated yearly, and the individual papers describing these database resources are freely available through the Nucleic Acids Research web site.

      1 Apweiler, R. (2001). Functional information in Swiss-Prot: the basis for large-scale characterization of protein sequences. Briefings Bioinf. 2: 9–18.

      2 Bairoch, A. (2000). Serendipity in bioinformatics: the tribulations of a Swiss bioinformatician through exciting times! Bioinformatics. 16: 48–64.

      3 Baxevanis, A.D. and Bateman, A. (2015). The importance of biological databases in biological discovery. Curr. Protoc. Bioinf. 50: 1.1.1–1.1.8.

      4 Benson, D.A., Cavanaugh, M., Clark, K. et al. (2018). GenBank. Nucleic Acids Res. 46: D41–D47.

      5 Cook, C.E., Bergman, M.T., Cochrane, G. et al. (2018). The European Bioinformatics Institute in 2017: data coordination and integration. Nucleic Acids Res. 46: D21–D29.

      6 Dayhoff, M.O., Eck, R.V., Chang, M.A., and Sochard, M.R. (1965). Atlas of Protein Sequence and Structure. Silver Spring, MD: National Biomedical Research Foundation.

      7 Gene


Скачать книгу