Bioinformatics. Группа авторов
Читать онлайн книгу.Society (FGED), he was a developer of the Minimal Information About a Microarray Experiment (MIAME) and other data-reporting standards. Dr. Quackenbush was honored by President Barack Obama in 2013 as a White House Open Science Champion of Change.
Jonas Reeb, MSc is a PhD student in the laboratory of Burkhard Rost at the Technical University of Munich, Germany (TUM). During his studies at TUM, he has worked on predictive methods for the analysis and evaluation of transmembrane proteins; he has also worked on the NYCOMPS structural genomics pipeline. His doctoral thesis focuses on the effect of sequence variants and their prediction.
Burkhard Rost, PhD is a professor and Alexander von Humboldt Award recipient at the Technical University of Munich, Germany (TUM). He was the first to combine machine learning with evolutionary information, using this combination to accurately predict secondary structure. Since that time, his group has repeated this success in developing many other tools that are actively used to predict and understand aspects of protein structure and function. All tools developed by his research group are available through the first internet server in the field of protein structure prediction (PredictProtein), a resource that has been online for over 25 years. Over the last several years, his research group has been shifting its focus to the development of methods that predict and annotate the effect of sequence variation and their implications for precision medicine and personalized health.
Fabian Sievers, PhD is currently a postdoctoral research fellow in the laboratory of Des Higgins at University College Dublin, Ireland. He works on multiple sequence alignment algorithms and, in particular, on the development of Clustal Omega. He received his PhD in mathematics from Trinity College, Dublin and has worked in industry in the fields of algorithm development and high-performance computing.
Michael F. Sloma, PhD is a data scientist at Xometry, Gaithersburg, MD, USA. He received his BA degree in Chemistry from Wells College. He earned his doctoral degree in Biochemistry in the laboratory of David Mathews at the University of Rochester, where his research focused on computational methods to predict RNA structure from sequence.
W. Scott Watkins, MS is a researcher and laboratory manager in the Department of Human Genetics at the University of Utah, Salt Lake City, UT, USA. He has a long-standing interest in human population genetics and evolution. His current interests include the development and application of high-throughput computational methods to mobile element biology, congenital heart disease, and personalized medicine.
David S. Wishart, PhD is a Distinguished University Professor in the Departments of Biological Sciences and Computing Science at the University of Alberta, Edmonton, Alberta, Canada. Dr. Wishart has been developing bioinformatics programs and databases since the early 1980s and has made bioinformatics an integral part of his research program for nearly four decades. His interest in bioinformatics led to the development of a number of widely used bioinformatics tools for structural biology, bacterial genomics, pharmaceutical research, and metabolomics. Some of Dr. Wishart's most widely known bioinformatics contributions include the Chemical Shift Index (CSI) for protein secondary structure identification by nuclear magnetic resonance spectroscopy, PHAST for bacterial genome annotation, the DrugBank database for drug research, and MetaboAnalyst for metabolomic data analysis. Over the course of his academic career, Dr. Wishart has published more than 400 research papers, with many being in the field of bioinformatics. In addition to his long-standing interest in bioinformatics research, Dr. Wishart has been a passionate advocate for bioinformatics education and outreach. He is one of the founding members of the Canadian Bioinformatics Workshops (CBW) – a national bioinformatics training program that has taught more than 3000 students over the past two decades. In 2002 he established Canada's first undergraduate bioinformatics degree program at the University of Alberta and has personally mentored nearly 130 undergraduate and graduate students, many of whom have gone on to establish successful careers in bioinformatics.
Tyra G. Wolfsberg, PhD is the Associate Director of the Bioinformatics and Scientific Programming Core at the National Human Genome Research Institute (NHGRI), National Institutes of Health (NIH), Bethesda, MD, USA. Her research program focuses on developing methodologies to integrate sequence, annotation, and experimentally generated data so that bench biologists can quickly and easily obtain results for their large-scale experiments. She maintains a long-standing commitment to bioinformatics education and outreach. She has authored a chapter on genomic databases for previous editions of this textbook, as well as a chapter on the NCBI MapViewer for Current Protocols in Bioinformatics and Current Protocols in Human Genetics. She serves as the co-chair of the NIH lecture series Current Topics in Genome Analysis; these lectures are archived online and have been viewed over 1 million times to date. In addition to teaching bioinformatics courses at NHGRI, she served for 13 years as a faculty member in bioinformatics at the annual AACR Workshop on Molecular Biology in Clinical Oncology.
Michael Zuker, PhD retired as a Professor of Mathematical Sciences at Rensselaer Polytechnic Institute, Troy, NY, USA, in 2016. He was an Adjunct Professor in the RNA Institute at the University of Albany and remains affiliated with the RNA Institute. He works on the development of algorithms to predict folding, hybridization, and melting profiles in nucleic acids. His nucleic acid folding and hybridization web servers have been running at the University of Albany since 2010. His educational activities include developing and teaching his own bioinformatics course at Rensselaer and participating in both a Chautauqua short course in bioinformatics for college teachers and an intensive bioinformatics course at the University of Michigan. He currently serves on the Scientific Advisory Board of Expansion Therapeutics, Inc. at the Scripps Research Institute in Jupiter, Florida.
About the Companion Website
This book is accompanied by a companion website:
www.wiley.com/go/baxevanis/Bioinformatics_4e
The website includes:
Test Samples
Word Samples
Scan this QR code to visit the companion website.
1 Biological Sequence Databases
Andreas D. Baxevanis
Introduction
Over the past several decades, there has been a feverish push to understand, at the most elementary of levels, what constitutes the basic “book of life.” Biologists (and scientists in general) are driven to understand how the millions or billions of bases in an organism's genome contain all of the information needed for the cell to conduct the myriad metabolic processes necessary for the organism's survival – information that is propagated from generation to generation. To have a basic understanding of how the collection of individual nucleotide bases drives the engine of life, large amounts of sequence data must be collected and stored in a way that these data can be searched and analyzed easily. To this end, much effort has gone into the design and maintenance of biological sequence databases. These databases have had a significant impact on the advancement of our understanding of biology not just from a computational standpoint but also through their integrated use alongside studies being performed at the bench.
The history of sequence databases began in the early 1960s, when Margaret Dayhoff and colleagues (1965) at the National Biomedical Research Foundation (NBRF) collected all of the protein sequences known at that time – all 65 of them – and published them in a book called the Atlas of Protein Sequence and Structure. It is important to remember that, at this point in the history of biology, the focus was on sequencing proteins through traditional techniques such as the Edman degradation rather