The Ultimate Frontier of Knowledge: the Mysterious Genomes and the Gene Expression and Regulation in the Upstream of Biological Information Flow
Page: 4-16 (13)
Author: Kuo Ping Chiu
DOI: 10.2174/9781681080925115010005
PDF Price: $15
Abstract
DNA sequencing is just a tool that we can employ to study biological phenomena. It can be useful for us to review some biological background before we discuss how to use the tool. It will help us to understand what subjects in the biological field DNA sequencing can be applied to and how to apply sequencing technologies in the study. Before we discuss the applications of DNA sequencing, Let’s review some molecular biology, starting from the structure and organization of information molecules at the molecular level.
Evolution of DNA Sequencing Technologies
Page: 17-24 (8)
Author: Kuo Ping Chiu
DOI: 10.2174/9781681080925115010006
PDF Price: $15
Abstract
Current nucleotide sequencing focuses on DNA sequencing. As you will see later in this chapter, direct RNA sequencing was de-selected by evolution. Instead, RNA molecules are converted to cDNA and subject to DNA sequencing. Moreover, DNA sequencing can be conducted by a number of sequencing technologies, each of which uses a company- or inventer-defined procedure and sequencing mechanism. By nature, both living organisms and non-living objects are constantly challenged by evolution. DNA sequencing technologies are of no exception. For living organisms, phenotypic variations resulted from genetic alterations are constantly tested by the surrounding environment, which allows the fittest to propagate more efficiently than the others. Similarly, using DNA sequencing technologies that we will discuss later in this chapter as an example, each technology has its pros and cons against one another. Also, their advantages and disadvantages co-evolve with, and depend on their environmental backgrounds. Here, we review the evolution of DNA sequencing technologies to appreciate the evolutionary process eventually leading to the development of Next-Generation Sequencing technologies.
Mechanisms of Next-Generation Sequencing (NGS)
Page: 25-37 (13)
Author: Kuo Ping Chiu
DOI: 10.2174/9781681080925115010007
PDF Price: $15
Abstract
DNA sequencing consists of a number of methodologies, each adopts a unique process of sequencing mechanisms. During 1970s, Sanger sequencing survived the competition against other approaches and dominated DNA sequencing for a number of decades. As stimulated by urgent demand of high throughput sequencing approaches by the Human Genome Project and various genome projects that followed, Next- Generation Sequencing evolved to replace Sanger sequencing as the main sequencing approach. NGS consists of three major sequencing platforms (i.e., 454/Roche, Solexa/Illumina and SOLiD/Life Technologies) and each has its own sequencing mechanism. These mechanisms have experienced severe competition, leading to the election of Illumina system by the sequencing market as the main stream sequencing platform. Before we can fully appreciate the reasons leading to the success of the Illumina system, here we analyze and discuss the sequencing mechanisms adopted by these NGS platforms.
Genome Assembly, the Genomic Era and the Rise of the Omics Era
Page: 38-48 (11)
Author: Kuo Ping Chiu
DOI: 10.2174/9781681080925115010008
PDF Price: $15
Abstract
Genome assembly, or genome sequencing, refers to the process or the end product of sequencing genomic fragments of an organism, followed by piecing together, or ‘assembling’, in scientific terms, the genomic fragments in sequential order to reveal the original genome sequence. To make a genome assembly usable as a reference, annotation (i.e., assigning locations for genes along the chromosome) is also required. The end product of genome sequencing is a complete set of nucleotide sequence(s), of a genome in linear or circular, of DNA or RNA form, depending on the organismic species. It represents the complete genetic makeup determining all molecular potential, entities and activities of that organism. Similar to road maps used for guiding traffic and for helping people to find a person living at a specific address, genome assemblies act as genomic maps (references) to guide us to find genes (eq. persons), regulatory elements, or mutations in specific locations in the genome. Genome assembly aims to generate a genomic map for future studies of that organism and other related organisms.
Laboratory Setup and Fundamental Works
Page: 50-56 (7)
Author: Kuo Ping Chiu
DOI: 10.2174/9781681080925115010009
PDF Price: $15
Abstract
This chapter aims to share some previous experiences in laboratory setup and bioinformatics exercises with readers and hopefully, by using this chapter as a mediator, to reduce problems which may be encountered by some readers, especially those who haven’t had a chance to personally use sequencers for their studies, and those who have just begun to acquire a taste of sequencing and/or genomics studies. There are many big laboratories and sequencing centers which are able to give you some thoughts and useful opinions. Please consult these resources if possible.
Sequencing Libraries and Basic Procedure for Sequencing Library Construction
Page: 57-61 (5)
Author: Kuo Ping Chiu
DOI: 10.2174/9781681080925115010010
PDF Price: $15
Abstract
A sequencing library is the library prepared to be put on the sequencer for sequencing. As such, sequencing library construction is essential for sequencing. Since sequencing of an unknown target DNA molecules has to start from a known and welldefined region, and every NGS sequencer manufacturer uses unique sequencing primers for its own machines, ligation of target DNA molecules to a pair of adaptors is not only essential but also sequencer-dependent, for the making of a sequencing library. This chapter carries a mission to clarify various types of sequencing libraries and the general procedure for their constructions.
Paired-End (PE), Mate Pair (MP) and Paired-End Ditag (PED) Sequencing
Page: 62-76 (15)
Author: Kuo Ping Chiu
DOI: 10.2174/9781681080925115010011
PDF Price: $15
Abstract
Information regarding the distance between paired reads enhances the accuracy of genome assembly and sequence-to-genome mapping, making paired-end indispensable strategies for DNA sequencing. The most commonly used paired-end sequencing strategies are Paired-End (PE) sequencing and Paired-End Ditag (PED) sequencing. Similarity in terminologies frequently causes confusion. This chapter is set out to clarify these terminologies and then, using PED as an example, to illustrate how a biotechnology can be sequentially developed.
Genome Sequencing and Assembly
Page: 77-83 (7)
Author: Kuo Ping Chiu
DOI: 10.2174/9781681080925115010012
PDF Price: $15
Abstract
In chapter 4, we reviewed the background and history of genome sequencing and assembly. In this chapter, we will focus more on the technical and experimental issues. In science, the term “whole genome sequencing and assembly” is often used interchangeably with “genome sequencing and assembly”, “genome sequencing” and “genome assembly”, because genome assembly normally refers to the assembly of an entire genome and genome sequencing is normally followed by assembling sequence reads to produce a complete set of chromosomal sequences for the genome of interest. For convenience, genome assembly and genome sequencing are preferred and will be used more often than the others throughout our discussion. Since whole genome sequencing and assembly is a complicate process, involving multiple alternatives and methodologies, it is not possible to cover every detail. We will go through some concepts and NGS-associated procedures so that readers can get some idea of how the genome assembly is achieved. Serious readers are recommended to consult previous reports published by sequencing laboratories.
Exome Sequencing: Genome Sequencing Focusing on Exonic Regions
Page: 84-87 (4)
Author: Kuo Ping Chiu
DOI: 10.2174/9781681080925115010013
PDF Price: $15
Abstract
While whole genome sequencing (WGS) remains costly and requires intensive labor and elaborate analytical tools for assembly, whole exome sequencing (WES) is relatively cheaper and easier. Compared to WGS, WES can be considered as an efficient approach when the protein-coding regions are the only concern, because this type of sequencing focuses on the exon regions and its desired sequencing depth can be easily reached. WES is frequently confused with transcriptome analysis because both types of libraries contain solely the exonal sequences. However, the former is generated from genomic DNA fragments, while the latter from expressed mRNA molecules. Readers are asked to distinguish the differences between these two libraries beforehand.
Transcriptome Analysis
Page: 88-96 (9)
Author: Kuo Ping Chiu
DOI: 10.2174/9781681080925115010014
PDF Price: $15
Abstract
Transcriptome analysis, or transcriptome sequencing, concerns the transcript sequences transcribed from the genome of a specific cell type at specific time and growth conditions. Previous studies have clearly demonstrated that, besides messenger RNA (mRNA), the transcribed RNA sequences also contain large amounts of ribosomal RNA (rRNA), transfer RNA (tRNA) and small-sized non-coding RNA (ncRNA). Transcriptome analysis focuses mainly on mRNA, and sometimes, certain types of ncRNA species which may be co-isolated with mRNA when gene expression and regulation are of the major concern. From transcriptome sequencing, a number of biological information can be retrieved. These include gene expression level, transcriptome landscape across the entire genome, Gene Ontology, pathway, etc. Notice that transcriptome analysis normally refers to the whole transcriiptome analysis of a cell population. The result is in fact a combination of millions of potentially diversified single-cell transcriptomes.
Single Cell Sequencing (SCS) and Single Cell Transcriptome (SCT) Sequencing
Page: 97-104 (8)
Author: Kuo Ping Chiu
DOI: 10.2174/9781681080925115010015
PDF Price: $15
Abstract
Sometimes genomic and transcriptomic information of single cells, instead of those produced from cell populations, are desired. Obtaining such information relies on single cell sequencing (SCS) and single cell transcriptome (SCT) sequencing.
Although SCT sequencing is in fact part of SCS, they are readily distinguishable not only in research objective, but also in experimental procedure and bioinformatic approach. We will first review the history and achievements that have been made in these fields, and then discuss an experimental procedure of SCT sequencing to gain more insight into the subject.
ChIP-TFBS Analysis
Page: 105-111 (7)
Author: Kuo Ping Chiu
DOI: 10.2174/9781681080925115010016
PDF Price: $15
Abstract
Eukaryotic gene expression is tightly controlled by a cascade of regulatory mechanisms. At the sequence level, gene expression is regulated by cis-acting DNA motifs that are able to recruit trans-acting transcription factors (TFs) for positive or negative regulation of local gene expression. The genome-wide mapping of transcription factor binding sites (TFBS) becomes a crucial strategy for the study of gene expression regulation. Here in this chapter we will discuss the preparation of ChIP-TFBS sequencing libraries and the analysis of ChIP-TFBS sequence data.
ChIP-EM Libraries
Page: 112-116 (5)
Author: Kuo Ping Chiu
DOI: 10.2174/9781681080925115010017
PDF Price: $15
Abstract
Epigenetic modifications (EMs) refer to the external modifications on DNA that do not alter coding specificity. EMs include DNA methylations (DMs) and histone modifications (HMs). This chapter will focus on HMs. We will discuss how ChIP-EM libraries can be made and what can be expected from the sequence data analysis. There are many laboratories working in this field and many reports have been published. Readers are recommended to consult the previous reports for further understanding of this subject.
MicroRNA Analysis
Page: 117-120 (4)
Author: Kuo Ping Chiu
DOI: 10.2174/9781681080925115010018
PDF Price: $15
Abstract
MicroRNAs (miRNAs) negatively regulate mRNA species by binding to the 3' untranslated region (3' UTR) in mRNA through nucleotide complementarity which allows limited number of nucleotide mismatches to fine tune the target specificity and the degree of represion. Although miRNAs have been intensively studied for decades, most of their targets and functions remain unknown. Furthermore, many of the miRNAs that have been studied are known to target multiple mRNAs. These properties seriously impede the progress of miRNA analysis. Analysis of miRNAs normally relies on commercial kits for miRNA isolation and sequencing library preparation. This chapter will serve as a general introduction of miRNA analysis. Most of the experimental procedure and sequence data analysis discussed in this chapter can also be found in the paper entitled “global assessment of Antrodia cinnamomea-induced microRNA alterations in hepatocarcinoma cells” published in 2013.
Application of NGS in the Study of Sequence Diversity in Immune Repertoire
Page: 121-126 (6)
Author: Kuo Ping Chiu
DOI: 10.2174/9781681080925115010019
PDF Price: $15
Abstract
During evolution, the immune system evolved as a defense mechanism to protect organisms against pathogens. Since pathogens in the environment are extremely diverse and unpredictable, strategies taken by the immune system have to be highly diversified in order to mount an effective response. At the molecular level, the sequence diversity present in the variable regions of antibody-coding and TCR-coding genomic sequences is eventually reflected in the amino acid sequences of their encoded proteins as seen in the circulation system and on the surface of immune cells. At the cellular level, B cells, T cells, dendritic cells, and many other immune cells have to interact coordinately with one another in order to foster the maturation of an immune response (e.g., affinity maturation) against pathogenic attack. With the advent of NGS technologies, the complexity of the immune system can now be studied in greater detail.
Galaxy Pipeline for Transcriptome Library Analysis
Page: 128-146 (19)
Author: Kuo Ping Chiu
DOI: 10.2174/9781681080925115010020
PDF Price: $15
Abstract
Next Generation Sequencing (NGS) provides researchers with an unprecedented opportunity to produce a large volume of DNA sequences quickly, and is one of the fundamental methods for high-throughput genomic studies. Currently, the most widely-used NGS platforms are Illumina, Roche 454 and SB SOLiD. These platforms differ in the chemistry used in the sequencing process and the length of sequencing read generated. Each platform has its own strengths and weaknesses. In particular, the required length of the sequence read to be generated plays an important role when designing an experiment. For example, a longer read length would be needed in the assembly of a novel genome, while throughput-maximizing PED-based techniques would be better-suited when shorter reads will suffice.
Introduction
Nucleic acid sequencing techniques have enabled researchers to determine the exact order of base pairs - and by extension, the information present - in the genome of living organisms. Consequently, our understanding of this information and its link to genetic expression at molecular and cellular levels has lead to rapid advances in biology, genetics, biotechnology and medicine. Next-Generation Sequencing and Sequence Data Analysis is a brief primer on DNA sequencing techniques and methods used to analyze sequence data. Readers will learn about recent concepts and methods in genomics such as sequence library preparation, cluster generation for PCR technologies, PED sequencing, genome assembly, exome sequencing, transcriptomics and more. This book serves as a textbook for students undertaking courses in bioinformatics and laboratory methods in applied biology. General readers interested in learning about DNA sequencing techniques may also benefit from the simple format of information presented in the book.