Zanmbeel

Sunday, February 3, 2008

Polymerase chain reaction

INTRODUCTION — The two most important principles underlying the polymerase chain reaction (PCR) are: Complementarity-driven binding of DNA to form a duplex Template-driven, semi-conservative synthesis of new DNA by DNA polymerases
These two principles are discussed separately in detail. (See "Overview of molecular biology").
To complete the theoretical background to understand PCR, a discussion relating to the requirements that specify a unique sequence in the genome, and a review of thermostable DNA polymerases are required. These features as well as a discussion of some research applications of PCR are presented here. An introduction to the clinical applications of PCR can be found elsewhere within UpToDate. (See "Cytogenetic and molecular genetic diagnostic tools").
UNIQUE GENOMIC SEQUENCES — Relatively short DNA sequences suffice to specify a unique sequence in the genome. As an example, assume (although it is not strictly true) that all four bases (adenine [A], thymine [T], guanine [G], and cytosine [C]) are equally frequent in the approximately 3 X 10(9) residues of the human genome. It is possible to calculate the expected frequency of a sequence of N bases having a specified sequence using simple probability theory; that frequency is given by the following formula:
(3 X 10(9))[1/4(N)]
The expression 1/4(N) expresses the chance that the sequence of an N-residue oligonucleotide matches a given sequence. The constant 3 X 10(9) is simply the size of the human genome in bases. The following are examples of this calculation for four different oligonucleotide lengths: A specific oligonucleotide that is 10 nucleotides long occurs approximately 3000 times in the human genome. An oligonucleotide that is 15 nucleotides long occurs approximately 3 times. Specific oligonucleotides that are 20 and 25 nucleotides in length occur at an approximate frequency of 0.003 and 0.000003, respectively.
Although the precise frequencies vary from these examples because of the simplified assumptions regarding base abundance and the independence of sequence at a specific residue from sequence at neighboring residues, these rough estimates demonstrate that a specific 20-residue sequence is very likely to be unique in the human genome.
THERMOSTABLE DNA POLYMERASES AND SYNTHETIC OLIGONUCLEOTIDES — The existence of thermostable DNA polymerases, purified or cloned from microorganisms living in hot springs, is a necessary component of PCR. These polymerases can withstand heating to 95 degrees C with minimal loss of activity and function optimally near 70 degrees C. The ability to readily and inexpensively synthesize specific oligonucleotides of 20 to 30 residues is the other technical underpinning of PCR.
PROCESS OF PCR — A reaction mixture containing a large excess of a pair of oligonucleotide primers designed to match either end of the target sequence, a substrate DNA, free deoxynucleotide triphosphates, and a thermostable DNA polymerase is assembled. The mixture is heated to 95 degrees C to allow the double stranded substrate DNA to denature into single strands. The mixture is then cooled to a temperature just below the predicted denaturation temperature of the primers, which will then anneal to the single-stranded substrate DNA and prime new DNA synthesis by the included DNA polymerase. The temperature is then raised to the optimal temperature for the polymerase to allow chain elongation to proceed long enough for synthesis to extend past the opposite primer's complementary sequence. The mixture is then heated to 95 degrees C again in order to once again separate all the DNA to single strands, and the entire sequence of temperature cycling is repeated. Now, the number of potential targets for primer annealing has been doubled, as both the original substrate DNA and the newly synthesized strands are available. In general, between 25 and 40 such cycles of denaturation, annealing, and elongation are performed, resulting in exponential amplification of the target sequence.
Early experiments were performed by physically moving reactions among three water baths preset to the desired temperatures. Currently, a variety of automated thermal cyclers perform the desired temperature regulation with minimal operator hands-on time. All the components of the reactions are readily available commercially, and computer programs that facilitate primer design and calculate annealing temperatures are freely distributed on-line [1-3].
OVERVIEW OF RESEARCH APPLICATIONS — PCR's impact on biomedical research has been immense. This technology allows large quantities of rare sequences to be synthesized, cloned, and analyzed with high reliability and minimum effort. The award of the 1993 Nobel Prize in Chemistry to Kary B. Mullis for inventing the technique recognized the importance of PCR-based methods. A few examples of common research applications of PCR are briefly described below.
PCR is a central tool in genomics and genetics. Relatively early in the human genome project, it was recognized that PCR technology permits more extensive and easier sharing of reagents than had been possible previously. To share a clone, for example, investigators need only specify a pair of primer sequences, the size of the expected product, and buffer conditions for its successful amplification. Any laboratory receiving this information subsequently has the capability of amplifying the sequence from genomic DNA and cloning it into a suitable vector [4]. This is obviously much easier than exchanging actual specimens or cultures.
Amplification of genomic DNA — Two examples of amplification of genomic DNA by PCR will briefly be presented. These are genotyping at microsatellite markers and detection of rare sequences. Both of these examples are fairly common research applications and are likely to be adapted to the clinical setting in the near future.
Microsatellite genotyping — To provide genetic markers, primer pairs flanking short, repetitive sequences called microsatellites were chosen. These were found to be present in variable numbers of copies at a population level, but to display Mendelian inheritance within families. Copy number of the repetitive sequence within the amplified product can therefore define various alleles, while the unique primers define a specific genomic location. This is discussed more extensively in the topic review on repetitive DNA sequences. (See "Repetitive DNA").
Detection of rare sequences — PCR technology has also allowed detection of rare DNA sequences in a population of DNA molecules. This application is particularly prominent in searching for DNA rearrangements in the setting of neoplasia. An example is the discovery that a Herpes simplex-related virus is involved in the pathogenesis of Kaposi's sarcoma [5-7]. Representation difference analysis, a PCR-based method for preferentially amplifying sequences present in one of a pair of sources of substrate DNA [8], has allowed investigators to determine that tumor tissue (but not surrounding normal tissue from the same individuals) harbored integrated viral DNA. Thereafter, primers were designed to allow direct amplification of the viral sequences. (See "Cytogenetic and molecular genetic diagnostic tools" and see "Diagnosis and antiviral therapy of human herpesvirus 8 infection").
Amplification of RNA — PCR can be applied to messenger RNA (mRNA) by addition of a reverse transcription (RT) step prior to amplification. RT-PCR allows detection of rare messages whose abundance is below the detection threshold for Northern blot analysis. Moreover, it is easier and faster to perform. Two examples are expression profiles and RNA virus infection.
Expression profiles — A vast area of current research assesses changes in patterns of gene expression in response to various perturbations. RT-PCR allows qualitative, semi-quantitative, or quantitative measurement of mRNA levels. Qualitative and semi-quantitative assays can now be performed on microarrays, permitting thousands of genes to be studied simultaneously (reviewed by [9-12]).
RNA virus infection — PCR of RNA isolated from blood has become a standard tool in monitoring the viral load in HIV-infected patients (reviewed by [13,14]). This strategy is being extended to other RNA viruses causing chronic infection, such as hepatitis C [15,16]. (See "Techniques and interpretation of HIV-1 RNA quantitation" and see "Diagnostic approach to hepatitis C virus infection").
CRITICAL EVALUATION OF DATA — The ease of performing PCR has led to wide dissemination of the methodology. The rigor with which work is done varies enormously. When reading medical literature including PCR data, one must evaluate the work critically since publication is not a guarantee of high-quality work.
Every PCR experiment requires a minimum of two technical controls, in addition to biological controls specific to the research question being addressed. The technical controls should demonstrate that amplification occurs when it should (positive control) and that it does not occur when it should not (negative control).
In RT-PCR, it is also important to distinguish semi-quantitative from truly quantitative designs. Because substrate concentrations may become limiting in later rounds of amplification, comparing intensity of electrophoretic bands in most circumstances is only semi-quantitative [17,18]. True quantitation generally entails either inclusion of an internal standard that is coamplified competitively with the target or use of special methodology, such as real time PCR [19-22]. (See "Cytogenetic and molecular genetic diagnostic tools")

Polymerase chain reaction

Peptide hormone signal transduction and regulation

INTRODUCTION — Advances in molecular biology over the past 15 years have expanded our understanding of the processes of peptide hormone receptor binding and signal transduction that were previously impossible to study. The DNA sequences for hundreds of receptors and many signaling molecules involved in their regulation have been analyzed.
Signal transduction is a process in which a peptide hormone transfers specific information from the outside of the target cell to exert a cellular response. For this to occur, the hormone (eg, gastrin) exerts a signal through a specific receptor that transmits information from the extracellular compartment (blood) into the cell (acid-secreting cells of the stomach). This message is tightly controlled, especially in settings that are vital for cellular homeostasis.
The normal function of a cell depends upon an intact signal regulation/termination system. If this system malfunctions, the host may experience pathophysiological consequences such as abnormal secretion, motility, growth, or even the development of cancer [1,2].
The major physiological principles of cell signaling systems will be reviewed here. Discussions of individual peptide hormones are presented separately. (See appropriate topic reviews.)
RECEPTOR STIMULATION — Despite the vast array of information communicated to a cell, the basic components of the signaling system are relatively simple (show figure 1). A peptide hormone binds to a cell surface receptor and stimulates activation of an effector system. Cell surface receptors are capable of interacting with only certain chemical messages. The specificity of the hormone-receptor interaction is responsible for the unique cellular response.
The peptide hormone must initiate a change in the receptor such that the hormone-receptor complex activates an intracellular effector molecule such as a specific guanyl-nucleotide-binding protein (G-protein) (show figure 2). Most peptide hormone receptors act through G-proteins; as a result, these receptors are called G protein-coupled receptors (GPCRs).
G proteins — G-proteins are molecular intermediaries that initiate the intracellular communication process (show figure 2) [3,4]. After the hormone binds to its receptor, a G-protein is stimulated. Stimulation begins the intracellular process of signal transduction.
G-proteins are composed of three subunits (alpha, beta, and gamma) and are classified according to their alpha subunit. G-proteins that stimulate adenylyl cyclase are classified as the Gs type; those that inhibit adenylyl cyclase are called Gi. To date, 20 different G-protein alpha subunits have been identified [4].
Shortly after receptor stimulation, a series of events are initiated, which ultimately act to turn off signaling. The principle events in this process involve receptor desensitization and internalization, which reestablish cell responsiveness. (See "Desensitization" below and see "Internalization" below).
G protein-coupled receptors — G protein-coupled receptors are heptahelical proteins, with seven membrane spanning domains [5]. They contain an extracellular amino terminus and an intracellular carboxy terminus (show figure 3). When stimulated by the appropriate chemical messenger, the GPCR undergoes a conformational change that causes coupling to a specific G protein.
GPCRs are classified by their structure into three groups (show table 1). Group I, the largest group, contains the receptors for catecholamines, many peptide hormones, neuropeptides, and glycoproteins. Group II contains the secretin/glucagon/vasoactive intestinal peptide receptor family. Group III contains the metabotrophic receptors (eg, calcium-sensing and glutamate receptors).
Effector systems — Following receptor occupation, G-protein subunits cause activation of enzymes or other proteins, ultimately resulting in a variety of cellular responses (show figure 4). Enzymes, such as adenylyl cyclase or phospholipase C, generate specific second messengers; examples include cyclic adenosine monophosphate (cAMP) and inositol 1,4,5 triphosphate (IP3) and diacylglycerol. Some G-proteins couple directly with specific ion channels, such as potassium or calcium channels, and initiate changes in ion permeability (show figure 4). The effector systems are not understood for some receptors such as receptors involved with cell growth and differentiation (show table 2).
Adenylate cyclase — One of the most studied effector systems of receptor activation is the production of cAMP. As discussed above, Gs coupled G-protein-coupled receptors stimulate adenylate cyclase to produce cAMP. A conformational change occurs as the hormone binds to its receptor allowing the receptor to associate with Gs. Under basal (unstimulated) conditions Gs is bound to GDP. However, GDP is released during hormone binding and is replaced with GTP. The Gs-GTP complex then activates adenylyl cyclase, resulting in the formation of cAMP from ATP within the cytoplasm of the cell. cAMP is then capable of producing other effects within the cell, ultimately leading to responses such as secretion, motility, or growth.
The G alpha-GTP complex is gradually inactivated by GTPase, which converts GTP to GDP. This enzymatic conversion occurs spontaneously by the G-protein, which is itself a GTPase. The conversion of GTP to GDP no longer permits G-protein stimulation of adenylate cyclase and is one way by which the hormone signal is terminated and the basal condition is restored.
Phospholipase C — Other G-proteins, such as Go, activate the phosphoinositide system when bound to hormone. Phospholipase C (PLC) acts on inositol phospholipids found in the cell membrane. As an example, PLC can cause the hydrolysis of phosphatidylinositol 4, 5 bisphosphate (PIP2) to 1, 2 diacylglycerol and inositol 1,4,5 triphosphate (IP3). Diacylglycerol and IP3 can then act as regulators of cell metabolism. This pathway can alter cell function by increasing intracellular calcium levels.
SIGNAL REGULATION AND TERMINATION — Even while signal transduction is occurring, processes begin that will terminate receptor responsiveness.
Desensitization — For the cell to respond to future stimuli, signaling must be terminated completely and in a timely fashion; a process known as desensitization. Desensitization begins within seconds to minutes of hormone binding, and eventually results in signal termination [6].
Desensitization is the primary regulatory step that assures appropriate cell function. It involves the termination of receptor activation by receptor phosphorylation, which is initiated by specific G protein-coupled receptor kinases (GRKs) or second messenger-dependent kinases (eg, protein kinase A and protein kinase C).
Phosphorylation of receptors requires the recruitment of proteins to the hormone-receptor complex, which participate in regulating signaling. One of these is beta-arrestin, which is located in the cytoplasm of unstimulated cells [6]. Upon hormone receptor stimulation, beta-arrestin is translocated from the cytoplasm to the cell membrane and assists in signal termination and subsequent hormone-receptor internalization [6-8].
Internalization — Once the receptor is adequately phosphorylated, the hormone-receptor complex is moves from the cell membrane to the inside of the cell; a process known as "internalization." Internalization, which may also involve beta-arrestins [8], permits receptor processing to occur, which will most likely result in receptor dephosphorylation, removal/degradation of the peptide hormone, and receptor degradation or recycling. Regardless of the eventual fate of the hormone-receptor complex, the goal is to reestablish cell responsiveness, so the next hormone stimulus is capable of sending the necessary information into the cell.
Beta-arrestin — Arrestins are cytosolic proteins that are recruited to hormone bound receptors and bind to cytoplasmic regions of the receptor [9]. Once bound with beta-arrestin, the hormone-receptor complex is "targeted" to a specific endocytic pathway that turns off the signaling process. Endocytosis is the process by which the hormone-occupied receptor is brought from the plasma membrane into the cell. The eventual fate of the receptor depends in part upon the receptor type. Some receptors are rapidly internalized and recycled back to the cell membrane while others are destroyed and only newly produced receptors are expressed on the cell surface.
Non-G protein-coupled receptors
Receptor tyrosine kinases — Some peptides signal through receptors that are not linked to G proteins. One particular class of receptors possesses intrinsic protein tyrosine kinase activity. These receptors are comprised of an extracellular domain that is usually glycosylated, a single transmembrane domain, and a cytoplasmic domain that contains a protein tyrosine kinase region and a region that is a substrate for peptide ligand-activated phosphorylation.
With peptide binding, these receptors either phosphorylate themselves or are phosphorylated by other protein kinases [10]. After activation, these receptors initiate other intracellular signal transduction pathways including Ras that activates MAP kinase. MAP kinase, in turn, modulates other cellular proteins, particularly transcription factors. Specific phosphorylated tyrosine residues are also binding sites for Src homology regions 2 and 3 (SH2 and SH3 domains) that can activate various signaling pathways [11].
Examples of the receptor tyrosine kinase family include receptors for epidermal growth factor, insulin, insulin-like growth factor, fibroblast growth factor, vascular endothelial growth factor, platelet-derived growth factor, nerve growth factor, and macrophage colony stimulating factor [12].
Receptor serine/threonine kinases — Receptor serine/threonine kinases such as TGF-b receptors contain a single transmembrane domain. Stimulation of these receptors activates endogenous serine/ threonine kinase activity which modulates cellular protein function [13].
PATHOPHYSIOLOGIC RELEVANCE — Dysfunction of the control mechanisms of cellular signaling may lead to a number of pathophysiologic consequences [14]. Numerous receptor mutations have been identified that result in unregulated stimulation in the absence of hormone (constitutive activity). As examples, a constitutively active receptor has been found in thyroid adenomas producing clinical hyperthyroidism [15] and in precocious puberty secondary to a mutation in the luteinizing hormone receptor [16]. On the other hand, the McCune-Albright syndrome is due to postzygotic activating mutations in the gene encoding the G alpha s protein, resulting in activation of the signal-transduction pathway generating cyclic AMP [17-19]. The clinical manifestations include polyostotic fibrous dysplasia, cafe au lait spots, and hyperfunction of multiple glands that can lead to sexual precocity, Cushing's syndrome, acromegaly, hyperthyroidism, or hyperparathyroidism.

Peptide hormone signal transduction and regulation

Overview of transcription factors

INTRODUCTION — Transcription is the process whereby the information in genomic DNA is transferred to RNA. The transcription of all genes requires the activity of critical core components that initiate the construction and elongation of RNA. Elements of this basic machinery include the transcription initiation complex and various transcription factors.
The transcription initiation complex consists of multiple molecules, including RNA polymerase II and the TATA binding factor. Most, but not all, genes have a TATA box located approximately 20 and 30 base pairs upstream of the transcription initiation site. This element helps specify the precise site at which transcription is initiated by binding the TATA binding factor. The exact sequence of the TATA box is variable. A number of related thymine and adenine rich sequences all confer TATA box function.
Transcription also requires various additional proteins, named transcription factors, that bind to specific recognition sequences close to the transcription initiation sites. This binding provides a mechanism for tissue- and stimulus-specific gene expression: Tissue-specificity is determined in part by the profile of transcription factors expressed in a given cell type. Stimulus-specificity is partly based upon the occupancy and activation of particular receptors, which leads to transcription factor-mediated alterations in gene expression.
Hundreds of transcription factors have been identified. These factors and their recognition elements are listed on several websites, including TRANSFAC, JASPAR, and TELIS [1-4]. These websites organize transcription factors based upon the presence of specific motifs, such as the following: Leucine zippers (eg, a sequence consisting of a leucine residue at every seventh position) Zinc fingers (eg, the presence of a number of residues - usually four cysteine molecules - that coordinate one zinc ion) Helix-loop-helix (eg, two potential alpha helices connected by a loop of variable length)
Alternatively, the proteins can be characterized based on DNA binding motifs and some databases allow searches based on user-specified features.
An encyclopedic compendium of transcription factors is beyond the scope of this topic review. Instead, the general properties of transcription factors will be presented, followed by a brief review of the specific characteristics of a few, and the consequences of mutations in transcription factors. A review of the basics of molecular biology is presented separately. (See "Overview of molecular biology").
OVERVIEW OF TRANSCRIPTION FACTOR FUNCTION
Properties — Certain characteristics are shared by nearly all transcription factors: All transcription factors bind to short recognition sequences within DNA and interact with the proteins of the transcription machinery. All transcription factors modulate the rate at which their target genes are transcribed. Recent work has demonstrated that the genome contains fewer genes than previously believed and that many genes encode multiple alternative transcripts. Parallel work in stem cell biology and tissue regeneration have revealed the existence of molecular switches that serve to drive cells along various developmental trajectories. These discoveries highlight the importance of transcription factor function in determining cell fate and establishing differentiated expression patterns. Transcription factors may be present in an inactive or active form. In the case of the nuclear hormone receptors, activation occurs after ligand binding. With other transcription factors, such as the signal transduction and activation of transcription proteins (STATs), protein phosphorylation causes activation. Other transcription factors are constitutively active. As an example, Pit-1 is always active, although it is synthesized only in the pituitary gland, presumably as a result of control by other transcription factors [2].
Functions — A cell's developmental fate is a direct consequence of the specific receptors and transcription factors present within the cell when it is exposed to biologically active ligands. An enormous complexity of gene expression results from the combined actions of multiple ligands and receptors triggering downstream signaling via a large array of transcription factors.
The precise mechanisms by which transcription factors help regulate gene expression are not entirely known. A simplified view of this process is shown in the figure (show figure 1): The transcription factor binds to specific sites in the genome via its DNA binding domain. The transcription factor interacts (through additional domains) with other proteins that comprise the transcription initiation complex. The protein-protein interaction between the transcription factor and the initiation complex changes the activity of the initiation complex, resulting in a modification of the rate of transcription.
Assays of function — A commonly used in vitro method to detect the binding of transcription factors to their DNA recognition sites is the "electrophoretic mobility shift assay" or "electrophoretic gel retardation assay" (show figure 2). This assay exploits the property that a DNA-transcription factor protein complex and DNA alone migrate differently on a separating gel matrix, thereby causing a shift or retardation in movement.
The electrophoretic mobility shift assay is used to understand the following properties of transcription factors: What sequences are essential for transcription factor binding Where transcription factor recognition sites are located within specific genes What transcription factors bind to specific recognition sites
TRANSCRIPTION FACTOR MUTATIONS IN DISEASE — Mutations in specific transcription factors can lead to human disease and provide insight into the multi-faceted impact of these proteins. Examples include the transcription factors Runx2 and peroxisome proliferator-activated receptor gamma (PPARg).
Osteoblasts and adipocytes arise from a common precursor found in the bone marrow [5]. To illustrate the role of transcription factors in driving differentiation, this section will briefly review the roles of Runx2 and PPARg in driving the osteoblastic and adipocytic programs, respectively.
Cleidocranial dysplasia — Cleidocranial dysplasia was recognized as a clinical syndrome long before RUNX2 (also called OSF2 and CBFA1) was recognized as the gene mutated in the disorder. The cardinal clinical features of this disorder include delayed closure of cranial sutures, delayed tooth eruption, hypoplastic clavicles, short stature, scoliosis, and multiple additional skeletal abnormalities. Cleidocranial dysplasia is caused by a mutation in RUNX2, a member of the Runx family of transcription factors, located on chromosome 6p21. This single gene is responsible for the initial differentiation of osteoblasts to form skeletal structures [6,7]. (See "Normal skeletal development and regulation of bone formation and resorption").
Linkage mapping revealed heterozygous deletions including the RUNX2 locus in affected families, and sequencing revealed various loss-of-function mutations in additional disease kindreds [7]. A mouse Runx2 knockout leads to features reminiscent of human cleidocranial dysplasia when heterozygous, and lethality at birth when homozygous, with global ossification failure [8]. Both intramembranous and endochondral ossification are disrupted, and mature osteoblastic protein products are not present in the matrix. These findings show that Runx2 serves as a key developmental switch in the osteoblastic lineage.
Runx2 also impacts maturation of the osteoclast and chondrocyte lineages. Osteoclast development requires signals provided by both macrophage colony stimulating factor (M-CSF) and receptor activator of NFkappaB ligand (RANKL), the latter produced by osteoblasts. Osteoclast development is impaired in the Runx2 knockout, while in vitro osteoblast-osteoclast co-culture experiments with Runx2 overexpressing cells show increased RANKL expression and enhanced osteoclastogenesis [8,9]. The effects on osteoclastogenesis show that Rankl is expressed early in the differentiation of the osteoblastic lineage, upstream of another "master" osteoblastic developmental switch, the transcription factor Sp7 (also known as Osterix). Sp7 knockout mice express Runx2, display global failure of skeletal development, but without impairment of osteoclast development [10]. Thus, while Runx2 gene expression is needed for osteoclastogenesis, Sp7 expression is not.
Indian hedgehog (IHH), a paracrine factor that regulates chondrocyte maturation in the growth plate, is also dependent on Runx2. In the absence of Runx2, IHH signaling is diminished, resulting in more rapid chondrocyte maturation and correspondingly decreased chondrocyte proliferation [11]. The cross-talk between Runx2 and IHH accounts for the shortened limb phenotype encountered in cleidocranial dysplasia.
Mutations in the PPAR gamma gene — PPARg is a member of the steroid hormone receptor superfamily and is active as a heterodimer with RXR. This receptor modulates a variety of interrelated processes, including adipogenesis (tontonoz cell 94), insulin sensitivity, lipid peroxidation, lipoprotein transport, and inflammatory cytokine release. It is the target of the currently used thiazolidinedione drugs rosiglitazone and pioglitazone, which act as PPARg agonists [12].
Two isoforms of PPARg are known, PPARg1 and PPARg2. PPARg2 includes an additional 28 N-terminal amino acid residue and is restricted in its expression to the adipocyte lineage [13]. Moreover, this isoform causes increased ligand-independent transcriptional activation [14]. In spite of a decade of intensive study, the natural ligands for PPARg remain incompletely known. Among the potential physiological ligands are 15-deoxy-D12,14 prostaglandin J2, the first identified ligand, an unidentified ligand induced during adipogenesis, and dietary lipids [15-18]. Importantly, PPARg can modulate transcription both in the presence and absence of bound ligand via its interactions with associated coactivator and corepressor proteins. In adipose tissue, PPARg leads to lipid trapping and promotion of the adipocytic differentiation program [19,20]. In liver, macrophages, and other tissues PPARg activation leads to lipid oxidation.
Thiazolidinedione drugs are potent PPARg agonists [21]. Their therapeutic actions include increasing peripheral glucose disposal, leading to marked improvement in insulin sensitivity [22]. The improvement in insulin sensitivity is accompanied by an increase in the mass of adipose tissue and, in a sizable fraction of patients, by fluid retention. As noted above, both adipocytes and osteoblasts arise from a common precursor cell. Given the role of thiazolidinediones in promoting the adipocytic developmental program, it is not surprising that reduced bone formation has been observed in response to the drugs' administration in animal models [23-25]. However, whether similar adverse effects on bone mass occur in humans remains an open question [26,27].
Rare individuals have been reported with dominant-negative PPARg gene mutations. These patients manifest a syndrome that combines lipodystrophy with features of the metabolic syndrome, including insulin resistance, type 2 diabetes, hepatic steatosis, dyslipidemia, hypertension, and polycystic ovary syndrome in women [28-30]. A common P12A polymorphism is associated with type 2 diabetes risk, with the proline allele conferring a relative risk of 1.25 compared to the alanine allele [31]. (See "The metabolic syndrome (insulin resistance syndrome or syndrome X)").
SUMMARY — Transcription factors serve as molecular switches that allow modulation of gene expression and enable a multiplicity of cellular phenotypes to be generated from a limited number of genes. They are critical to both cellular development and responsiveness to physiological and pathological stimuli. Mutations in transcription factors have been identified as causes of syndromic human diseases.

Overview of transcription factors

Overview of molecular biology

INTRODUCTION — Every human began life as a single fertilized egg. This single cell contained all the necessary information to direct development of the various organs and tissues of the body, including germ cells. Understanding the basis of tissue diversity is an ongoing theme of biomedical research. Although many details of this process are unclear, the basic scheme of how tissue-specific functions are established is known.
The basics of molecular biology are reviewed here; this includes the relationship among DNA, RNA, and proteins, as well as the language used in describing molecular and cellular processes. The summarized material is essential to properly understand topics addressed elsewhere within UpToDate. (See "Polymerase chain reaction" and see "Repetitive DNA"). Readers who desire additional coverage of any topics discussed here may consult the cited literature, which was chosen to be both sufficiently detailed and written at an appropriate level for physicians.
CELLULAR DIVERSITY AND GENOMIC STABILITY — In general, all cells of an individual possess exactly the same genetic information; this is largely contained in nuclear DNA that is located within 46 discrete chromosomes (22 pairs of autosomes and 1 pair of sex chromosomes).
In contrast to the general constancy of DNA across tissues, the structure and function of different tissues are highly variable. As an example, cardiac muscle clearly differs from skin or liver. These differences in function are achieved by selective activation of genes in each cell type. The mechanisms by which this is accomplished are understood at the general level, but differ in their details in various tissues.
Some exceptions to the constancy of genetic information exist. As examples: Immunoglobulin and T cell receptor gene rearrangements occur in normal B and T cells, respectively. The degradation of genetic information, including mutations, chromosomal duplication or loss, and/or rearrangements, is a hallmark of neoplastic disease.
CENTRAL DOGMA OF MOLECULAR BIOLOGY — The discovery of the structure of DNA, RNA, and proteins, and of the genetic code provides the conceptual framework by which genetic stability and functional diversity is currently understood.
DNA is the information-storing molecule; each molecule is duplicated during the process of replication that accompanies each cell generation [1]. The process by which this information is transferred into cellular function begins when the genetic information residing in DNA is transcribed into messenger RNA (mRNA).
Messenger RNA is then used to direct the translation of genetic information to physiologically active proteins. This is performed by utilizing the information contained within the mRNA sequence to construct a unique polypeptide, which is defined by a linear chain of amino acids. Via the use of molecular machinery, the specific amino acids are placed within the polypeptide as directed by a template defined by unique triplets of bases (codons) found within the mRNA molecule. This process establishes the correspondence between DNA encoded information and its expression in protein via the universal genetic code [1].
Not all RNA molecules function as mRNA. Some act as components of ribosomes, others are involved in RNA splicing, and still others serve as transfer RNA. Finally, some double stranded RNAs direct targeted degradation of homologous mRNA molecules, inhibiting their translation into proteins [2].
Exceptions to the central dogma of molecular biology are essential to understanding the biology of viruses, organelles, and bacterial second-site suppression: Many viruses store their genetic information as RNA rather than as DNA. Several different mechanisms have been incorporated into viral life cycles to accommodate this difference from the biology of their host cells. Retroviruses utilize reverse transcription to integrate into the host genome following infection [3]. The mitochondrial genetic code differs from the universal genetic code in that UGA (see below) is used as a tryptophan codon and not as a termination codon [4,5]. Alteration of the genetic code in bacteria is the mechanism by which suppression of nonsense mutants is achieved [6].
Essential elements of the central dogma are discussed in each of the following sections. Mechanistic details, however, are not provided.
STRUCTURE OF DNA AND TEMPLATE-DIRECTED NUCLEIC ACID SYNTHESIS — DNA is normally present as an antiparallel polymeric double helix composed of four nucleotide subunits. The nucleotide subunits consist of the following bases: Adenine (A) Guanine (G) Thymine (T) Cytosine (C)
The two strands of the double helix are held together by specific hydrogen bonds that form between A and T (2 hydrogen bonds) or between G and C (3 bonds).
A and G, the larger bases, are purines, while T and C, the smaller bases, are pyrimidines. Double stranded DNA contains equimolar amounts of purine and pyrimidine. In addition, A and T are present in equimolar amounts, as are G and C [7]. The backbone of the DNA molecule is an alternating copolymer of deoxyribose, a 5 carbon sugar and phosphate groups, linked by phosphodiester bonds to the 5' and 3' carbons of each deoxyribose unit.
The hydrogen bonding between the complementary base pairs A and T or G and C provides the chemical basis for DNA's function as the storage medium for genetic information. The genetic information is encoded as the sequence of bases along a DNA strand and is read from the 5' to 3' direction. Since base pairing is specific, it follows that the opposite strands of the molecule carry redundant information, although their sequences are not identical. As a result, given the sequence of a single DNA strand, it is a simple exercise to write down the sequence of its complementary strand.
The weakness of hydrogen bonds, which each possess a strength of approximately 2 kcal/mol, is an important feature with regard to nucleic acid function. This weakness allows denaturation, or separation of the DNA strands, to occur at physiologic temperatures. Separation of the complementary DNA strands and synthesis of new DNA strands by sequences directed by the templates of the original strands allows for accurate copying of sequence information.
The website www.accessexcellence.org/RC/VL/GG/dna_replicating.html, contains a picture of DNA replicating itself. In general, DNA replication is semi-conservative, in that each daughter molecule contains one old and one newly synthesized strand.
During the S phase of each cell cycle, DNA is replicated by DNA polymerases to provide each daughter cell with a complete genome. The genome is the total genetic complement of an organism. Regulation of the cell cycle and consequences of improper cell cycle regulation are reviewed separately.
There are several important structural differences between DNA and RNA. In general, DNA is double-stranded and RNA is single stranded. In DNA, the sugar is deoxyribose, while it is ribose in RNA. In DNA, thymine is the pyrimidine complementary to adenine, but uracil replaces thymine in RNA.
Transcription — Template-directed synthesis, in which one strand of DNA provides sequence information, is used in both DNA replication and transcription of DNA to form mRNA. However, some DNA sequences do not encode protein: Some DNA sequence elements provide control information; they specify the location of an active gene or allow the binding of transcription factors that modulate the rate at which a gene is transcribed. (See "Overview of transcription factors"). Coding regions of a gene's DNA sequence are characteristically interrupted by introns or noncoding intervening sequences that are spliced (or removed) out of mature mRNA. Intron boundaries are marked by splice donor and splice acceptor sites, which provide sequence recognition sites for the spliceosomes; spliceosomes are the enzymatic ribonucleoprotein complex that removes introns from the primary transcript to produce mature mRNA. The 3' end of an mRNA molecule is a tail of adenines that are not present in the DNA, but are added when the transcriptional machinery recognizes a polyadenylation site.
The transcription initiation complex and various transcription factors recognize sequence signals present in the DNA to identify the presence of an active gene. Local denaturation of the DNA allows RNA polymerase to synthesize an mRNA molecule using the coding strand of the DNA as a template. The primary transcript synthesized in this step is spliced and polyadenylated to yield a mature mRNA molecule.
There are three major RNA polymerases present in mammalian cells: The outline of transcription given above is applicable to RNA polymerase 2, the polymerase that is responsible for the expression of most genes. RNA polymerase 1 functions primarily to transcribe ribosomal RNA RNA polymerase 3 functions primarily to transcribe a variety of small RNAs, such as tRNAs (transfer RNAs) and the RNA components of the spliceosomes.
GENETIC CODE AND TRANSLATION — Mature mRNA leaves the nucleus and reaches the ribosomes, where its sequence is recognized and used to direct the synthesis of a polypeptide chain. The ribosomes are complex ribonucleoprotein structures that include the enzymatic machinery for protein synthesis. Protein synthesis is template directed, with the mRNA's sequence information being used to specify the protein's amino acid sequence.
There is a complication concerning the relationship between mRNA and proteins: RNA contains 4 bases while proteins may contain up to 20 amino acids (if amino acid modification is excluded). To overcome this numerical difference, the genetic code establishes a correspondence between specific triplets of bases (codons) and specific amino acids. However, since there are 64 ways to combine three bases, the code is redundant: some amino acids are encoded by more than one codon. In addition, there are three codons (UAG, UGA, UAA) that do not encode amino acids; instead, they specify the end of a polypeptide chain.
By convention, the codons are given as the sequence of mRNA, not the sequence of the complementary DNA strand. The nucleic acid sequence is given 5' to 3' and the protein sequence is given N-terminal- to C-terminal. These directions correspond to the direction of synthesis.
To synthesize proteins, the following processes must occur sequentially. Beginning with the start codon, each codon is held at the synthetic site in the ribosome; at this site, an amino-acid-charged transfer RNA (tRNA) molecule containing a complementary anticodon base pairs with the mRNA. The carried amino acid is subsequently added to the nascent polypeptide chain. Once the amino acid is added, the ribosome moves processively codon by codon along the mRNA strand, adding an additional amino acid to the polypeptide at each step as dictated by the unique codon. When the ribosome encounters a chain termination, or nonsense, codon it releases both the mRNA and the newly synthesized protein.
The processive movement of the ribosome along the mRNA molecule allows multiple ribosomes to simultaneously synthesize multiple copies of a protein from a single mRNA molecule. This is observed microscopically by the presence of polyribosomes or polysomes, which is a tight spatial array of ribosomes translating a single mRNA molecule.
IMPLICATIONS OF THE CENTRAL DOGMA TO MEDICINE — The importance of specific base pairing for the development of a normal organism and/or the maintenance of health cannot be overstated. The mechanisms of template-directed replication and transcription allow preservation of genetic information and its use to encode functional proteins.
Errors in these processes as well as intrinsic properties of this molecular machinery have direct implications for the practice of medicine. As examples: Errors in replication account for mutations that cause a wide array of diseases, including inherited disorders and malignancies. Divergence between humans and bacteria in the enzymatic machinery that carry out the functions of transcription and translation provides the molecular targets for an array of antibiotics that are lethal to bacteria but harmless to humans.
Specific base pairing is also central to many routine laboratory methods. Hybridization of DNA or RNA to labeled probes is accomplished by denaturing a specimen and allowing the probe to base pair with the resulting single-stranded nucleic acid. The polymerase chain reaction (PCR) uses cycles of hybridization followed by template-directed DNA synthesis to produce multiple copies of a defined sequence. Recently introduced chip [8-12] and chromosome painting technologies [13,14] are simply refinements of the basic chemistry of hybridization. As these techniques become integrated into clinical laboratory practice, it is useful for practicing physicians to understand the principles that underlie them.
OTHER LITERATURE AND INFORMATION SOURCES — This topic review only provides a superficial overview of a vast literature. More complete accounts of this material are available from the following books and web sites: Molecular Cell Biology, 5th edition by Matthew P Scott, Paul Matsudaira, Harvey Lodish, James Darnell, Lawrence Zipursky, Chris A Kaiser, Arnold Berk, Monty Krieger, W.H. Freeman and Company, 2003. ISBN 0716743663. Molecular Biology of the Cell, 4th edition by Bruce Alberts, Alexander Johnson, Julian Lewis, Martin Raff, Keith Roberts, Peter Walter Garland Publishing, 2002, ISBN 0815332181. A Genetic Switch: Phage Lambda Revisited, 3rd edition by Mark Ptashne. Cold Spring Harbor Laboratory Press, 2004, ISBN 0879697164. The Massachussetts Institute of Technology's Experimental Study Group has an on-line Biology Hypertextbook which includes outstanding text and images. It is located at: web.mit.edu/esgbio/www The National Health Museum hosts the Access Excellence program, which was initiated by Genentech to provide educational tools for biology and genetic engineering. Its Graphics Gallery includes multiple high-quality images and can be accessed at: www.accessexcellence.org/AB/GG

Pages

Sunday, February 3, 2008