Of the 4,000 or so genes in the typical bacterial genome, or the 20,000 genes in the human genome, only a fraction are expressed in a cell at any given time. Some gene products are present in very large amounts: the elongation factors required for protein synthesis, for example, are among the most abundant proteins in bacteria, and ribulose 1,5- bisphosphate carboxylase/oxygenase (rubisco) of plants and photosynthetic bacteria is one of the most abundant enzymes in the biosphere. Other gene products occur in much smaller amounts; for instance, a cell may contain only a few molecules of the enzymes that repair rare DNA lesions.

Requirements for some gene products change over time. The need for enzymes in certain metabolic pathways may wax and wane as food sources change or are depleted.

During development of a multicellular organism, some proteins that influence cellular differentiation are present for just a brief time in only a few cells. Specialization of cellular function can greatly affect the need for various gene products; an example is the uniquely high concentration of a single protein—hemoglobin—in erythrocytes. Given the high cost of protein synthesis, regulation of gene expression is essential to making optimal use of available energy.

The cellular concentration of a protein is determined by a delicate balance of at least seven processes, each having several potential points of regulation:

  1. Synthesis of the primary RNA transcript (transcription)
  2. Posttranscriptional modification of mRNA
  3. Degradation of mRNA
  4. Protein synthesis (translation)
  5. Posttranslational modification of proteins
  6. Protein targeting and transport
  7. Degradation of protein
FIGURE DEPICITNG :- Various stages for regulation of protein synthesis


RNA polymerase is directed to a specialized set of promoters with a different consensus sequence. These promoters control the expression of a set of genes that encode proteins, including some protein chaperones, that are part of a stress-induced system called the heat shock response. Thus, through changes in the binding affinity of the polymerase that direct the enzyme to different promoters, a set of genes involved in related processes is coordinately regulated. In eukaryotic cells, some of the general transcription factors, in particular the TATA binding protein, may be considered specificity factors. Repressors bind to specific sites on the DNA. In bacterial cells, such binding sites, called operators, are generally near a promoter.

RNA polymerase binding, or its movement along the DNA after binding, is blocked when the repressor is present. Regulation by means of a repressor protein that blocks transcription is referred to as negative regulation. Repressor binding to DNA is regulated by a molecular signal, or effector, usually a small molecule or a protein that binds to the repressor and causes a conformational change. The interaction between repressor and signal molecule either increases or decreases transcription. In some cases, the conformational change results in dissociation of a DNA-bound repressor from the operator. Transcription initiation can then proceed unhindered. In other cases, interaction between an inactive repressor and the signal molecule causes the repressor to bind to the operator.

Activators provide a molecular counterpoint to repressors; they bind to DNA and enhance the activity of RNA polymerase at a promoter; this is positive regulation. In bacteria, activator-binding sites are often adjacent to promoters that are bound weakly or not at all by RNA polymerase alone, such that little transcription occurs in the absence of the activator. Some activators are usually bound to DNA, enhancing transcription until dissociation of the activator is triggered by the binding of a signal molecule. In other cases the activator binds to DNA only after interaction with a signal molecule. Signal molecules can therefore increase or decrease transcription, depending on how they affect the activator.



Bacteria have a simple general mechanism for coordinating the regulation of multiple genes: these genes are clustered on the chromosome and are transcribed together. Many bacterial mRNAs are polycistronic—multiple genes on a single transcript—and the single promoter that initiates transcription of the cluster is the site of regulation for expression of all the genes in the cluster. The gene cluster and promoter, plus additional sequences that function together in regulation, are called an operon.


The lactose (lac) operon  includes the genes for β-galactosidase (Z), galactoside permease (Y), and thiogalactoside transacetylase (A). The last of these enzymes seems to modify toxic galactosides to facilitate their removal from the cell. Each of the three genes is preceded by a ribosomebinding site.

When cells are provided with lactose, the lac operon is induced. An inducer (signal) molecule binds to a specific site on the Lac repressor, causing a conformational change that results in dissociation of the repressor from the operator. The inducer in the lac operon system is not lactose itself but allolactose, an isomer of lactose. After entry into the E. coli cell (via the few existing molecules of lactose permease), lactose is converted to allolactose by one of the few existing β-galactosidase molecules. Release of the operator by Lac repressor, triggered as the repressor binds to allolactose, allows expression of the lac operon genes and leads to a 10 3 -fold increase in the concentration of β-galactosidase.

The lac operator  is a short region of DNA that interacts with a regulatory protein lac repressor, which negatively controls the transcription of the operon.

A structural gene is simply any gene that codes for a protein. Lactose uptake and degradation are mediated by the of three structural genes. The role of three structural genes are

Lacz  codes for beta galactosidase that cleaves the lactose molecules to yield glucose and galactose.

LacY codes for beta galactoside permease, which transport lactose in the cell.

Lac A codes for beta galactoside acteyltransferase. This enzyme is not essential for lactose metabolism but appear to play a role in detoxification of compound by transferring an acetyl group.

A regulatory mechanism known as catabolite repression restricts expression of the genes required for catabolism of lactose, arabinose, and other sugars in the presence of glucose, even when these secondary sugars are also present. The effect of glucose is mediated by cAMP, as a coactivator, and an activator protein known as cAMP receptor protein, or CRP (the protein is sometimes called CAP, for catabolite gene activator protein).

Binding is mediated by a helix-turn-helix motif in the protein’s DNA-binding domain. When glucose is absent, CRPcAMP binds to a site near the lac promoter and stimulates RNA transcription 50-fold. CRP-cAMP is therefore a positive regulatory element responsive to glucose levels, whereas the Lac repressor is a negative regulatory element responsive to lactose. The two act in concert. CRP-cAMP has little effect on the lac operon when the Lac repressor is blocking transcription, and dissociation of the repressor from the lac operator has little effect on transcription of the lac operon unless CRP-cAMP is present to facilitate transcription; when CRP is not bound, the wild-type lac promoter is a relatively weak promoter.

CRP binds to DNA most avidly when cAMP concentrations are high. In the presence of glucose, the synthesis of cAMP is inhibited and efflux of cAMP from the cell is stimulated. As [cAMP] declines, CRP binding to DNA declines, thereby decreasing the expression of the lac operon. Strong induction of the lac operon therefore requires both lactose (to inactivate the lac repressor) and a lowered concentration of glucose (to trigger an increase in [cAMP] and increased binding of cAMP to CRP).

FIGURE DEPICTING Positive regulation by CRP
FIGURE DEPICTING :- Positive regulation by CRP


The 20 common amino acids are required in large amounts for protein synthesis, and E. coli can synthesize all of them. The genes for the enzymes needed to synthesize a given amino acid are generally clustered in an operon and are expressed whenever existing supplies of that amino acid are inadequate for cellular requirements. When the amino acid is abundant, the biosynthetic enzymes are not needed and the operon is repressed.

The E. coli tryptophan (trp) operon  includes five genes for the enzymes required to convert chorismate to tryptophan. When tryptophan is abundant, it binds to the Trp repressor, causing a conformational change that permits the repressor to bind to the trp operator and inhibit expression of the trp operon. The trp operator site overlaps the promoter, so binding of the repressor blocks binding of RNA polymerase.

Once again, this simple on/off circuit mediated by a repressor is not the entire regulatory story. Once repression is lifted and transcription begins, the rate of transcription is fine-tuned to cellular tryptophan requirements by a second regulatory process, called transcription attenuation, in which transcription is initiated normally but is abruptly halted before the operon genes are transcribed.

The frequency with which transcription is attenuated is regulated by the availability of tryptophan and relies on the very close coupling of transcription and translation in bacteria. The trp operon attenuation mechanism uses signals encoded in four sequences within a 162 nucleotide leader region at the 5′ end of the mRNA, preceding the initiation codon of the first gene.

 The leader contains a region known as the attenuator, made up of sequences 3 and 4. These sequences base-pair to form a G≡C-rich stem-and-loop structure closely followed by a series of U residues. The attenuator structure acts as a transcription terminator.

 Sequence 2 is an alternative complement for sequence 3. If sequences 2 and 3 base-pair, the attenuator structure cannot form and transcription continues into the trp biosynthetic genes; the loop formed by the pairing of sequences 2 and 3 does not obstruct transcription. Regulatory sequence 1 is crucial for a tryptophan-sensitive mechanism that determines whether sequence 3 pairs with sequence 2 (allowing transcription to continue) or with sequence 4 (attenuating transcription).

Formation of the attenuator stem-and-loop structure depends on events that occur during translation of regulatory sequence 1, which encodes a leader peptide (so called because it is encoded by the leader region of the mRNA) of 14 amino acids, two of which are Trp residues. The leader peptide has no other known cellular function; its synthesis is simply an operon regulatory device. This peptide is translated immediately after it is transcribed, by a ribosome that follows closely behind RNA polymerase as transcription proceeds.

 When tryptophan concentrations are high, concentrations of charged tryptophan tRNA (Trp-tRNATrp ) are also high. This allows translation to proceed rapidly past the two Trp codons of sequence 1 and into sequence 2, before sequence 3 is synthesized by RNA polymerase. In this situation, sequence 2 is covered by the ribosome and unavailable for pairing to sequence 3 when sequence 3 is synthesized; the attenuator structure (sequences 3 and 4) forms and transcription halts.

When tryptophan concentrations are low, however, the ribosome stalls at the two Trp codons in sequence 1, because charged tRNATrp is less available. Sequence 2 remains free while sequence 3 is synthesized, allowing these two sequences to base-pair and permitting transcription to proceed. In this way, the proportion of transcripts that are attenuated declines as tryptophan concentration declines.

FIGURE DEPICTING The tryptophan operon
FIGURE DEPICTING :- The tryptophan operon


Regulatory proteins generally bind to specific DNA sequences. Their affinity for these target sequences is roughly 10 to the power 4 to 10 to the power  6 times higher than their affinity for any other DNA sequence. Most regulatory proteins have discrete DNA-binding domains containing substructures that interact closely and specifically with the DNA. These binding domains usually include one or more of a relatively small group of recognizable and characteristic structural motifs. To bind specifically to DNA sequences, regulatory proteins must recognize surface features on the DNA.

To interact with bases in the major groove of DNA, a protein requires a relatively small substructure that can stably protrude from the protein surface. The DNA-binding domains of regulatory proteins tend to be small (60 to 90 amino acid residues), and the structural motifs within these domains that are actually in contact with the DNA are smaller still. Many small proteins are unstable because of their limited capacity to form layers of structure to bury hydrophobic groups.

The DNA-binding motifs provide either a very compact stable structure or a way of allowing a segment of protein to protrude from the protein surface. The DNA-binding sites for regulatory proteins are often inverted repeats of a short DNA sequence (a palindrome) at which multiple (usually two) subunits of a regulatory protein bind cooperatively. The Lac repressor is unusual in that it functions as a tetramer, with two dimers tethered together at the end distant from the DNA-binding sites.

Several DNA-binding motifs have been described, but here we focus on two that play prominent roles in the binding of DNA by regulatory proteins from all domains of life.

Helix-Turn-Helix The helix-turn-helix motif is crucial to the interaction of many regulatory proteins with DNA in bacteria, and similar motifs occur in some eukaryotic regulatory proteins. The helix-turn-helix comprises about 20 amino acid residues in two short α-helical segments, each 7 to 9 residues long, separated by a β turn. This structure generally is not stable by itself; it is simply the reactive portion of a somewhat larger DNA-binding domain. One of the two α-helical segments is called the recognition helix, because it usually contains many of the amino acids that interact with DNA in a sequence-specific way.

This α helix is stacked on other segments of the protein structure so that it protrudes from the protein surface. When bound to DNA, the recognition helix is positioned in or nearly in the major groove. The Lac repressor has this DNA-binding motif.

Zinc Finger In a zinc finger, about 30 amino acid residues form an elongated loop held together at the base by a single Zn 2+ ion, which is coordinated to four of the residues (four Cys, or two Cys and two His). The zinc does not itself interact with DNA; rather, the coordination of zinc with the amino acid residues stabilizes this small structural motif. Several hydrophobic side chains in the core of the structure also lend stability. shows the interaction between DNA and three zinc fingers of a single polypeptide from the mouse regulatory protein Zif268.

Homeodomain Another type of DNA-binding domain has been identified in some proteins that function as transcriptional regulators, especially during eukaryotic development. This domain of 60 amino acid residues—called the homeodomain, because it was discovered in homeotic genes (genes that regulate the development of body patterns)—is highly conserved and has now been identified in proteins from a wide variety of organisms, including humans.

The DNA-binding segment of the domain is related to the helix-turn-helix motif. The DNA sequence that encodes this domain is known as the homeobox.

RNA Recognition Motif An RNA-binding domain is not out of place in this discussion. RNA recognition motifs (RRMs) are found in some eukaryotic gene activators, where they may do double duty in binding DNA and RNA. When bound to specific binding sites in DNA, these activators induce transcription. The same activators are sometimes regulated in part by specific lncRNAs that compete with DNA binding and decrease gene transcription.

Other proteins with RRM motifs bind to mRNA, rRNA, or any of a range of other smaller, noncoding RNAs.

This motif may be present as part of DNA-binding regulatory proteins that also have other DNA-binding motifs, or may occur in proteins that bind uniquely to RNA.

Leucine Zipper The leucine zipper is an amphipathic α helix with a series of hydrophobic amino acid residues concentrated on one side, with the hydrophobic surface forming the area of contact between the two polypeptides of a dimer. A striking feature of these α helices is the occurrence of Leu residues at every seventh position, forming a straight line along the hydrophobic surface. Although researchers initially thought the Leu residues interdigitated (hence the name “zipper”), we now know that they line up side by side as the interacting α helices coil around each.

Regulatory proteins with leucine zippers often have a separate DNA-binding domain with a high concentration of basic (Lys or Arg) residues that can interact with the negatively charged phosphates of the DNA backbone. Leucine zippers have been found in many eukaryotic and a few bacterial proteins.

Basic Helix-Loop-Helix Another common structural motif, the basic helix-loop-helix, occurs in some eukaryotic regulatory proteins implicated in the control of gene expression during development of multicellular organisms. These proteins share a conserved region of about 50 amino acid residues important in both DNA binding and protein dimerization. This region can form two short amphipathic α helices linked by a loop of variable length, the helix-loop-helix (distinct from the helix-turn-helix motif associated with DNA binding).

The helix-loop-helix motifs of two polypeptides interact to form dimers. In these proteins, DNA binding is mediated by an adjacent short amino acid sequence rich in basic residues, similar to the separate DNA-binding region in proteins containing leucine zippers.


Regulation in cis involves a class of RNA structures known as riboswitches. Aptamers are RNA molecules, generated in vitro, that are capable of specific binding to a particular ligand. As one might expect, such ligand-binding RNA domains are also present in nature—in riboswitches—in a significant number of bacterial mRNAs (and even in some eukaryotic mRNAs). These natural aptamers are structured domains found in untranslated regions at the 5́ ends of certain bacterial mRNAs.

Some riboswitches also regulate the transcription of certain noncoding RNAs. Binding of an mRNA’s riboswitch to its appropriate ligand results in a conformational change in the mRNA, and transcription is inhibited by stabilization of a premature transcription termination structure, or translation is inhibited (in cis) by occlusion of the ribosome-binding site. In most cases, the riboswitch acts in a kind of feedback loop.

Most genes regulated in this way are involved in the synthesis or transport of the ligand that is bound by the riboswitch; thus, when the ligand is present in high concentrations, the riboswitch inhibits expression of the genes needed to replenish this ligand. Each riboswitch binds only one ligand. Distinct riboswitches have been detected that respond to more than a dozen different ligands, including thiamine pyrophosphate (TPP, vitamin B1 ), cobalamin (vitamin B12 ), flavin mononucleotide, lysine, S-adenosylmethionine (adoMet), purines, Nacetylglucosamine 6-phosphate, glycine, and some metal cations such as Mn 2+.

It is likely that many more remain to be discovered. The riboswitch that responds to TPP seems to be the most widespread; it is found in many bacteria, fungi, and some plants. The bacterial TPP riboswitch inhibits translation in some species and induces premature transcription termination in others. The eukaryotic TPP riboswitch is found in the introns of certain genes and modulates the alternative splicing of those genes.

FIGURE DEPICTING Regulation by riboswitch
FIGURE DEPICTING :- Regulation by riboswitch


  • The expression of genes is regulated by processes that affect the rates at which gene products are synthesized and degraded. Much of this regulation occurs at the level of transcription initiation, mediated by regulatory proteins that either repress transcription (negative regulation) or activate transcription (positive regulation) at specific promoters.
  • In bacteria, genes that encode products with interdependent functions are often clustered in an operon, a single transcriptional unit. Transcription of the genes is generally blocked by binding of a specific repressor protein at a DNA site called an operator. Dissociation of the repressor from the operator is mediated by a specific small molecule, an inducer. These principles were first elucidated in studies of the lactose (lac) operon. The Lac repressor dissociates from the lac operator when the repressor binds to its inducer, allolactose.
  • Regulatory proteins are DNA-binding proteins that recognize specific DNA sequences; most have distinct DNA-binding domains. Within these domains, common structural motifs that bind DNA (and/or RNA) are the helix-turnhelix, zinc finger, homeodomain, and RNA recognition motif.
  •  Regulatory proteins also contain domains for protein-protein interactions, including the leucine zipper and helix-loop-helix, which are involved in dimerization, and other motifs required for activation of transcription. Mixing and matching of protein family variants in dimeric transcription factors provides for more efficient and responsive regulation through combinatorial control.


  • Lehninger  principles of biochemistry seventh edition By  David L. Nelson and Michael M. Cox
  • voets and voets biochemistry 4th edition
  • Life sciences  fundamental and practices sixth edition, pathfinder publication By Pranav Kumar and Usha Mina
  • Essential cell biology (fourth edition) by ALBERTS, BRAY, HOPKIN, JOHNSON, LEWIS, RAFF, ROBERTS, WALTER


Leave a Reply

This Post Has 3 Comments

  1. siera

    Thanks for uploading such informatic and important topic; Loved it!