INTRODUCTION – RNA SPLICING
Many of the RNA molecules in bacteria and virtually all RNA molecules in eukaryotes are processed to some degree after synthesis. Some of the most interesting molecular events in RNA metabolism occur during this post synthetic processing. Intriguingly, several of the enzymes that catalyze these reactions consist of RNA rather than protein. The discovery of these catalytic RNAs, or ribozymes, has brought a revolution in thinking about RNA function and about the origin of life.
A newly synthesized RNA molecule is called a primary transcript. Perhaps the most extensive processing of primary transcripts occurs in eukaryotic mRNAs and in the tRNAs of both bacteria and eukaryotes. Special-function RNAs are also processed.
The primary transcript for a eukaryotic mRNA typically contains sequences encompassing one gene, although the sequences encoding the polypeptide may not be contiguous. Noncoding tracts that break up the coding region of the transcript are called introns, and the coding segments are called exons. In a process called RNA splicing, the introns are removed from the primary transcript, and the exons are joined to form a continuous sequence that specifies a functional polypeptide.
Eukaryotic mRNAs are also modified at each end. A modified residue called a 5′ cap is added at the 5′ end. The 3′ end is cleaved, and 80 to 250 A residues are added to create a poly(A) “tail.” The sometimes elaborate protein complexes that carry out each of these three mRNA-processing reactions do not operate independently. They seem to be organized in association with each other and with the phosphorylated CTD of Pol II; each complex affects the function of the others.
Proteins involved in mRNA transport to the cytoplasm are also associated with the mRNA in the nucleus, and the processing of the transcript is coupled to its transport. In effect, a eukaryotic mRNA, as it is synthesized, is ensconced in an elaborate complex comprising dozens of proteins. The composition of the complex changes as the primary transcript is processed, transported to the cytoplasm, and delivered to the ribosome for translation. The associated proteins modulate all aspects of the function and fate of the mRNA.
PROCESS OF RNA SPLICING
There are four classes of introns. The first two, the group I and group II introns, differ in the details of their splicing mechanisms but share one surprising characteristic: they are self-splicing—no protein enzymes are involved. Group I introns are found in some nuclear, mitochondrial, and chloroplast genes that code for rRNAs, mRNAs, and tRNAs. Group II introns are generally found in the primary transcripts of mitochondrial or chloroplast mRNAs in fungi, algae, and plants. Group I and group II introns are also found among the rare examples of introns in bacteria. Neither class requires a high-energy cofactor (such as ATP) for splicing.
The splicing mechanisms in both groups involve two transesterification reaction steps, in which a ribose 2′- or 3′-hydroxyl group makes a nucleophilic attack on a phosphorus, and a new phosphodiester bond is formed at the expense of the old, maintaining the balance of energy. These reactions are very similar to the DNA breaking and rejoining reactions promoted by topoisomerase and site-specific recombinases. The group I splicing reaction requires a guanine nucleoside or nucleotide cofactor, but the cofactor is not used as a source of energy; instead, the 3′- hydroxyl group of guanosine is used as a nucleophile in the first step of the splicing pathway.
The guanosine 3′-hydroxyl group forms a normal 3′,5′- phosphodiester bond with the 5′ end of the intron. The 3′ hydroxyl of the exon that is displaced in this step then acts as a nucleophile in a similar reaction at the 3′ end of the intron. The result is precise excision of the intron and ligation of the exons. In group II introns the reaction pattern is similar, except for the nucleophile in the first step, which in this case is the 2′-hydroxyl group of an A residue within the intron A branched lariat structure is formed as an intermediate.
- Self-splicing of introns was first revealed in 1982 in studies of the splicing mechanism of the group I rRNA intron from the ciliated protozoan Tetrahymena thermophila, conducted by Thomas Cech and colleagues. These workers transcribed isolated Tetrahymena DNA (including the intron) in vitro, using purified bacterial RNA polymeras.
- In eukaryotes, most introns undergo splicing by the same lariat-forming mechanism as the group II introns. However, the intron splicing takes place within a large protein complex called a spliceosome, and these introns, the spliceosomal introns, are not assigned a group number. A spliceosome is made up of multiple specialized RNA-protein complexes called small nuclear ribonucleoproteins (snRNPs, often pronounced snurps).
- Each snRNP contains one of a class of eukaryotic RNAs, 100 to 200 nucleotides long, known as small nuclear RNAs (snRNAs). Five snRNAs (U1, U2, U4, U5, U6) involved in splicing reactions are generally found in abundance in eukaryotic nuclei. In yeast, the various snRNPs include about 100 different proteins, most of which have close homologs in all other eukaryotes.
- In humans, these conserved protein components are augmented by more than 200 additional proteins. Spliceosomes are thus among the most complex macromolecular machines in any eukaryotic cell. The RNA components of a spliceosome are the catalysts of the various splicing steps. The overall complex can be considered a highly flexible nucleoprotein chaperone that can adapt to the great diversity in size and sequence of nuclear mRNAs.
- Spliceosomal introns generally have the dinucleotide sequence GU at the 5′ end and AG at the 3′ end, and these sequences mark the sites where splicing occurs.
- The U1 snRNA contains a sequence complementary to sequences near the 5′ splice site of nuclear mRNA introns, and the U1 snRNP binds to this region in the primary transcript.
- U2 snRNP then binds to the branch site, aided by U2AF and displacing BBP (SF1). This arrangement is called the A complex. The base pairing between the U2 snRNA and the branch site is such that the branch site A residue is extruded from the resulting stretch of double-helical RNA as a singlenucleotide bulge.
- The next step is a rearrangement of the A complex to bring together all three splice sites. This is achieved as follows: the U4 and U6 snRNPs, along with the U5 snRNP, join the complex. Together, these three snRNPs are called the tri-snRNP particle, within which the U4 and U6 snRNPs are held together by complementary base pairing between their RNA components, and the U5 snRNP is more loosely associated through protein– protein interactions. With the entry of the tri-snRNP, the A complex is converted into the B complex.
- In the next step, U1 leaves the complex, and U6 replaces it at the 5 prime splice site. This requires that the base pairing between the U1 snRNA and the premRNA be broken, allowing the U6 RNA to anneal with the same region.
- U4 is released from the complex, allowing U6 to interact with U2.
- ATP is required for assembly of the spliceosome, but the RNA cleavage-ligation reactions do not require ATP.
- Many eukaryotic genes are thus mosaics, consisting of blocks of coding sequences separated from each other by blocks of non-coding sequences. The coding sequences are called exons and the intervening sequences are called introns.
- Once transcribed into an RNA transcript, the introns must be removed and the exons joined together to create the mRNA for that gene. In fact, technically, the term exon applies to any region retained in a mature RNA, whether or not it is coding. Non-coding exons include the 5 and 3prime untranslated regions of an mRNA; all portions of spliced.
- The primary transcripts of intron-containing genes must have their introns removed before they can be translated into proteins. The process of intron removal, called RNA splicing, converts the pre-mRNA into mature mRNA and must occur with great precision to avoid the loss, or addition, of even a single nucleotide at the sites at which the exons are joined.
- The triplet-nucleotide codons of mRNA are translated in a fixed reading frame that is set by the first codon in the protein-coding sequence.
- Many primary mRNA transcripts contain introns (noncoding regions), which are removed by splicing. Excision of the group I introns found in some rRNAs requires a guanosine cofactor. Some group I and group II introns are capable of self-splicing; no protein enzymes are required.
- Nuclear mRNA precursors have a third (the largest) class of introns, which are spliced with the aid of RNA-protein complexes called snRNPs that are assembled in spliceosomes. A fourth class of introns, found in some tRNAs, consists of the only introns known to be spliced by protein enzymes.