Pre-mRNA splicing is an essential, precisely regulated process that occurs after gene transcription and prior to mRNA translation. Pre-mRNA splicing begins with the ordered assembly and coordinated action of the particles U1, U2, U4, U5 and U6 snRNPs (small nuclear ribonucleoprotein particles) and non-snRNP proteins on the pre-mRNA. Each snRNP particle contains a small nuclear RNA molecule (snRNA) and several proteins. The complex of snRNPs and non-snRNPs is called the spliceosome. The process of pre-mRNA splicing can be divided into three stages (Visit the Molecular Cell Biology book for more details or the animated movie from The Essential Cell Biology):
The precise recognition of intron-exon junctions (splice sites) and the correct pairing of the 5’ splice site with its cognate 3’ splice site is critical for splice site selection. It is during the formation of the commitment complex () that splice sites are first recognized by spliceosomal components, with the aid of certain non-spliceosomal proteins.
A number of dynamic interactions including snRNA pre-mRNA interactions as well as pre-mRNA-protein and protein-protein interactions bring the reactive sites on the pre-mRNA together and create the catalytic sites for the trans-esterification reactions (Visit the the Molecular Biology Web Book for more details).
The cleavage and ligation reaction required for intron removal and exon ligation proceeds via two trans-esterification reactions. In the first reaction the 5’ exon is cleaved and the 5’ end of the intron is joined to the branch point creating the intron lariat structure. The second reaction occurs when the free 3’ end of the 5’ exon is joined to the downstream exon resulting in exon ligation and release of the intron sequence. This mechanism is similar to that used by a class of self-splicing introns called Group II introns. The Group II introns are found in certain ribosomal RNAs (rRNA) and transfer RNAs (tRNA) and they can undergo self-splicing in the absence of any proteins. The similarity in the mechanisms of spliceosome-mediated splicing and Group II intron self-splicing has led to the hypothesis that the catalytic core of the spliceosome functions as an RNA enzyme (ribozyme).
The nature and function of the components of the splicing machinery and the biochemistry of splicing are well known. However, the answer to the question of how splice sites are selected in vivo remains somewhat unclear. The answer is likely to be a complex one in view of the short, moderately conserved, sequences that serve to define exon-intron junctions (see the figure below). Consensus splice signals that are not normally used as splice sites (known as “cryptic” splice sites) occur frequently in a given pre-mRNA. Furthermore, non-consensus splice site sequences that contain mismatches at highly conserved positions are sometimes used as splice sites. These observations have lead to the discovery of auxiliary cis-acting sequences that can influence splice site recognition. These sequences are known as splicing enhancers and inhibitor sequences and now appear to help regulate alternative splicing.
The average exon is small, approximately 150bp with 99% of known exons shorter than 400 nucleotides, however exons of more than 5000 nucleotides are known. By contrast, introns vary greatly in length averaging around 1500 nucleotides but can be up to a hundred thousand nucleotides. The narrow range of exon size has lead to the exon definition hypothesis whereby splicing factors bound to a 3’ splice site interact with factors at the next downstream 5’ splice site, across the exon, thereby defining the position of the exon before switching to interact with factors at the upstream 5’ splice site, across the intron, to allow splicing of the introns. In cases where the intron is short and the exon is longer than 300 nucleotides, the “intron bridging” model which proposes interactions across the intron is invoked to explain pairing of the 5’ splice site with the correct 3’ splice site (Berget 1995; Talerico et al., 1994; Robberson et al., 1990).
Many publications are related to splicing. We can suggest you to read the following:
To get more information about how splicing software work, please read: FO Desmet, D Hamroun, G Collod-Béroud, M Claustres and C Béroud: Bioinformatics identification of splice site signals and prediction of mutation effects.