With the completion of the Human Genome Project our vision of human genetic diseases has changed. Thousands of mutations are identified in diagnostic and research laboratories yearly. The knowledge of these mutations associated with clinical and biological data is essential for clinicians, geneticists and researchers.

In order to better understand intronic and exonic mutations leading to splicing defects, we decided to create the Human Splicing Finder website. This tool is aimed to help studying the pre-mRNA splicing [more about splicing background].

To calculate the consensus values of potential splice sites and search for branch points, new algorithms were developed. Furthermore, we have integrated all available matrices to identify exonic and intronic motifs, as well as new matrices to identify hnRNP A1, Tra2-β and 9G8.

We hope that this tool will be useful for your research. In order to improve it, please send us comments and new matrices to identify specific sequences involved in splicing.

HSF (Human Splicing Finder) is freely available for non-commercial users. Nevertheless it is not allowed to copy all or part of the database content without specific authorisation from us. If you are a commercial user please contact us to obtain a dedicated license.
For more information please contact Prof. Christophe Béroud or Dr. David Salgado

Pre-mRNA splicing is an essential, precisely regulated process that occurs after gene transcription and prior to mRNA translation. Pre-mRNA splicing begins with the ordered assembly and coordinated action of the particles U1, U2, U4, U5 and U6 snRNPs (small nuclear ribonucleoprotein particles) and non-snRNP proteins on the pre-mRNA. Each snRNP particle contains a small nuclear RNA molecule (snRNA) and several proteins. The complex of snRNPs and non-snRNPs is called the spliceosome. The process of pre-mRNA splicing can be divided into three stages (Visit the Molecular Cell Biology book for more details or the animated movie from The Essential Cell Biology):

Formation of the commitment complex

The precise recognition of intron-exon junctions (splice sites) and the correct pairing of the 5’ splice site with its cognate 3’ splice site is critical for splice site selection. It is during the formation of the commitment complex () that splice sites are first recognized by spliceosomal components, with the aid of certain non-spliceosomal proteins.

Creation of catalytic sites

A number of dynamic interactions including snRNA – pre-mRNA interactions as well as pre-mRNA-protein and protein-protein interactions bring the reactive sites on the pre-mRNA together and create the catalytic sites for the trans-esterification reactions (Visit the the Molecular Biology Web Book for more details).

The trans-esterification reactions

The cleavage and ligation reaction required for intron removal and exon ligation proceeds via two trans-esterification reactions. In the first reaction the 5’ exon is cleaved and the 5’ end of the intron is joined to the branch point creating the intron lariat structure. The second reaction occurs when the free 3’ end of the 5’ exon is joined to the downstream exon resulting in exon ligation and release of the intron sequence. This mechanism is similar to that used by a class of self-splicing introns called Group II introns. The Group II introns are found in certain ribosomal RNAs (rRNA) and transfer RNAs (tRNA) and they can undergo self-splicing in the absence of any proteins. The similarity in the mechanisms of spliceosome-mediated splicing and Group II intron self-splicing has led to the hypothesis that the catalytic core of the spliceosome functions as an RNA enzyme (ribozyme).

The nature and function of the components of the splicing machinery and the biochemistry of splicing are well known. However, the answer to the question of how splice sites are selected in vivo remains somewhat unclear. The answer is likely to be a complex one in view of the short, moderately conserved, sequences that serve to define exon-intron junctions (see the figure below). Consensus splice signals that are not normally used as splice sites (known as “cryptic” splice sites) occur frequently in a given pre-mRNA. Furthermore, non-consensus splice site sequences that contain mismatches at highly conserved positions are sometimes used as splice sites. These observations have lead to the discovery of auxiliary cis-acting sequences that can influence splice site recognition. These sequences are known as splicing enhancers and inhibitor sequences and now appear to help regulate alternative splicing.

The average exon is small, approximately 150bp with 99% of known exons shorter than 400 nucleotides, however exons of more than 5000 nucleotides are known. By contrast, introns vary greatly in length averaging around 1500 nucleotides but can be up to a hundred thousand nucleotides. The narrow range of exon size has lead to the exon definition hypothesis whereby splicing factors bound to a 3’ splice site interact with factors at the next downstream 5’ splice site, across the exon, thereby defining the position of the exon before switching to interact with factors at the upstream 5’ splice site, across the intron, to allow splicing of the introns. In cases where the intron is short and the exon is longer than 300 nucleotides, the “intron bridging” model which proposes interactions across the intron is invoked to explain pairing of the 5’ splice site with the correct 3’ splice site (Berget 1995; Talerico et al., 1994; Robberson et al., 1990).

Further readings

Many publications are related to splicing. We can suggest you to read the following:


Marseille Medical Genetics (MMG) - UMR 1251
Director: Nicolas LEVY

Bioinformatics & Genetics Team
Director: Christophe BEROUD