In most eukaryotes, two types of introns co-exist, major introns and minor introns, hence called because the first type largely outnumbers the second one. The major and minor class introns have distinct consensus splice site and branch point sequences and are removed by different spliceosomes, designed "U2-type" for the major and "U12-type" for the minor spliceosomes. Major and minor introns are also called U2- and U12-type introns.
Minor intron splicing plays a central role in human embryonic development and survival. Indeed, biallelic mutations in RNU4ATAC, the gene transcribed into the minor spliceosomal U4atac snRNA, are responsible for a rare autosomal recessive disorder named microcephalic osteodysplastic primordial dwarfism type 1 (MOPD1) or Taybi-Linder syndrome (TALS). RNU4ATAC is also involved in different and less severe developmental diseases, Roifman (RFMN) and Lowry-Wood (LWS) syndromes.
To gain knowledge on gene expression and splicing during development, taking into account whether they contain a minor intron or not, we used the publically available E-ERAD-475 study (ArrayExpress collection) containing 360 RNA-seq datasets from whole embryo mRNA at 18 developmental stages from 1 cell (0 hpf) to 5 days. The associated publication can be found here. .
For each stage, we pooled the four technical replicates but kept the five biological replicates separated. We analysed these datasets with a special focus on minor introns and on the genes that contain them. However, this Shiny App can be used to explore the expression/splicing profile of any transcript!
Reads from all stages were aligned with STAR to the GRCz11 genome version with ensembl95 annotation version. The same reads were also assembled using KisSplice.
Each section of this app focuses on one type of analysis: Expression, Intron Retention (IR) and Alternative Splicing (AS). Each section displays a dynamic table containing the raw results and gives access to several methods, PCA plot, differential analysis and track plot, which allow to follow one or more gene/intron/splicing event throughout the different stages. Below is a quick description of each of these sections and of each of the methods.
This section gives results related to gene expression measurement. The metric used is the TPM, Transcript Per Million, as computed by RSEM.
One line per gene. The columns give the TPM value of each gene at each developmental stage for each biological replicate (they can be merged). The last column indicates if the gene contains a minor intron (for those genes, the line is highlighted in yellow).
This section gives results related to intron retention detection and measurement. The metric used is the PSI, Percent Spliced In, as computed by KisSplice2refgenome and kissDE. PSI is comprised between 0% and 100%; the higher the value, the higher the retention.
One line per intron. The columns give the PSI value of each intron at each developmental stage for each biological replicate (they can be merged). The last columns indicate if the gene contains a minor intron and if the intron is a minor one (for those introns, the line is highlighted in yellow).
This section shows results related to alternative splicing event detection and measurement. The metric used is the PSI, Percent Spliced In, as computed by KisSplice2refgenome and kissDE. PSI is comprised between 0% and 100%; the higher the value, the higher the abundance of the longer isoform.
One line per alternative splicing event. The columns give the PSI value of each event at each developmental stage for each biological replicate (they can be merged). The columns indicate the type of event, if the event involves a U12-type donor or acceptor splice site, a minor intron and if it concerns a gene that contains a minor intron (for those introns, the line is highlighted in yellow).
Enables to select one or more genes/introns/splicing events, or genes involved in a given biological process, and visualize their expression or splicing profile throughout the selected stages.
Enables to perform PCA on genes/introns/events of the active table. Users can select all of them or choose the number of genes/introns/events to use. Which PC to plot on the 2D graph can also be selected.
Concerning gene expression, all or the n-most variable genes can be used. If all the genes are selected, then the TPM distribution is rescale to have a mean value of 0 and a variance of 1. Else, the n-most variable genes are selected, which will bias the PCA toward higly expressed genes as variability increase with expression.
In addition to the PCA-plot, the expression profile of the genes most contributing to each PC will also be shown, to give a visual overview of the main dynamics of the different developmental stages.
Enables to run differential analyses between any two groups of selected stages. These analyses are time-consuming for IR and AS.
In addition to providing a comprehensive summary table, an interactive summary plot will also be shown (MA-plot for expression level, PSI-plot for IR and AS).
Enables to run a GO enrichment analysis on the differentially expressed/spliced genes using the genome wide annotation for zebrafish org.Dr.eg.db version 3.13.0. Depending on the number of differential elements analysed, this can be a time-consuming step.
Each gene associated with a GO term of interest can then be accessed with the GOID (that can be found in the TopGO result table on the Shiny App). Further information on these genes can be accessed in a table and a plot.
Users can also create a bar-plot graph to summarize the GO enrichment results.