Feature Counts
Biotype Counts
The FeatureCounts tool executes a comprehensive quantification of the data pinpointed within GFF or GTF files. Subsequently, it furnishes pertinent insights about the reference genome, these insights are derived from the reads discovered across each individual sample.
Figure 1 : bar chart representing the different mapping assignments for each sample, based on the reference genome.
source : MultiQC example RNAseq
Legend: Assigned indicates the percentage of readings that have been assigned to a specific biological characteristic. Ambiguity indicates that a read has been assigned to two or more features. Multimapping means that a read has been located at two distinct locations within the reference genome. No features suggests that the read originates either from an intron or from an intergenic region.
Figure 2 : bar chart representing the various biological assignments for the reads of each sample.
source : nf-core RNAseq MultiQC
Depending on the reference genome and the GTF/GFF file that is used as input, there can be a tremendous amount of biological information on the mapped reads. For RNAseq data, this chart can be very interesting to know what type of biological expression was present at the time of capture. As RNAseq data, the above chart is very representative of the expected data, given that the majority of the reads from each sample are attributed to Protein_coding, which is precisely what we are trying to measure in RNAseq.
Warnings :
On the last chart, there are many different legends, which can make data processing difficult due to the limited number of colors. To be sure of your analysis, do not hesitate to double-click on the legends to make them appear or disappear. Of course, placing the cursor on the sample will provide all the necessary information.