Figure 3 | Biology Direct

From: RNaseIII and T4 Polynucleotide Kinase sequence biases and solutions during RNA-seq library construction

Sequence logo and entropy analysis of mapped reads. We analyzed mapped reads for deviations from randomness using sequence logo and entropy. (Note that the height of letters in the sequence logo is given by the reduction in entropy from random expectation. Thus, large letters in the sequence logo correspond to depressions in the entropy numbers.) (a) shows sequence logos for the whole transcriptome library (WTL) and gene specific library (GSL). (b) shows sequence logos for their computational controls (Com. ctl). (c) shows entropy data for whole transcriptome library and gene specific library including their computational controls. The entropies of computational controls have black error bars from 1000 replicates. The error bars represent their standard deviation, and they are smaller than the size of the symbols of whole transcriptome and gene specific libraries on the graph. The computational controls obtained by randomly generated sequences from human exome for each library. Based on mapped read locations and splicing junction information, we predicted existing 200 bp cDNA sequences in RNA-seq library. 50 bp partial sequences were randomly selected from the sequences. In all cases, we analyzed a window of 50 bp starting with the 5’ ends of the sequence fragments. The gene specific library showed almost no deviations from randomness. By contrast, the whole transcriptome library had a strong bias at the first 2 bases at the 5’ end.

