B Fraction of genes with a negative among-cell—line Spearman's correlation between the Simpson or Shannon index of TSS diversity and expression level. C Fraction of genes with a positive among-cell—line Spearman's correlation between the gene expression level and fractional use of a ranked TSS. The area of a circle is proportional to the indicated number of genes in the circle. E Only in a minority of human genes is the number N of observed major TSSs significantly greater than that n expected under no differential use of TSSs among five human cell lines. F The probability densities of expression level for genes with larger N than n not necessarily significantly; red and the rest of the genes black. In this panel, N and n have been re-estimated using down-sampled data to equalize the sampling error among genes. The error hypothesis further predicts that, when the expression level of a gene varies among cell types, the TSS diversity of the gene in a cell type decreases with the rise of its expression level in the cell type.

To verify ATI EG PARK prediction, we calculated for each gene the correlation between its expression level and TSS diversity across the five human cell lines.

We focused on down-sampled data to guard against the influences of unequal sequencing depths of a gene across cell ATI EG PARK. Indeed, significantly more genes exhibit negative correlations than expected by chance regardless of whether we used the Simpson or Shannon index of TSS diversity Fig 3B.

The ATI EG PARK hypothesis further predicts a negative correlation across cell types between the expression level of a gene and the fractional use of each minor TSS. To verify this prediction, for each gene, we defined the major and minor TSSs in each cell line separately and then computed the across-cell—line rank correlation between the expression level of a gene in a cell line and the fractional use of a TSS of a certain rank in the cell ATI EG PARK based on down-sampled data. Although our evidence so far suggests that, for most genes, each cell type has only one preferred TSS, it remains possible that the optimal TSS varies among cell types such that the ATI variation among cell types is adaptive.


To assess this possibility, we first repeated the analysis in Fig 3C by defining the global major and minor TSSs for each gene using the combined CAGE-seq reads from all five cell lines. Interestingly, the patterns observed are similar to those ATI EG PARK Fig 3C.

Drivers - add resolution to ATI card - - Ask Ubuntu

Along with the observations in Fig 3Cthese results suggest that only a small fraction of genes may have different optimal TSSs in different cell types. To estimate this fraction, we first counted the number of different major TSSs observed in each gene across the five cell lines N ; Fig 3D because if all five cell lines share the same major TSS, it is most likely that they all share the same optimal TSS. We found that, of 7, genes examined, 6, or We also examined the maximum possible number of different major TSSs in the five cell lines M for each gene, which would be the smaller of 5 and the total number of TSSs observed in the five cell lines for the gene Fig 3D. Even when different cell lines show different major TSSs, the optimal TSS could still be the same in these cell lines because the observation could be due to sampling error caused by limited sequencing depths.

To examine this possibility, for each gene, we randomly ATI EG PARK its CAGE-seq reads among the five cell lines without altering the number of reads in each cell line and then used the shuffled data to count the number of different major TSSs in the five cell lines. We repeated this process 10, times ATI EG PARK estimated the mean number of major TSSs for the gene in the shuffled data n Fig 3E and the fraction of times f when the number of major TSSs observed in the shuffled data equals to or exceeds that in the actual data. Here, f is an estimate of the one-tailed P -value in testing the null hypothesis that all cell lines share the same major TSS. Thus, approximately 4. The above analysis assumed that the major TSS of a gene in a cell line is the optimal TSS in the cell line, but this may not always be the case because using the optimal TSS more than any other TSS in every cell type for every gene could be difficult because of the limited power of transcription start regulation.

