Last updated: 2020-10-25
Checks: 6 1
Knit directory: Comparative_eQTL/analysis/
This reproducible R Markdown analysis was created with workflowr (version 1.5.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish
to commit the R Markdown file and build the HTML.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20190319)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .RData
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: WorkingManuscript.zip
Ignored: WorkingManuscript/
Ignored: analysis/.DS_Store
Ignored: analysis/.Rhistory
Ignored: analysis_temp/.DS_Store
Ignored: big_data/
Ignored: code/.DS_Store
Ignored: code/snakemake_workflow/.DS_Store
Ignored: code/snakemake_workflow/.Rhistory
Ignored: data/.DS_Store
Ignored: data/PastAnalysesDataToKeep/.DS_Store
Ignored: figures/
Ignored: output/.DS_Store
Untracked files:
Untracked: analysis/.index.Rmd.swp
Untracked: analysis/20200907_Response_Point_02.Rmd
Untracked: analysis/20200907_Response_Point_04.Rmd
Untracked: analysis/20200907_Response_Point_07.Rmd
Untracked: analysis/20200907_Response_Point_08.Rmd
Untracked: analysis/20200907_Response_Point_11_2.Rmd
Untracked: analysis/20201012_MakeSourceData.Rmd
Untracked: data/FinalSourceData/
Untracked: data/c5.all.v7.1.symbols.gmt
Untracked: data/c5.all.v7.1.symbols.gmt.categories.tsv.gz
Untracked: data/h.all.v7.1.symbols.gmt
Unstaged changes:
Modified: README.md
Modified: analysis/20200907_Response_OriginalComments.Rmd
Modified: analysis/20200907_Response_Point_06.Rmd
Modified: analysis/20200907_Response_Point_09-2.Rmd
Modified: analysis/20200907_Response_Point_09.Rmd
Modified: analysis/20200907_Response_Point_11.Rmd
Modified: analysis/Final_2_DispersionPlots.Rmd
Modified: analysis/about.Rmd
Modified: analysis/index.Rmd
Modified: analysis_temp/TabulaMuris_analysis2.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view them.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | e3ed68f | Benjmain Fair | 2020-09-09 | update site, address reviewers |
html | e3ed68f | Benjmain Fair | 2020-09-09 | update site, address reviewers |
Original reviewer point:
Have the authors tried estimating dispersion on top of what is expected based on differences in cell type? There are several strategies that might work for this: There are new strategies for estimating a posterior of cell type specific expression from a bulk sample, conditional on scRNA-seq data as prior information (Chu and Danko, bioRxiv, 2020). These cell type specific expression estimates could then be analyzed for dispersion. Alternatively, it may also work to regress the estimated proportion of each cell type out of the dispersion estimates. While there are certainly a lot of pitfalls with using these strategies, especially in the setting shown here (all of this would work better if there were species matched reference data), they might provide an avenue for depleting the contribution of cell type differences from dispersion estimates.
Here I will use the Chu and Danko methodology (“TED”, or “BayesPrism”, see https://github.com/Danko-Lab/TED) to estimate cell type specific expression estimates for each individual, and then estimate a cell type specific dispersion estimate. As the reviewer notes, there a lot of pitfalls with this strategy. Really, in the absence of scRNA-seq data (and a human scRNA-seq reference), there is only so much we can gain I think. But still worth a try.
library(TED)
library(tidyverse)
library(biomaRt)
library(Seurat)
library(gplots)
source("../code/CustomFunctions.R")
library(knitr)
library(scales)
library(latex2exp)
This block references a file that I keep on my local computer, that is not reproducible from the snakemake. This rds file references a seurat object I created from publicaly available mouse scRNA-seq data (Tabula Muris) using this Rscript.
Heart.seur = readRDS("../big_data/TabMuris_heart_seurat.rds")
Some other data to read in… This includes the bulk RNA-seq count data from this study (39 human, 39 chimp), and some other necessary data to convert mouse gene names to human, etc.
#read in table of overdispersion, dispersion, mean expresion estimates
Dispersion <- read.delim('../output/OverdispersionEstimatesFromChimp.txt')
Dispersion$MeanDispersion <- (Dispersion$Chimp.Residual + Dispersion$Human.Residual)/2
CountTableChimpFile <- '../output/PowerAnalysisFullCountTable.Chimp.subread.txt.gz'
CountTableHumanFile <- '../output/PowerAnalysisFullCountTable.Human.subread.txt.gz'
OutputDE <- '../output/Final/TableS2.tab'
DropFileName <- '../data/DE_SamplesToDrop.txt'
DropFile <- read.delim(DropFileName, sep='\t', col.names = c("Sample", "Species"), stringsAsFactors = F)
HumanSamplesToDrop <- DropFile %>% filter(Species=="Human") %>% pull(Sample)
ChimpSamplesToDrop <- DropFile %>% filter(Species=="Chimp") %>% pull(Sample)
DE.results <- read.delim(OutputDE, sep='\t', stringsAsFactors = F)
GeneListForOverdispersionCalculation <- DE.results$Ensembl_geneID
CountTables <- GetCountTables(CountTableChimpFile,
CountTableHumanFile,
0, GeneListForOverdispersionCalculation, ChimpSampleDrop=ChimpSamplesToDrop, HumanSampleDrop = HumanSamplesToDrop)
human = useMart("ensembl", dataset = "hsapiens_gene_ensembl")
mouse = useMart("ensembl", dataset = "mmusculus_gene_ensembl")
MouseGenes = rownames(Heart.seur)
genes = getLDS(attributes = c("mgi_symbol"), filters = "mgi_symbol", values = MouseGenes ,mart = mouse, attributesL = c("ensembl_gene_id", "chromosome_name", "start_position", "mmusculus_homolog_orthology_type"), martL = human, uniqueRows=T)
# Get list of one to one orthologs in both tabula muris and our dataset for further analysis
one2one_HumanMouseOrthologs <- genes %>%
filter(Mouse.homology.type=="ortholog_one2one") %>%
filter(Gene.stable.ID %in% Dispersion$gene) %>%
dplyr::select(MGI.symbol, Gene.stable.ID) %>%
distinct(MGI.symbol, .keep_all = T)
Now prepare input data from run.Ted
function:
CellTypes <- data.frame(Cell.ID=colnames(Heart.seur@assays$RNA), CellType=Heart.seur@meta.data$Cell.ontology.class)
scCountTable.Filtered <- Heart.seur@assays$RNA[one2one_HumanMouseOrthologs$MGI.symbol,] %>%
as.matrix() %>%
as.data.frame() %>%
rownames_to_column() %>%
inner_join(one2one_HumanMouseOrthologs, by=c("rowname"="MGI.symbol")) %>%
dplyr::select(-rowname) %>%
column_to_rownames("Gene.stable.ID") %>% t() %>%
as.data.frame() %>%
rownames_to_column("Cell.ID") %>%
inner_join(CellTypes, by="Cell.ID") %>%
filter(!CellType == "unassigned cell type") %>%
dplyr::select(-CellType) %>%
column_to_rownames("Cell.ID") %>% t()
scCountTable.Filtered[1:10,1:10] %>% kable()
A1.B000412.3_56_F.1.1 | A1.B000633.3_56_F.1.1 | A1.B002423.3_39_F.1.1 | A1.B002427.3_39_F.1.1 | A1.B002428.3_38_F.1.1 | A1.MAA000398.3_9_M.1.1 | A1.MAA000400.3_8_M.1.1 | A1.MAA000898.3_11_M.1.1 | A1.MAA000899.3_10_M.1.1 | A1.MAA000903.3_11_M.1.1 | |
---|---|---|---|---|---|---|---|---|---|---|
ENSG00000131779 | 0 | 156 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
ENSG00000131788 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 13 | 14 | 0 |
ENSG00000265491 | 295 | 0 | 516 | 0 | 0 | 0 | 59 | 0 | 135 | 0 |
ENSG00000186141 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 305 |
ENSG00000186364 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
ENSG00000198483 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 137 | 0 | 0 |
ENSG00000265972 | 676 | 555 | 0 | 1 | 82 | 414 | 0 | 1068 | 299 | 467 |
ENSG00000134250 | 0 | 48 | 181 | 1 | 36 | 12 | 269 | 583 | 12 | 828 |
ENSG00000271601 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
ENSG00000143127 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
CellTypeVector <- CellTypes %>%
filter(!CellType == "unassigned cell type") %>% pull(CellType) %>% as.character()
head(CellTypeVector)
[1] "endothelial cell" "leukocyte" "fibroblast"
[4] "myofibroblast cell" "endocardial cell" "fibroblast"
Bulk.CountTable <- cbind(CountTables$Chimp$Counts, CountTables$Human$Counts) %>%
rownames_to_column() %>%
inner_join(one2one_HumanMouseOrthologs, by=c("rowname"="Gene.stable.ID")) %>%
dplyr::select(-MGI.symbol) %>%
column_to_rownames()
Bulk.CountTable[1:10,1:10] %>% kable()
C.4x0519 | C.4x373 | C.476 | C.4x0025 | C.438 | C.462 | C.4X0354 | C.503 | C.317 | C.623 | |
---|---|---|---|---|---|---|---|---|---|---|
ENSG00000186891 | 46 | 73 | 20 | 28 | 4 | 11 | 98 | 13 | 11 | 3 |
ENSG00000186827 | 274 | 393 | 313 | 397 | 173 | 128 | 975 | 236 | 307 | 84 |
ENSG00000078808 | 5967 | 1931 | 7637 | 3892 | 2957 | 3745 | 1189 | 5857 | 4306 | 3704 |
ENSG00000176022 | 1097 | 402 | 1574 | 1333 | 955 | 1235 | 416 | 1087 | 745 | 640 |
ENSG00000160087 | 761 | 283 | 1153 | 776 | 471 | 610 | 275 | 920 | 547 | 604 |
ENSG00000131584 | 2587 | 1537 | 3646 | 2546 | 1755 | 1323 | 4585 | 2250 | 1484 | 1872 |
ENSG00000169972 | 282 | 78 | 396 | 257 | 209 | 282 | 111 | 243 | 141 | 252 |
ENSG00000169962 | 320 | 111 | 366 | 80 | 178 | 145 | 269 | 416 | 237 | 107 |
ENSG00000107404 | 7019 | 3600 | 11211 | 7397 | 5731 | 4670 | 3942 | 9779 | 5566 | 6768 |
ENSG00000162576 | 3701 | 1059 | 1384 | 2114 | 1426 | 2418 | 1242 | 1360 | 2221 | 1948 |
Now, use run.Ted
function to estimate cell type porportions and also per individual expression within each cell type. In this code block, which I have run before knitting the Rmarkdown, I will also save the results to an RDS file that I will read in the next code block, so I don’t have to re-run this time consuming function when I want to knit the Rmarkdown.
Ted.results <- run.Ted(t(scCountTable.Filtered), t(Bulk.CountTable), pheno.labels = CellTypeVector, input.type = "scRNA", n.cores = 1)
saveRDS(Ted.results, file = "../big_data/TedResults.rds")
Now read in the ted results.
Ted.results <- readRDS("../big_data/TedResults.rds")
Ted.Expression.Per.CellType <- Ted.results$res$first.gibbs.res$Znkg #mean reads in each cell type
Ted.CellTypePorportions <- Ted.results$res$first.gibbs.res$theta.merged #cell fraction
Ok, now let’s compare the cell type porportions to CIBERSORT results…
#Ted cell type porportions table
Ted.CellTypePorportions %>% head() %>% kable()
endothelial cell | leukocyte | fibroblast | myofibroblast cell | endocardial cell | cardiac muscle cell | professional antigen presenting cell | erythrocyte | smooth muscle cell | |
---|---|---|---|---|---|---|---|---|---|
C.4x0519 | 0.0655604 | 0.0015653 | 0.0279680 | 0.0863701 | 0.0335372 | 0.5969150 | 0.0281230 | 0.1059831 | 0.0539779 |
C.4x373 | 0.0729209 | 0.0000807 | 0.0353951 | 0.0581507 | 0.0102430 | 0.5956729 | 0.0252973 | 0.0969409 | 0.1052985 |
C.476 | 0.0443883 | 0.0000164 | 0.0007314 | 0.0427403 | 0.0154651 | 0.7444661 | 0.0230774 | 0.0774483 | 0.0516668 |
C.4x0025 | 0.0840676 | 0.0002683 | 0.0140482 | 0.1144108 | 0.0322334 | 0.5566394 | 0.0434807 | 0.1087926 | 0.0460590 |
C.438 | 0.0438903 | 0.0000171 | 0.0061195 | 0.0420949 | 0.0162850 | 0.7467382 | 0.0235200 | 0.0658731 | 0.0554619 |
C.462 | 0.0661200 | 0.0005401 | 0.0194387 | 0.0645938 | 0.0215091 | 0.6691649 | 0.0268254 | 0.0845318 | 0.0472763 |
Ted.CellTypePorportions.df <- Ted.CellTypePorportions %>% as.data.frame() %>%
rownames_to_column("Input.Sample") %>%
select_all(~gsub("\\s+", ".", .)) %>%
arrange(cardiac.muscle.cell) %>%
gather(key="cell.type", value="percent", -Input.Sample) %>%
mutate(Species = case_when(str_detect(Input.Sample, "C.") ~ "Chimp",
str_detect(Input.Sample, "H.") ~ "Human"))
#Ted cell type porportions as plot
TedProp <- ggplot(Ted.CellTypePorportions.df, aes(x=reorder(Input.Sample, percent, FUN=max), y=percent, fill=cell.type)) +
geom_bar(stat="identity") +
scale_y_continuous(limits = c(-0.001,1.001), expand = c(0, 0)) +
facet_grid(~Species, scales="free_x", space="free_x") +
theme_bw() +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank())
TedProp
Version | Author | Date |
---|---|---|
e3ed68f | Benjmain Fair | 2020-09-09 |
#CIBERSORT cell type porportions
CIBERSORT <- read.table("../data/CIBERSORT.Output_Job2.csv", sep=',', header=T) %>%
arrange(cardiac.muscle.cell) %>%
gather(key="cell.type", value="percent", -Input.Sample, -P.value, -Pearson.Correlation, -RMSE) %>%
mutate(Species = case_when(str_detect(Input.Sample, "C.") ~ "Chimp",
str_detect(Input.Sample, "H.") ~ "Human"))
#As a plot...
ggplot(CIBERSORT, aes(x=reorder(Input.Sample, percent, FUN=max), y=percent, fill=cell.type)) +
geom_bar(stat="identity") +
scale_y_continuous(limits = c(-0.001,1.001), expand = c(0, 0)) +
facet_grid(~Species, scales="free_x", space="free_x") +
theme_bw() +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank())
Version | Author | Date |
---|---|---|
e3ed68f | Benjmain Fair | 2020-09-09 |
#Compare CIBERSORT and TED cell porportions
inner_join(CIBERSORT, Ted.CellTypePorportions.df, by=c("Input.Sample", "cell.type"), suffix=c(".CIBERSORT", ".TED")) %>%
ggplot(aes(x=percent.TED, y=percent.CIBERSORT, color=cell.type)) +
geom_point(aes(shape=Species.TED)) +
theme_bw()
Version | Author | Date |
---|---|---|
e3ed68f | Benjmain Fair | 2020-09-09 |
inner_join(CIBERSORT, Ted.CellTypePorportions.df, by=c("Input.Sample", "cell.type"), suffix=c(".CIBERSORT", ".TED")) %>% dplyr::select(percent.CIBERSORT, percent.TED) %>% cor()
percent.CIBERSORT percent.TED
percent.CIBERSORT 1.000000 0.936489
percent.TED 0.936489 1.000000
inner_join(CIBERSORT, Ted.CellTypePorportions.df, by=c("Input.Sample", "cell.type"), suffix=c(".CIBERSORT", ".TED")) %>%
ggplot(aes(x=percent.TED, y=percent.CIBERSORT, color=cell.type)) +
geom_point(aes(shape=Species.TED)) +
facet_wrap(~cell.type, scales = "free") +
theme_bw()
Version | Author | Date |
---|---|---|
e3ed68f | Benjmain Fair | 2020-09-09 |
Ok, some reasonable agreement between the cell type porportions for some cell types, but not for others (particularly the less prevalent cell types).
Ok, now let’s explore TED’s output for expression per cell type per individual, and consider estimating dispersion in a cell type specific fashion in a way similar to the bulk dispersion estimates.
Ted.Expression.Per.CellType %>% dim()
[1] 79 9 10001
dimnames(Ted.Expression.Per.CellType)[[1]] <- rownames(Ted.results$res$first.gibbs.res$theta.merged)
dimnames(Ted.Expression.Per.CellType)[[2]] <- colnames(Ted.results$res$first.gibbs.res$theta.merged)
dimnames(Ted.Expression.Per.CellType)[[3]] <- colnames(Ted.results$res$first.gibbs.res$Zkg)
Ted.Expression.Per.CellType.df <- Ted.Expression.Per.CellType %>%
as.data.frame() %>% t() %>% as.data.frame() %>%
rownames_to_column() %>%
separate(rowname, into=c("CellType", "gene"), sep="\\.")
Ted.Expression.Per.CellType.df %>% head() %>% kable()
CellType | gene | C.4x0519 | C.4x373 | C.476 | C.4x0025 | C.438 | C.462 | C.4X0354 | C.503 | C.317 | C.623 | C.554 | C.4X0095 | C.549 | C.4X0267 | C.4X0339 | C.529 | C.495 | C.338 | C.4X0550 | C.558 | C.4x523 | C.389 | C.4x0043 | C.4X0357 | C.88A020 | C.554_2 | C.Little_R | C.676 | C.456 | C.537 | C.MD_And | C.724 | C.4x0430 | C.95A014 | C.4X0333 | C.522 | C.295 | C.4X0212 | C.570 | H.SRR1477033 | H.SRR598148 | H.SRR601645 | H.SRR599086 | H.SRR604230 | H.SRR1478189 | H.SRR613186 | H.59263 | H.62905 | H.63060 | H.SRR600852 | H.62606 | H.SRR604174 | H.SRR608096 | H.SRR1489693 | H.62765 | H.61317 | H.59167 | H.SRR602106 | H.SRR601868 | H.SRR612335 | H.SRR598589 | H.SRR600474 | H.SRR603449 | H.SRR606939 | H.SRR1481012 | H.SRR599249 | H.59365 | H.SRR607313 | H.63145 | H.SRR1477569 | H.SRR601239 | H.SRR607252 | H.SRR612875 | H.SRR601613 | H.SRR614683 | H.SRR604122 | H.59511 | H.62850 | H.SRR599380 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
endothelial cell | ENSG00000271601 | 33.62 | 14.13 | 42.24 | 26.24 | 13.91 | 20.09 | 9.18 | 20.97 | 21.35 | 18.85 | 19.36 | 24.77 | 20.95 | 26.77 | 5.61 | 11.77 | 27.47 | 34.49 | 21.10 | 12.77 | 17.93 | 21.59 | 16.31 | 10.93 | 13.42 | 20.70 | 15.22 | 20.33 | 13.27 | 13.56 | 15.65 | 37.65 | 23.02 | 19.14 | 31.34 | 10.73 | 45.42 | 22.09 | 24.04 | 11.76 | 19.25 | 19.48 | 19.62 | 47.66 | 9.97 | 3.02 | 31.37 | 12.16 | 13.86 | 10.79 | 6.76 | 26.20 | 12.44 | 25.66 | 11.75 | 7.45 | 9.67 | 13.84 | 12.64 | 22.52 | 30.10 | 13.48 | 9.83 | 31.99 | 13.10 | 35.29 | 21.13 | 30.34 | 17.91 | 13.35 | 18.78 | 11.34 | 16.69 | 16.05 | 14.90 | 46.83 | 10.28 | 20.34 | 34.97 |
leukocyte | ENSG00000271601 | 0.21 | 0.00 | 0.00 | 0.03 | 0.00 | 0.07 | 0.00 | 0.00 | 0.00 | 0.17 | 0.00 | 0.04 | 0.04 | 0.02 | 0.00 | 0.11 | 0.24 | 1.42 | 0.00 | 0.02 | 0.60 | 0.01 | 0.01 | 0.00 | 0.11 | 0.00 | 0.00 | 0.09 | 0.04 | 0.02 | 0.01 | 0.48 | 1.04 | 0.01 | 0.00 | 0.00 | 0.01 | 0.13 | 0.07 | 0.07 | 0.54 | 0.17 | 0.12 | 1.00 | 0.03 | 0.00 | 1.90 | 0.14 | 0.16 | 0.08 | 0.09 | 0.19 | 0.09 | 0.01 | 0.20 | 0.01 | 0.05 | 0.04 | 0.03 | 1.03 | 0.66 | 0.02 | 0.01 | 0.25 | 0.02 | 0.09 | 0.17 | 0.37 | 0.05 | 0.00 | 0.19 | 0.02 | 0.11 | 0.07 | 0.07 | 0.78 | 0.00 | 0.27 | 0.24 |
fibroblast | ENSG00000271601 | 50.30 | 23.94 | 2.19 | 16.15 | 7.13 | 20.59 | 0.45 | 1.35 | 30.27 | 15.82 | 0.76 | 6.19 | 78.02 | 24.31 | 1.84 | 24.80 | 29.23 | 21.01 | 12.99 | 16.75 | 55.92 | 17.19 | 21.82 | 2.27 | 3.12 | 9.69 | 27.92 | 6.36 | 7.51 | 43.36 | 5.70 | 6.14 | 7.56 | 6.27 | 31.85 | 13.02 | 23.51 | 8.98 | 12.03 | 4.70 | 21.01 | 6.99 | 10.55 | 13.38 | 1.20 | 6.57 | 65.89 | 35.78 | 16.15 | 2.48 | 17.04 | 35.84 | 53.90 | 16.39 | 4.36 | 20.49 | 21.04 | 4.73 | 4.74 | 48.90 | 13.27 | 9.56 | 2.13 | 4.22 | 7.88 | 13.63 | 41.32 | 14.15 | 13.29 | 6.61 | 0.58 | 12.23 | 23.51 | 19.30 | 2.60 | 14.30 | 15.60 | 11.97 | 2.09 |
myofibroblast cell | ENSG00000271601 | 116.55 | 30.33 | 108.13 | 97.52 | 36.27 | 53.13 | 13.36 | 50.35 | 63.34 | 49.61 | 53.00 | 59.96 | 58.25 | 64.29 | 9.13 | 27.65 | 52.42 | 78.03 | 72.87 | 24.29 | 73.06 | 63.32 | 43.24 | 21.50 | 44.20 | 55.98 | 15.05 | 64.04 | 45.34 | 26.86 | 31.00 | 76.45 | 61.39 | 53.24 | 91.24 | 35.34 | 92.34 | 47.52 | 71.77 | 42.96 | 51.82 | 56.50 | 45.18 | 165.63 | 29.82 | 6.36 | 146.65 | 32.01 | 44.51 | 36.22 | 27.54 | 142.43 | 67.65 | 46.80 | 73.26 | 21.68 | 21.12 | 38.41 | 41.20 | 127.33 | 112.89 | 43.43 | 33.31 | 92.20 | 62.67 | 89.46 | 101.15 | 82.89 | 50.59 | 38.47 | 55.65 | 73.29 | 55.28 | 65.74 | 45.63 | 137.46 | 26.61 | 76.28 | 93.03 |
endocardial cell | ENSG00000271601 | 31.16 | 3.92 | 26.70 | 19.68 | 9.65 | 11.95 | 5.28 | 12.36 | 14.82 | 16.35 | 13.50 | 17.46 | 25.39 | 12.91 | 2.49 | 3.31 | 15.32 | 26.75 | 19.81 | 6.29 | 19.80 | 17.07 | 17.65 | 6.72 | 15.93 | 10.08 | 4.63 | 18.06 | 16.22 | 8.29 | 18.26 | 26.58 | 18.12 | 16.24 | 18.83 | 10.58 | 23.71 | 18.37 | 16.25 | 10.09 | 12.83 | 18.87 | 9.55 | 39.83 | 6.53 | 1.55 | 34.51 | 6.55 | 10.47 | 6.08 | 4.97 | 21.64 | 6.53 | 13.75 | 12.44 | 5.57 | 7.05 | 7.91 | 9.45 | 25.48 | 22.62 | 9.17 | 5.91 | 24.50 | 12.93 | 12.54 | 27.18 | 23.70 | 14.48 | 8.45 | 15.78 | 11.76 | 11.34 | 14.87 | 8.90 | 33.58 | 7.22 | 17.13 | 22.41 |
cardiac muscle cell | ENSG00000271601 | 87.27 | 32.39 | 202.95 | 51.15 | 67.67 | 58.45 | 6.95 | 115.05 | 55.18 | 58.21 | 53.82 | 68.82 | 58.48 | 89.59 | 21.19 | 10.95 | 13.14 | 71.15 | 88.51 | 9.18 | 55.66 | 111.10 | 53.40 | 36.71 | 70.14 | 59.25 | 1.78 | 43.86 | 77.88 | 3.11 | 51.23 | 91.64 | 61.93 | 52.74 | 66.59 | 63.03 | 157.06 | 72.20 | 39.53 | 53.53 | 23.89 | 57.67 | 29.88 | 92.89 | 36.59 | 1.37 | 47.74 | 42.10 | 40.69 | 35.90 | 28.13 | 66.33 | 22.43 | 13.40 | 59.27 | 10.76 | 4.75 | 26.57 | 53.95 | 40.19 | 83.37 | 40.37 | 31.79 | 96.59 | 59.24 | 87.82 | 58.41 | 83.78 | 43.75 | 41.68 | 57.98 | 43.50 | 37.62 | 48.77 | 39.06 | 79.05 | 20.32 | 79.57 | 84.64 |
Are these expression values? Are they on a log scale? I should read some documentation, but let’s also just look at a histogram. I expect gene expression across different genes in an individual to appear roughly bell shaped on a log-scale.
hist(Ted.Expression.Per.CellType.df$C.4x0519)
Version | Author | Date |
---|---|---|
e3ed68f | Benjmain Fair | 2020-09-09 |
hist(log10(Ted.Expression.Per.CellType.df$C.4x0519))
Version | Author | Date |
---|---|---|
e3ed68f | Benjmain Fair | 2020-09-09 |
Yeah, so I think these are un-log transformed expression values. Are they normalized for sequencing depth per individual per cell type? Let’s check the median gene expression level for each individual x cell type combination… If the expression values are already normalized, I expect the median gene expression level to be roughly similar (at least, more similar than the >4 fold range in sequencing depth across samples)
#What is the median gene's expression level across each sample across each cell type
Ted.Expression.Per.CellType.df %>%
dplyr:: select(CellType, gene, starts_with("C.")) %>%
gather("Ind", "Expression", -CellType, -gene) %>%
group_by(Ind, CellType) %>%
summarise(Med=median(Expression, na.rm=T)) %>%
ggplot(aes(y=Med, x=Ind, fill=CellType)) +
geom_col() +
facet_wrap(~CellType) +
theme_bw() +
theme(
axis.text.x=element_blank(),
axis.ticks.x=element_blank())
Version | Author | Date |
---|---|---|
e3ed68f | Benjmain Fair | 2020-09-09 |
Ok, things may not be normalized across cell types, but if I want to estimate dispersion on a per-cell type basis that may not matter… Let’s see if the median counts are related to total sequencing depth…
#Total gene mapping read counts from bulk count table
TotalBulkCount.df <- data.frame(TotalBulkCount = cbind(CountTables$Chimp$Counts, CountTables$Human$Counts) %>% colSums()) %>%
rownames_to_column("Ind")
#Plot median gene expression versus bulk library size (colSums)
Ted.Expression.Per.CellType.df %>%
dplyr:: select(CellType, gene, starts_with("C.")) %>%
gather("Ind", "Expression", -CellType, -gene) %>%
group_by(Ind, CellType) %>%
summarise(Med=median(Expression, na.rm=T)) %>%
left_join(TotalBulkCount.df, by="Ind") %>%
ggplot(aes(x=Med, y=TotalBulkCount, color=CellType)) +
geom_point() +
facet_wrap(~CellType, scales = "free") +
theme_bw()
Version | Author | Date |
---|---|---|
e3ed68f | Benjmain Fair | 2020-09-09 |
Ok, since total counts (columnSums) from the bulk matrix strongly correlates with the median gene expression measurements output by TED for most cell types, I conclude that this TED output is estimates of the original count table, (though, obviously not integer counts).
First, let’s quickly write out the TED results and write out some other things for response to review.
write_delim(Ted.Expression.Per.CellType.df, "../figures/OriginalArt/TED.expressionPerCelltype.tsv.gz", delim = '\t')
write_delim(Ted.CellTypePorportions.df, "../figures/OriginalArt/TED.CelltypeProportions.source.tsv", delim='\t')
ggsave(TedProp, filename = "../figures/OriginalArt/TED.CelltypeProportions.pdf", height=3, width=6)
#And do PCA with TED cell proportions and establish a correlation with first few PCs
library(psych)
GeneExpMatrix<-read.table("../output/log10TPM.StandardizedAndNormalized.txt", header=T, check.names = F) %>% column_to_rownames("IID")
PCA.results <- prcomp(t(GeneExpMatrix))
PCs <- PCA.results$x %>% as.data.frame() %>%
dplyr::select(PC1:PC4) %>%
rownames_to_column() %>%
arrange(rowname) %>%
column_to_rownames()
CellProportions.wide <- Ted.CellTypePorportions.df %>%
mutate(Input.Sample = str_remove(Input.Sample, "^C.")) %>%
filter(Input.Sample %in% rownames(PCs)) %>%
dplyr::select(-Species) %>%
spread(key="cell.type", value="percent") %>%
arrange(Input.Sample) %>%
column_to_rownames("Input.Sample") %>%
as.matrix() %>% scale(scale = F)
MyCorMatrix <- corr.test(x=PCs,
y=CellProportions.wide,
method="pearson",
adjust="BH")
inner_join(
MyCorMatrix$r %>% as.data.frame() %>% rownames_to_column() %>% gather(key="celltype", value="R", -rowname),
MyCorMatrix$p %>% as.data.frame() %>% rownames_to_column() %>% gather(key="celltype", value="P.adjust", -rowname),
by=c("rowname", "celltype")) %>%
mutate(Ptext=signif(P.adjust, 2)) %>%
ggplot(aes(x=rowname, y=celltype)) +
scale_fill_gradient2() +
scale_color_viridis_c(trans="log10", direction=-1, limits=c(1E-9, 1)) +
geom_tile(aes(fill=R)) +
# geom_point(aes(color=P.adjust), size=3) +
geom_text(angle=45, aes(label=Ptext)) +
coord_flip() +
ylab("") +
xlab("") +
theme_classic() +
theme(axis.text.x = element_text(angle = 45,hjust = 1))
# theme(legend.position = "bottom")
ggsave("../figures/OriginalArt/TED.CelltypeProportions.vs.PCs.pdf", height=5, width=6)
Perhaps the most reasonable way to estimate dispersion per cell type from this TED output is to first transform the TED expression matrices to log(CPM), and, similarly to the dispersion estimation procedure in the original manuscript, fit a loess curve to the trend between the population mean expression and log(variance).
Though I have an intuition that \(log(variance(logCPM))\) is analogous to the \(log(\phi)\) that I used to estimate dispersion, I can’t derive mathematically why \(log(variance(logCPM))\)) is the appropriate metric (as opposed to \(sd(logCPM)\) for example). But, I can show empirically show that \(log(variance(logCPM))\) is basically equivalent to the \(log(\phi)\) parameter I estimated from fitting a negative binomial to the bulk raw count data:
#Compare the overdispersion (phi) estimate to the log(variance(logCPM)).
CountTables$Chimp$Counts %>%
rownames_to_column("gene") %>%
gather("Ind", "Expression", -gene) %>%
left_join(TotalBulkCount.df, by="Ind") %>%
mutate(Log.CPM.Expression=log(Expression/TotalBulkCount*1E6)) %>%
group_by(gene) %>%
summarise(mu=mean(Log.CPM.Expression), log.var.logcpm=log(var(Log.CPM.Expression))) %>%
left_join(Dispersion, by="gene") %>%
# head() %>%
ggplot(aes(x=log.var.logcpm, y=log(Chimp.Overdispersion))) +
geom_point()
Version | Author | Date |
---|---|---|
e3ed68f | Benjmain Fair | 2020-09-09 |
#in contrast to the sd(logCPM)...
CountTables$Chimp$Counts %>%
rownames_to_column("gene") %>%
gather("Ind", "Expression", -gene) %>%
left_join(TotalBulkCount.df, by="Ind") %>%
mutate(Log.CPM.Expression=log(Expression/TotalBulkCount*1E6)) %>%
group_by(gene) %>%
summarise(mu=mean(Log.CPM.Expression), sd.logcpm=sd(Log.CPM.Expression)) %>%
left_join(Dispersion, by="gene") %>%
# head() %>%
ggplot(aes(x=sd.logcpm, y=log(Chimp.Overdispersion))) +
geom_point()
Version | Author | Date |
---|---|---|
e3ed68f | Benjmain Fair | 2020-09-09 |
Ok, now let’s look at the loess plots and think about estimating dispersion on a per cell type (and per species) basis.
#loess log(var) vs mean for chimp by cell type
Normalized.Expression.Per.CellType <- Ted.Expression.Per.CellType.df %>%
gather("Ind", "Expression", -CellType, -gene) %>%
left_join(TotalBulkCount.df, by="Ind") %>%
mutate(Log.CPM.Expression=log(Expression/TotalBulkCount*1E6))
#loess fit, just for chimp population
Normalized.Expression.Per.CellType %>%
filter(startsWith(Ind, "C")) %>%
group_by(gene, CellType) %>%
summarise(mu=mean(Log.CPM.Expression), log.var=log(var(Log.CPM.Expression))) %>%
ggplot(aes(x=mu, y=log.var, color=CellType)) +
geom_point(alpha=0.05) +
geom_smooth(method="loess", method.args=list(degree=1)) +
theme_bw()
Version | Author | Date |
---|---|---|
e3ed68f | Benjmain Fair | 2020-09-09 |
#...for both chimp and human
LoessFits <- Normalized.Expression.Per.CellType %>%
mutate(Species = case_when(str_detect(Ind, "C.") ~ "Chimp",
str_detect(Ind, "H.") ~ "Human")) %>%
group_by(gene, CellType, Species) %>%
summarise(mu=mean(Log.CPM.Expression), log.var=log(var(Log.CPM.Expression))) %>%
ggplot(aes(x=mu, y=log.var, color=CellType)) +
geom_point(alpha=0.03) +
xlab(TeX("Mean Expression, $\\hat{\\mu}$")) +
ylab(TeX("Overdispersion, log($\\hat{\\phi})$")) +
geom_smooth(method="loess", method.args=list(degree=1), aes(linetype=Species), color="black") +
facet_wrap(~CellType, labeller = labeller(CellType = label_wrap_gen(width = 25))) +
guides(color = guide_legend(override.aes = list(alpha = 1) ) ) +
theme_bw()
LoessFits
Save this plot. also write code chunk for an alternate visualization of the same thing…
ggsave("../figures/OriginalArt/ResponseToReviewers.CellType.loess.png", LoessFits, height=6,width=8)
Normalized.Expression.Per.CellType %>%
mutate(Species = case_when(str_detect(Ind, "C.") ~ "Chimp",
str_detect(Ind, "H.") ~ "Human")) %>%
group_by(gene, CellType, Species) %>%
summarise(mu=mean(Log.CPM.Expression), log.var=log(var(Log.CPM.Expression))) %>%
ggplot(aes(x=mu, y=log.var)) +
stat_binhex(bins=100) +
scale_fill_viridis_c(option="C") +
xlab(TeX("Mean Expression, $\\hat{\\mu}$")) +
ylab(TeX("Overdispersion, log($\\hat{\\phi})$")) +
geom_smooth(method="loess", method.args=list(degree=1), aes(linetype=Species), color="black") +
facet_wrap(~CellType+Species) +
guides(color = guide_legend(override.aes = list(alpha = 1) ) ) +
theme_bw()
Normalized.Expression.Per.CellType %>%
mutate(Species = case_when(str_detect(Ind, "C.") ~ "Chimp",
str_detect(Ind, "H.") ~ "Human")) %>%
saveRDS(file = "../big_data/NormalizedExpressionPerCellType.rds")
It seems most trends are similar between species for different cell types, except leukocytes which I may drop from further analysis. The fact thiat this leukocyte cell type is an outlier may be biologically meaningful, especially given that ~4 of the chimp samples have been challenged with virus. But it might also be technical, I would rather just avoid that cell type.
So now I will calculate the residual from each loess fit as the cell type specific dispersion. Similar to the bulk case, I may use bootstrapping to estimate standard error and perform inference to compare the human vs chimp dispersion estimates.
Let’s start first just by getting dispersion estimates, and worry about bootstrapping and inference later.
DataToEstimateCellTypeDispersion <- Normalized.Expression.Per.CellType %>%
mutate(Species = case_when(str_detect(Ind, "C.") ~ "Chimp",
str_detect(Ind, "H.") ~ "Human")) %>%
group_by(gene, CellType, Species) %>%
summarise(mu=mean(Log.CPM.Expression), log.var=log(var(Log.CPM.Expression)))
#Estimate dispersion (residual from the loess fit)
CellTypeDispersion <- DataToEstimateCellTypeDispersion %>%
group_by(Species, CellType) %>%
do(data.frame(., resid = residuals(loess(log.var ~ mu, data=., degree=1, na.action="na.exclude"))))
#Plot kernel density estimate of dispersion... One quick way to verify that I correctly obtained the residual of per-species and per-cell-type loess fits, is to look at the leukoctye in chimp and humans. If I did this correctly, I expect the dispersion estimates for every species x cell type combination to be roughly centered around 0, even the leukocytes.
CellTypeDispersion %>%
ggplot(aes(x=resid, color=Species)) +
geom_density() +
facet_wrap(~CellType) +
theme_bw()
Version | Author | Date |
---|---|---|
e3ed68f | Benjmain Fair | 2020-09-09 |
Ok, it seems I have dispersion estimates for each cell type. How to present this in an interpretable way. How about I start by plotting a correlation matrix with hiearchal clustering dendrogram, comparing the dispersion estimates from bulk to those for each cell type, for both species. As an alternative visualization, plot celltype:species pairs in PC space.
Dispersion.Matrix <- Dispersion %>%
dplyr::select(gene, Chimp_Bulk=Chimp.Dispersion, Human_Bulk=Human.Dispersion) %>%
gather(key="Species.CellType", value="dispersion", -gene) %>%
bind_rows(
unite(CellTypeDispersion, Species.CellType, Species, CellType) %>%
dplyr::select(gene, Species.CellType, dispersion=resid)) %>%
dplyr::select(gene, Species.CellType, dispersion) %>%
filter(gene %in% one2one_HumanMouseOrthologs$Gene.stable.ID) %>%
pivot_wider(names_from=Species.CellType, values_from = dispersion) %>%
column_to_rownames("gene") %>%
as.matrix()
Dispersion.Cor.Matrix <- Dispersion.Matrix %>% as.data.frame() %>%
# dplyr::select(-Chimp_leukocyte, -Human_leukocyte) %>%
cor(use="pairwise.complete.obs")
SpeciesFactorCol <- colnames(Dispersion.Cor.Matrix) %>% substr(1,1) %>% factor() %>% unclass() %>% as.character() %>% recode("1"="#F8766D", "2"="#00BFC4")
TissueFactorCol <- data.frame(ColNames=colnames(Dispersion.Cor.Matrix)) %>%
mutate(Tissue=str_remove(ColNames, "^.+?_"))
TissueColors.df <- data.frame(colors=c("black", hue_pal()(nrow(Dispersion.Cor.Matrix)/2-1)), Tissue=unique(sort(TissueFactorCol$Tissue)))
TissueFactorCol.vector <- TissueFactorCol %>%
left_join(TissueColors.df, by="Tissue") %>% pull(colors) %>% as.character()
heatmap.2(Dispersion.Cor.Matrix, trace="none", ColSideColors=SpeciesFactorCol, RowSideColors = TissueFactorCol.vector)
Dispersion.PCA <- Dispersion.Matrix %>% as.data.frame() %>% dplyr::select(-Chimp_leukocyte, -Human_leukocyte) %>%
drop_na() %>%
t() %>%
prcomp()
Dispersion.PCA$x %>% as.data.frame() %>%
rownames_to_column() %>%
separate(rowname, into=c("Species", "CellType"), sep = "_") %>%
ggplot(aes(x=PC1, y=PC2, color=CellType)) +
# geom_point(aes(fill=Species), pch=21, size=5, stroke=3) +
geom_point(aes(shape=Species)) +
theme_bw()
Let’s contrast this with the correlation matrix for cell type expression estimates:
Expression.Matrix <- Dispersion %>%
dplyr::select(gene, Chimp_Bulk=Chimp.Mean.Expression, Human_Bulk=Human.Mean.Expression) %>%
gather(key="Species.CellType", value="expression", -gene) %>%
bind_rows(
unite(CellTypeDispersion, Species.CellType, Species, CellType) %>%
dplyr::select(gene, Species.CellType, expression=mu)) %>%
dplyr::select(gene, Species.CellType, expression) %>%
filter(gene %in% one2one_HumanMouseOrthologs$Gene.stable.ID) %>%
pivot_wider(names_from=Species.CellType, values_from = expression) %>%
column_to_rownames("gene") %>%
as.matrix()
Expression.Matrix[mapply(is.infinite, Expression.Matrix)] <- NA
Expression.Cor.Matrix <- cor(Expression.Matrix, use="pairwise.complete.obs")
heatmap.2(Expression.Cor.Matrix, trace="none", ColSideColors=SpeciesFactorCol, RowSideColors = TissueFactorCol.vector)
Expression.PCA <- Expression.Matrix %>% as.data.frame() %>% dplyr::select(-Chimp_leukocyte, -Human_leukocyte) %>%
drop_na() %>%
t() %>%
prcomp()
Expression.PCA$x %>% as.data.frame() %>%
rownames_to_column() %>%
separate(rowname, into=c("Species", "CellType"), sep = "_") %>%
ggplot(aes(x=PC1, y=PC2, color=CellType)) +
# geom_point(aes(fill=Species), pch=21, size=5, stroke=3) +
geom_point(aes(shape=Species)) +
theme_bw()
Save plots…
pdf("../figures/OriginalArt/ResponseToReviewers.CorMatPlot.Dispersion.pdf")
heatmap.2(Dispersion.Cor.Matrix, trace="none", ColSideColors=SpeciesFactorCol, RowSideColors = TissueFactorCol.vector, offsetRow=0, offsetCol=0, cexRow = 0.5, cexCol = 0.5, margins=c(10,10))
dev.off()
pdf("../figures/OriginalArt/ResponseToReviewers.CorMatPlot.Expression.pdf")
heatmap.2(Expression.Cor.Matrix, trace="none", ColSideColors=SpeciesFactorCol, RowSideColors = TissueFactorCol.vector, offsetRow=0, offsetCol=0, cexRow = 0.5, cexCol = 0.5, margins=c(10,10))
dev.off()
sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Catalina 10.15.5
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] latex2exp_0.4.0 scales_1.1.1 knitr_1.26 cowplot_1.0.0
[5] gridExtra_2.3 edgeR_3.26.8 limma_3.40.6 MASS_7.3-53
[9] gplots_3.0.1.1 Seurat_3.1.1 biomaRt_2.40.5 forcats_0.4.0
[13] stringr_1.4.0 dplyr_1.0.2 purrr_0.3.3 readr_1.3.1
[17] tidyr_1.0.0 tibble_3.0.3 ggplot2_3.3.2 tidyverse_1.3.0
[21] TED_1.0
loaded via a namespace (and not attached):
[1] reticulate_1.13 R.utils_2.9.0
[3] tidyselect_1.1.0 RSQLite_2.1.4
[5] AnnotationDbi_1.46.1 htmlwidgets_1.5.1
[7] grid_3.6.1 BiocParallel_1.18.1
[9] Rtsne_0.15 munsell_0.5.0
[11] codetools_0.2-16 mutoss_0.1-12
[13] ica_1.0-2 future_1.15.1
[15] withr_2.1.2 colorspace_1.4-1
[17] Biobase_2.44.0 highr_0.8
[19] rstudioapi_0.10 stats4_3.6.1
[21] ROCR_1.0-7 gbRd_0.4-11
[23] listenv_0.8.0 labeling_0.3
[25] Rdpack_0.11-0 git2r_0.26.1
[27] GenomeInfoDbData_1.2.1 mnormt_1.5-5
[29] MCMCpack_1.4-9 farver_2.0.3
[31] bit64_0.9-7 rprojroot_1.3-2
[33] TH.data_1.0-10 coda_0.19-3
[35] vctrs_0.3.4 generics_0.0.2
[37] xfun_0.11 R6_2.4.1
[39] GenomeInfoDb_1.20.0 rsvd_1.0.2
[41] locfit_1.5-9.1 bitops_1.0-6
[43] DelayedArray_0.10.0 assertthat_0.2.1
[45] promises_1.1.0 SDMTools_1.1-221.2
[47] multcomp_1.4-11 nnet_7.3-12
[49] gtable_0.3.0 npsurv_0.4-0
[51] globals_0.12.5 conquer_1.0.2
[53] mcmc_0.9-7 sandwich_2.5-1
[55] workflowr_1.5.0 rlang_0.4.7
[57] MatrixModels_0.4-1 genefilter_1.66.0
[59] splines_3.6.1 lazyeval_0.2.2
[61] broom_0.7.0 checkmate_2.0.0
[63] reshape2_1.4.3 yaml_2.2.0
[65] modelr_0.1.5 backports_1.1.5
[67] httpuv_1.5.2 Hmisc_4.4-1
[69] tools_3.6.1 ellipsis_0.3.0
[71] RColorBrewer_1.1-2 BiocGenerics_0.30.0
[73] ggridges_0.5.1 TFisher_0.2.0
[75] Rcpp_1.0.5 plyr_1.8.6
[77] base64enc_0.1-3 progress_1.2.2
[79] zlibbioc_1.30.0 RCurl_1.95-4.12
[81] prettyunits_1.0.2 rpart_4.1-15
[83] pbapply_1.4-3 S4Vectors_0.22.1
[85] zoo_1.8-8 SummarizedExperiment_1.14.1
[87] haven_2.2.0 ggrepel_0.8.1
[89] cluster_2.1.0 fs_1.3.1
[91] magrittr_1.5 data.table_1.12.8
[93] SparseM_1.78 lmtest_0.9-37
[95] reprex_0.3.0 RANN_2.6.1
[97] mvtnorm_1.0-11 whisker_0.4
[99] fitdistrplus_1.0-14 matrixStats_0.56.0
[101] hms_0.5.2 lsei_1.2-0
[103] evaluate_0.14 xtable_1.8-4
[105] XML_3.98-1.20 jpeg_0.1-8.1
[107] readxl_1.3.1 IRanges_2.18.3
[109] compiler_3.6.1 KernSmooth_2.23-16
[111] crayon_1.3.4 R.oo_1.23.0
[113] htmltools_0.4.0 mgcv_1.8-31
[115] later_1.0.0 Formula_1.2-3
[117] geneplotter_1.62.0 RcppParallel_4.4.4
[119] lubridate_1.7.9 DBI_1.0.0
[121] dbplyr_1.4.2 Matrix_1.2-18
[123] cli_2.0.0 R.methodsS3_1.7.1
[125] gdata_2.18.0 parallel_3.6.1
[127] metap_1.2 igraph_1.2.4.2
[129] GenomicRanges_1.36.1 pkgconfig_2.0.3
[131] sn_1.5-4 numDeriv_2016.8-1.1
[133] foreign_0.8-72 plotly_4.9.1
[135] xml2_1.2.2 annotate_1.62.0
[137] multtest_2.40.0 XVector_0.24.0
[139] bibtex_0.4.2 rvest_0.3.5
[141] digest_0.6.23 tsne_0.1-3
[143] sctransform_0.2.0 RcppAnnoy_0.0.14
[145] rmarkdown_1.18 cellranger_1.1.0
[147] leiden_0.3.1 htmlTable_2.0.1
[149] uwot_0.1.5 curl_4.3
[151] gtools_3.8.1 quantreg_5.61
[153] lifecycle_0.2.0 nlme_3.1-143
[155] jsonlite_1.6 viridisLite_0.3.0
[157] fansi_0.4.0 pillar_1.4.6
[159] lattice_0.20-38 plotrix_3.7-7
[161] httr_1.4.1 survival_3.1-8
[163] glue_1.4.2 png_0.1-7
[165] bit_1.1-14 stringi_1.4.3
[167] blob_1.2.0 DESeq2_1.24.0
[169] latticeExtra_0.6-29 caTools_1.17.1.3
[171] memoise_1.1.0 irlba_2.3.3
[173] future.apply_1.3.0 ape_5.3