4

I'm interested in looking into cells that are positive for two (or in some cases more) genes. I know I have some double positive just by looking at the FeaturePlot of those genes, but now I'm trying to figure it out how many are double positive and further percentage wise. Example:

50 cells are positive for Myh11
100 cells are positive for Acta2
30 cells are double positive
60% of Myh11+ cells are Acta2+
30% of Acta2+ cells are Myh11+

Is the data used to generated the FeaturePlot the scale.data? If so, I could just make a new dataset out of that, but I'm not sure if that is the case.

Jason Aller
  • 165
  • 2
  • 10

1 Answers1

4

The counts stored in the Seurat object are: raw counts (seuratobject@raw.data), the log + normalized counts (seuratobject@data), and the scaled counts (seuratobject@scale.data). FeaturePlot() plots the log + normalized counts.

In order to identify double-positive cells, you need to identify cells that express a gene (i.e. positive for a gene) and that is not trivial: you need a cutoff value for each gene, and also to distinguish between true and dropout zero counts.

Provided you solve these, you can retrieve your cells by:

MYH11.cutoff <- 1
ACTA2.cutoff <- 1
length(which(FetchData(seuratobject, vars.all='MYH11') > MYH11.cutoff))
length(which(FetchData(seuratobject, vars.all='ACTA2') > ACTA2.cutoff))
length(which(FetchData(seuratobject, vars.all='MYH11') > MYH11.cutoff & FetchData(seuratobject, vars.all='ACTA2') > ACTA2.cutoff))
Peter
  • 2,634
  • 15
  • 33
  • you can substitute the length(which( by just using sum: TRUE is evaluated as 1 and then you end up with the same result (but a bit faster). – llrs Jun 20 '18 at 14:01
  • Sorry what would be the code if we have 3 clusters of cells and we we want to show the cluster in which the ratio of gene named DDB_G0267412 is two times higher than DDB_G0277853? – Zizogolu Jul 23 '18 at 20:23