We will load clustermole along with dplyr to help with summarizing the data.
You can use clustermole as a simple database and get a table of all cell type markers.
markers <- clustermole_markers(species = "hs")
markers
#> # A tibble: 422,292 × 8
#> celltype_full db species organ celltype n_genes gene_original gene
#> <chr> <chr> <chr> <chr> <chr> <int> <chr> <chr>
#> 1 1-cell stage cell (… Cell… Human Embr… 1-cell … 32 ACCSL ACCSL
#> 2 1-cell stage cell (… Cell… Human Embr… 1-cell … 32 ACVR1B ACVR…
#> 3 1-cell stage cell (… Cell… Human Embr… 1-cell … 32 ASF1B ASF1B
#> 4 1-cell stage cell (… Cell… Human Embr… 1-cell … 32 BCL2L10 BCL2…
#> 5 1-cell stage cell (… Cell… Human Embr… 1-cell … 32 BLCAP BLCAP
#> 6 1-cell stage cell (… Cell… Human Embr… 1-cell … 32 CASC3 CASC3
#> 7 1-cell stage cell (… Cell… Human Embr… 1-cell … 32 CLEC10A CLEC…
#> 8 1-cell stage cell (… Cell… Human Embr… 1-cell … 32 CNOT11 CNOT…
#> 9 1-cell stage cell (… Cell… Human Embr… 1-cell … 32 DCLK2 DCLK2
#> 10 1-cell stage cell (… Cell… Human Embr… 1-cell … 32 DHCR7 DHCR7
#> # ℹ 422,282 more rows
Each row contains a gene and a cell type associated with it. The
gene
column is the gene symbol (human or mouse) and the
celltype_full
column contains the detailed cell type string
including the species and the original database.
Number of cell types by source database
Check the source databases and the number of cell types from each.
Number of cell types by species
Check the number of cell types per species (not available for all cell types).
Number of cell types by organ
Check the number of available cell types per organ (not available for all cell types).
distinct(markers, celltype_full, organ) |> count(organ, sort = TRUE)
#> # A tibble: 93 × 2
#> organ n
#> <chr> <int>
#> 1 "" 2160
#> 2 "Brain" 122
#> 3 "Immune system" 50
#> 4 "Lung" 47
#> 5 "Kidney" 43
#> 6 "Bone marrow" 42
#> 7 "Liver" 38
#> 8 "Blood" 33
#> 9 "Embryo" 30
#> 10 "Peripheral blood" 29
#> # ℹ 83 more rows
Package version
Check the package version since the database contents may change.
packageVersion("clustermole")
#> [1] '1.1.1.9000'