Skip to contents

Retrieve a data frame of gene sets and their member genes. The original human genes can be converted into their corresponding counterparts in various model organisms, including mouse, rat, pig, zebrafish, fly, and yeast. The output includes gene symbols along with NCBI and Ensembl IDs.

Usage

msigdbr(
  db_species = "HS",
  species = "human",
  collection = NULL,
  subcollection = NULL,
  category = deprecated(),
  subcategory = deprecated()
)

Arguments

db_species

Species abbreviation for the human or mouse databases ("HS" or "MM").

species

Species name for output genes, such as "Homo sapiens" or "Mus musculus". Both scientific and common names are acceptable. Use msigdbr_species() for the available options.

collection

Collection abbreviation, such as "H" or "C1". Use msigdbr_collections() for the available options.

subcollection

Sub-collection abbreviation, such as "CGP" or "BP". Use msigdbr_collections() for the available options.

category

[Deprecated] use the collection argument

subcategory

[Deprecated] use the subcollection argument

Value

A tibble (a data frame with class tibble::tbl_df) of gene sets with one gene per row.

Details

Historically, the MSigDB resource has been tailored to the analysis of human-specific datasets, with gene sets exclusively aligned to the human genome. Starting with release 2022.1, MSigDB incorporated a database of mouse-native gene sets and was split into human and mouse divisions ("Hs" and "Mm"). Each one is provided in the approved gene symbols of its respective species.

Mouse MSigDB includes gene sets curated from mouse-centric datasets and specified in native mouse gene identifiers, eliminating the need for ortholog mapping.