See original documentation at test_genesets

run_geneset_enrichment(
  genesets,
  genelist,
  method = "goat",
  score_type = "effectsize",
  padj_method = "BH",
  padj_sources = TRUE,
  padj_cutoff = 0.01,
  padj_min_signifgenes = 0L,
  ...
)

Arguments

genesets

tibble with genesets, must contain columns 'source', 'source_version', 'id', 'name', 'genes', 'ngenes', 'ngenes_signif'

genelist

tibble with genes, must contain column 'gene' and 'test'. gene = character column, which are matched against list column 'genes' in genesets tibble. test = boolean column (you can set all to FALSE if not performing Fisher-exact or hypergeometric test downstream)

method

method for overrepresentation analysis. Options: "goat", "hypergeometric", "fisherexact", "fisherexact_ease", "gsea", "idea"

score_type

string, default: "effectsize", alternatively set to "pvalue", "effectsize_up", "effectsize_down", "effectsize_abs"

padj_method

first step of multiple testing correction; method for p-value adjustment, passed to stats::p.adjust() via padjust_genesets(), e.g. set "BH" to compute FDR adjusted p-values (default) or "bonferroni" for a more stringent procedure

padj_sources

second step of multiple testing correction; apply Bonferroni adjustment to all p-values according to the number of geneset sources that were tested. Boolean parameter, set TRUE to enable (default) or FALSE to disable

padj_cutoff

cutoff for adjusted p-value, signif column is set to TRUE for all values lesser-equals

padj_min_signifgenes

if a value larger than zero is provided, this will perform additional post-hoc filtering; after p-value adjustment, set the pvalue_adjust to NA and signif to FALSE for all genesets with fewer than padj_min_signifgenes 'input genes that were significant' (ngenes_signif column in genesets table). So this does not affect the accuracy of estimated p-values, in contrast to prefiltering genesets prior to p-value computation or adjusting p-values

...

further parameters are passed to the respective stats method

Value

the input genesets, with results stored in columns 'pvalue', 'pvalue_adjust', 'signif' and 'zscore'

Examples

run_geneset_enrichment(
  get(load(system.file("extdata", "example_genesets.rda", package = "goatea"))),
  get(load(system.file("extdata", "example_genelist.rda", package = "goatea")))
)
#> # A tibble: 10 × 17
#>    source source_version id    name  parent_id ngenes_input ngenes ngenes_signif
#>    <chr>  <chr>          <chr> <chr> <list>           <int>  <int>         <int>
#>  1 origin org.Xx.eg.db   DB.0… gene… <chr [1]>           13     13             6
#>  2 origin org.Xx.eg.db   DB.0… gene… <chr [1]>           18     18            11
#>  3 origin org.Xx.eg.db   DB.0… gene… <chr [1]>           20     20            10
#>  4 origin org.Xx.eg.db   DB.0… gene… <chr [1]>           11     11             4
#>  5 origin org.Xx.eg.db   DB.0… gene… <chr [1]>           10     10             4
#>  6 origin org.Xx.eg.db   DB.0… gene… <chr [1]>           12     12             6
#>  7 origin org.Xx.eg.db   DB.0… gene… <chr [1]>           19     19             6
#>  8 origin org.Xx.eg.db   DB.0… gene… <chr [1]>           20     20            10
#>  9 origin org.Xx.eg.db   DB.0… gene… <chr [1]>           10     10             4
#> 10 origin org.Xx.eg.db   DB.0… gene… <chr [1]>           14     14             6
#> # ℹ 9 more variables: genes <list<int>>, genes_signif <list>, score_type <chr>,
#> #   pvalue <dbl>, zscore <dbl>, pvalue_adjust <dbl>, signif <lgl>,
#> #   score_oddsratio <dbl>, symbol <list>