The Gene Expression Omnibus (GEO) has millions of samples from thousands of studies. While users of GEO can search the metadata describing studies, there is a need for methods to search GEO at the data level. RummaGEO is a gene expression signature search engine for human and mouse RNA sequencing perturbation studies extracted from GEO. To develop RummaGEO, we automatically identified groups of samples and computed differential expressions to extract gene sets from each study. The contents of RummaGEO are served for gene set, PubMed, and metadata search. Next, we analyzed the contents of RummaGEO to identify patterns and perform global analyses. Overall, RummaGEO provides a resource that is enabling users to identify relevant GEO studies based on their own gene expression results. Users of RummaGEO can incorporate RummaGEO into their analysis workflows for integrative analyses and hypothesis generation.
Keywords: ARCHS4; CFDE; LINCS; RNA sequencing; data integration; data mining; gene expression; gene set enrichment analysis; signature search; transcriptomics.
© 2024 The Author(s).