PEPhub: a database, web interface, and API for editing, sharing, and validating biological sample metadata

bioRxiv [Preprint]. 2024 May 11:2023.08.15.551388. doi: 10.1101/2023.08.15.551388.

Abstract

Background: As biological data increases, we need additional infrastructure to share it and promote interoperability. While major effort has been put into sharing data, relatively less emphasis is placed on sharing metadata. Yet, sharing metadata is also important, and in some ways has a wider scope than sharing data itself.

Results: Here, we present PEPhub, an approach to improve sharing and interoperability of biological metadata. PEPhub provides an API, natural language search, and user-friendly web-based sharing and editing of sample metadata tables. We used PEPhub to process more than 100,000 published biological research projects and index them with fast semantic natural language search. PEPhub thus provides a fast and user-friendly way to finding existing biological research data, or to share new data.

Availability: https://pephub.databio.org.

Publication types

  • Preprint