The number of databases of natural products (NPs) has increased substantially. Latin America is extraordinarily rich in biodiversity, enabling the identification of novel NPs, which has encouraged both the development of databases and the implementation of those that are being created or are under development. In a collective effort from several Latin American countries, herein we introduce the first version of the Latin American Natural Products Database (LANaPDB), a public compound collection that gathers the chemical information of NPs contained in diverse databases from this geographical region. The current version of LANaPDB unifies the information from six countries and contains 12,959 chemical structures. The structural classification showed that the most abundant compounds are the terpenoids (63.2%), phenylpropanoids (18%) and alkaloids (11.8%). From the analysis of the distribution of properties of pharmaceutical interest, it was observed that many LANaPDB compounds satisfy some drug-like rules of thumb for physicochemical properties. The concept of the chemical multiverse was employed to generate multiple chemical spaces from two different fingerprints and two dimensionality reduction techniques. Comparing LANaPDB with FDA-approved drugs and the major open-access repository of NPs, COCONUT, it was concluded that the chemical space covered by LANaPDB completely overlaps with COCONUT and, in some regions, with FDA-approved drugs. LANaPDB will be updated, adding more compounds from each database, plus the addition of databases from other Latin American countries.
Keywords: Latin America; chemical multiverse; chemical space; chemoinformatics; databases; diversity; drug discovery; natural products; virtual screening.