Integrating Prior Knowledge Using Transformer for Gene Regulatory Network Inference

Guangzheng Weng; Patrick Martin; Hyobin Kim; Kyoung Jae Won

doi:10.1002/advs.202409990

Integrating Prior Knowledge Using Transformer for Gene Regulatory Network Inference

Adv Sci (Weinh). 2024 Nov 28:e2409990. doi: 10.1002/advs.202409990. Online ahead of print.

Authors

Guangzheng Weng¹, Patrick Martin², Hyobin Kim², Kyoung Jae Won²

Affiliations

¹ Biotech Research and Innovation Centre (BRIC), University of Copenhagen, Ole Maaløes Vej 5, Copenhagen, 2200, Denmark.
² Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, 90069, USA.

PMID: 39605181
DOI: 10.1002/advs.202409990

Abstract

Gene regulatory network (GRN) inference, a process of reconstructing gene regulatory rules from experimental data, has the potential to discover new regulatory rules. However, existing methods often struggle to generalize across diverse cell types and account for unseen regulators. Here, this work presents GRNPT, a novel Transformer-based framework that integrates large language model (LLM) embeddings from publicly accessible biological data and a temporal convolutional network (TCN) autoencoder to capture regulatory patterns from single-cell RNA sequencing (scRNA-seq) trajectories. GRNPT significantly outperforms both supervised and unsupervised methods in inferring GRNs, particularly when training data is limited. Notably, GRNPT exhibits exceptional generalizability, accurately predicting regulatory relationships in previously unseen cell types and even regulators. By combining LLMs ability to distillate biological knowledge from text and deep learning methodologies capturing complex patterns in gene expression data, GRNPT overcomes the limitations of traditional GRN inference methods and enables more accurate and comprehensive understanding of gene regulatory dynamics.

Keywords: deep learning; gene regulatory networks; inference; large language model; temporal convolutional network; transformer.