Genome-wide association studies (GWAS) are powerful statistical methods that detect associations between genotype and phenotype at genome scale. Despite their power, GWAS frequently fail to pinpoint the causal variant or the gene controlling a given trait in crop species. Assessing genetic variants other than single-nucleotide polymorphisms (SNPs) could alleviate this problem. In this study, we tested the potential of structural variant (SV)- and k-mer-based GWAS in soybean by applying these methods as well as conventional SNP/indel-based GWAS to 13 traits. We assessed the performance of each GWAS approach based on loci for which the causal genes or variants were known from previous genetic studies. We found that k-mer-based GWAS was the most versatile approach and the best at pinpointing causal variants or candidate genes. Moreover, k-mer-based analyses identified promising candidate genes for loci related to pod color, pubescence form, and resistance to Phytophthora sojae. In our dataset, SV-based GWAS did not add value compared to k-mer-based GWAS and may not be worth the time and computational resources invested. Despite promising results, significant challenges remain regarding the downstream analysis of k-mer-based GWAS. Notably, better methods are needed to associate significant k-mers with sequence variation. Our results suggest that coupling k-mer- and SNP/indel-based GWAS is a powerful approach for discovering candidate genes in crop species.
© 2023 The Authors. The Plant Genome published by Wiley Periodicals LLC on behalf of Crop Science Society of America.