Evaluating language models for mathematics through interactions.
Collins KM, Jiang AQ, Frieder S, Wong L, Zilka M, Bhatt U, Lukasiewicz T, Wu Y, Tenenbaum JB, Hart W, Gowers T, Li W, Weller A, Jamnik M.
Collins KM, et al.
Proc Natl Acad Sci U S A. 2024 Jun 11;121(24):e2318124121. doi: 10.1073/pnas.2318124121. Epub 2024 Jun 3.
Proc Natl Acad Sci U S A. 2024.
PMID: 38830100
Free PMC article.