Streptococcus mutans is one of several members of the oral indigenous biota linked with severe early childhood caries (S-ECC). Because most humans harbor S. mutans, but not all manifest disease, it has been proposed that the strains of S. mutans associated with S-ECC are genetically distinct from those found in caries-free (CF) children. The objective of this study was to identify common DNA fragments from S. mutans present in S-ECC but not in CF children. Using suppressive subtractive hybridization, we found a number of DNA fragments (biomarkers) present in 88 to 95% of the S-ECC S. mutans strains but not in CF S. mutans strains. We then applied machine learning techniques including support vector machines and neural networks to identify the biomarkers with the most predictive power for disease status, achieving a 92% accurate classification of the strains as either S-ECC or CF associated. The presence of these gene fragments in 90 to 100% of the 26 S-ECC isolates tested suggested their possible functional role in the pathogenesis of S. mutans associated with dental caries.