Progression of hepatocellular carcinoma (HCC) is a stepwise process that proceeds from pre-neoplastic lesions--including low-grade dysplastic nodules (LGDNs) and high-grade dysplastic nodules (HGDNs)--to advanced HCC. The molecular changes associated with this progression are unclear, however, and the morphological cues thought to distinguish pre-neoplastic lesions from well-differentiated HCC are not universally accepted. To understand the multistep process of hepato-carcinogenesis at the molecular level, we used oligo-nucleotide microarrays to investigate the transcription profiles of 50 hepatocellular nodular lesions ranging from LGDNs to primary HCC (Edmondson grades 1-3). We demonstrated that gene expression profiles can discriminate not only between dysplastic nodules and overt carcinoma but also between different histological grades of HCC via unsupervised hierarchical clustering with 10,376 genes. We identified 3,084 grade-associated genes, correlated with tumor progression, using one-way ANOVA and a one-versus-all unpooled t test. Functional assignment of these genes revealed discrete expression clusters representing grade-dependent biological properties of HCC. Using both diagonal linear discriminant analysis and support vector machines, we identified 240 genes that could accurately classify tumors according to histological grade, especially when attempting to discriminate LGDNs, HGDNs, and grade 1 HCC. In conclusion, a clear molecular demarcation between dysplastic nodules and overt HCC exists. The progression from grade 1 through grade 3 HCC is associated with changes in gene expression consistent with plausible functional consequences.