Early-stage non-small cell lung cancer (NSCLC) can be cured by surgical resection, but a substantial fraction of patients ultimately dies due to distant metastasis. In this study, we used subtractive hybridization to identify gene expression differences in stage I NSCLC tumors that either did or did not metastasize in the course of disease. Individual clones (n=225) were sequenced and quantitative RT-PCR verified overexpression in metastasizing samples. Several of the identified genes (eIF4A1, thymosin beta4 and a novel transcript named MALAT-1) were demonstrated to be significantly associated with metastasis in NSCLC patients (n=70). The genes' association with metastasis was stage- and histology specific. The Kaplan-Meier analyses identified MALAT-1 and thymosin beta4 as prognostic parameters for patient survival in stage I NSCLC. The novel MALAT-1 transcript is a noncoding RNA of more than 8000 nt expressed from chromosome 11q13. It is highly expressed in lung, pancreas and other healthy organs as well as in NSCLC. MALAT-1 expressed sequences are conserved across several species indicating its potentially important function. Taken together, these data contribute to the identification of early-stage NSCLC patients that are at high risk to develop metastasis. The identification of MALAT-1 emphasizes the potential role of noncoding RNAs in human cancer.