Whole-genome bisulfite sequencing (WGBS) is a popular method for characterizing cytosine methylation because it is fully quantitative and has base-pair resolution. While WGBS is prohibitively expensive for experiments involving many samples, low-coverage WGBS can accurately determine global methylation and erasure at similar cost to high-performance liquid chromatography (HPLC) or enzyme-linked immunosorbent assays (ELISA). Moreover, low-coverage WGBS has the capacity to distinguish between methylation in different cytosine contexts (e.g., CG, CHH, and CHG), can tolerate low-input material (<100 cells), and can detect the presence of overrepresented DNA originating from mitochondria or amplified ribosomal DNA. In addition to describing a WGBS library construction and quantitation approach, here we detail computational methods to predict the accuracy of low-coverage WGBS using empirical bootstrap samplers and theoretical estimators similar to those used in election polling. Using examples, we further demonstrate how non-independent sampling of cytosines can alter the precision of error calculation and provide methods to improve this.
Keywords: Asymptotic estimator; Bootstrap sampling; DNA methylation; Epigenetics; Low-coverage bisulfite sequencing; Methylation erasure; Post-bisulfite adaptor tagging; Whole-genome bisulfite sequencing.