It has long been known that canonical 5' splice site (5'SS) GT>GC variants may be compatible with normal splicing. However, to date, the actual scale of canonical 5'SSs capable of generating wild-type transcripts in the case of GT>GC substitutions remains unknown. Herein, combining data derived from a meta-analysis of 45 human disease-causing 5'SS GT>GC variants and a cell culture-based full-length gene splicing assay of 103 5'SS GT>GC substitutions, we estimate that ~15-18% of canonical GT 5'SSs retain their capacity to generate between 1% and 84% normal transcripts when GT is substituted by GC. We further demonstrate that the canonical 5'SSs in which substitution of GT by GC-generated normal transcripts exhibit stronger complementarity to the 5' end of U1 snRNA than those sites whose substitutions of GT by GC did not lead to the generation of normal transcripts. We also observed a correlation between the generation of wild-type transcripts and a milder than expected clinical phenotype but found that none of the available splicing prediction tools were capable of reliably distinguishing 5'SS GT>GC variants that generated wild-type transcripts from those that did not. Our findings imply that 5'SS GT>GC variants in human disease genes may not invariably be pathogenic.
Keywords: canonical 5′ splice site; full-length gene splicing assay; genotype and phenotype relationship; human gene mutation database; human inherited disease; noncanonical splice donor site.
© 2019 Wiley Periodicals, Inc.