NGS Errors Due to Damage in DNA Materials

[PMID:28209900] [Science]

DNA damage is a pervasive cause of sequencing errors, directly confounding variant identification

It demonstrates that “many so-called low-frequency genetic variants in large public databases may be due to DNA damage” — DNA material itself, not the calling. “To estimate the extent of damage in public data sets, we determined the GIV scores of individual sequencing runs from the 1000 Genomes Project and a subset of The Cancer Genome Atlas (TCGA) data set. Both data sets showed widespread damage, particularly those leading to an excess of G-to-T variants. Specifically, 41% of the 1000 Genomes Project data sets had a GIVG_T score ≥ 1.5, indicative of damaged samples. Furthermore, 73% of the TCGA sequencing runs showed extensive damage, with a GIVG_T > 2. This indicates that the majority of G-to-T observations are erroneous and establishes damage as a pervasive cause of errors in these data sets.”

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s