New software tool identifies genetic mutations that influence disease risk

Unique, robust method uses family data to find mutations for both common and rare diseases

MD Anderson News Release 05/30/14

Researchers at The University of Texas MD Anderson Cancer Center and other institutions have applied a newly developed software tool to identify genetic mutations that contribute to a person’s increased risk for developing common, complex diseases, such as cancer. The research is published in the May 2014 edition of the journal Nature Biotechnology.

The technology, known as pVAAST (pedigree Variant Annotation, Analysis and Search Tool), combines two different statistical methods used for identifying disease-causing gene mutations. This combination approach outperforms individual familial analysis methods by increasing power or speed in which mutations are identified, and reducing complications through study design and analysis.

“This method allows for faster, more efficient identification and validation of genetic variants that influence disease risk,” Chad Huff, Ph.D., professor of Epidemiology at MD Anderson. “This will eventually enable clinical labs to design genetic tests that provide better predictions of a person’s individual risk of developing cancer.” 

The pVAAST tool combines two commonly used disease-gene identification methods, linkage analysis and association tests.  Linkage analysis tracks the inheritance of genetic mutations in families to identify possible causal mutations. Association tests compare unrelated individuals with a specific disease to healthy individuals in search of a common mutation in one group or the other.

“Linkage analysis and association tests were initially designed for sparse genetic markers available from earlier genotyping techniques,” said lead author of the study, Hao Hu, Ph.D., a postdoctoral fellow of Epidemiology at MD Anderson. “pVAAST integrates these two methods and repurposes them for next-generation DNA sequencing data, which is the state-of-art technique for genetics research. It fills in a genuine gap between molecular techniques and computational tools in familial disease studies.”

Researchers also incorporated functional variant prioritization into the tool, which predicts whether a particular mutation in a family is damaging.

For this particular study, pVAAST analyzed data to identify the genetic causes of three diseases; enteropathy – a chronic inflammation of the intestine, cardiac septal defects and Miller Syndrome – a developmental defect of the face and multiple limbs. The tool was able to identify the exact mutations causing these diseases from DNA data from a single family.  In the cardiac septal defects and Miller Syndrome families, the casual mutations had previously been identified and the results served as a proof of concept.  In the enteropathy family, the causal mutation in the family was unknown prior to the analysis.

In addition, researchers applied pVAAST and three other statistical methods to three models of genetic disease: dominant, in which one defected copy of the gene is inherited from a parent; recessive, in which both copies of the gene must have the defect; or dominant caused by a new mutation not inherited from either parent.  In each case, pVAAST required a fraction of the sample size of families to detect disease risk as the other methods did.

“For most rare diseases it is challenging to collect DNA samples from multiple patients,” said Huff. “This makes it essential to be able to incorporate relatives in a study to improve the success rate.”

In this study, the combined methodology of using multiple statistical methods increased the power of the results and reduced the complexity of the analysis. “This provides a gateway to identifying genetic variants that influence the risk of developing specific cancers,” said Huff.

Many ongoing genetic studies recruit patients with a family history of cancer to search for inherited mutations that increase the risk of developing cancer.  Huff says this tool will allow researchers to analyze the sequence data from these families to identify the genetic variants that are most likely responsible for the history of cancer in the families.

Huff says moving forward the major focus for the software will be to uncover new cancer-susceptibility genes.  In a separate paper published in Cancer Discovery, the software was used to support the discovery of RINT1 as a new breast-cancer susceptibility gene.  
“The identification of potential cancer susceptibility genes is only a first step, and years of additional research are required to characterize these variants to conclusively establish the degree to which they influence cancer risk,” said Huff.

Co-authors with Huff and first author Hu are Shankaracharya, Ph.D., and Paul Scheet, Ph.D., both of MD Anderson; Jared Roach, Ph.D., Gustavo Glusman, Ph.D., Robert Hubley, Ph.D., Hong Li, Ph.D., and Leroy Hood, Ph.D., all of Institute for Systems Biology; Mark Yandell, Ph.D., Hilary Coon, Ph.D,, Stephen Guthery, M.D., Sean Tavtigian, Ph.D., Wilfred Wu, M.D., Ph.D., Lynn Jorde, Ph.D., Barry Moore and Karl Voelkerding, M.D., all of  the University of Utah; Shuoguo Wang, Ph.D. and Jinchuan Xing, Ph.D., both of Rutgers, The State University of New Jersey; Rebecca Margraf, Ph.D. and Jacob Durtschi, both of ARUP Institute for Clinical and Experimental Pathology; Vidu Garg, M.D., of  Ohio State University; David Galas, Ph.D., of the University of Luxembourg; Deepak Srivastava, M.D., of Gladstone Institute of Cardiovascular Disease and the University of California; and Martin Reese, Ph.D., of Omicia, Inc.

This work was funded by grants from the National Institutes of Health (NIH) (R01 GM104390), (R01 DK091374), and (R01 CA164138) as well as the University of Luxembourg’s Institute for Systems Biology Program. HH was supported by MD Anderson’s Odyssey Program.  JX was supported by the NIH (R00HG005846). An allocation of computer time on MD Anderson’s Research Computing High Performance Computing (HPC) facility is gratefully acknowledged.