The Magic and Marvel of Genomic Medicine

DNA graphic

The rise of genomics in medicine is clear, as even a brief search of the scientific literature will attest. Study after study in just the past few years has shed an ever brighter light onto the mostly uncharted and incompletely understood world of genomics and its relationship to human health and disease. Facilitating this deep analysis of genetic variation has been the stunningly fast advance of technology. As the partnership between science and technology fuels further discoveries, genomics—whether in the form of whole-genome, whole-exome, or next-generation sequencing—is steadily and inexorably moving closer to clinical practice. Yet as spectacular as advances in the field have been, considerable heavy-lifting remains before genomic analysis enters everyday practice, as attendees at the 2012 AACC Annual Meeting in July learned.

"We find ourselves today with a tremendous amount of data, remarkably powerful technologies, and increasingly golden opportunities to use genomics in ways to improve medical care and eventually to think about better approaches to treat patients with all sorts of different types of disease," said Eric Green, MD, PhD, director of the National Human Genome Research Institute (NHGRI), which is part of the National Institutes of Health (NIH). "I would tell you quite candidly we have at best a Cliff Notes view of the human genome. We have a tremendous amount to learn. We will be decades from now still interpreting and reinterpreting the human genome sequence like a great novel. Fundamentally, we know quite a bit, but it's also inspired a tremendous amount of work that's still going to be required."

Green made the remarks as the Wallace H. Coulter Lectureship speaker at AACC's 2012 Annual Meeting and Clinical Lab Expo. The meeting, with a theme of "Genomics: The Future of Laboratory Medicine," featured many experts who addressed the genomic revolution and its impact on clinical care and laboratory medicine. Green was not alone in, on the one hand, expressing excitement and wonder about all the knowledge gained thus far, and on the other, feeling humbled by the task ahead in translating that knowledge into routine clinical care.

Eric Green, MD, PhD, laid out a long-term vision of how genomic research will advance over the coming decades. "We will be decades from now still interpreting and reinterpreting the human genome sequence like a great novel."

Taking the Long View

In considering how the work of translating basic genomic discoveries into clinical relevancy might unfold, Green and his colleagues in 2011 developed a strategic vision for the future of genetic research, which outlines five domains in the field: understanding the structure of genomes; understanding the biology of genomes; understanding the biology of disease; advancing the science of medicine; and improving the effectiveness of healthcare (Nature 2011; 470:204–13).

The Human Genome Project, which produced the first draft whole genome sequence, had, by the time it concluded in 2003, largely addressed the first domain, understanding the structure of the genome, and between 2004–2010, much about the biology of genomes was elucidated. The coming decade, Green noted, will be focused primarily on the third domain, understanding the biology of disease. However, the tasks of advancing the science of medicine and improving the effectiveness of healthcare will take decades to achieve. "Genomics is not going to solve everything immediately, even in a decade or two or three. It's a hard road ahead in doing any of these things effectively. We also have to manage expectations in realizing it's going to be many years in going from the most basic information about our human genome sequence to actually changing medical care in any serious way," he predicted.

Assessing Risk in Cardiovascular Disease

Plenary speaker Robert Roberts, MD, outlined some of the reasons that it will take quite some time to elucidate the role of and devise treatment responses to genetic risk in cardiovascular disease (CVD) and move the field into Green's proposed fifth domain—improving the effectiveness of healthcare. Roberts, who is president, chief executive officer, and chief scientific officer of the Ottawa Heart Institute and director of the Ruddy Canadian Cardiovascular Genetics Centre, led a consortium that in 2007 found that the 9p21 variant conveys independent risk for CVD, adding up to 40% over conventional factors in those who are 9p21 homozygous (Science 2007:316;1488–91). Analyzing data from large study populations, the investigative team determined that 9p21 occurs in about 75% of the population worldwide, and is present in all ethnic groups except Africans. By the end of 2011, Roberts's team and other researchers had identified and verified in independent populations 36 risk variants for CVD. He explained that while scientists now understand how 13 of the variants act, for the remaining 23, "the true mechanisms are unknown at this moment as to how they contribute to the pathogenesis of cardiovascular disease. But it is important to realize that this indicates very clearly there are several factors associated with risks through mechanisms yet to be identified."

Complicating this work is the fact that "most of these markers occur in regions of the genome that are not coding for proteins," explained Roberts. "They are presumably in regulatory regions, promoter, or depressor elements, so they either turn on or off downstream or upstream messenger RNA that does make proteins. That too does make it difficult."

Another challenge ahead is teasing out each variant's contribution to risk. "They probably act together in some synergistic way, and if that is true, then we'll be underestimating the total heritability of these factors. At the moment, no one has come up with a formulation as to which is correct, so we simply add their effects," said Roberts. "Not until we get past 17, 18, 19 risk factors do we see the separation between cases and controls. This indicates that an individual's genetic risk for cardiovascular disease is dependent on the number of genetic risk factors you've inherited rather than any particular one."

Although genotyping for 9p21 is available commercially, Roberts cautioned about any practical use of the test clinically at this time. "I don't think there is any evidence that we should go test people for these genetic variants, including 9p21. In medicine we say, if you don't have something to treat a condition differently you might not want to know it anyway. But at the moment the only way we have therapeutically to prevent heart disease is through statins to lower cholesterol," he said. "Certainly when you go to the people who pay for these tests they want to know why you're doing it, and simply telling a patient he or she is a bomb about to explode and we've got nothing to change it will probably not impress payers."

Robert Roberts, MD, traced researchers' progress in determining the complicated influence of genetics in cardiovascular disease.

Even as Roberts emphasized the challenges ahead in fully elucidating and developing treatments to address genetic risk in CVD, plenary speaker Elaine Mardis, PhD, described the work of her lab on the frontlines of genomic research. The Genome Institute of Washington University in St. Louis, where Mardis is co-director, director of technology development, and professor of genetics—one of only three federally funded genome centers in the U.S.—is pushing the envelope in combining whole-genome and next-generation sequencing to understand cancer pathology and study tumor heterogeneity, particularly in acute myeloid leukemia (AML).

After analyzing hundreds of AML cases, Mardis's team has determined that there are an average of 11 genetic mutations per AML genome across all subtypes of the disease. Deep sequencing with 1,000-fold coverage of variant regions has revealed other important insights. "The high-depth, next-generation sequencing of the captured regions provides a digital read-out of the allele frequency of each mutation in the tumor cell population, enabling us to calculate the allele frequency of each mutation. The logic here is that the prevalence of a mutation reflects its history in the tumor's evolution, where older mutations are present at a higher prevalence and newer mutations exist at a lower variant allele frequency," explained Mardis. "By combining the digital data and statistical approaches, we can model the tumor heterogeneity for any sample we've sequenced to depth. We can determine the mutational profile of each subclone represented in the tumor cell population and in what proportion it exists. This is ultimately a very powerful approach for teasing apart this heterogeneity."

Mardis's work also has been instrumental in pinpointing the cause of individual cases of AML relapse. This led her team to describe two models of relapse, including at least one case in which by traditional pathology a patient appeared to be in remission after a second round of chemotherapy, but who in fact had residual disease and was therefore disqualified for the recommended therapy, bone marrow transplant. However, genetic analysis revealed that the patient had extraordinarily high levels of fms-related tyrosine kinase 3 (FLT3) expression, enabling this individual to be treated successfully with the FLT3 inhibitor, sunitinib, and to qualify subsequently for bone marrow transplant. "In essence, this is a success story at this time, and we're happy to report it. We know that every patient won't have this comprehensive analysis. There are logistical and payment barriers to implementing it for all patients, but we think there's a place for it in the clinical workup of patients as we move into this more genomic era of cancer treatment and therapy," Mardis observed.

Based upon this work, Mardis sees a future for what she calls diagnostic trials. "These are not clinical trials because they're fundamentally a different beast. You're looking at an N of one. You're looking at each patient as a microcosm of themselves," she explained. "Cancer biopsies are studied by whole genome sequencing, exome, and transcriptome integrated analysis where whole genome sequencing drives discovery, exome data provides a surrogate for deep digital read data looking at heterogeneity in the disease and the subclone architecture of that patient's disease. The transcriptome tells us about aberrant gene expression and validates any fusion events that we've predicted from the structural variant detection out of whole genome data. On top of all this, we have interpretive analysis."

Elaine Mardis, PhD, spoke of her lab's use of whole-genome and next-generation sequencing to understand cancer genomics.

Is the Sequencing Process a Commodity?

If Mardis's work is on the very forefront of genomic medicine, other laboratorians also are bringing whole-genome and next-generation sequencing to bear in clinical settings, and in several annual meeting sessions they offered a treasure trove of insight to colleagues who might be considering these applications. All shared a sense of wonder at how fast sequencing technologies are advancing, and how quickly costs are coming down, causing several to question whether some organizations might subcontract the actual sequencing process to specialized labs.

"All the platforms have different characteristics and they're changing rapidly. Every three-to-six months each company comes up with a new version of their product, a new machine, a new chemistry, and every month one of the companies comes up with a breakthrough. So it's really challenging. Do you want to invest in a technology that's going to be obsolete in the next six months? Or do you just want to outsource the sequencing, which is becoming a commodity very fast?" observed Murat Sincan, MD, research fellow at the Medical Genetics Branch of the National Human Genome Research Institute. Sincan has been instrumental in developing bioinformatics used by NIH's Undiagnosed Diseases Program (UDP) to analyze sequencing data. The 4-year old UDP, which has evaluated about 500 patients with longstanding medical conditions that have eluded diagnosis, expects to expand to five or more extramural sites over the next 7 years.

How Deep Should Base Coverage Be?

Even as whole-genome and next-generation sequencing technology continues to advance rapidly, Sincan and others emphasized that the various methods all have limitations. Sequencing accuracy and unequal coverage of different regions of the genome are but two of many challenges.

"These tests and technologies that we use are not entirely complete or fully accurate. With these new sequencing methods, we often have to take additional steps to confirm what we find. There's a ratio of the amount of DNA sequenced to amount of variants detected and the more you sequence, the more variants you find. It takes a lot of time and energy to evaluate these variants, and even when we do our best job, we still don't often know what all the variants mean. This is a challenge for implementing these tests in the clinical lab," explained Heidi L. Rehm, PhD, FACMG, assistant professor of pathology at Harvard Medical School and director of the Laboratory for Molecular Medicine at the Partners Healthcare Center for Personalized Genetic Medicine in Boston.

Labs also need to balance depth of coverage with the desired sensitivity of mutation detection, with higher base coverage required for detection of rarer variants. This can be problematic in that the various sequencing methods yield different levels of coverage in different parts of the genome. "You're going to have uneven coverage, where some regions are not going to be covered at all and others covered very well. For example, we have major issues for exon 1 in a number of different genes," said Andrea Ferreira-Gonzalez, PhD, director of the molecular diagnostics laboratory at Virginia Commonwealth University in Richmond. "If you have very low coverage you don't know whether a variant was an error during the sequencing or if it's actually real, and so you won't have confidence in what you're calling. These are major issues that we're dealing with right now, and we're starting to learn more and more is gene- and exon-dependent."

Becoming Big Data Scientists

To a person, the speakers were in agreement that being able to efficiently and accurately interpret sequencing data is by far the greatest challenge labs face in implementing whole-genome, whole-exome, or next-generation sequencing. "These machines that sequence DNA spew out so much data that they simply overwhelm our ability to interpret it as of today. This is new for us in the research community," said Green. "Suddenly we find ourselves as big data scientists. We didn't use to be big data scientists. The particle physicists, the astronomers, those are the big data scientists. But biomedicine? These technologies are putting us into the realm of big data and actually creating a bottleneck. It's a bottleneck a lot of us at NIH are trying to figure out what we can do to break down."

Others spoke in detail about just how taxing it can be to analyze the raw sequence output. Rehm reported that a review in her lab found that it takes anywhere from 22–120 minutes to assess each variant, depending on whether reference data about the variant already exists. In the case of Partners, this process is repeated about 300 times each month, with about 25,000 variants having been curated through the program's rare disease testing process. "At the beginning I thought, once we've tested a few thousand patients with each disease we'll have figured out all the variants and then it'll be easy sailing. But that's turned out not to be true." She went on to explain that even in the case of mature testing programs, such as for hypertrophic cardiomyopathy mutations, about 17% of variants detected continue to be novel.

Ferreira-Gonzalez also painted a sobering picture of the data storage dilemma. "One whole genome sequenced takes about one terabyte of data. Go talk to your information technology department and tell them you have one terabyte of information per patient and that you have 20,000 patients you need to store per year and see what they say," she said. "We also need to ensure access to the data in a useful manner, but we can't put a terabyte of data into each patient's electronic medical record. More importantly, under CLIA '88 we're supposed to keep archives of data at least five years, and in some cases 10 to 20 years. We can't keep a terabyte of data for 20 years. It's just prohibitive. So some sites are actually re-running the sample instead of storing the images because it's a lot cheaper."

Even with these perplexing challenges to be worked out, experts all expressed unbridled enthusiasm about the unfolding world of genomic medicine and its role in clinical care.