Listen to the Clinical Chemistry Podcast
Gregory J. Hogan, et al. Validation of an Expanded Carrier Screen that Optimizes Sensitivity via Full-Exon Sequencing and Panel-wide Copy Number Variant Identification. Clin Chem 2018;64:1063-73.
Dr. Dale Muzzey is the senior director of scientific affairs and staff scientist in computational biology at Counsyl, a company offering clinical DNA screening and genetic counseling.
This is a podcast from Clinical Chemistry, sponsored by the Department of Laboratory Medicine at Boston Children’s Hospital. I am Bob Barrett.
In the age of genomic medicine, decreasing costs and rising quality of sequencing-related technologies means it's now possible to routinely screen parents and prospective parents of any ethnicity for hundreds of inherited conditions in a single sample.
Previously, carrier screening has been used in at-risk and specific ethnic populations, or for a limited number of diseases. The broader screening approach, called expanded carrier screening, was endorsed in 2017 by the American College of Obstetricians and Gynecologists as an acceptable strategy for preconception and prenatal carrier screening. This is expected to have a significant clinical impact, as targeted screening approaches likely have missed a large number of affected pregnancies due to the limited conditions tested, the unknown ethnic ancestry of many individuals, and the fact that the majority of children born with a genetic disease have no family history of the condition.
When provided with results that their pregnancy is at-risk for a severe condition, study show approximately 80% of such couples pursue alternative reproductive options. Given these consequences and the relative rarity of the included diseases, expanded carrier screening panels must have a high detection rate to correctly identify at-risk couples, and to minimize the residual risk when only one partner is tested positive. Though next generation sequencing has revolutionized genetic testing as a whole, advances in detecting copy number variants by next generation sequencing have been critical to improving the analytical quality of expanded carrier screening panels.
An original research article appearing in the July 2018 issue of Clinical Chemistry describes these achievements while establishing the analytical validity of expanded carrier screening and quantifying how many pregnancies could be impacted by this approach in the general U.S. population.
For this podcast, we are joined by the article’s senior author, Dr. Dale Muzzey. He is the senior director of scientific affairs and staff scientist in computational biology at Counsyl, a company offering clinical DNA screening and genetic counseling. So Dr. Muzzey, can you please explain the mechanics of expanded carrier screening in the clinic, such as, what types of diseases are being tested for and how patients receive the testing?
That’s a good question. Patients receive the testing from their doctor and they will typically do this with a simple blood test, we also offer saliva testing, you can extract DNA from the cells in that sample. And so, it basically begins with that type of sourcing of the sample and that gets sent to the lab. And the diseases that we’re testing for are primarily autosomal recessive diseases and the others are x- linked diseases.
I’ll focus on the first for a minute. With an autosomal recessive disease, we’re looking at a single gene disorder. So, a disease that’s caused by having zero functional copies, so both of the two copies, the one from your mom, the one from your dad, both of them don’t work, that causes the disease in this case. However, if you had one functional copy, then you’re actually okay, and that just makes you a carrier who is asymptomatic. And what carrier screening is really looking for is prospective parents, where both parents are carriers of a disease, so they're asymptomatic, but have one dysfunctional copy.
And that makes that couple at a 25% risk of having a child inherit zero copies that are functional and making that child affected. And so we call these “at-risks couples” and their identification is really the goal of carrier screening. So, I described that now for an autosomal recessive disease which again, is the predominant type of disease on the panel. There are also some x-linked disorders such as Fragile X Syndrome where male children are primarily at- risk because they lack an X chromosome from their father. Only the mother needs to be a carrier to put that child at- risk, and so we have to screen for several of those.
And why is high sensitivity particularly important in carrier screening relative to other types of clinical genomic screens?
Right. So, clinical genomic screening needs to have high sensitivity, irrespective of what's being tested. But there's a particular wrinkle to carrier screening that makes that sensitivity super important. What you’re really trying to do with carrier screening is detect at-risks couples. And so, that means that you need the test to be very sensitive on -- really both parent are kind of running the test twice to get it's sort of intended outcome.
And so, if a test can only detect carriers with a sensitivity of say 90%, then it's only able to find a real at-risk couple with the square of that 90%, so like 81% efficacy. And this starts to get really bad if you have 50% sensitivity, say, for an individual, then your ability to find the couple becomes only 25%. And so, it's really important because of the squaring of the sensitivity or the detection rate that it be as high as possible.
The screen you validated is a panel of 235 genes. What's special about the number 235? Why is that the magic number? Why not have a 500-gene panel, and conversely what would be wrong with a 100-gene panel?
It’s a really good question. So, there's nothing particularly sacred about the number 235, but it’s the number of genes on our panel that satisfied our inclusion and exclusion criteria which is something that we actually published last year. So, we surveyed more than 650 genes to ultimately get down to that 235. And the criteria that we used prioritized severity, prevalence, actionability, and sensitivity and I can go into each of those a little bit.
We don't want to be screening for super mild conditions. We really want to be only screening for those that are having severe or profound effect on the health of the child. And when I say severe or profound, what I mean by that, would be once it impact the intellectual ability of the child or the life span. So, that was severity.
Secondly, the prevalence, it's actually important not to screen for super, super rare diseases. And the primary thing there is that -- well, first of all, very unlikely to find couples who are at-risk for them. But some of the diseases that are on panels have really only a few affected people even known in the world. And so, our understanding of the genetic of that disease is limited and it becomes then harder to find at-risk couples and to identify mutations in their DNA as actually being disease-causing.
Another criteria was actionability, and so there are some sort of more mild diseases on the panel for which may be a change in diet in the mother during pregnancy can actually help to avert that disease. And so, that has a very clear course of action that can actually help the health of the pregnancy.
And the final criteria really is the assay being able to have high sensitivity. This is really important, from the assay perspective, where if there is something about this particular gene that makes it such that we can only have 20% sensitivity for it at the level of an individual, then our ability to find a couple is again, the square of that 20%. So, it gets to a point where it’s really not even worth screening for because you aren’t sensitive to mutations in it.
So, that's how we got to the 235 genes that we validated in this paper. The arguments against a much larger panel, comes from the same criteria that I went through where at some point the prevalence actually gets to be so low that you really don't have good sensitivity for them, because you can't interpret the results because so few people have had the disease that it’s really not well understood. And another note on that 500 is that these diseases become less and less and less common.
So, ultimately, the impact on the test is actually not that much higher. So, in terms of finding at-risk couples, adding 500 or a 1,000 more genes, even though it's technically possible, has have a very small marginal increase in the number of identified at-risk couples. Now, the problem with a smaller gene finally is that at some point you actually are really just missing at-risk couples, you're not tapping into the entire potential of the test. And so, there will be affected children that could have been identified via screening.
A major focus of the paper is support for exon level copy number variant identification. First, can you better explain how this impacts the efficacy of the test for patients? And second, what was your strategy when validating these variants? What type of tuning was required to ensure good performance across the panel?
So, finding the exon level of copy number variants impacts the efficacy of the test by really boosting the detection rate. And when I say detection rate, I mean the ability of the test to identify pathogenic variants in that gene. And so, those variants can take multiple different forms. They can be single-base substitutions. They can be short insertions or deletions, but some of those variants also include entire deletions or insertions of a whole exon, like large parts of the gene. And those are frequently deleterious and so, if the test does not interrogate those types of mutations, then it has a sub-100% detection rate. So, conversely, being able to find those types of mutation makes the detection rate of our test very high.
And this has value not just for being able to identify couples at-risk and individuals who are carriers, but it also has, from a clinical point of view, high value in terms of boosting the negative predictive value of the test. So, what I mean by this is, this is a very common work flow for expanding carrier screening, is to test the mother first. And if she is found to be a carrier, then you test her reproductive partner to see if he is a carrier. And when the mother screens positive, and then the father screens negative, that negative has a far higher impact to that couple if you can be very confident that that person is negative.
By screening for copy-number variants and the other types of mutations and having a very, very high detection rate, that negative result has very high value. You know that you haven't left something on the table where there’s still a residual risk disease. Really underscoring the clinical impact of looking for these copy-number variants, is the fact that we identified 74 patients who had single exon deletions. We really tuned the assay, so it can be very, very sensitive to both large and small deletions. We identified several hundreds who had multi-exon deletions, but the test characterized in this paper found again, 74 patients with single exon deletions out of a cohort of 36,859.
So, the reason this test is important again, is because if it doesn't offer copy-number variants, which again most of them don't, and it isn’t validated to show that they can find them with high efficacy, then really those 74 patients who could potentially be at-risk for affected child would not be identified. That's how it really impacts the clinical efficacy of the test for patients.
Now, validating these was nontrivial. A typical way of validating a test is you buy samples or source samples that have known genotypes and run your test on them to ensure that what you found is concordant with what was known about them. The problem is that there really aren't many samples available that have copy-number variants that we can test. And so, we got our hands on all of the ones that we could in the genes that are on our panel, the 235 that we validated. And this included the 44 copy-number variants in commercially available samples. But as I said, we have 235 genes, each of these genes has five to 20 exons. And so, those 44 samples were not really sufficient to demonstrate that our test is effective at finding copy- number variants anywhere and of any length.
So we used in silico simulations, where we made computationally more 250,000 samples that harbor copy- number variants of different size and location and then ran our calling algorithm on those simulated samples and that's what really helped us demonstrate that it was sensitive and specific for finding those. In terms of tuning the assay, those simulations were actually super helpful too, because in the initial stages of development, we could use the simulations to point out areas in the test where we may not have been as sensitive as we needed to be and then we could go back in with additional molecular tricks to basically boost that sensitivity, such that it was robust and best serving patients.
Doctor, your paper discusses several relatively common diseases like congenital adrenal hyperplasia that have technically challenging molecular biology. What makes them challenging and what is the impact on the clinical efficacy of the panel?
The screen relies on capturing DNAs from genes of interest and finding variants in that DNA. Now, sometimes a gene has a dysfunctional copy called a pseudogene that has very, very similar DNA sequence to the gene itself and this makes it very hard to distinguish them. Further, the pseudogene is typically dysfunctional because it has pathogenic variants within it. And so, when you put all this together, it's really hard because when you're doing the DNA sequencing of the test, you can't really tell sometimes if a fragment is from the gene or the pseudogene. And if you sequenced a fragment from the pseudogene, you might think that the gene has a pathogenic variant in it and issue a report to the patient saying that they’re a carrier, incorrectly.
You really need to be careful with the biology, the bioinformatics, so that you can properly distinguish the source of the fragments that you're sequencing. This requires a ton of care and a lot of customization. And ultimately though, by doing that, you're able to resolve whether the gene itself contains pathogenic variants, kind of irrespective of what’s in the pseudogenic copy. The reason that’s very important to do is that it’s technically challenging to perform sequencing on these regions, but it's important to realize that also is very complicated for the genome to replicate and deal with those sequences itself.
So, there's actually a lot of interchange between genes and pseudogenes. So they’re very dynamic regions where you can frequently get pathogenic variants moving from the pseudogene into the gene itself. So, that means actually these diseases that are technically challenging are also some of the most common. So, it was very important to develop custom tests for them because they have a really high impact on the efficacy of the test. So, for example congenital adrenal hyperplasia, like you mentioned, is expected to affect about 9.4 birth per 100,000, and that’s actually a pretty massive component of the test’s total ability to find the at-risk couples. So, it was very important to add those to the test.
Well finally, Dr. Muzzey, what's next? What do you see for the future in expanded carrier screening?
There are some technical improvements that we can make to the test. There could be a few additional single gene disorders that could be added. I think that multi-polygenic risk is something that will become much more common in carrier screening. So, this is for diseases that are influenced not by just one gene, but by variant in multiple different genes. However, I think that another thing that's really important for carrier screening is just increased awareness and wider adoption of the test.
There’s still many doctors that don't offer or are aware of expanded carrier screening, and many insurers that do not expressively have expanded carrier screening in their medical policy. That's one of the reasons that papers like this are extremely important, is to show that it has analytical utility, that is able to identify very well variants in patients’ DNA across many, many different genes. Speaking to that limited uptake, there’s about 45% of pregnancies who undergo carrier screening.
And within that, 45% of them are expanded carrier screening. So, there’s really a lot of room for growth there. But what I think is important to stress is that growth actually makes the test better ultimately because it increases our scientific knowledge of whether people are affected or not, if they're carriers or not. As you build up the statistical heft behind the screening, it actually becomes easier to interpret variants that we observe in patients’ DNA and so, it actually makes the test itself better. So that’s the horizon for expanding carrier screening.
Dr. Dale Muzzey is the senior director of scientific affairs and staff scientist in computational biology at Counsyl, a company offering clinical DNA screening and genetic counseling. He has been our guest in this podcast from Clinical Chemistry. I am Bob Barrett. Thanks for listening.