May 2009: Volume 36, Number 5
Where do Labs Fit In?
By Bill Malone
Perhaps more than any other healthcare-related project to receive funding from the American Recovery and Reinvestment Act (ARRA), comparative effectiveness research has sparked anxiety and curiosity on the part of the public and those working in healthcare, including laboratorians. Even though ARRA specifically prohibits aiming federally funded comparative effectiveness research at decisions about coverage or reimbursement, the daunting rise of healthcare costs and the increasing government deficit have left many wondering whether this might become a method to ratchet down payment or take away choices from consumers.
According to experts, clinical labs and other providers are unlikely to see a change in Medicare reimbursement in the short term due to comparative effectiveness research. However, laboratorians do have an immediate opening to get involved in the direction that comparative effectiveness takes in their profession. As part of the wider concept of evidence-based medicine, comparative effectiveness research—whether federally funded or not—represents a new take on connecting lab medicine with outcomes and refining the questions surrounding test utility and value.
“A lot of labs are concerned about getting accurate and precise results with the turnaround required by the clinician, and then they leave it at that,” said Christopher Price, PhD, visiting professor in Clinical Biochemistry at the University of Oxford and an independent consultant. “We have focused very strongly on the basic and analytical science of laboratory medicine, but now we have to go further. The essence of comparative effectiveness is looking for the most effective test or treatment. Ultimately, true effectiveness is measured in the context of an improved health outcome—and this applies as much to a test as to a treatment.”
Facets of Effectiveness Research
Common Terms Found in Discussions of Comparative Effectiveness
Primary clinical effectiveness research. Primary clinical effectiveness research refers to structured research protocols to produce data on the results of one or more diagnostic or therapeutic interventions of interest. Examples include certain randomized controlled trials, practical clinical trials, cluster randomized trials, observational studies, and cohort studies, including registries. Some of these studies focus only on the efficacy of an intervention—the extent to which an intervention produces a beneficial result under ideal circumstances. But many also examine the effectiveness of an intervention when used under ordinary circumstances—including evaluation in broader patient populations and healthcare delivery settings, or the relative risks and benefits of competing therapies. Both types of evaluation are important to an understanding of which interventions work best, for whom, and under what circumstances.
Evidence synthesis. Evidence synthesis or secondary clinical effectiveness research refers to the structured assessment of evidence from multiple primary studies to derive conclusions, considered to have greater weight than an individual study alone. This includes systematic review and technology assessment, which both describe a systematic method of identifying, assembling, and interpreting a body of data to validate or extend the interpretation of single trials, lend context to individual trials, and, where possible, arrive at common conclusions. Systematic reviews are frequently published through the peer-reviewed literature, and many assessments are more narrowly tailored to assist in policy or practice decision-making.
Comparative effectiveness. Within the overall umbrella of clinical effectiveness research, studies of comparative effectiveness consider one diagnostic or treatment option to one or more others. Primary comparative effectiveness research involves the direct generation of clinical information on the relative merits or outcomes of one intervention in comparison to one or more others, and secondary comparative effectiveness research involves the synthesis of primary studies to allow conclusions to be drawn. Secondary comparisons of the relative merits of different diagnostic or treatment interventions can be done through collective analysis of the results of multiple head-to-head studies, or indirectly, in which the treatment options have not been directly compared to each other in a clinical evaluation, and inferences must be drawn based on the relative effect of each intervention to a specific comparison, often a placebo.
Source: Institute of Medicine. 2007. Learning What Works Best: The Nations Need for Evidence on Comparative Effectiveness in Health Care. IOM website
New Funding and Focus
ARRA will drive comparative effectiveness research via two avenues: money and policy. First, it allocates $1.1 billion that will be divided three ways: $300 million to AHRQ (Agency for Healthcare Research and Quality), $400 million to NIH, and $400 million to the Secretary for Health and Human Services (HHS). The stimulus package also includes legislation to create the Federal Coordinating Council for Comparative Effectiveness Research. This 15 member council is charged with helping all the agencies of the federal government synchronize their efforts, and will set priorities and make recommendations to the secretary on how to spend the $400 million allocated to HHS.
But the legislation also makes clear that the council will not recommend clinical guidelines for payment, coverage, or treatment. However, ARRA calls on IOM (Institute of Medicine) to recommend research priorities for the money. IOM must produce a consensus report by June 30, 2009 that offers specific recommendations to Congress and the HHS Secretary for how to spend the funds.
At a March 4 House Ways and Means Committee meeting on President Obama’s 2010 budget, White House Office of Management and Budget Director Peter Orszag emphasized that the administration will not create an entity like the UK’s National Institute for Health and Clinical Excellence (NICE), which uses comparative effectiveness to develop clinical guidelines weighing costs to justify coverage decisions by the country’s National Health Service. The NICE model had stoked fears that the Obama administration might be trying the same thing.
The funds directed to AHRQ will go to its Effective Health Care Program, which falls under the agency’s Center for Outcomes and Evidence. The Effective Healthcare Program synthesizes published and unpublished scientific evidence through comparative effectiveness reviews, which have yet to tackle laboratory tests. However, AACC has already submitted three lab-related suggestions to IOM’s Committee on Comparative Effectiveness Research Priorities as topics for AHRQ research (See Box, below).
AACC Recommends Three Comparative Studies to AHRQ
Under the American Reinvestment and Recovery Act, Congress set aside $1.1 billion for comparative effectiveness research. Of the total, $300 million went to AHRQ under its Effective Health Care Program that researches tests and treatments to determine whether there are significant advantages or disadvantages with different approaches. AACC has suggested three topics for research at AHRQ: one pertaining to tight glycemic control, one to iron deficiency anemia, and one involving breast cancer.
Tight Glycemic Control—Does the use of relatively more precise and accurate glucose testing methodology achieve better management and tighter glycemic control in critically ill adult patients, thereby producing better health outcomes (reduced morbidity, mortality and length of stay), as compared with use of simple glucometers, which have relatively poorer analytical performance?
Iron Deficiency Anemia—Determine whether hematologic measures such as percentage of hypochromic erythrocyes or reticulocyte hemoglobin content are better indices than serum iron, transferrin and ferritin for assessment of iron deficiency and for guiding iron replacement therapy to achieve targeted hemoglobin concentrations in patients with chronic renal failure who are being treated with erythropoietin.
Breast Cancer—Determine if premenopausal women with breast cancer and who have a poor or reduced metabolism genotype for CYP 2D6 should be treated with tamoxifen or an aromatase inhibitor.
The fact that the money remains to be spent is good news for the lab community, said Mary Nix, MT(ASCP)SBB, a project officer for AHRQ’s Center for Outcomes and Evidence. Nix oversees the National Guidelines Clearinghouse and National Quality Measures Clearinghouse. “The IOM is going to be looking to the lab community and other stakeholders to provide input as to how this money should be spent,” she said. “There are a number of ways that labs can participate in the efforts of AHRQ, and in particular about comparative effectiveness.” Nix noted that labs can nominate topics through the Effective Health Care Program website, as well as offer feedback on draft reports.
Right now, Nix is working on a method guidance document that will be used by systematic reviewers who are doing comparative effectiveness research or effectiveness reviews of diagnostics. “The lab field is who we automatically look to more than other types of diagnostics, and it would be good to have some really robust comments submitted once we put that methods guidance on the web,” Nix said. “I think that will be key to making the document really useful to systematic reviewers who are evaluating lab tests in and outside our program.” Nix said she expects the first draft to be completed around the middle of 2009.
So far, AHRQ has found it difficult to perform full-fledged comparative effectiveness reviews for laboratory medicine, mostly because the evidence base these studies try to build upon is limited to begin with. AHRQ performed an evidence review on BNP, but found they just couldn’t make a comparison, said Nix. The study ended up covering the effectiveness of different types of BNP assays in different patient groups and in different clinical settings. “There just wasn’t literature and data to do a comparison across like patients, across like care settings, and even the assays were very different. But the goal is that we will be able to do that some time in the near future,” she said. “It’s really a struggle to do comparative effectiveness reviews of lab tests because the evidence base is so different from drugs or devices. That’s why we really need the input from labs on this.”
CDC Uses Comparative Effectiveness for Quality Improvement
The concept of comparative effectiveness is being applied not just to looking at tests and treatments, but also at lab practices. Now in its third year, the CDC Division of Laboratory Systems and Battelle Memorial Institute are developing methods for systematic evidence reviews and evidence-based practice recommendations for lab medicine. The initiative focuses on practices in the pre- and post-analytic stages of the testing process, with methods being pilot-tested using specific practices related to the topic areas of patient specimen identification, communication of critical value test results, and blood culture contamination. The Laboratory Medicine Best Practices Initiative (LMBP) has developed a system that will use both published and unpublished evidence to fill gaps in the literature.
According to CDC Senior Economist Susan R. Snyder, MBA, PhD, who oversees the project, the LMBP’s systematic evidence reviews are, in fact, a kind of comparative effectiveness research. “Systematic reviews constitute a preferred, high quality form of comparative effectiveness research since they rely on multiple studies that meet specified inclusion criteria. In general, the aim is to develop and use a review methodology that systematically, transparently, and critically appraises existing research to synthesize knowledge in a particular topic area related to the effectiveness of an intervention or practice,” she said. LMBP’s main focus is comparing the outcomes of alternative practices that address the same quality improvement or patient safety issue. LMBP is currently about to release a technical report on the second year of its pilot phase, “Developing Systematic Evidence Review and Evaluation Methods for Quality Improvement.” This report details LMBP’s pilot phase evidence review and evaluation methods applied to specific practices associated with patient specimen identification (e.g., bar coding systems) and communication of critical value test results (e.g., automated electronic notification systems).
Laboratorians can keep track of the initiative’s progress online.
Effects on Coverage
Even though the new Federal Coordinating Council for Comparative Effectiveness Research is not allowed to make recommendations about payment or coverage, it’s still possible that the results of comparative studies could seep into CMS (Center for Medicare and Medicaid Services) coverage decisions. However, any changes that do happen will be measured and incremental, said Charles Root, PhD, founder and president of CodeMap, LLC. “There are very well-defined regulatory processes for reconsidering either national or local coverage policies, and these things work pretty slow,” said Root. “The payers themselves are not very fast at writing coverage and incorporating new data.”
Rather than CMS deciding absolutely to deny coverage for a test or treatment in light of compelling comparative effectiveness research, the more likely scenario is that specific coverage policies could be modified with a more a detailed policy, creating more restrictive conditions that define when a test or treatment can be used. And even if CMS comes out with a new restrictive coverage policy, there are always ways to modify it, said Root. “There are enough ways to either restrict or expand policy, and if people get smart and use them, that’s the way new research can get translated into the actual payment system. But it won’t happen quickly.”
When CMS or its contractors do tighten coverage for a test, a drawn out process ensues that is often left unresolved, hanging in bureaucratic limbo. For example, one of CMS’s largest contractors, National Government Services, issued a draft policy earlier this year that would have restricted coverage of vitamin D testing to patients with chronic kidney disease, osteomalacia, hypercalcemia, or rickets. AACC, the Endocrine Society, and other stakeholders voiced opposition to this move, noting that current literature shows that vitamin D testing is useful for many other indications such as osteopenia, diabetes, cancer, cardiovascular disease, and hyperthyroidism. Typically in such scenarios, the professional societies and other groups submit comments and basically rewrite the policy for the contractor, citing the most recent evidence. The Medicare contractor may then finalize the draft with a less restrictive coverage policy, or just sit on it and stall indefinitely, explained Root. In the case of vitamin D, National Government Services has yet to release a final coverage policy.
The only swift or dramatic coverage changes that could occur will come from Congress, according to Root. “The system itself is not really set up to save much money from comparative effectiveness research, even though Medicare is not explicitly prohibited from considering costs in coverage determinations. But if Congress and the public become aware of something that clearly doesn’t work, then Congress may step in and say, ‘no, we’re not going to pay for that.’”
Coverage changes could also be bundled into broader healthcare reform, said Root. When coverage changes are proposed for individual tests or treatments, stakeholders tend to align against them. But if change comes in a larger wave as part of more comprehensive reform, coverage restrictions stand a better chance of making it through.
Try this at Home
With funding from the federal government pushing new research and bringing comparative effectiveness to the forefront, laboratorians are taking a hard look at how this concept fits in to the evidence-based lab medicine principles the profession has grappled with for years. Comparative effectiveness research for lab tests can be relatively straightforward in theory but challenging in practice, since to really work it requires solid evidence of effectiveness of both tests being compared. “If one looks at lab medicine overall, I start off with the basis that the evidence base itself is not very good,” said Price. “We understand the science behind the marker, and we understand why the marker changes in relation to a particular condition. But of itself, that doesn’t naturally ensure that there’s utility on the part of the marker. And the consequence of that is there isn’t much evidence on effectiveness.”
Despite the difficulties inherent in collecting robust evidence, Price pointed out that comparative effectiveness research will draw upon the very same tools laboratorians have developed under the rubric of evidence-based laboratory medicine so far. In their book “Applying Evidence-Based Laboratory Medicine: A Step-By-Step Guide,” Price and his coauthor Robert Christenson, PhD, professor of pathology and medical and research technology at the University of Maryland School of Medicine in Baltimore, lay out a heuristic called the A5 cycle: ask, acquire, appraise, apply, and assess. This process aims to reliably translate evidence into practice, beginning with defining the clinical problem, and ending with assessing how well the evidence was applied, leading to an ongoing audit cycle that continually refines the evidence and its application.
Before taking on a head-to-head comparison of test vs. test, laboratorians should develop a firm grasp of whether performing one test has a significant effect outcomes to begin with, stressed Christenson. He offered an example in the context of managing coagulation therapy. In this example, a test looks at a set of genes to determine the patient’s individual metabolic rate for an anticoagulation drug versus no testing of metabolic rate. The clinical question should be set up according to PICO (patient, indicator, control group, outcome): patients are those in need of chronic coagulation therapy, the indicator is the genetic test of metabolism, the control group is patients who don’t receive the test, and the outcome(s) would be indicators of better therapeutic management, such as fewer thromboembolic events, major bleeding episodes, etc. Price emphasized that the starting point has to be identifying the clinical need and asking the right question, and that building the evidence base has to begin here.
Applying the A5 cycle, PICO drives the format of the question: the acquire phase is searching published evidence or generating new evidence; the appraisal phase means closely examining the quality of the evidence and looking for bias; and, if the test is determined to be more effective than no testing, the application phase might include educating clinicians on how to use the test, making sure the lab delivers the test in the right way at the right time, and of course the necessary accuracy and precision in line with what the evidence demonstrates is useful. Finally, in assessment, the lab reflects on (or audits) whether the test really worked as expected, if the evidence was applied successfully, and whether the clinical question was fully addressed.
Going forward, laboratorians must keep a razor-sharp focus on how a test impacts outcomes, said Christenson. “Comparative effectiveness is more than just a buzzword. It all boils down to a really simple thing: do patients who get the new test do better than patients who don’t? If they do, we do the test; if not, then forget it.” With the federal government becoming more involved in comparative effectiveness research with money from ARRA, Christenson said this is an important time to make sure the lab field is heard from. “I’m excited about this because I think comparative effectiveness will help us focus our resources on things that work. And I think with universal healthcare on the way, we can’t just continue with how we’re doing things now,” he said. “They’re going to have to focus on evidence-based lab medicine to help determine what works, and the better the cost-effectiveness of tests or treatments the more powerful the incentive to use it.” Price agreed. “This is a great opportunity to show how the laboratory contributes to more effective decision making,” he said. “It puts the laboratory at the ‘top table’ for both clinical and policy decision making.”