Surrogate markers

A recent conversation about an institution’s use of the A1c (a measurement of average blood glucose levels over the preceding 100 days) to grade clinician performance and adjust compensation frustrated me. The issue was the misunderstanding and misuse of surrogate markers, those things we measure when we can’t measure what we really want to know.

Let’s start with a thought experiment using traffic safety as an analogy. Imagine that several independent national organizations study accident data and determine that drivers with the fewest speeding tickets, fewest accidents, and lowest traffic mortality tend to use their directional  signals more consistently than those with high accident rates, frequent speeding tickets, and those who die at the wheel. They develop a tool that reports on the percent of turns without signaling and find that a score less than 10 is associated with low risk and a score greater than 30 is associated with high risk. Insurers respond by providing discounts to ‘safe drivers’ with scores < 10 and surcharges for ‘unsafe drivers’ with scores > 30. Over the following five years:

  • The percent of drivers who signal consistently increases to a new level and then plateaus.
  • There is no change in speeding, texting or impaired driving behavior (the causes of the majority of accidents and fatalities).
  • The number of fatalities in cars where the driver did not signal goes down.
  • The actual rate of accidents and fatalities is unchanged.
  • The insurers claim success: signaling increased.
  • Independent observes claim failure: no change in accidents and fatalities.

People signaled more – but traffic fatalities were unchanged. Why? The well-intentioned insurers confused correlation with causation. Consistent use of signals is associated with safe driving because safe drivers use their signals appropriately. Safe driving causes signaling. THE REVERSE IS NOT TRUE. Safe driving requires many other safe driving habits, and there is no reason to expect that paying a driver to signal will change his speeding, texting or drinking behavior. The data identified a surrogate marker (signaling) that is a RESULT of safe practices but not a cause of safe practices. Incentivizing signaling will not change the root causes of unsafe driving.

The same sort of doomed quality incentive programs built on broken logic occur all too often in medicine. Attempting to leverage the A1c to improve outcomes is a common example. It is well established that groups of diabetics with a low A1c have fewer and less severe complications than groups with a high A1c. This makes the A1c an indirect or surrogate marker for the likelihood of diabetic complications in groups. A whole host of well meaning individuals and institutions (CMS, insurers, quality reporters, the press, and my own institution) have jumped to the unfortunate conclusion that improving A1c statistics will improve diabetic outcomes.

(This completely ignores the fact that the ADA and other national organizations have specifically said that the A1c data for populations is only a rough guide to individual patient care, that targets for patients should be individualized rather than applied across the board as arbitrary one-size-fits-all numbers, and that these numbers should not be used to grade the quality of care by clinicians. But that discussion will have to wait for another day.)

What are the problems with using A1c data to evaluate and compensate/incentivize clinicians?

  • The sample size is too small to be valid for individual clinicians. The margin of error or confidence levels are such that one has to pool large numbers (usually 10s or 100s of clinician panels) to reach a usable number. (Discussed for Medicare data in an article in the  December 9, 2009 issue of JAMA, with an accompanying editorial by Berwick.)
  • It falsely assumes that the clinician drives health outcomes the way a driver drives a car. The A1c is NOT something the clinician can control.  Incentivizing the clinician for patient behavior is like incentivizing a passenger for driver behavior. Passengers can certainly play a small role by reminding the driver, but if the driver does not heed the reminders, the passenger is helpless to improve the driver’s safety score. (A passenger can – and should – refuse to drive with an unsafe driver. More about that later.)
  • It assumes that clinicians are not motivated to help the patient control their A1c unless they can get cash rewards. Clinicians find this an offensive concept. Further, the bulk of the evidence says it doesn’t work and can cause harm by replacing a social contract with a market contract.
  • We have many good prospective studies that show we can improve outcomes by lowering blood pressure or LDL. We have no such data for A1c. The data for the A1c is weak, at best.
  • We know that the risk of complications in diabetes is more strongly correlated with systolic blood pressure, LDL levels, smoking status and physical activity than it is with A1c levels. The clinical message from this is that it is more important to control BP and cholesterol, and help patients quit smoking and start exercising than to get the A1c below an arbitrary target.
  • Patients have limited resources. Adding additional medicines to get the A1c down may compromise their ability to pay for medicines for their blood pressure or lipids, arguably more important. It may also lead them (because it leads their clinician) to focus on moving the fasting sugar from 150 to 120 instead of exercise, better meals, or giving up cigarettes. 
  • We know that lowering sugars is subject to both the law of diminishing returns and to very real risks. The clinician can add insulin and up titrate the dose, drive the average sugar down and improve the A1c. This can be associated with hypoglycemia (low blood sugar) which is uncomfortable at best and quite dangerous at worst. 
  • Forcing the clinician to focus on the A1c also means forcing the patient to focus on the A1c. This is not patient-centric, impairs engagement and collaboration, and may distract from work on other important issues: major depression, domestic violence (the more time one spends thinking and talking about the a1c, the less likely one is to ask about safety), alcohol or other drug issues, obesity, obstructive sleep apnea...
  • It incentivizes the clinician to avoid caring for patients with poor A1c levels. (Paying passengers if their driver signals is more likely to change whom the passenger drives with than change driver behavior.) Clinicians game the system: patients are referred to an endocrinologist so the PCP is no longer responsible for the outcome, or the patient is discharged from the practice for ‘non-compliance.’   The former is common, the latter less so but will likely increase if incentives and punishments get more potent.
  • Just as rewarding drivers for signaling will not change the rate of accidents from speeding, texting or alcohol, rewarding clinicians for patient A1c results will not change the rate of heart attacks, renal failure, amputations and blindness. It will aggravate clinicians, deplete scarce resources, distract and detract from interventions that improve outcomes, make it harder to provide quality care, and thereby actually harm patients.

Published successful clinical quality improvement program consistently have the following stepwise approach: 

  • Identify the process or outcome to be improved. (Most have been in the hospital setting where the environment is easier to control and there are fewer variables than with outpatients.)
  • Identify metrics that will show if the targeted outcome improves.
  • Identify and monitor additional metrics to ensure that there are no unanticipated negative outcomes.
  • Clinicians examine the processes of care and identify best practices.
  • Data is collected to see where care deviates from best practices - and why.
  • Systems and resources are deployed to support best practices.
  • Data is collected to assess the response to the interventions. 
  • The cycle is modified and repeated based on results.

Many institutions replace this process with a more simplistic (and lazy) misuse of the A1c as a surrogate for quality care. Typically, they perform no evaluation of the process of care, do not identify where or why problems originate, provide no resources or systemic support, and then hold clinicians financially responsible for patient behavior. This suggests either ignorance of the behavioral and medical literature about quality improvement, or a desire to claim a QI program on the cheap, without making a true commitment to either quality or improvement. 

I understand why non-clinical leaders and managers would not understand the clinical implications. It is harder to understand why intelligent physician leaders buy into this.


Links to more on this topic::