(Circulation. 1998;97:707.)
© 1998 American Heart Association, Inc.
Measurement of Clinical Efficacy in Studies of Heart Failure
Thomas S. Rector, PhD
Senior Research Associate,
Cardiovascular Division,
University of Minnesota Medical School,
Minneapolis, Minn
To the Editor:
A recent clinical investigation reported by Packer et
al1 used a multiplicity of end points to assess
the efficacy of carvedilol for heart failure because the most
appropriate measure(s) have not been firmly established. Statistically
significant favorable effects of carvedilol compared with placebo were
observed for measures commonly used in clinical practice, such as NYHA
classification, physician's assessment of changes in clinical status,
and asking patients if they felt better compared with baseline. These
congruent results do not represent independent assessments.
Other measures, including a previously validated quality-of-life
questionnaire, did not demonstrate significant
differences.2 3 These discrepant results may have
been due to differences in the content of measures, timing of
assessments, and statistical methods. Nevertheless, the authors
concluded that these results have important implications for future
clinical trials, implying that simple symptom assessments were adequate
to assess clinical efficacy because they correspond with clinical
practice and demonstrated differences compared with placebo. Are the
simple symptom assessments adequate measures of therapeutic
efficacy?
Reliable clinical measurements result from methods that can be applied
consistently at different times and by different
investigators.4 The NYHA classification has been
shown to have poor interobserver agreement in part because asking about
"ordinary" physical activities is rather
imprecise.5 When standard activities such as
walking a specific distance or climbing a flight of stairs are used, it
can enhance this measure's reliability,5 but
concerns about what investigators take into consideration persist. For
example, 60% of the patients in the carvedilol study were classified
as NYHA class III or IV. The protocol specified that class III patients
should be symptomatic walking 200 yards or up a flight of
stairs. However, the mean distance walked in a corridor was >345 m,
and all patients had to walk at least 150 m. Perhaps these
patients developed symptoms at much shorter distances and continued to
walk. Another possibility is that protocol criteria for NYHA
classification were not applied consistently.
Similarly, asking patients if they "feel better or worse" does not
specify what the patient or investigator should focus on each time this
measurement is made. Indeed, one doesn't know what is actually being
measured when patients say they feel better. Is the response based on
changes in symptoms of heart failure or some other aspect of their
care? The large placebo response seen in the carvedilol and other
studies raises serious concerns about what was measured by these
so-called "global" questions.6 Single
questions are not global measures in the sense that patients do not
consider many aspects of their heart failure when asked if they feel
better. These potential measurement problems can be minimized by
carefully designed written questionnaires that ask the same specific
questions each time they are administered.
Measures of quality of life extend well beyond simple assessments of
symptoms. Measures of quality of life focus on how changes in symptoms
affect the individual's activities and sense of
well-being.7 Statistically significant changes in
simple measurements of symptoms may be insufficient to alter lifestyle.
Furthermore, quality of life can be affected by more than symptoms of
heart failure. For example, side effects may adversely
affect quality of life. The more frequent dizziness reported by
patients receiving carvedilol compared with those receiving placebo may
have reduced their quality of life even though the frequency of
dyspnea, but not fatigue, was less in the carvedilol group. Clearly,
commonly used symptom assessments cannot serve as adequate measures of
quality of life.
Symptoms are certainly an important component of clinical efficacy.
Selection of measurements of symptoms for clinical trials should not be
unduly influenced by probability values in studies of investigational
therapies. Rather, we should be explicit about what symptoms it is
important to measure to determine if a treatment has value to patients.
We should then develop unambiguous measures that comprehensively
reflect what is judged to be important (ie, valid measures) and that
provide for consistent applications (ie, reliable measures) in
both clinical trials and practice. On the basis of these criteria,
assessments such as the NYHA classification and so-called global
questions are not the best possible measures of therapeutic efficacy.
More recent attempts to develop comprehensive, reliable, and valid
written questionnaires should not be discarded because they don't
demonstrate statistically significant effects in some clinical trials.
Perhaps they are providing more meaningful data. We must continue to
improve our measures of symptoms, adverse effects, and quality of life
to understand the true value of treatments for heart failure.
References
-
Packer M, Colucci WS, Sackner-Bernstein JD, Liang CS,
Goldscher DA, Freeman I, Kukin ML, Kinhal V, Udelson JE, Klapholz M,
Gottlieb SS, Pearle D, Cody RJ, Gregory JJ, Kantrowitz NE, LeJemtel TH,
Young ST, Lukas MA, Shusterman NH. Double-blind, placebo-controlled
study of the effects of carvedilol in patients with moderate to severe
heart failure: the PRECISE trial. Circulation. 1996;94:27932799.
-
Rector TS, Cohn JN. Assessment of patient outcome with the
Minnesota Living with Heart Failure questionnaire: reliability and
validity during a randomized, double-blind, placebo-controlled trial of
pimobendan. Am Heart J. 1992;124:10171025.
-
Rector TS, Kubo SH, Cohn JN. Validity of the Minnesota Living
with Heart Failure questionnaire as a measure of therapeutic response
to enalapril or placebo. Am J Cardiol. 1993;71:11061107.
-
Feinstein AR. An additional basic science for clinical
medicine, the development of clinimetrics. Ann Intern Med. 1983;99:843848.
-
Goldman L, Hashimoto B, Cook EF, Loscalzo A. Comparative
reproducibility and validity of systems for assessing
cardiovascular functional class: advantages of a new
specific activity scale. Circulation. 1981;64:12271234.
-
Rector TS, Johnson G, Dunkman B, Daniels G, Farrell L, Henrick
A, Smith B, Cohn JN. Evaluation by patients with heart failure of the
effects of enalapril compared with hydralazine plus isosorbide
dinitrate on quality of life: V-HeFT II. Circulation.
1993;87(suppl VI):VI-71-VI-77.
-
Testa MA, Simonson DC. Assessment of quality of life outcomes.
N Engl J Med. 1996;334:835840.