(Circulation. 1997;96:1157-1164.)
© 1997 American Heart Association, Inc.
Articles |
From Duke University Medical Center, Department of Medicine, Division of Cardiology, Durham, NC.
Correspondence to Thomas M. Bashore, MD, PhD, Duke University Medical Center, Erwin Rd, Box 3012, Durham, NC 27710.
| Abstract |
|---|
|
|
|---|
Methods and Results Fifty image sequences from 31 interventional procedures were viewed both in the original (uncompressed) state and after 15:1 lossy Joint Photographic Expert's Group (JPEG) compression. Experienced angiographers identified dissections, suspected thrombi, and coronary stents, and their results were compared with those from a consensus panel that served as a "gold standard." The panel and the individual observers reviewed the same image sequences 4 months after the first session to determine intraobserver variability. Intraobserver agreement for original images was not significantly different from that for compressed images (89.8% versus 89.5% for 600 pairs of observations in each group). Agreement of individual observers with the consensus panel was not significantly different for original images from that for compressed images (87.6% versus 87.3%; CIs for the difference, -4.0%, 4.0%). Subgroup analysis for each observer and for each detection task (dissection, suspected thrombus, and stent) revealed no significant difference in agreement.
Conclusions The identification of dissections, thrombi, and coronary stents is not substantially impaired by the application of 15:1 lossy JPEG compression to digital coronary angiograms. These data suggest that digital angiographic images compressed in this manner are acceptable for clinical decision-making.
Key Words: angiography imaging computers
| Introduction |
|---|
|
|
|---|
Despite these obstacles, it is widely anticipated that digital options will replace analog modes of storage and transfer in the next 5 to 10 years. Among the reasons for this are the ongoing trends in the computer industry of increased performance and declining costs. In addition, high-capacity networks are being established within and between hospitals to allow for rapid transmission of large files. Despite these advances, maximum availability and distribution of digital cardiac images will be limited in the near future.
One option for accelerating the rate of adoption of digital technology is provided through the application of data compression methods to coronary angiograms. Compression methods can significantly reduce the size of any digitally stored data, including medical images, to various degrees of reversibility. Review of angiograms subjected to low levels of compression reveals that in many cases the compressed images appear almost indistinguishable from the original images.1 Intermediate levels of compression produce images that are slightly different in appearance from original studies but that might have enough information content to allow reliable diagnoses. These subjective impressions of image quality are noteworthy, but they are not an acceptable substitute for clinical data demonstrating that diagnostic accuracy does not suffer when images are compressed before their interpretation.
To carefully rule out a small deleterious effect of compression, an assessment of data compression must include sufficient observations to detect a small difference in diagnostic accuracy. A study with insufficient statistical power may lead users to conclude that a compression mode is clinically safe, when it may in fact result in potentially dangerous errors with a finite frequency. The present study was designed, therefore, with a relatively large number of observations to test the hypothesis that 15:1 lossy Joint Photographic Expert's Group (JPEG) compression would not diminish the recognition of important qualitative morphological features commonly observed in coronary angiograms.
| Methods |
|---|
|
|
|---|
The final selection of sequences to be used was made by a group of three cardiologists who were not participants in later phases of the study either as individual observers or as members of a "gold standard" consensus panel. If the group did not agree that an abnormality was present, the sequence was excluded from analysis. Fifty image sequences from 31 interventional procedures were included in the final protocol. The patients included 20 men and 11 women ranging in weight from 42 to 102 kg (mean, 74.7±19.7 kg). The selection panel identified 23 dissections, 7 suspected thrombi, and 9 stents in the final data set. Of the 50 sequences, the selection panel concluded that 15 sequences did not contain any of the three features.
Acquisition, Processing, and Compression of Image
Sequences
Digital angiograms were acquired at 30 frames per second in the
7-in image intensifier mode on the DCI digital angiographic imaging
system (Philips Medical Systems). A Philips Optimus OM2000 x-ray
generator was used and resulted in detected x-ray exposure per frame of
18 µR. Images were temporarily stored for screening on the DCI
real-time disk at a resolution of 512x480 pixels, resulting in a pixel
resolution of 0.2 to 0.25 mm. After selection of appropriate image
sequences, the sequences were edited with removal of some frames at the
beginning and end of the sequences for more efficient storage of
images. After editing, a typical sequence consisted of
100 images,
or three to four cardiac cycles. Sequences were then transferred to
digital tape for storage and transfer. Images were processed off-line
after import into the research network of the Cardiac Imaging Research
Laboratory. Edge enhancement was performed with a fixed kernel and
weighting factor to reproduce the degree of enhancement routinely used
in our laboratory for clinical review of digital images. Compression
was applied to the enhanced image sequences with a dedicated hardware
compressor/decompressor on a DEC Alpha 3000/600 workstation (Digital
Equipment Corp). The same quality factor was used for all images,
resulting in an
15:1 reduction in the size of each sequence. This
quality factor was selected in advance on an empirical basis by review
of angiograms subjected to different levels of JPEG compression.
Viewing Sessions
A final display data set was produced by combining the
compressed image sequences along with the original uncompressed images
into a final data set of 100 sequences. The viewing order was generated
with a random-number generator (SAS System for Windows, Version 3.1,
SAS Institute). This resulted in two subsets, with 50 sequences of
images in each subset. In the first data subset, each of the sequences
was viewed in the order determined by the random-number generator. Half
of the sequences in this subset were JPEG-compressed and half were
original images (the assignment to compressed or original was made
before randomization). In the second display subset, the 50 sequences
were in the same randomized order as in the first subset, except that
the data format (original versus compressed) was reversed. For example,
if sequence number 1 was compressed, then sequence number 51 was its
original counterpart. All the images were stored on a high-speed
digital disk for review on the DEC workstation with application
software developed in our laboratory. Images were displayed on a 13-in
video monitor (Sony Medical Systems) that has been used in our
laboratory for clinical review of images. Ambient lighting was
consistent with the levels used in our procedure rooms.
Observers were allowed to adjust contrast and brightness settings on
the console as desired.
Four experienced angiographers individually reviewed each of the 100 sequences and were asked to indicate whether a dissection, thrombus, or stent was present. Observers were told that for any given sequence, there might be zero, one, or more than one type of finding. The nature of the sequence (compressed versus original) was not revealed, and the reviewers were not asked to identify which sequences were compressed. Observers were able to view the sequences at a rate of 30 frames per second and were able to slow the frame rate and view the images in either forward or reverse. No time limitation was imposed.
In parallel, a panel of three different angiographers reviewed the 50 original (uncompressed) image sequences and were given the same instructions as the individual observers, except that the decision regarding the presence or absence of an abnormality was made after discussion. Agreement of all three panelists was required for an abnormality to be considered present.
To determine intraobserver and interobserver agreement, the individual observers and the panel repeated the sessions after a 4-month interval. The images and compression ratio were the same for the second session, but a new randomized viewing order of image sequences was generated.
Statistical Analysis
The primary hypothesis of the study is that there is no change
in the error rate for detection of morphological features (defined as
the difference in agreement with the panel for compressed images versus
original images). To determine the overall error rate across all
observers, it was necessary first to test whether the error rates of
the individual observers were independent. The results of a likelihood
ratio test for independence, analyzed separately by end point,
indicated a significant degree of dependence among the four observers
(
2 values for independence ranged from 19.65 to
42.39 for 3 df, P<.0001). This dependence was
taken into account by analyzing the various combinations of possible
errors across the observers, weighted by the probability of the
occurrence of each error. The combined response for each feature was
modeled as a multinomially distributed sample of 50 in which there were
five different possible outcomes: all correct, three correct, two
correct, one correct, and none correct. The responses were
analyzed for each feature separately, and the error rates,
variances, and CIs were calculated by the CATMOD procedure for
categorical analysis (SAS). Similarly, the results of a
likelihood ratio test for independence of features by observers
indicated no significant dependence between observer results for
different features (
2 value for all four
readers=8.29 for 4 df, P>.05). Thus, the results
across features were combined to determine overall error rates for each
observer. The combined CI for the overall error rate, ie, all observers
and all features, was constructed from the individual end points by
combining the individual end points by use of the standard inverse
variance weighting formula.
As described above, observer agreement was defined as the fraction of
the total number of pairs of observations in which it was agreed that a
finding was either present or absent. The
statistic was also
used to assess intraobserver and interobserver variability. The
statistic is defined as
=(observer agreement-chance
agreement)/(1-chance agreement). Of note, the level of chance
agreement depends on the proportion of observed findings and is not
necessarily 0.5. As described by Landis and Koch,2
=0 to 0.20 suggests slight agreement,
=0.21 to 0.40 suggests fair agreement,
=0.41 to 0.60 suggests
moderate agreement,
=0.61 to 0.80 suggests substantial agreement,
and
=0.81 to 1.0 suggests almost perfect agreement.
To determine whether compression altered intraobserver agreement, a two-sample test for binomial proportions was used. Probability values (of type I error) and CIs for differences in agreement were determined with the normal approximation to the binomial.
| Results |
|---|
|
|
|---|
|
|
Results for the primary hypothesis, that compression does not
alter individual agreement with the consensus panel, are shown in Fig 3
and Table 1
. For each type of finding,
there were no significant differences in agreement with the consensus
panel. In the case of dissections, agreement was very close for
detection in the two types of images (88.5% agreement with panel for
original images versus 87.0% agreement for compressed images,
P=NS). Agreement was also very close for detection of
suspected thrombi (87.5% agreement for original images versus 87.0%
for compressed images, P=NS) and coronary stents
(86.5% agreement for original images versus 87.5% for compressed
images, P=NS). For each observer, there was no significant
difference in detection of abnormalities, and for all observers and
tasks combined, the agreement was very close (87.6% agreement for
original images; 87.3% agreement for compressed images;
P=NS; upper limit of 95% CI for difference in agreement,
4.0%). The effect of compression on intraobserver agreement is shown
in Fig 4
and Table 2
. For each task and
each observer, there was no significant difference between
intraobserver agreement for compressed and original images.
|
|
|
|
Fig 5
shows the intraobserver variability for
interpretation of original images. Agreement was substantial for
dissections (
=0.69) and thrombi (
=0.67) and moderate for stents
(
=0.58). When tasks were considered together, each observer
demonstrated substantial intraobserver agreement (
=0.64, 0.78, 0.63,
and 0.78 for observers A, B, C, and D, respectively). Overall
intraobserver agreement for all items and all individual observers
together was substantial (
=0.70). The agreement of the consensus
panel with itself, between the first and second readings, was
substantial (
=0.76 for all tasks). The results, expressed as percent
agreement, are shown in Table 3
.
|
|
Interobserver variability was generally greater than intraobserver
variability for interpretation of original images (Fig 6
). Compared with the consensus panel, individual
observers had substantial agreement for dissections (
=0.71),
moderate agreement for thrombus (
=0.52), and fair agreement for
stents (
=0.38). Observer A demonstrated moderate agreement with the
panel (
=0.55), and observers B, C, and D showed substantial
agreement (
=0.75, 0.64, and 0.67, respectively). Overall
interobserver agreement for all items and all individual observers
together was substantial (
=0.65). Table 4
lists
representative results for each observer from original
images in terms of percent agreement. Agreement between observers and
the consensus panel was not significantly different for the
analysis of the first subset of 50 sequences compared with the
second subset (87.6% versus 87.9%, respectively). Similarly, there
was no significant difference for a comparison between the first review
session and the second review session 4 months later (87.8% versus
88.2%, respectively).
|
|
Also shown in Tables 2 through 4![]()
![]()
are the corresponding results for
identification of sequences in which no features were found, ie, those
sequences identified as "normal." The agreement between observers
and the consensus panel as well as for intraobserver agreement was in
general less than for identification of other features, but there was
no substantial difference in agreement between original and compressed
images. The lower agreement is explained in part by the failure of the
consensus panel to detect a number of stents; the members of the panel
were asked to make their judgment solely on the sequence under
consideration and were not provided the full body of information
available to the selection panel.
| Discussion |
|---|
|
|
|---|
Concern remains about the lack of data investigating the impact of
compression on diagnosis.5 Lossless compression methods
permit restoration of the original image data without any distortion,
with a modest reduction in storage requirements (on the order of 3:1);
thus, no effects on diagnosis would be expected. Lossy compression
methods produce irreversible changes in images but allow much greater
reduction in storage requirements (
20:1 compression). The
irreversible methods, however, potentially degrade the reconstructed
images; the degree of distortion is dependent on the method and the
image content, but in general, distortion is greater for high ratios of
compression. Digital compression algorithms use abbreviated codes that
emphasize the important low-frequency content of images and take
advantage of the similarity of nearby pixels. One of the most common
families of compression methods is JPEG, which includes procedures for
both lossless and lossy compression of images. For example, the recent
development and acceptance of the DICOM 3.0 standard for
exchange of digital cardiac angiographic data specify a lossless JPEG
method as part of the standard. Although this leads to some reduction
in the required capacity of the designated exchange medium, it is much
less of a reduction than is possible with the lossy methods. Because
much of the information content of medical images is redundant, the use
of lossy JPEG compression may lead to significant reduction in storage
and transfer requirements while maintaining the important
diagnostic features of images.
Previous work has examined the effect of compression on noncardiac imaging6 7 8 and echocardiography.9 Quantitative angiographic analysis of compressed images of phantom coronary stenoses has been explored in several recent abstracts and publications.10 11 12 13 Whiting et al14 found that with up to 12:1 lossy compression, observers were able to detect dynamically displayed computer-generated filling defects but that detection was impaired with 18:1 compression. Rigolin et al10 found that 15:1 JPEG compression of coronary angiograms did not increase intraobserver variability in the visual estimation of stenosis severity or reduce the precision of quantitative coronary angiographic measurements. This suggests that 15:1 JPEG compression may be adequate for determining the presence of a significant lesion in digital angiograms.
Because clinical decision making depends on recognition of subtle morphological features as well as the determination of stenosis severity, it is important to determine whether lossy compression impairs the detection of qualitative abnormalities. For example, the recognition of a coronary dissection after angioplasty has important immediate therapeutic implications. The inability to recognize a dissection may result in the failure to provide appropriate remedial therapy (such as stenting or anticoagulation). Even a "small" percentage of errors may be unacceptably high, because the consequences of each mistake may be substantial. In addition to their clinical importance, subtle morphological features such as dissections and thrombi are often more challenging to identify than stenoses. The complete evaluation of the adequacy of a compression standard must include analysis of the most difficult diagnostic tasks, particularly when the clinical implications of errors are potentially severe. By using a large number of observations, we were able to determine with confidence that the magnitude of any error resulting from 15:1 JPEG compression is small.
In addition to studying the effects of compression, these data provide information regarding the intraobserver and interobserver variabilities for diagnosis of morphological abnormalities. It was not the objective of the present study to perform a comprehensive evaluation of the diagnostic assessment of complex lesion morphology but rather to determine the effects of image compression on that task. Previous work has demonstrated surprisingly low agreement for the visual estimation of stenosis severity15 16 17 18 19 20 and for American College of Cardiology/American Heart Association lesion grading,21 22 23 24 25 26 even among very experienced observers. As in the present study, Hermans et al22 also found a high level of agreement (89%) for the detection of dissections. We were unable to find previous work investigating the variability of identification of stents, and interobserver agreement for the detection of thrombi has been variable.22 25 26 In the present study, agreement was reasonably good for the detection of thrombi and dissections. Agreement about the presence of Palmaz-Schatz stents was only fair, confirming the clinical impression that stent visualization can be difficult. The agreement regarding the absence of any feature was also only fair; in general, it was not as high as for the detection of any single feature. This may also be explained in part by the difficulty in identification of stents solely on the basis of a single angiographic sequence. Although limited in scope, these data can be used as a benchmark for comparison with other tasks and groups of observers and for sample size calculations.
Although these results are favorable for this compression mode, there are several important caveats. First, it is very difficult to demonstrate that a compression mode has no adverse effects. Interpretation of angiograms requires assessment of many morphological details. Ideally, one would investigate other features as well (such as calcification, TIMI flow, and collaterals). Although such features differ clinically from those evaluated in the present study, they do share spatial and contrast characteristics with the features assessed in this study. We chose to look at three findings at one level of compression so that a sufficient number of observations could be made for each type of detection task. In addition, the present study was not designed to look at a large number of compression levels to determine precisely the level at which diagnostic accuracy begins to suffer. Although our study was large, a multicenter investigation with even more image sequences, types of abnormalities, observers, and compression levels would provide more information to be certain about the implications of image compression. A study of this type, sponsored by the American College of Cardiology and the European Society of Cardiology, is currently under way.
The conclusions of this study should not be extended to other types of image compression, such as Motion Picture Experts Group (MPEG), Wavelet, or even JPEG implementations that use customized parameters. The "baseline" or default JPEG algorithm was selected for a number of reasons, including the widespread availability of software and hardware options for compression and decompression of digital image data at a continuous range of image quality. This is reinforced by the recent introduction of lossy JPEG methods into commercial products by angiographic equipment manufacturers. An additional factor for the use of JPEG in this evaluation was the temporal integrity of individual JPEG frames, ie, there is a unique one-to-one relationship between a given compressed image frame and the x-ray exposure used for the acquisition of the original image. Some compression methods, such as MPEG, achieve high data compression by calculating approximate "difference" images from a small number of original images spaced widely in time and may potentially lose this unique temporal relationship.
The degree of degradation resulting from lossy image compression is a function of numerous parameters in the original images that, in turn, are dependent on equipment in the x-ray imaging chain, patient thickness, geometric magnification, and detected x-ray dose. In our study, images suboptimal in the angiographic technique were excluded so as to determine whether compression impairs diagnostic accuracy in relatively "best-case" circumstances. Similarly, only experienced angiographers participated as observers. To determine whether these results are generally applicable, it would be useful to perform a similar study with less experienced angiographers interpreting images of various levels of quality, which would be representative of the full spectrum of image data likely to be encountered clinically.
It would be unrealistic to insist that one level or mode of compression be used for all types of display, transfer, and storage. For applications that do not have an immediate impact on patient care, small amounts of image distortion and diminution in diagnostic accuracy may be tolerable. For example, high levels of compression may enable cardiologists at a referral center to send images from a patient's study on a floppy disk to the primary care provider at the time of discharge for illustrative purposes, to complement the written discharge and catheterization summary. Studies that are relatively old (>5 years) with a low probability of being used for clinical purposes could be stored inexpensively in a compressed format. For these applications, image processing methods such as spatial filtering for edge enhancement can be applied before image compression. The degree of compression that would be acceptable when image processing is performed after compression may be lower than that evaluated in this study. Our results are sufficient to establish 15:1 JPEG compression after edge enhancement as safe for these applications.
In contrast, when immediate decisions regarding patient care depend on the results of image interpretation, a rigorous standard must be met before a compression mode should be broadly applied. Although these data are encouraging, we would not advocate that 15:1 JPEG compression be adopted as a universal standard for display and archiving. Rather, original images (or those stored with lossless compression) should be used for short-term interpretation and for long-distance consultation with therapeutic implications. Only as more information is obtained about the clinical utility of compression can standards be expected to be relaxed.
| Acknowledgments |
|---|
Received October 7, 1996; revision received February 3, 1997; accepted February 16, 1997.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
L Koenig, E Parks, M Analoui, and G Eckert The impact of image compression on diagnostic quality of digital images for detection of chemically-induced periapical lesions Dentomaxillofac. Radiol., January 1, 2004; 33(1): 37 - 43. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.C Tuinenburg, G Koning, E Hekking, A.H Zwinderman, T Becker, R Simon, and J.H.C Reiber American College of Cardiology/ European Society of Cardiology international study of angiographic data compression phase II. The effects of varying JPEG data compression levels on the quantitative assessment of the degree of stenosis in digital coronary angiography Eur. Heart J., April 2, 2000; 21(8): 679 - 686. [Abstract] [PDF] |
||||
![]() |
R Brennecke, U Burgel, R Simon, G Rippin, H.P Fritsch, T Becker, and S.E Nissen American College of Cardiology/ European Society of Cardiology international study of angiographic data compression phase III. Measurement of image quality differences at varying levels of data compression Eur. Heart J., April 2, 2000; 21(8): 687 - 696. [Abstract] [PDF] |
||||
![]() |
J. C. Tuinenburg, G. Koning, E. Hekking, A. H. Zwinderman, T. Becker, R.u. Simon, and J. H. C. Reiber American College of Cardiology/ European Society of Cardiology international study of angiographic data compression phase II: The effects of varying JPEG data compression levels on the quantitative assessment of the degree of stenosis in digital coronary angiography J. Am. Coll. Cardiol., April 1, 2000; 35(5): 1380 - 1387. [Abstract] [Full Text] [PDF] |
||||
![]() |
R.u. Brennecke, U. Burgel, R.u. Simon, G. Rippin, H. P. Fritsch, T. Becker, and S. E. Nissen American College of Cardiology/ European Society of Cardiology international study of angiographic data compression phase III: Measurement of image quality differences at varying levels of data compression J. Am. Coll. Cardiol., April 1, 2000; 35(5): 1388 - 1397. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Circulation Home | Subscriptions | Archives | Feedback | Authors | Help | AHA Journals Home | Search Copyright © 1997 American Heart Association, Inc. All rights reserved. Unauthorized use prohibited. |