Abstract
Purpose
To estimate the visual lossless threshold of Joint Photographic Experts Group (JPEG) 2000 compression digital chest radiograph images.
Materials and Methods
Fifty (n=50) selected chest radiograph images were compressed to 5 different levels: reversible (as negative control) and irreversible 5:1, 10:1, 15:1, and 20:1. By alternately displaying the original image and its paired compressed image on the same monitor, five radiologists independently determined if the image pairs had detectable differences. For each reader, we compared the proportion of the image pairs (the compressed image and the original image) rated to have detectable differences between reversible compression and each of the four irreversible compressions using the exact test for paired proportions.
Results
For each reader, the proportion of the image pairs rated to have detectable difference was not significantly different between the reversible and irreversible 5:1 and 10:1 compressions. However, the proportion significantly increased with 15:1 and 20:1 irreversible compressions, versus reversible compression in all readers (p=7.4×10-22-0.027).
Chest radiographs are the most frequently performed radiologic examination and they account for a large portion of medical image data. For example, 16,110 chest radiographs were performed, resulting in a data volume of 229.9 gigabytes accumulated at the Seoul National University Bundang Hospital in September 2009. Irreversible ("lossy") image compression appears to be an immediate and effective means to reduce operational costs on transmission and storage of medical image data (1234). However, such compression techniques are not always accepted by radiologists due to concerns of artifacts that potentially hinder diagnosis.
There have largely been two approaches in determining optimal compression thresholds: the diagnostically lossless approach (567891011) and visually lossless approach (121314). The diagnostically lossless approach aims to preserve diagnostic accuracy in compressed images. Although this approach addresses the diagnostic performance directly, its practicability is limited as the compression threshold intrinsically varies with the diagnostic task itself (e.g., the size of the target lesion) (12). The visually lossless approach is based on the concept that if compression artifacts are imperceptible, they should not affect the diagnosis. The latter approach has been advocated to be more robust and conservative and, therefore, to be more suitable for medical image compression than the former. In this study, we focus on the principle of visual losslessness.
If a compressed image cannot be distinguished from the original (non-compressed) image by radiologists, there is no basis for arguing that this "visually lossless" compression impedes diagnostic accuracy (1215). In other words, a visually lossless threshold (VLT) for image compression can be higher than a mathematically lossless threshold (reversible threshold), and lower than a diagnostically lossless threshold. Although the visually lossless criterion would likely allow a relatively lower compression level, this conservative criterion would be more readily accepted, even by radiologists skeptical to irreversible compressions (12).
The effect of image compression on chest radiographs has been studied since the early 1990s. The acceptable compression ratios were reported to be as high as between 10:1 and 25:1 (78916). All published reports on studies concerned the evaluation of diagnostic performance - the diagnostic lossless threshold - typically in a receiver operating characteristic study.
There are studies on VLT of abdominal and chest computed tomography (CT) compressed with JPEG 2000. Previous studies show that VLT differed with body parts, scan parameters, and modalities (41417181920212223). To our knowledge, there is only one published report on VLT assessment in compressed digital chest radiographs (13). In that study, the chest radiographs were compressed with the JPEG algorithm, and displayed with film and CRT.
The purpose of this study was to evaluate the visually lossless threshold of irreversibly compressed chest radiograph adopting recent imaging trends, i.e. digital radiography with flat detector, JPEG 2000 compression algorithm, and flat panel liquid crystal displayer.
The Institutional Review Board at the Seoul National University Bundang Hospital approved the use of clinical images in this study. Patient confidentiality was preserved with anonymized images therefore informed patient consent was waived.
The study included 50 patients (age range, 0.6-79 years; mean age, 48.6; 26 males and 24 females) who underwent chest radiography using a commercial digital radiography system (DigitalDiagnost; Philips Medical Systems, Shelton, Conn, US) in Seoul National University Bundang Hospital in September 2009.
A chest radiologist with seven years of experience selected four normal chest radiographs, 40 chest radiographs showing specific abnormal findings, and six chest radiographs showing medical instrumentations or post-operative changes to be included in this study. Specific abnormal findings were selected following the glossary of terms for thoracic imaging compiled by the Fleischner Society (Table 1) (19). This glossary consists of 107 terms describing anatomy (n=19), specific disease entity (n=10), specific abnormal findings (n=66), and synonyms which are referred to other terms (n=12). We excluded the terms describing anatomy, specific disease entity, synonyms, and specific abnormal findings which could be seen on CT only (n=26), and selected 40 terms describing specific abnormal findings which could be seen on chest radiographs.
Each of the 50 original images had a bit depth of 16 bits/pixel packed into two bytes. The compression ratio was defined as the ratio of original image file size (16 bits/pixel) to the compressed size (bits/pixel) (20). Using a JPEG 2000 algorithm (Pegasus Imaging Co., Tampa, FL, USA), each image was compressed to five different compression ratios: reversible (as negative control) and irreversible 5:1, 10:1, 15:1, and 20:1. The encoder was set to default settings (20): single tile; 6 levels of wavelet decomposition; size of code block 64×64; size of precinct 32,768×32,768; and a single layer. The actual compression ratios achieved for the four nominal irreversible ratios were 4.98 ± 0.01 (mean ± SD), 9.92 ± 0.04, 14.82 ± 0.10, and 19.67 ± 0.17, respectively. Minute differences in the actual compression ratios from the nominal compression ratios were considered inconsequential in this study.
Each compressed (and then decompressed) image was paired with its corresponding original, yielding 250 image pairs (50 images×5 compression ratios). Five radiologists with three years of working experiences (fourth-year-residents) interpreting chest radiograph findings participated in the study. Each reader was informed of the purpose of the evaluation and a description of the study protocol. The 250 pairs of the original and compressed images were randomly assigned to five reading sessions, while avoiding repetition of patient in a session. The order of reading sessions was changed among readers. Reading sessions were separated by a minimum of one week.
To compare an image pair, we used a previously reported image presentation method (212223242526). On a single monitor, the reader selectively toggled between the two images in rapid fashion. The order of the original and compressed images was randomized and blinded to the readers. The reader could return to the first image as desired. Each reader independently determined if the second image was identical to the first image or if any detectable difference was present (binary response). This method is known to be extremely sensitive to image difference (12), and therefore, provides a conservative standpoint on estimating the visually lossless threshold. When making comparisons, the readers were asked to pay attention to structural detail, particularly pulmonary vascular structures, bone, soft tissues, and if any, abnormally increased pulmonary opacity, such as nodules, masses, and consolidations.
Images were displayed in a one-by-one format using a Digital Imaging and Communications in Medicine image viewing software (M-view version 5.4, Infinitt Healthcare, Seoul, Korea), a flat-panel monochrome monitor (ME315, Totoku, Tokyo, Japan) with a matrix size of 2,048×2,560 and a diagonal display size of 20.8 inches (52.8 cm), and a matching video hardware (LV32P1, Totoku, Tokyo, Japan). The readers were encouraged to adjust window centers and level settings. Since reading distance would affect the readers' sensitivity to compression artifacts (13), the reading distance was limited to a range used in clinical practice. A research assistant had measured this range, 35-75 cm, by aiming a laser beam in front of the forehead of each reader to a ruler perpendicular to the monitor during 30 minutes of clinical work. In a similar manner, the research assistant monitored the reading distance during visual analysis, and instructed the readers to keep their reading distance within the range.
Further magnification was not allowed. The ambient room light was subdued. Reviewing was conducted at the readers' convenience, without a time constraint.
For each reader and for each compression ratio, the proportion that the image pairs rated as having detectable difference and the corresponding 95% confidence interval (27) were calculated. For each reader, the proportions in the irreversible 5:1, 10:1, 15:1, and 20:1 compressions were compared with that in the reversible compression (as negative control) using the exact test for paired proportions (28). A p-value of less than 0.05 was considered to indicate a statistically significant difference. If the proportion in an irreversible compression ratio was statistically different compared to that in the reversible compression, we determined the VLT to be below the compression ratio. Inter-observer agreements over 250 image pairs were measured using kappa statistics for multiple reviewers (29). StatsDirect version 2.7.2 (StatsDirect Ltd., Cheshire, UK) was used for the statistical analyses.
Each reader rated 0-2% (0/50 to 1/50) of the image pairs for the reversible compression (as negative control) and 0-4% (0/50 to 2/50) of the image pairs for the irreversible 5:1 compression having detectable differences between the compressed images and the original images. 0-12% (0/50 to 6/50) of the image pairs for the 10:1 compression, 14-70% (7/50 to 35/50) of the image pairs for the 15:1 compression, and 58-96% (29/50 to 48/50) of the image pairs for the 20:1 compression had detectable difference between the compressed images and the original images (Table 2) (Fig. 1). Kappa statistics of the five readers' responses was 0.686. Kappa statistics for each compression levels were also analyzed, but the kappa values were statistically not significant (p=0.607 -0.676).
Reader 5 rated none of the image pairs in the reversible compression and irreversible 5:1 compression having detectable difference between the compressed images and the original images. Consequently, p-values could not be calculated for comparisons between the reversible and irreversible 5:1 compressions for reader 5. For readers 1, 2, 3, and 4, the proportion of image pairs having detectable difference between the compressed images and the original images was not significantly different between the reversible and irreversible 5:1 compressions. For irreversible 10:1 compression, the proportion increased in four out of five readers (reader 1, 2, 3, and 5), but the difference between reversible and irreversible 10:1 compressions was not statistically significant for each reader.
However, the proportion significantly increased with irreversible 15:1 and 20:1 compressions, versus reversible compression in all readers (p=0.027-p=7.4×10-22). And we concluded VLT was considered to lie between 10:1 and 15:1.
There was no significant difference in VLT between normal chest radiographs and chest radiographs with abnormal findings.
In our results, the overall response patterns of the five readers were similar and suggested that there was no difference in image quality between the compressed and the original images at 5:1 and 10:1 compression, while there was significant difference at 15:1 or greater compression levels. From these results, we estimate the VLT to be somewhere between 10:1 and 15:1 for chest radiograph images compressed using the JPEG 2000 algorithm.
Most previous studies have concerned the diagnostically lossless threshold (78916). However, image compression artifacts can be detectable even though their presence does not affect the reader's performance for a given diagnostic task (91017). The presence of such perceivable artifacts possibly obscure ancillary findings (18) or induce false positive findings (13). Therefore, the threshold determined by the previous studies can address only narrowly defined diagnostic tasks (1112). To provide an acceptable threshold that covers a broad range of potential abnormalities with confidence, many receiver operating characteristic studies would be required, which are time consuming and expensive (12). This inefficiency of diagnostic lossless threshold and the conservativeness of VLT advocate the use of VLT in our study. Although the concepts of VLT yielded very conservative threshold, we tried to include as diverse image findings as possible, and selected chest radiographs possessing all specific abnormal findings from the glossary terms defined by the Fleischner Society.
Our image comparison method was intended to be as conservative as possible in any estimate of the visually lossless threshold. We used alternating presentation of registered images on the same monitor instead of presentation in a side-by-side orientation, similar to that in Slone's previous study, because the human visual system is naturally drawn to changes in structure or brightness (12). We believe this image comparison method, together with the adoption of a visually lossless threshold, should result in a very conservative and, we hope, widely accepted threshold for the compression level. Therefore, the visually lossless threshold measured in this study is the minimum (baseline) of acceptable compression level, and should not be mistaken as an optimal compression level in practice.
In addition, we used the JPEG 2000 compression algorithm which is regarded as the most sensible choice in a modern PACS (27), and brighter flat panel monitor as displaying device.
This study was conducted in the context of primary, rather than preliminary, interpretation of chest radiograph images, regardless of viewing tasks. Our results suggest that 10:1 JPEG 2000 compression is visually lossless for most images and is, therefore, potentially acceptable for the primary interpretation of chest radiograph images with minimal risk negatively impacting on diagnosis, eliminating the need to maintain the original images as the diagnostic standard. Modern hospitals transmit original (non-compressed) images to workstations for primary reading, and use reversible compression for the image storage. Considering the large amount of image data, the practical benefits of 10:1 (as a minimum) irreversible compression over reversible compression (3.5:1 in this study) are not insignificant. This reduction in data (approximately 65%) would directly affect operational costs in transmission and storage.
The limitations of the present study are as follows. First, the tested images did not include all potential abnormal findings and anatomical variations. Furthermore, the readers had to determine if the image pairs had detectable differences and did not compared all the selected abnormal findings and anatomical variations of the tested images. However, we believe that our results would be reproducible even with a study sample containing other abnormalities, because our study design was sensitive enough to detect perceptible compression artifacts, regardless of the image content. Second, more studies are needed to further generalize our results, since the acceptable compression level can be affected by imaging parameters (30) and radiography systems. Third, loss of information is inevitable in irreversible compression (either visually lossless or diagnostically lossless), and it can limit future use of the images, such as quantitative analysis. However, we focused on the adequacy of using irreversibly compressed images for primary interpretation, and such future use is out of our scope.
In conclusion, chest radiographs irreversibly compressed at a level of 10:1 using the JPEG 2000 algorithm are visually lossless and is therefore potentially acceptable for primary interpretation while does not impede diagnostic accuracy.
Figures and Tables
Table 2
Note.─ Data are the percentages of the compressed images being rated distinguishable from the corresponding original (uncompressed) images. Data in parentheses are the 95% confidence intervals of the percentages.
P-values are obtained by comparing the proportions in the irreversible 5:1, 10:1, 15:1, and 20:1 compressions with that in the reversible compression by using the exact test for paired proportions.
References
1. Rubin GD. Data explosion: the challenge of multidetector-row CT. Eur J Radiol. 2000; 36:74–80.
2. Rubin GD. 3-D imaging with MDCT. Eur J Radiol. 2003; 45:Suppl 1. S37–S41.
3. Tamm EP, Thompson S, Venable SL, McEnery K. Impact of multislice CT on PACS resources. J Digit Imaging. 2002; 15:Suppl 1. 96–101.
4. Lee KH, Lee HJ, Kim JH, Kang HS, Lee KW, Hong H, et al. Managing the CT data explosion: initial experiences of archiving volumetric datasets in a mini-PACS. J Digit Imaging. 2005; 18:188–195.
5. Ko JP, Rusinek H, Naidich DP, McGuinness G, Rubinowitz AN, Leitman BS, et al. Wavelet compression of low-dose chest CT data: effect on lung nodule detection. Radiology. 2003; 228:70–75.
6. Ko JP, Chang J, Bomsztyk E, Babb JS, Naidich DP, Rusinek H. Effect of CT image compression on computer-assisted lung nodule volume measurement. Radiology. 2005; 237:83–88.
7. Ishigaki T, Sakuma S, Ikeda M, Itoh Y, Suzuki M, Iwai S. Clinical evaluation of irreversible image compression: analysis of chest imaging with computed radiography. Radiology. 1990; 175:739–743.
8. Kido S, Ikezoe J, Kondoh H, Takeuchi N, Johkoh T, Kohno N, et al. Detection of subtle interstitial abnormalities of the lungs on digitized chest radiographs: acceptable data compression ratios. AJR Am J Roentgenol. 1996; 167:111–115.
9. MacMahon H, Doi K, Sanada S, Montner SM, Giger ML, Metz CE, et al. Data compression: effect on diagnostic accuracy in digital chest radiography. Radiology. 1991; 178:175–179.
10. Mori T, Nakata H. Irreversible data compression in chest imaging using computed radiography: an evaluation. J Thorac Imaging. 1994; 9:23–30.
11. Ohgiya Y, Gokan T, Nobusawa H, Hirose M, Seino N, Fujisawa H, et al. Acute cerebral infarction: effect of JPEG compression on detection at CT. Radiology. 2003; 227:124–127.
12. Slone RM, Foos DH, Whiting BR, Muka E, Rubin DA, Pilgram TK, et al. Assessment of visually lossless irreversible image compression: comparison of three methods by using an image-comparison workstation. Radiology. 2000; 215:543–553.
13. Slone RM, Muka E, Pilgram TK. Irreversible JPEG compression of digital chest radiographs for primary interpretation: assessment of visually lossless threshold. Radiology. 2003; 228:425–429.
14. Ringl H, Schernthaner RE, Kulinna-Cosentini C, Weber M, Schaefer-Prokop C, Herold CJ, et al. Lossy three-dimensional JPEG2000 compression of abdominal CT images: assessment of the visually lossless threshold and effect of compression ratio on image quality. Radiology. 2007; 245:467–474.
15. Daly S. Application of a noise-adaptive contrast sensitivity function to image data compression. Opt Eng. 1990; 29:977–987.
16. Aberle DR, Gleeson F, Sayre JW, Brown K, Batra P, Young DA, et al. The effect of irreversible image compression on diagnostic accuracy in thoracic imaging. Invest Radiol. 1993; 28:398–403.
17. Savcenko V, Erickson BJ, Palisson PM, Persons KR, Manduca A, Hartman TE, et al. Detection of subtle abnormalities on chest radiographs after irreversible compression. Radiology. 1998; 206:609–616.
18. Kalyanpur A, Neklesa VP, Taylor CR, Daftary AR, Brink JA. Evaluation of JPEG and wavelet compression of body CT images for direct digital teleradiologic transmission. Radiology. 2000; 217:772–779.
19. Hansell DM, Bankier AA, MacMahon H, McLoud TC, Muller NL, Remy J. Fleischner Society: glossary of terms for thoracic imaging. Radiology. 2008; 246:697–722.
20. Kim KJ, Kim B, Choi SW, Kim YH, Hahn S, Kim TJ, et al. Definition of compression ratio: difference between two commercial JPEG2000 program libraries. Telemed J E Health. 2008; 14:350–354.
21. Lee KH, Kim YH, Kim BH, Kim KJ, Kim TJ, Kim HJ, et al. Irreversible JPEG 2000 compression of abdominal CT for primary interpretation: assessment of visually lossless threshold. Eur Radiol. 2007; 17:1529–1534.
22. Woo HS, Kim KJ, Kim TJ, Hahn S, Kim B, Kim YH, et al. JPEG 2000 compression of abdominal CT: difference in tolerance between thin- and thick-section images. AJR Am J Roentgenol. 2007; 189:535–541.
23. Kim KJ, Kim B, Lee KH, Kim TJ, Mantiuk R, Kang HS, et al. Regional difference in compression artifacts in low-dose chest CT images: effects of mathematical and perceptual factors. AJR Am J Roentgenol. 2008; 191:W30–W37.
24. Kim B, Lee KH, Kim KJ, Mantiuk R, Kim HR, Kim YH. Artifacts in slab average-intensity-projection images reformatted from JPEG 2000 compressed thin-section abdominal CT data sets. AJR Am J Roentgenol. 2008; 190:W342–W350.
25. Kim B, Lee KH, Kim KJ, Mantiuk R, Hahn S, Kim TJ, et al. Prediction of perceptible artifacts in JPEG 2000-compressed chest CT images using mathematical and perceptual quality metrics. AJR Am J Roentgenol. 2008; 190:328–334.
26. Kim TJ, Lee KH, Kim B, Kim KJ, Chun EJ, Bajpai V, et al. Regional variance of visually lossless threshold in compressed chest CT images: lung versus mediastinum and chest wall. Eur J Radiol. 2009; 69:483–488.
27. Newcombe RG. Two-sided confidence intervals for the single proportion: comparison of seven methods. Stat Med. 1998; 17:857–872.
28. Liddell FD. Simplified exact analysis of case-referent studies: matched pairs; dichotomous exposure. J Epidemiol Community Health. 1983; 37:82–84.
29. Fleiss JL, Cuzick J. The reliability of dichotomous judgements: unequal numbers of judges per subjects. Appl Psychol Meas. 1979; 3:537–542.
30. Erickson BJ, Manduca A, Palisson P, Persons KR, Earnest F 4th, Savcenko V, et al. Wavelet compression of medical images. Radiology. 1998; 206:599–607.