Abstract
Backgrounds/Aims
To systematically evaluate inter-reader agreement in the assessment of individual liver imaging reporting and data system (LI-RADS) category M (LR-M) imaging features in computed tomography/magnetic resonance imaging (CT/MRI) LIRADS v2018, and to explore the causes of poor agreement in LR-M assignment.
Methods
Original studies reporting inter-reader agreement for LR-M features on multiphasic CT or MRI were identified using the MEDLINE, EMBASE, and Cochrane databases. The pooled kappa coefficient (κ) was calculated using the DerSimonian-Laird random-effects model. Heterogeneity was assessed using Cochran’s Q test and I2 statistics. Subgroup meta-regression analyses were conducted to explore the study heterogeneity.
Results
In total, 24 eligible studies with 5,163 hepatic observations were included. The pooled κ values were 0.72 (95% confidence interval [CI], 0.65-0.78) for rim arterial phase hyperenhancement, 0.52 (95% CI, 0.39-0.65) for peripheral washout, 0.60 (95% CI, 0.50-0.70) for delayed central enhancement, 0.68 (95% CI, 0.57-0.78) for targetoid restriction, 0.74 (95% CI, 0.65-0.83) for targetoid transitional phase/hepatobiliary phase appearance, 0.64 (95% CI, 0.49-0.78) for infiltrative appearance, 0.49 (95% CI, 0.30-0.68) for marked diffusion restriction, and 0.61 (95% CI, 0.48-0.73) for necrosis or severe ischemia. Substantial study heterogeneity was observed for all LR-M features (Cochran’s Q test, P<0.01; I2≥89.2%). Studies with a mean observation size of <3 cm, those performed using 1.5-T MRI, and those with multiple image readers, were significantly associated with poor agreement of LR-M features.
The Liver Imaging Reporting and Data System (LI-RADS), which was last updated in 2018, offers a comprehensive framework for standardizing the terminology, technique, interpretation, reporting, and data collection when performing liver imaging for patients with a high risk for hepatocellular carcinoma (HCC), such as those with cirrhosis or chronic hepatitis B virus infection.1 LI-RADS has gained recognition as the standard for imaging diagnosis of HCC, and has been increasingly adopted worldwide after recent endorsement by the American Association for the Study of Liver Diseases.1
In the computed tomography (CT)/magnetic resonance imaging (MRI) LI-RADS v2018, LI-RADS category M (LR-M) is defined as observations that are probably or definitely malignant but not specific for HCC, thereby preserving the specificity of HCC diagnosis without compromising the sensitivity for detection of malignancy.1 Most LR-M category lesions are non-HCC malignancies such as intrahepatic cholangiocarcinomas (CCAs) and combined hepatocellular-CCAs (cHCC-CCAs), whereas 28-29% of LR-M lesions are HCCs with atypical imaging features.2,3 LR-M criteria include five targetoid imaging features of rim arterial phase hyperenhancement (APHE), peripheral washout, delayed central enhancement (DCE), targetoid restriction, and targetoid transitional phase (TP)/hepatobiliary phase (HBP) appearance, and the three non-targetoid features of infiltrative appearance, marked diffusion restriction, and necrosis or severe ischemia.1 LR-M is an exclusion criterion, which means that any hepatic observation showing only one targetoid imaging feature can be categorized into LR-M, and therefore, assessing individual LR-M features and assigning LR-M category accurately and reproducibly is important in the context of HCC diagnosis.
A recent multicenter retrospective study reported that agreement was poor for the diagnostic decision of LR-M compared with not LR-M (intraclass correlation coefficient [ICC], 0.46).4 However, this study did not provide the agreement for assessments of individual LR-M features, and was therefore of limited value for exploring the causes of poor agreement in the assignment of LR-M. Additionally, although several previous studies reported inter-reader agreement for the assessment of individual LR-M features,5-13 they were subject to limitations such as a single-center study design, a small number of study subjects, and variable results among studies.
Therefore, we performed a systematic review and meta-analysis of inter-reader agreement in the assessment of individual LR-M imaging features in the CT/MRI LI-RADS v2018 and explored the causes of poor agreement in LR-M assignment.
This study was conducted and reported according to the guidelines of the meta-analysis of observational studies in epidemiology14 and preferred reporting items for systematic reviews and meta-analyses (Supplementary Table 1),15 and was prospectively registered in PROSPERO (ID, CRD42024505011). Two abdominal radiologists, each with ≥10 years of experience in liver imaging and ≥5 years of experience in performing systematic reviews with meta-analysis, independently conducted the literature search, study selection, data extraction, and assessment of study quality.
Comprehensive searches were conducted across the MEDLINE, EMBASE, and Cochrane databases to identify original research articles that reported inter-reader agreement in the evaluation of LI-RADS v2018 LR-M imaging features on dynamic contrast-enhanced CT/MRI. The search query was designed to ensure a sensitive literature search and to minimize the risk of overlooking relevant articles. Key search terms included HCC, LI-RADS, CT, and MRI. Search terms are defined in detail in Supplementary Table 2. The start date of the search period was set to January 1, 2018, to encompass all original articles utilizing LI-RADS v2018, and updates were performed until December 6, 2023. The search was restricted to studies comprising human subjects and those published in English. To broaden the search further, the bibliographies of the selected articles were scrutinized for potentially relevant studies.
After removing duplicates, articles underwent a review for eligibility based on the following criteria: (1) population-patients with HCC risk factors defined by LI-RADS,1 (2) index test: multiphasic CT or MRI, (3) comparator: no specific requirements, (4) outcomes: inter-reader agreement of LR-M imaging features, including five targetoid features (rim APHE, peripheral washout, DCE, targetoid restriction, and targetoid TP/HBP appearance) and three non-targetoid features (infiltrative appearance, marked diffusion restriction, and necrosis or severe ischemia), and (5) study design: any type of study, including observational studies and clinical trials. The exclusion criteria were as follows: (1) animal studies, case reports, review articles, editorials, abstracts/conference proceedings, meta-analyses, and systematic reviews, (2) studies outside the scope of interest of this investigation, (3) studies lacking sufficient data for extracting inter-reader agreement (e.g., absence of standard variance or 95% confidence intervals [CI]), and (4) studies with overlapping patient cohorts. The initial screening involved titles and abstracts, followed by full-text reviews of the selected potentially eligible abstracts.
Each study’s data were extracted using a standardized form prepared using Microsoft Excel (Microsoft, Redmond, WA, USA). The extracted information included study characteristics such as first author, year of publication, and study design (prospective vs. retrospective), subject characteristics, including sample size, sex, age, underlying liver disease, number of hepatic observations, percentages of HCC and/or other malignancy among observations, and observation size, type of imaging modality (CT vs. MRI), MRI scanner field strength, and contrast agent used, details related to imaging interpretation, such as the number of readers, level of experience, and whether readers were blinded to the reference standard, statistical method for evaluating inter-reader agreement, and study outcomes of interest. Inter-reader agreement was determined using kappa values (κ) with 95% CI. When an article lacked sufficient data, the corresponding authors were contacted via email to request additional information or clarification.
The guidelines for Reporting Reliability and Agreement Studies16 served as the basis for evaluating the quality of the selected articles. The risk of bias was appraised across seven domains: (1) description of the index test (MRI techniques and imaging sequences), (2) the study subjects (recruitment methods and demographic characteristics), (3) the readers (number and experience level), (4) the reading process (availability of clinical information and independent review), (5) blinding to reference standard, (6) statistical analysis, and (7) the actual numbers of subjects and observations. A high-quality score was assigned to each domain if it was sufficiently detailed with no apparent potential bias.
The pooled κ values and their 95% CIs for the assessment of eight LR-M features were calculated using the DerSimonian-Laird random-effects model with Knapp and Hartung adjustment,17 based on the κ and 95% CI reported in the individual studies. κ values were categorized according to Landis and Koch as follows: poor (<0.20), fair (0.21-0.40), moderate (0.41-0.60), substantial (0.61-0.80), or almost perfect agreement (0.81- 1.00).18 Heterogeneity was assessed using Cochran’s Q test (P<0.1 indicating high heterogeneity) and I2 statistic (I2 >50% indicating substantial heterogeneity).19 Publication bias was evaluated using a funnel plot with a rank test. Subgroup analyses were performed using Cohen’s κ as a statistical method.
Subgroup meta-regression analysis with the following covariates was performed to explore the causes of study heterogeneity: (1) proportion of liver cirrhosis (≥50% vs. <50%), (2) dominant etiology of underlying liver disease (hepatitis B virus vs. others), (3) proportion of other malignancy among hepatic observations (≥10% vs. <10%), (4) mean size of hepatic observations (≥3 cm vs. <3 cm), (5) MRI scanner field strength (3.0-T only vs. 1.5-T or both), (6) MRI contrast agent (hepatobiliary contrast agent [HBA] only vs. extracellular contrast agent or both), and (7) number of readers (≥3 vs. 2). The analyses were performed using R version 4.3.2 (The R Foundation for Statistical Computing, Vienna, Austria), with P<0.05 considered statistically significant.
Among the 1,029 articles identified through the search, 627 were screened after removing duplicates. Subsequently, 364 articles were excluded based on their titles and abstracts (Fig. 1). After a thorough full-text review, 239 articles were excluded from analysis. Finally, 24 eligible articles reporting the inter-reader agreement of LR-M imaging features, encompassing a total of 25 datasets,5-13,20-34 were included. Among them, 22 utilized only MRI. The remaining two articles used either CT or MRI, with one presenting the CT and MRI results separately22 and the other presenting them together.6
The characteristics of the included articles are summarized in Table 1. The 24 eligible studies included 4,789 patients with 5,163 hepatic observations. One study was prospective,20 whereas the other studies were retrospective. Three studies exclusively included patients with cirrhosis.5,9,32 The dominant etiology of liver disease was hepatitis B virus in all but two studies, of which alcoholic liver disease was the dominant etiology in one9 and hepatitis C virus in the other.6 Twelve studies reported proportions of other malignancy such as CCA or cHCC-CCA that exceeded 10% of total observations.5-9,11,12,20,21,25,30,34 Twelve studies utilized only 3.0-T MRI scanners.8,11,12,20,21,23-25,27,31,33,34 Eighteen studies employed only HBA including gadoxetate or gadobenate dimeglumine as the MRI contrast agent.5,7,9-13,20,21,23,26-29,31-34 In all studies, CT/MRI interpretation was conducted by multiple image readers, with three or more readers participating in two studies.11,22
All the included studies demonstrated a low risk of bias in at least six of the seven domains evaluated (Supplementary Table 3). However, five studies were deemed to have a high risk of bias concerning the study subject domain, primarily due to the absence of demographic characteristics such as patient age or the proportion of individuals with liver cirrhosis.7,8,10,27,28 One study was identified as having a high risk of bias in the readers’ domain because the imaging analysis was conducted by inexperienced readers, specifically radiology residents.33 Additionally, one study did not provide information regarding the experience levels of the image readers.25 Furthermore, the availability of clinical information during the review process was unclear in two studies.12,21
All 24 eligible articles reported inter-reader agreement for rim APHE.5-13,20-34 Of them, 20 studies reported inter-reader agreement for peripheral washout,5-8,10-13,20-25,27,30-34 20 for DCE,5-12,21-27,29-32,34 18 for targetoid restriction,5-7,9-12,21-25,27,30-34 19 for targetoid TP/HBP appearance,5-12,21,23-27,30-34 12 for infiltrative appearance,5,6,8,9,12,13,20,22,25-27,29 12 for marked diffusion restriction,5-7,9,10,13,24-27,29,30 and 15 for necrosis or severe ischemia.5-7,9,12,13,20,22,24-27,29,30,33
Table 2 and Fig. 2 summarize the inter-reader agreements for the LR-M features. For rim APHE, 24 studies with 5,163 observations reported κ values ranging from 0.39 to 0.94 (Fig. 2A); with a pooled κ of 0.72 (95% CI, 0.65-0.78), indicating substantial inter-reader agreement. For peripheral washout, 20 studies comprising 4,134 observations showed κ ranging from 0.05 to 0.87 (Fig. 2B); with a pooled κ of 0.52 (95% CI, 0.39-0.65), suggesting moderate inter-reader agreement. For DCE, 20 studies with 3,900 observations reported κ ranging from 0.09 to 0.92 (Fig. 2C); with a pooled κ of 0.60 (95% CI, 0.50-0.70), indicating moderate inter-reader agreement. For targetoid restriction, 18 studies with 3,849 observations presented κ ranging from 0.20 to 0.97 (Fig. 2D); with a pooled κ of 0.68 (95% CI, 0.57-0.78), indicating substantial inter-reader agreement. For targetoid TP/HBP appearance, 19 studies with 3,998 observations reported κ ranging from 0.33 to 0.98 (Fig. 2E); with a pooled κ of 0.74 (95% CI, 0.65-0.83), indicating substantial inter-reader agreement. For infiltrative appearance, 12 studies with 2,192 observations reported κ ranging from 0.24 to 0.92 (Fig. 2F); with a pooled κ of 0.64 (95% CI, 0.49-0.78), indicating substantial inter-reader agreement. For marked diffusion restriction, 12 studies with 2,291 observations reported κ ranging from 0.03 to 0.98 (Fig. 2G); with a pooled κ of 0.49 (95% CI, 0.30-0.68), indicating moderate inter-reader agreement. For necrosis or severe ischemia, 16 studies with 3,118 observations reported κ ranging from 0.07 to 0.91 (Fig. 2H); with a pooled κ of 0.61 (95% CI, 0.48-0.73), indicating substantial inter-reader agreement. Additionally, the proportion of studies showing substantial to almost perfect agreement was the highest for targetoid TP/HBP appearance at 76.5% and lowest for marked diffusion restriction at 45.5% (Fig. 3). The results of the subgroup of 21 studies using Cohen’s κ, which were similar to the values obtained for all included studies, are provided in Supplementary Table 4.
Substantial heterogeneity was observed across all LR-M imaging features (Cochran’s Q test, P<0.01; I2 ≥89.2%). The features showed no significant publication bias (P>0.05; Table 2 and Supplementary Fig. 1), except for peripheral washout (P=0.04) and necrosis or severe ischemia (P=0.03; Table 2 and Supplementary Fig. 1).
Of the 24 included studies, 22 reported inter-reader agreement for the assessment of MRI LR-M features. One reported combined CT and MRI results, and the other reported CT and MRI results separately. Therefore, the inter-reader agreement for the assessment of CT LR-M features was available in only one study:22 κ=0.70 (95% CI, 0.50-0.91) for rim APHE, 0.32 (95% CI, -0.10 to 0.81) for peripheral washout, 0.68 (95% CI, 0.52-0.85) for infiltrative appearance, and 0.66 (95% CI, 0.56-0.76) for necrosis or severe ischemia.
The results of the subgroup meta-regression analyses of targetoid features are summarized in Table 3. In the subgroup analysis of rim APHE, studies with a small observation size (<3 cm) showed significantly lower inter-reader agreement than those with a large observation size (≥3 cm; κ=0.59 vs. κ=0.78, P<0.01). Additionally, studies that used 1.5-T MRI showed poorer agreement than those that used 3.0-T MRI (κ=0.65 vs. κ=0.79, P=0.04). For peripheral washout, studies with a small observation size (<3 cm) exhibited poorer agreement than those with a large observation size (≥3 cm; κ=0.64 vs. κ=0.35, P<0.01). For the features of DCE and targetoid restriction, studies using 1.5-T MRI demonstrated significantly poorer agreement than those using 3.0-T MRI (DCE, κ=0.55 vs. κ=0.78, P=0.01; targetoid restriction, κ=0.49 vs. κ=0.70, P=0.02). For targetoid TP/HBP appearance, studies using 1.5-T MRI showed poorer agreement than those using 3.0-T MRI (κ=0.64 vs. κ=0.82, P=0.02), and studies with multiple readers (≥3) showed poorer agreement than those with two readers (κ=0.33 vs. κ=0.87, P<0.01).
The results of the subgroup meta-regression analysis for non-targetoid features are summarized in Table 4. For infiltrative appearance, marked diffusion restriction, and necrosis or severe ischemia, only MRI field strength was demonstrated as a contributing factor to significant differences in agreement (infiltrative appearance, 3.0-T, κ=0.83 vs. 1.5-T, κ=0.48, P=0.03; marked diffusion restriction, 3.0-T, κ=0.78 vs. 1.5-T, κ=0.39, P=0.04; necrosis or severe ischemia, 3.0-T, κ=0.77 vs. 1.5-T, κ=0.46, P=0.02).
For the 24 published studies with 5,163 hepatic observations, the overall pooled inter-reader agreement for the assessment of LR-M features was moderate to substantial. Most LR-M features showed substantial agreement, whereas peripheral washout (κ=0.52) and marked diffusion restriction (κ=0.49) showed moderate agreement. In addition, studies with small observation sizes, 1.5-T MRI scans, and multiple imaging readers were significantly associated with poor agreement for the assessment of LR-M features.
Our results regarding the pooled inter-reader agreement for the assessment of LR-M features compare favorably with the inter-reader agreement for assessment of the LI-RADS major features of non-rim APHE (κ=0.72), non-peripheral washout (κ=0.69), and enhancing capsule (κ=0.66).35 Although LI-RADS has shown excellent performance when used to stratify the probability of HCC based on the assigned categories,36 concerns about imperfect inter-reader agreement using LI-RADS have been raised.37 In particular, poor agreement for the LR-M criteria was reported in both research and clinical reading (ICC=0.46 for both).4 This may be important because LR-M lesions usually result in a biopsy for diagnostic confirmation and cannot be classified non-invasively.4 Given these imperfect agreements for the LR-M criteria, our study can be useful for understanding the causes of poor agreement because it provides the pooled inter-reader agreement for individual LR-M features and explores the factors associated with poor agreement in LR-M features. Furthermore, since our study included a large number of hepatic observations and involved a robust analysis, it should be helpful in understanding the current status and limitations of the CT/MRI LI-RADS v2018 LR-M criteria.
Of the LR-M features, poor agreement was found for peripheral washout and marked diffusion restriction (κ=0.52 and 0.49, respectively), with these features showing the lowest pooled κ among targetoid and non-targetoid features. LI-RADS has released a lexicon that provides a standardized vocabulary for liver imaging.38,39 In this LI-RADS lexicon, peripheral washout and marked diffusion restriction are defined as subtypes of washout that are mainly observed at the periphery and have an intensity higher than that of the liver on diffusion-weighted imaging that is not caused only by T2 shine-through, if marked in degree.38 However, these definitions have limitations because several details remain unclear. For example, it is not clear which area should be considered the “observation periphery,” or how “marked diffusion restriction” can be differentiated from “general restricted diffusion.” Furthermore, given the lower frequency of peripheral washout and marked diffusion restriction compared with other LR-M features (3% vs. 44% for rim APHE or 38% for targetoid TP/HBP appearance),10 it is possible that these two features are less familiar and experienced by radiologists. As all LR-M features are assessed qualitatively, some variability owing to the inherent subjectivity of imaging feature characterization may be unavoidable. Although the adoption of standardized lexicons has promoted clarity and consistency of communication in clinical practice and scientific literature, the situation is not perfect. Therefore, efforts to overcome inter-reader variability, including clearer definitions for qualitative assessment, illustrations of specific imaging features, and education, should continue.
The subgroup meta-regression analysis identified three significant factors associated with poor agreement in the assessment of LR-M features. Notably, for most LR-M features, 1.5-T MRI showed a poorer agreement than 3.0-T MRI. This result can be explained by the fact that 3.0-T MRI has a higher signal-to-noise ratio and lesion-to-liver contrast than 1.5-T MRI.40 Furthermore, meta-regression analysis revealed that a small observation size and large number of image readers were significant factors associated with poor agreement for LR-M features. In particular, for small HCC lesions, the definition of the peripheral portion may differ between readers, and ancillary features such as fat and blood products may prevent readers from detecting rim APHE.11 In addition, the interpretation may differ due to bias between readers, which might result from differences in training, experience, and frame of reference among readers. Therefore, as the number of image readers increases, variability in the reader experience is more likely to occur.
Our study has several limitations. First, substantial study heterogeneity was noted, which precludes the creation of solid meta-analytic summary estimates. To overcome this limitation, the causes of study heterogeneity (i.e., the causes of poor agreement) were robustly analyzed by performing a subgroup meta-regression analysis using various covariates. Second, during the second round of screening for eligibility, we excluded 19 articles because of insufficient data for extracting inter-reader agreement (Supplementary Table 5). As the meta-analysis for inter-reader agreement required variance as well as κ from each original study, the necessary information was sought by emailing the authors of studies that reported only the value of κ without variance, and if the variance could not be obtained, these studies were excluded. Third, most studies included in this meta-analysis evaluated LR-M imaging features using MRI, with only one study presenting results using CT. Consequently, drawing conclusions regarding the differences in inter-reader agreement between CT and MRI in LR-M evaluations based on the results of this study is challenging.
In conclusion, substantial inter-reader agreement was found for most LR-M features; however, the agreement for peripheral washout and marked diffusion restriction was limited. Additionally, a small observation size, the use of 1.5-T MRI, and multiple image readers contributed to the poor agreement for LR-M features. Therefore, the LI-RADS should continue to clarify and refine the definitions of individual LR-M features and provide educational programs, including a comprehensive manual with both schematic figures and clinical examples, to illustrate these features.
Notes
Data Availability
The data presented in this study are available upon request from the corresponding author.
Author Contributions
Conceptualization: SHC
Data curation: DHK, SHC
Formal analysis: DHK, SHC
Funding acquisition: SHC
Investigation: DHK, SHC
Methodology: DHK, SHC
Project administration: SHC
Resources: DHK, SHC
Software: DHK, SHC
Supervision: SHC
Validation: DHK, SHC
Visualization: DHK, SHC
Writing - original draft: DHK, SHC
Writing - review and editing: DHK, SHC
Approval of final manuscript: DHK, SHC
Supplementary Material
Supplementary data can be found with this article online https://doi.org/10.17998/jlc.2024.04.05.
References
1. Singal AG, Llovet JM, Yarchoan M, Mehta N, Heimbach JK, Dawson LA, et al. AASLD practice guidance on prevention, diagnosis, and treatment of hepatocellular carcinoma. Hepatology. 2023; 78:1922–1965.
2. Kim DH, Choi SH, Park SH, Kim KW, Byun JH, Kim SY, et al. Liver imaging reporting and data system category M: a systematic review and meta-analysis. Liver Int. 2020; 40:1477–1487.
3. Hwang SH, Rhee H. Radiologic features of hepatocellular carcinoma related to prognosis. J Liver Cancer. 2023; 23:143–156.
4. Hong CW, Chernyak V, Choi JY, Lee S, Potu C, Delgado T, et al. A multicenter assessment of inter-reader reliability of LI-RADS version 2018 for MRI and CT. Radiology. 2023; 307:e222855.
5. Kim YY, Kim MJ, Kim EH, Roh YH, An C. Hepatocellular carcinoma versus other hepatic malignancy in cirrhosis: performance of LI-RADS version 2018. Radiology. 2019; 291:72–80.
6. Ludwig DR, Fraum TJ, Cannella R, Ballard DH, Tsai R, Naeem M, et al. Hepatocellular carcinoma (HCC) versus non-HCC: accuracy and reliability of Liver Imaging Reporting and Data System v2018. Abdom Radiol (NY). 2019; 44:2116–2132.
7. Kim MY, Joo I, Kang HJ, Bae JS, Jeon SK, Lee JM. LI-RADS M (LR-M) criteria and reporting algorithm of v2018: diagnostic values in the assessment of primary liver cancers on gadoxetic acid-enhanced MRI. Abdom Radiol (NY). 2020; 45:2440–2448.
8. Kim SS, Lee S, Choi JY, Lim JS, Park MS, Kim MJ. Diagnostic performance of the LR-M criteria and spectrum of LI-RADS imaging features among primary hepatic carcinomas. Abdom Radiol (NY). 2020; 45:3743–3754.
9. Lim K, Kwon H, Cho J. Inter-reader agreement and imaging-pathology correlation of the LI-RADS M on gadoxetic acid-enhanced magnetic resonance imaging: efforts to improve diagnostic performance. Abdom Radiol (NY). 2020; 45:2430–2439.
10. Jang JK, Choi SH, Byun JH, Park SY, Lee SJ, Kim SY, et al. New strategy for Liver Imaging Reporting and Data System category M to improve diagnostic performance of MRI for hepatocellular carcinoma≤ 3.0 cm. Abdom Radiol (NY). 2022; 47:2289–2298.
11. Min JH, Lee MW, Park HS, Lee DH, Park HJ, Lee JE, et al. LI-RADS version 2018 targetoid appearances on gadoxetic acid-enhanced MRI: interobserver agreement and diagnostic performance for the differentiation of HCC and non-HCC malignancy. AJR Am J Roentgenol. 2022; 219:421–432.
12. Zheng W, Huang H, She D, Xiong M, Chen X, Lin X, et al. Added-value of ancillary imaging features for differentiating hepatocellular carcinoma from intrahepatic mass-forming cholangiocarcinoma on Gd-BOPTA-enhanced MRI in LI-RADS M. Abdom Radiol (NY). 2022; 47:957–968.
13. Yang T, Wei H, Wu Y, Qin Y, Chen J, Jiang H, et al. Predicting histologic differentiation of solitary hepatocellular carcinoma up to 5 cm on gadoxetate disodium-enhanced MRI. Insights Imaging. 2023; 14:3.
14. Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, et al. Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis of observational studies in epidemiology (MOOSE) group. JAMA. 2000; 283:2008–2012.
15. McInnes MDF, Moher D, Thombs BD, McGrath TA, Bossuyt PM; The PRISMA-DTA Group, et al. Preferred reporting items for a systematic review and meta-analysis of diagnostic test accuracy studies: the PRISMA-DTA statement. JAMA. 2018; 319:388–396.
16. Kottner J, Audigé L, Brorson S, Donner A, Gajewski BJ, Hróbjartsson A, et al. Guidelines for reporting reliability and agreement studies (GRRAS) were proposed. J Clin Epidemiol. 2011; 64:96–106.
17. IntHout J, Ioannidis JP, Borm GF. The Hartung-Knapp-Sidik-Jonkman method for random effects meta-analysis is straightforward and considerably outperforms the standard DerSimonian-Laird method. BMC Med Res Methodol. 2014; 14:25.
18. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977; 33:159–174.
19. Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ. 2003; 327:557–560.
20. Jiang H, Liu X, Chen J, Wei Y, Lee JM, Cao L, et al. Man or machine? Prospective comparison of the version 2018 EASL, LI-RADS criteria and a radiomics model to diagnose hepatocellular carcinoma. Cancer Imaging. 2019; 19:84.
21. Min JH, Kim SH, Hwang JA, Hyun SH, Ha SY, Choi SY, et al. Prognostic value of LI-RADS category on gadoxetic acid-enhanced MRI and 18FFDG PET-CT in patients with primary liver carcinomas. Eur Radiol. 2021; 31:3649–3660.
22. Cannella R, Burgio MD, Beaufrère A, Trapani L, Paradis V, Hobeika C, et al. Imaging features of histological subtypes of hepatocellular carcinoma: Implication for LI-RADS. JHEP Rep. 2021; 3:100380.
23. Moon JY, Min JH, Kim YK, Cha D, Hwang JA, Ko SE, et al. Prognosis after curative resection of single hepatocellular carcinoma with a focus on LI-RADS targetoid appearance on preoperative gadoxetic acid-enhanced MRI. Korean J Radiol. 2021; 22:1786–1796.
24. Shin J, Lee S, Kim SS, Chung YE, Choi JY, Park MS, et al. Characteristics and early recurrence of hepatocellular carcinomas categorized as LR-M: comparison with those categorized as LR-4 or 5. J Magn Reson Imaging. 2021; 54:1446–1454.
25. Yoon J, Hwang JA, Lee S, Lee JE, Ha SY, Park YN. Clinicopathologic and MRI features of combined hepatocellular-cholangiocarcinoma in patients with or without cirrhosis. Liver Int. 2021; 41:1641–1651.
26. Chen Y, Qin Y, Wu Y, Wei H, Wei Y, Zhang Z, et al. Preoperative prediction of glypican-3 positive expression in solitary hepatocellular carcinoma on gadoxetate-disodium enhanced magnetic resonance imaging. Front Immunol. 2022; 13:973153.
27. Liang Y, Xu F, Wang Z, Tan C, Zhang N, Wei X, et al. A gadoxetic acidenhanced MRI-based multivariable model using LI-RADS v2018 and other imaging features for preoperative prediction of macrotrabecularmassive hepatocellular carcinoma. Eur J Radiol. 2022; 153:110356.
28. Yang H, Han P, Huang M, Yue X, Wu L, Li X, et al. The role of gadoxetic acid-enhanced MRI features for predicting microvascular invasion in patients with hepatocellular carcinoma. Abdom Radiol (NY). 2022; 47:948–956.
29. Yang J, Jiang H, Xie K, Bashir MR, Wan H, Huang J, et al. Profiling hepatocellular carcinoma aggressiveness with contrast-enhanced ultrasound and gadoxetate disodium-enhanced MRI: an intra-individual comparative study based on the Liver Imaging Reporting and Data System. Eur J Radiol. 2022; 154:110397.
30. Hwang JA, Lee S, Lee JE, Yoon J, Choi SY, Shin J. LI-RADS category on MRI is associated with recurrence of intrahepatic cholangiocarcinoma after surgery: a multicenter study. J Magn Reson Imaging. 2023; 57:930–938.
31. Min JH, Lee MW, Rhim H, Han S, Song KD, Kang TW, et al. LI-RADS category is associated with treatment outcomes of small single HCC: surgical resection vs. radiofrequency ablation. Eur Radiol. 2024; 34:525–537.
32. Oh NE, Choi SH, Kim S, Lee H, Jang HJ, Byun JH, et al. Suboptimal performance of LI-RADS v2018 on gadoxetic acid-enhanced MRI for detecting hepatocellular carcinoma in liver transplant candidates. Eur Radiol. 2024; 34:465–474.
33. Park JH, Park YN, Kim MJ, Park MS, Choi JY, Chung YE, et al. Steatotic hepatocellular carcinoma: association of MRI findings to underlying liver disease and clinicopathological characteristics. Liver Int. 2023; 43:1332–1344.
34. Wu H, Liang Y, Wang Z, Tan C, Yang R, Wei X, et al. Optimizing CT and MRI criteria for differentiating intrahepatic mass-forming cholangiocarcinoma and hepatocellular carcinoma. Acta Radiol. 2023; 64:926–935.
35. Kang JH, Choi SH, Lee JS, Park SH, Kim KW, Kim SY, et al. Interreader agreement of Liver Imaging Reporting and Data System on MRI: a systematic review and meta-analysis. J Magn Reson Imaging. 2020; 52:795–804.
36. van der Pol CB, Lim CS, Sirlin CB, McGrath TA, Salameh JP, Bashir MR, et al. Accuracy of the Liver Imaging Reporting and Data System in computed tomography and magnetic resonance image analysis of hepatocellular carcinoma or overall malignancy-a systematic review. Gastroenterology. 2019; 156:976–986.
37. Chernyak V, Sirlin CB. Editorial for “interreader agreement of Liver Imaging Reporting and Data System on MRI: a systematic review and meta analysis”. J Magn Reson Imaging. 2020; 52:805–806.
38. American College of Radiology. LI-RADS lexicon [Internet]. Reston, VA (US): American College of Radiology;[cited 2024 Feb 4]. Available from: https://www.acr.org/-/media/ACR/Files/RADS/LI-RADS/LIRAD-SLexicon-Table.pdf.
Table 1.
Study | Study design | Number of patients* | Patient age | Cirrhosis patients† | Dominant etiology of liver disease | Number of hepatic observations‡ | Size of observations (cm) | Imaging modality | MRI magnet | MRI contrast agent | Agreement metric | Number of readers§ | Blinding to reference standard |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Jiang et al. [20] (2019) | Prospective | 211 (80.1) | 51.4 (26.0-83.0) | 134 (63.5) | Hepatitis B virus (95.3) | 229 (14.0) | 5.43 (1.0-14.9) | MRI | 3.0-T | Gadoxetate | Cohen’s κ | 2 abdominal radiologists (4, 10) | Yes |
Kim et al. [5] (2019) | Retrospective | 220 (81.4) | 58.0 (29.0-80.0) | 220 (100.0) | Hepatitis B virus (75.9) | 220 (25.0) | 3.6 (1.0-11.5) | MRI | 1.5-T or 3.0-T | Gadoxetate | Cohen’s κ | 2 board-certified radiologists (3, 25) | Yes |
Ludwig et al. [6] (2019) | Retrospective | 178 (77.5) | 61.9 (26.0-86.0) | 174 (97.8) | Hepatitis C virus (46.6) | 178 (41.0) | 3.46 (1.4-19.0) | CT or MRI | 1.5-T or 3.0-T | Gadoxetate, gadoversetamide, gadobenate | Cohen’s κ | 2 abdominal radiologists (3,7) | Yes |
Kim et al. [7] (2020) | Retrospective | 165 (78.2) | 58.1 | NR | Hepatitis B virus (83.6) | 165 (31.5) | 3.79 | MRI | 1.5-T or 3.0-T | Gadoxetate | Cohen’s κ | 2 abdominal radiologists (6, 8) | Yes |
Kim et al. [8] (2020) | Retrospective | 98 (83.7) | 60 | NR | Hepatitis B virus (NR) | 98 (50.0) | 3.10 | MRI | 3.0-T | Gadoxetate, gadoterate meglumine | Cohen’s κ | 2 abdominal radiologists (8, 9) | Yes |
Lim et al. [9] (2020) | Retrospective | 65 (83.1) | 62.7 (40.0-84.0) | 65 (100.0) | Alcohol (46.2) | 65 (50.8) | NR | MRI | 1.5-T | Gadoxetate | ICC | 2 abdominal radiologists 5, 15) | Yes |
Cannella et al. [22] (2021) | Retrospective | 266 (77.8) | 64.0 (55.0-70.0) | 85 (32.0) | Hepatitis B virus (26.3) | CT, 253 (0.0) | CT, 4.39 | CT or MRI | NR | Gadobenate dimeglumine, gadoterate meglumine | Cohen’s κ | 3 abdominal radiologists (6, 8, 10) | Yes |
MRI, 227 (0.0) | MRI, 4.20 | ||||||||||||
Min et al. [21] (2021) | Retrospective | 189 (70.4) | 59.4 (36.0-80.0) | 68 (36.0) | Hepatitis B virus (77.8) | 189 (33.3) | 3.63 | MRI | 3.0-T | Gadoxetate | Cohen’s κ | 2 abdominal radiologists (11, 23) | Yes |
Moon et al. [23] (2021) | Retrospective | 242 (78.5) | 57.1 (31.0-84.0) | 108 (44.6) | Hepatitis B virus (82.2) | 242 (0.0) | 2.59 (0.9-4.8) | MRI | 3.0-T | Gadoxetate | Cohen’s κ | 2 abdominal radiologists (10, 11) | Yes |
Shin et al. [24] (2021) | Retrospective | 281 (68.0) | 57 | 157 (55.9) | Hepatitis B virus (89.3) | 281 (0.0) | 3.23 | MRI | 3.0-T | Gadoxetate, gadoterate meglumine | Fleiss’ κ | 3 abdominal radiologists (5, 9, 27) | Yes |
Yoon et al. [25] (2021) | Retrospective | 113 (80.5) | 58.6±8.8 | 68 (60.2) | Hepatitis B virus (87.6) | 113 (100.0) | 3.54 | MRI | 3.0-T | Gadoxetate, gadoterate meglumine | Cohen’s κ | 2 abdominal radiologists (NR) | Yes |
Chen et al.26 (2022) | Retrospective | 278 (79.9) | 53.9±11.7 | 161 (57.9) | Hepatitis B virus (91.0) | 278 (0.0) | 3.90 | MRI | 1.5-T or 3.0-T | Gadoxetate | Cohen’s κ | 2 abdominal radiologists (6, 8) | Yes |
Jang et al.10 (2022) | Retrospective | 384 (77.6) | 60 (33.0-84.0) | NR | Hepatitis B virus (80.7) | 463 (6.9) | 2.1 (0.5-3.0) | MRI | 1.5-T or 3.0-T | Gadoxetate | Cohen’s κ | 2 abdominal radiologists (>5) | Yes |
Liang et al.27 (2022) | Retrospective | 93 (81.7) | NR | 73 (78.5) | Hepatitis B virus (98.9) | 93 (0.0) | 6.86 | MRI | 3.0-T | Gadoxetate | Cohen’s κ | 2 abdominal radiologists (8, 15) | Yes |
Min et al.11 (2022) | Retrospective | 100 (76.0) | 58 (31.0-77.0) | 53 (53.0) | Hepatitis B virus (86.0) | 100 (25.0) | 2.5 (1.1-4.7) | MRI | 3.0-T | Gadoxetate | Fleiss’ κ | 8 abdominal radiologists (1, 2, 3, 5, 8, 10, 11, 15) | Yes |
Yang et al.28 (2022) | Retrospective | 134 (86.6) | 54.3±11.3 | NR | Hepatitis B virus (97.0) | 134 (0.0) | 3.25±1.86 | MRI | 1.5-T | Gadoxetate | Cohen’s κ | 2 abdominal radiologists (5, 8) | Yes |
Yang et al.29 (2022) | Retrospective | 140 (82.9) | 51.9±11.0 | 72 (51.4) | Hepatitis B virus (88.6) | 140 (0.0) | 4.8±3.6 | MRI | 1.5-T or 3.0-T | Gadoxetate | Cohen’s κ | 2 abdominal radiologists (5, 7) | Yes |
Zheng et al.12 (2022) | Retrospective | 116 (81.9) | 56.3 | 68 (58.6) | Hepatitis B virus (78.4) | 116 (29.3) | 6.39 | MRI | 3.0-T | Gadobenate dimeglumine | Cohen’s κ | 2 abdominal radiologists (5, 11) | Yes |
Hwang et al.30 (2023) | Retrospective | 113 (65.6) | 61.1±10.1 | 52 (46.0) | Hepatitis B virus (76.1) | 113 (100.0) | 3.5±1.7 | MRI | 1.5-T or 3.0-T | Gadoxetate, gadoterate meglumine | Cohen’s κ | 2 abdominal radiologists (10, 14) | Yes |
Park et al.33 (2023) | Retrospective | 465 (75.5) | 58.0 (28.0-86.0) | 203 (43.7) | Hepatitis B virus (75.1) | 465 (0.0) | 2.7 (2.0-4.0)Π | MRI | 3.0-T | Gadoxetate | Cohen’s κ | 1 abdominal radiologist and 1 radiology resident (9, 3) | Yes |
Wu et al.34 (2023) | Retrospective | 118 (75.4) | 59.1 | 23 (19.5) | Hepatitis B virus (72.1) | 118 (34.7) | 4.5 | MRI | 3.0-T | Gadoxetate | Cohen’s κ | 2 abdominal radiologists (8, 15) | Yes |
Yang et al.13 (2023) | Retrospective | 182 (79.7) | 52.9 (28.0-75.0) | 109 (59.9) | Hepatitis B virus (94.0) | 182 (0.0) | 2.7 (1.6-4.0) | MRI | 1.5-T or 3.0-T | Gadoxetate | Cohen’s κ | 2 abdominal radiologists (5, 7) | Yes |
Min et al.31 (2024) | Retrospective | 357 (76.2) | 58.8 (34.0-82.0) | 239 (66.9) | Hepatitis B virus (81.8) | 357 (0.0) | 1.85 | MRI | 3.0-T | Gadoxetate | Cohen’s κ | 2 abdominal radiologists (13, 22) | Yes |
Oh et al.32 (2024) | Retrospective | 281 (75.4) | 60.0±9.0 | 281 (100.0) | Hepatitis B virus (77.9) | 344 (4.4) | 2.0 | MRI | 1.5-T or 3.0-T | Gadoxetate | Cohen’s κ | 2 abdominal radiologists (2, 7) | Yes |
Table 2.
Table 3.
Covariate* |
Rim arterial phase hyperenhancement |
Peripheral washout |
Delayed central enhancement |
Targetoid restriction |
Targetoid TP/HBP appearance |
|||||
---|---|---|---|---|---|---|---|---|---|---|
Pooled κ (95% CI) | P-value | Pooled κ (95% CI) | P-value | Pooled κ (95% CI) | P-value | Pooled κ (95% CI) | P-value | Pooled κ (95% CI) | P-value | |
% of cirrhosis | 0.74 | 0.57 | 0.053 | 0.27 | 0.09 | |||||
≥50% (n=14) | 0.71 (0.60-0.81) | 0.48 (0.28-0.68) | 0.52 (0.37-0.66) | 0.67 (0.52-0.82) | 0.69 (0.56-0.83) | |||||
<50% (n=6) | 0.74 (0.59-0.88) | 0.57 (0.33-0.81) | 0.77 (0.61-0.93) | 0.82 (0.68-0.96) | 0.87 (0.85-0.90) | |||||
Dominant etiology of liver disease | ||||||||||
Hepatitis B virus (n=22) | 0.72 (0.65-0.78) | 0.90 | 0.54 (0.41-0.67) | 0.13 | 0.61 (0.51-0.72) | 0.29 | 0.69 (0.57-0.81) | 0.42 | 0.76 (0.66-0.85) | 0.36 |
Others (n=2) | 0.72 (0.00-2.86) | 0.05 (0.00-0.47) | 0.44 (0.00-2.25) | 0.55 (0.00-2.89) | 0.66 (0.0-1.53) | |||||
% of other malignancy | 0.05 | 0.33 | 0.52 | 0.48 | 0.51 | |||||
≥10% of hepatic observations (n=12) | 0.78 (0.69-0.86) | 0.57 (0.38-0.76) | 0.62 (0.47-0.77) | 0.64 (0.45-0.82) | 0.71 (0.57-0.86) | |||||
<10% of hepatic observations (n=12) | 0.65 (0.55-0.75) | 0.46 (0.25-0.66) | 0.57 (0.43-0.72) | 0.72 (0.59-0.86) | 0.78 (0.66-0.90) | |||||
Mean size of hepatic observations | <0.01 | <0.01 | 0.16 | 0.69 | 0.18 | |||||
≥3 cm (n=16) | 0.78 (0.70-0.85) | 0.64 (0.51-0.77) | 0.67 (0.56-0.77) | 0.70 (0.53-0.87) | 0.82 (0.71-0.93) | |||||
<3 cm (n=7) | 0.59 (0.50-0.68) | 0.35 (0.12-0.58) | 0.52 (0.27-0.77) | 0.65 (0.46-0.84) | 0.68 (0.49-0.86) | |||||
MRI scanner field strength | 0.04 | 0.11 | 0.02 | 0.01 | 0.02 | |||||
3.0-T only (n=12) | 0.79 (0.71-0.87) | 0.60 (0.46-0.75) | 0.70 (0.59-0.82) | 0.78 (0.66-0.91) | 0.82 (0.71-0.94) | |||||
1.5-T or both (n=11) | 0.65 (0.54-0.76) | 0.39 (0.09-0.68) | 0.49 (0.32-0.66) | 0.55 (0.39-0.71) | 0.64 (0.52-0.76) | |||||
MRI contrast agent | 0.68 | 0.70 | 0.22 | 0.69 | 0.28 | |||||
HBA only (n=18) | 0.71 (0.63-0.79) | 0.51 (0.34-0.68) | 0.56 (0.42-0.69) | 0.69 (0.55-0.83) | 0.71 (0.60-0.83) | |||||
ECA or both (n=6) | 0.75 (0.58-0.92) | 0.62 (0.38-0.85) | 0.70 (0.57-0.82) | 0.69 (0.50-0.87) | 0.87 (0.76-0.98) | |||||
Number of image readers | 0.18 | 0.11 | 0.19 | 0.15 | <0.01 | |||||
≥3 (n=3) | 0.60 (0.40-0.81) | 0.20 (0.07-0.33) | 0.37 (0.15-0.60) | 0.43 (0.38-0.48) | 0.33 (0.30-0.37) | |||||
2 (n=21) | 0.73 (0.66-0.80) | 0.56 (0.43-0.70) | 0.63 (0.52-0.73) | 0.71 (0.59-0.82) | 0.87 (0.76-0.98) |
Table 4.
Covariate* |
Infiltrative appearance |
Marked diffusion restriction |
Necrosis or severe ischemia |
|||
---|---|---|---|---|---|---|
Pooled κ (95% CI) | P-value | Pooled κ (95% CI) | P-value | Pooled κ (95% CI) | P-value | |
% of cirrhosis | 0.82 | 0.26 | 0.52 | |||
≥50% (n=14) | 0.65 (0.46-0.83) | 0.48 (0.24-0.72) | 0.60 (0.43-0.77) | |||
<50% (n=6) | 0.69 (0.50-0.89) | 0.88 (0.16-0.44) | 0.70 (0.56-0.84) | |||
Dominant etiology of liver disease | 0.64 | 0.69 | 0.94 | |||
Hepatitis B virus (n=22) | 0.65 (0.49-0.81) | 0.50 (0.29-0.72) | 0.60 (0.48-0.73) | |||
Others (n=2) | 0.52 (0.00-3.88) | 0.39 (0.00-4.47) | 0.49 (0.00-5.49) | |||
% of other malignancy | 0.85 | 0.74 | 0.50 | |||
≥10% of hepatic observations (n=12) | 0.65 (0.38-0.91) | 0.46 (0.07-0.85) | 0.55 (0.33-0.78) | |||
<10% of hepatic observations (n=12) | 0.62 (0.40-0.84) | 0.52 (0.25-0.78) | 0.64 (0.46-0.82) | |||
Mean size of hepatic observations | 0.14 | 0.49 | 0.96 | |||
≥3 cm (n=16) | 0.67 (0.50-0.85) | 0.52 (0.21-0.82) | 0.58 (0.42-0.74) | |||
<3 cm (n=7) | 0.39 (0.00-1.02) | 0.39 (0.27-0.50) | 0.59 (0.11-1.07) | |||
MRI scanner field strength | 0.03 | 0.04 | 0.02 | |||
3.0-T only (n=12) | 0.83 (0.57-1.08) | 0.78 (0.33-1.23) | 0.77 (0.63-0.92) | |||
1.5-T or both (n=11) | 0.48 (0.27-0.69) | 0.39 (0.18-0.60) | 0.46 (0.24-0.67) | |||
MRI contrast agent | 0.53 | 0.41 | 0.50 | |||
HBA only (n=18) | 0.66 (0.46-0.87) | 0.43 (0.19-0.67) | 0.57 (0.39-0.74) | |||
ECA or both (n=6) | 0.55 (0.08-1.02) | 0.63 (0.14-1.13) | 0.68 (0.38-0.98) | |||
Number of image readers | 0.72 | Not applicable† | 0.90 | |||
≥3 (n=3) | 0.69 (0.50-0.86) | 0.63 (0.20-1.06) | ||||
2 (n=21) | 0.62 (0.44-0.80) | 0.60 (0.45-0.75) |