Abstract
Background
Hyperkalemia is a potentially fatal condition that mandates rapid identification in emergency departments (EDs). Although a 12-lead electrocardiogram (ECG) can indicate hyperkalemia, subtle changes in the ECG often pose detection challenges. An artificial intelligence application that accurately assesses hyperkalemia risk from ECGs could revolutionize patient screening and treatment. We aimed to evaluate the efficacy and reliability of a smartphone application, which utilizes camera-captured ECG images, in quantifying hyperkalemia risk compared to human experts.
Methods
We performed a retrospective analysis of ED hyperkalemic patients (serum potassium ≥ 6 mmol/L) and their age- and sex-matched non-hyperkalemic controls. The application was tested by five users and its performance was compared to five board-certified emergency physicians (EPs).
Results
Our study included 125 patients. The area under the curve (AUC)-receiver operating characteristic of the application’s output was nearly identical among the users, ranging from 0.898 to 0.904 (median: 0.902), indicating almost perfect interrater agreement (Fleiss’ kappa 0.948). The application demonstrated high sensitivity (0.797), specificity (0.934), negative predictive value (NPV) (0.815), and positive predictive value (PPV) (0.927). In contrast, the EPs showed moderate interrater agreement (Fleiss’ kappa 0.551), and their consensus score had a significantly lower AUC of 0.662. The physicians’ consensus demonstrated a sensitivity of 0.203, specificity of 0.934, NPV of 0.527, and PPV of 0.765. Notably, this performance difference remained significant regardless of patients’ sex and age (P < 0.001 for both).
Graphical Abstract
Hyperkalemia is a common electrolyte disturbance that occurs when serum potassium levels exceed the normal range of 3.5–5.0 mmol/L. Severe hyperkalemia, defined as serum potassium levels greater than 6.5 mmol/L, can lead to life-threatening cardiac arrhythmias and cardiac arrest1 Early detection and prompt management of hyperkalemia are crucial to prevent adverse outcomes.
The 12-lead electrocardiogram (ECG) is a widely available and non-invasive tool that can aid in the diagnosis of hyperkalemia.23 The classic ECG changes associated with hyperkalemia include tall, peaked T waves, widened QRS complex, and ultimately, loss of P waves.45 However, these ECG changes can be subtle and difficult to quantify, particularly in the early stages of hyperkalemia.67
Artificial intelligence (AI) algorithms have been proposed as a potential solution to improve the accuracy and clinical usability of ECG interpretation for various conditions including hyperkalemia.891011 By quantifying the risk of hyperkalemia using AI, clinicians may be able to make more timely and informed decisions regarding patient management, especially in emergency department (ED).
However, the incorporation of existing AI algorithms for ECG analysis into clinicians’ routines poses significant challenges. Primarily, most of the AI algorithms typically depend on raw waveform data for their analyses. Given most physicians have only access to printed (on-screen or paper) ECGs, they cannot utilize these algorithms unless such services are systemically integrated into their ECG devices or electronical health records (EHRs) system. However, such systemic integration would cost a lot because of the diversity of manufacturers and EHR systems.
Some of these challenges may be mitigated through the development of AI capable of analyzing printed ECG image. Therefore, we previously developed a smartphone application “ECG BuddyTM”.1213 The application automatically detect, capture and analyze ECG waveforms using smartphone’s camera and is capable of extracting various digital biomarkers from printed 12-lead ECGs.
In this study, we evaluated its performances and compared it to human experts. As it used photo images as input, which can be subjected to various source of noise and variability, we also tested its reliability as a digital biomarker.
This was a retrospective study that included patients who visited the ED of an academic hospital between July to September 2021. We included patients who had a serum potassium level of 6 mmol/L or more (hyperkalemic group) with a 12-lead ECG done at the visit and their age- and sex-matched controls with a serum potassium level less than 6 mmol/L (non-hyperkalemic group) and an ECG from the same period.
We collected demographic information including age and sex of the patient, chief complaints, initial serum potassium level and captured images of the waveform area of the ECGs from the electronic medical record system. Patients suspected of having pseudohyperkalemia were excluded.
An AI smartphone application, named “ECG Buddy,” was used to analyze ECG images (Fig. 1). It can analyze 12-lead ECGs and provide risk scores (Quantitatve ECG [QCG®] scores, ranging from 0 to 100) for various cardiac, hemodynamic and electrolyte problems using deep learning algorithms.1213 Briefly, users take a picture of a 12-lead ECG printed either on paper or computer display to analyze the image and get AI risk scores for 10 conditions including critical events (respiratory or circulatory failure), acute coronary syndrome, ST-elevation myocardial infarction, myocardial injury, pulmonary edema, large pericardial effusion, left ventricular dysfunction, right ventricular dysfunction, pulmonary hypertension and hyperkalemia.
To establish a benchmark for the AI score for hyperkalemia, a consensus score was obtained from a group of five board-certified emergency physicians who were blinded to the patients’ clinical information and potassium levels. The physicians were emergency medicine (EM) professors with at least 6-years of clinical work experience in tertiary care centers. Each physician was presented with ECG images and asked to determine whether the patient had hyperkalemia, defined as a potassium level of 6 mmol/L or higher. The percentage of experts who voted “yes” was calculated as the consensus score, and the consensus decision was based on the majority vote among the experts. This consensus score and decision were used to compare the accuracy of the AI model.
The application’s performance was evaluated in comparison to the consensus score obtained from the panel of physicians, using the area under the receiver operating characteristic curve (AUC-ROC). The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of the application were calculated by binning the AI score based on the Youden index. To assess the inter-rater agreement among the application users and among the expert physicians, Fleiss' Kappa was calculated. The data analysis was performed using R software, version 4.1.0.141516
A total of 125 patients (64 hyperkalemic and 61 non-hyperkalemic) were included in this study (Table 1). The mean age of the hyperkalemic group was 73.4 ± 10.6 years and that of the non-hyperkalemic group was 71.6 ± 11.3 years. There were no significant differences in age and sex, and the most common cause of ED visit was due to dyspnea in both of the groups. There were significant differences in QCG score between the groups with 75.9 (28.2–92.2) in hyperkalemia group and 2.8 (1.5–7.0) in normokalemia group.
The application was evaluated by five evaluators including two physicians, two nurses and a paramedic (Table 2). The AUC-ROC of the application for detecting hyperkalemia ranged from 0.898 (95% confidence interval, 0.842–0.954) to 0.904 (0.849–0.959) with almost perfect inter-rater agreement (Fleiss’ Kappa 0.948). Since all five evaluators show almost identical evaluation results (Fig. 2), following analyses will be based on the assessment results of physician #1 whose AUC was 0.900.
The AUC of EM physicians ranged 0.522 (0.472–0.572) to 0.638 (0.568–0.709) with moderate inter-rater agreement (Fleiss’ Kappa 0.551; Table 3). Even the consensus score of the EM physicians achieved AUC of only 0.662 (0.585–0.739) which was significantly lower than all of the application evaluations (P < 0.001 for all).
Binned at the threshold of 22.1, the application (physician #1) showed a sensitivity of 0.797 (0.703–0.891), specificity of 0.934 (0.869–0.984), NPV of 0.815 (0.746–0.891), and PPV of 0.927 (0.862–0.982; Table 4). The consensus decision of the EM physicians showed a sensitivity of 0.203 (0.109–0.297), specificity of 0.934 (0.869–0.984), PPV of 0.527 (0.491–0.563) and NPV of 0.765 (0.545–0.944).
When analyzed by age category (≥ 73, median age cutoff) or sex, the AUC-ROC of the QCG score was 0.905 (0.826–0.984) for the older age group, 0.896 (0.816–0.977) for the younger age group, 0.961 (0.916–1.000) for the female population, and 0.863 (0.778–0.947) for the male population, respectively; these results revealed a significant sex difference (P = 0.045) and an insignificant age difference (P = 0.879), as shown in Table 5. The performance was consistently higher compared to EM physicians’ consensus score in all of the groups (P < 0.001 for all).
In this study, we appraised the efficiency of a smartphone-based AI analyzer that utilizes camera input for screening hyperkalemia in ED patients. Our findings demonstrate that the AI ECG reader, relying on image input, surpasses human experts in identifying hyperkalemia from preliminary ECGs, exhibiting near-perfect inter-rater consistency. This constitutes the inaugural study that reports the accuracy of an ECG image analyzer, smartphone-based and camera-operated, in diagnosing hyperkalemia. Furthermore, it is among the rare studies that report the inter-rater reliability of an AI application.17
Prompt detection of hyperkalemia is crucial in ED, given that severe cases can progress to fatal cardiac arrest. Immediate interventions, including the administration of insulin and glucose, bicarbonate, and beta-agonists, can preempt this critical event. Nevertheless, the time required to wait for laboratory results to confirm hyperkalemia may lead to considerable delays, thereby amplifying the risk of adverse outcomes.
In contrast to laboratory tests, an ECG test requires merely a minute. Thus, the primary advantage of utilizing AI for hyperkalemia screening in the ED in comparison to laboratory results is its expedience, which can facilitate the immediate initiation of the appropriate therapeutic strategy. In this study, we have demonstrated the precision and reliability of AI as a diagnostic tool for hyperkalemia screening. Given that hyperkalemia's emergency treatment is straightforward and relatively safe, while severe hyperkalemia can precipitate cardiac arrest, the application of ECG AI for hyperkalemia might significantly enhance ED triage and emergency treatment.
Earlier studies have demonstrated that ECG AIs can accurately detect hyperkalemia. For instance, Galloway et al. predicted hyperkalemia employing a convolution neural network using a variable number of ECG leads.9 Lin et al.8 utilized a hierarchical attention network architecture (ECG12Net) and reported superior accuracy in detecting dyskalemia compared to emergency physicians and cardiologists. Moreover, Kwon et al.18 developed deep learning models to predict abnormalities in electrolyte levels, including potassium, sodium, and calcium. While their methodology demonstrated excellent accuracy, these AI services have not seen widespread use due to the requirement of raw ECG signal data, notwithstanding that healthcare providers are only accessible to printed ECG materials.
Thus, the use of camera input data, as opposed to ECG raw data, for AI analysis as exhibited in our study may considerably increase the accessibility of healthcare workers to AI technology. The ubiquity of smartphones with camera capabilities among healthcare workers allows for ease of access to this technology simply by downloading the application. Moreover, this approach has a financial advantage, given that there is no need to replace or upgrade existing ECG machines or EHR systems.
The AI demonstrated enhanced performance within the female group. While this could potentially be a coincidental observation, it's important to acknowledge the existence of sex differences in cardiac electrophysiology. For example, ECGs from men typically display a higher T-wave amplitude and an increased ST angle, while those from women often show a longer QT duration.19 Furthermore, reports have indicated sex differences in ECG changes associated with pathological conditions, such as left ventricular hypertrophy (LVH).20 Despite these known differences, no previous studies have specifically addressed the sex-based performance disparity of AI in hyperkalemia screening. As such, further research is necessary to determine whether this finding can be replicated.
Our study is not without limitations. Firstly, it is a single-center, retrospective study, potentially limiting the generalizability of our findings. Secondly, the sample size is also relatively small, suggesting a need for verification in a larger, multi-center study. Thirdly, the decision threshold for the QCG score, set at 22.1 based on the Youden index, yielded a sensitivity of 79.7% and a specificity of 93.4%. However, in a real clinical application, this threshold could be adjusted to meet predetermined minimum sensitivity or specificity requirements, depending on the ED's policies or available resources. Fourthly, the AI application lacks an explainability feature, which represents a significant limitation. This absence hinders users from learning from the AI algorithms, as understanding the features used to classify signals could provide valuable insights. Lastly, while our results illustrate the superior ability of the image-based AI ECG reader over clinicians in detecting hyperkalemia, the clinical efficacy of this technology needs further exploration. Future investigations should aim to evaluate the impact of this technology on patient outcomes and healthcare costs.
In conclusion, the use of an image-based AI ECG reader with camera input is both an accurate and reliable tool for hyperkalemia screening at ED triage. This potentially enables early treatment of hyperkalemic patients and leads to improved patient outcomes.
Notes
Funding: This research was supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number: RS-2023-00265933).
Disclosure: Joonghee Kim, MD developed the algorithm and founded a start-up company ARPI Inc. He is the CEO of the company. Youngjin Cho works for the company as a research director. Eunkyoung Lee and Bumi Jeong work for the company as clinical researcher. Otherwise, there is no conflict of interest for the other authors.
References
1. Diercks DB, Shumaik GM, Harrigan RA, Brady WJ, Chan TC. Electrocardiographic manifestations: electrolyte abnormalities. J Emerg Med. 2004; 27(2):153–160. PMID: 15261358.
2. Dillon JJ, DeSimone CV, Sapir Y, Somers VK, Dugan JL, Bruce CJ, et al. Noninvasive potassium determination using a mathematically processed ECG: proof of concept for a novel “blood-less, blood test”. J Electrocardiol. 2015; 48(1):12–18. PMID: 25453193.
3. Attia ZI, DeSimone CV, Dillon JJ, Sapir Y, Somers VK, Dugan JL, et al. Novel bloodless potassium determination using a signal-processed single-lead ECG. J Am Heart Assoc. 2016; 5(1):e002746. PMID: 26811164.
4. Velagapudi V, O’Horo JC, Vellanki A, Baker SP, Pidikiti R, Stoff JS, et al. Computer-assisted image processing 12 lead ECG model to diagnose hyperkalemia. J Electrocardiol. 2017; 50(1):131–138. PMID: 27662777.
5. Laks MM, Elek SR. The effect of potassium on the electrocardiogram: clinical and transmembrane correlations. Dis Chest. 1967; 51(6):573–586. PMID: 6027025.
6. Wrenn KD, Slovis CM, Slovis BS. The ability of physicians to predict hyperkalemia from the ECG. Ann Emerg Med. 1991; 20(11):1229–1232. PMID: 1952310.
7. Montague BT, Ouellette JR, Buller GK. Retrospective review of the frequency of ECG changes in hyperkalemia. Clin J Am Soc Nephrol. 2008; 3(2):324–330. PMID: 18235147.
8. Lin CS, Lin C, Fang WH, Hsu CJ, Chen SJ, Huang KH, et al. A deep-learning algorithm (ECG12Net) for detecting hypokalemia and hyperkalemia by electrocardiography: algorithm development. JMIR Med Inform. 2020; 8(3):e15931. PMID: 32134388.
9. Galloway CD, Valys AV, Shreibati JB, Treiman DL, Petterson FL, Gundotra VP, et al. Development and validation of a deep-learning model to screen for hyperkalemia from the electrocardiogram. JAMA Cardiol. 2019; 4(5):428–436. PMID: 30942845.
10. Chiu IM, Cheng JY, Chen TY, Wang YM, Cheng CY, Kung CT, et al. Using deep transfer learning to detect hyperkalemia from ambulatory electrocardiogram monitors in intensive care units: personalized medicine approach. J Med Internet Res. 2022; 24(12):e41163. PMID: 36469396.
11. Corsi C, Cortesi M, Callisesi G, De Bie J, Napolitano C, Santoro A, et al. Noninvasive quantification of blood potassium concentration from ECG in hemodialysis patients. Sci Rep. 2017; 7(1):42492. PMID: 28198403.
12. Choi YJ, Park MJ, Ko Y, Soh MS, Kim HM, Kim CH, et al. Artificial intelligence versus physicians on interpretation of printed ECG images: diagnostic performance of ST-elevation myocardial infarction on electrocardiography. Int J Cardiol. 2022; 363:6–10. PMID: 35691440.
13. Kim D, Hwang JE, Cho Y, Cho HW, Lee W, Lee JH, et al. A retrospective clinical evaluation of an artificial intelligence screening method for early detection of STEMI in the emergency department. J Korean Med Sci. 2022; 37(10):e81. PMID: 35289140.
14. Conger AJ. Integration and generalization of kappas for multiple raters. Psychol Bull. 1980; 88(2):322–328.
15. Fleiss JL. Measuring nominal scale agreement among many raters. Psychol Bull. 1971; 76(5):378–382.
16. Fleiss JL, Levin B, Paik MC. Statistical Methods for Rates and Proportions. 3rd ed. New York, NY, USA: John Wiley & Sons;2003.
17. Chan KS, Chan YM, Tan AH, Liang S, Cho YT, Hong Q, et al. Clinical validation of an artificial intelligence-enabled wound imaging mobile application in diabetic foot ulcers. Int Wound J. 2022; 19(1):114–124. PMID: 33942998.
18. Kwon JM, Jung MS, Kim KH, Jo YY, Shin JH, Cho YH, et al. Artificial intelligence for detecting electrolyte imbalance using electrocardiography. Ann Noninvasive Electrocardiol. 2021; 26(3):e12839. PMID: 33719135.
19. Prajapati C, Koivumäki J, Pekkanen-Mattila M, Aalto-Setälä K. Sex differences in heart: from basics to clinics. Eur J Med Res. 2022; 27(1):241. PMID: 36352432.
20. Ochi H, Noda A, Miyata S, Skegawa M, Iwase M, Koike Y, et al. Sex differences in the relationships between electrocardiographic abnormalities and the extent of left ventricular hypertrophy by echocardiography. Ann Noninvasive Electrocardiol. 2006; 11(3):222–229. PMID: 16846436.