INTRODUCTION
Currently, significant amounts of laboratory results are generated and reported to clinical departments. Clinical laboratories are continuously trying to ensure test result quality and reliability using their own quality control schemes and operation algorithms.
Together with quality control, comparing serial laboratory results is important for monitoring patient condition, estimating responsiveness to therapy, and predicting when additional interventions are necessary [
12]. Therefore, clinical laboratories are also required to provide additional information regarding the interpretability of changes in serial laboratory results, but it is not easy to set decision limits for cases ranging from healthy conditions to morbid or disease status and vice versa. Biological variation is an important concept to explain these conditions [
3], and the utility of biological variation has been emphasized in many reports [
456789]. Biochemical analytes are biologically active in response to metabolic changes, and these natural fluctuations could influence laboratory results even in analytes measured in a single day [
1]. Analytical error should also be considered, including pre-analytical variables, analytical imprecision, and the possibility of random error. Analytical error can be minimized but cannot be eliminated. Thus, both biological variation and analytical error should be considered when establishing acceptable decision limits [
510].
The concept of reference change values (RCVs) based on biological variation is widely used for interpreting changes in serial laboratory results; it has been applied to clinical laboratory tests such as delta check and auto-verification [
11112131415]. However, RCV still has limitations as many of the biochemical analytes have high intra-individual variation [
1314]. For these reasons, we tried to introduce the concept of confidence interval (CI) into the interpretation of serial clinical chemistry test results. CI represents the statistically significant probability of the validity of values, and 95% CI covers the true value with 95% probability [
16]. Overlapping CIs can explain statistical significance when comparing two measured results. If the two CIs do not overlap, they can be considered significantly different [
1617]. To the best of our knowledge, no studies till date have examined the association between overlapping CIs and clinical chemistry test results.
The aims of this study were to: (1) introduce the concept of overlapping CIs into the interpretation of changes in serial clinical chemistry test results; (2) compare their clinical utility with that of conventional RCVs; and (3) determine the possibility of practical clinical application.
MATERIALS AND METHODS
Data collection
We retrospectively analyzed all the clinical chemistry test results obtained from 1,421,595 outpatients at Wonju Severance Christian Hospital, Wonju, Korea between January and December 2017. In total, 2,022,643 test results were obtained from outpatients. The number of paired results was 1,202,096 (59.4%). To minimize the effects of fluctuations due to treatment, intervention, or sudden physiological changes, we excluded the records of hospitalized and emergency room patients. The clinical chemistry test items included the followings: albumin (ALB), alkaline phosphatase (ALP), alanine aminotransferase (ALT), aspartate aminotransferase (AST), blood urea nitrogen (BUN), complement 3 (C3), complement 4 (C4), calcium (CA), creatine kinase (CK), chloride (CL), total carbon dioxide (CO2), creatinine (CRE), C-reactive protein (CRP), direct bilirubin (DB), iron (FE), gamma-glutamyl transferase (GGT), glucose (GLU), high-density lipoprotein (HDL) cholesterol, immunoglobulin (Ig)A, IgG, IgM, potassium (K), lactate dehydrogenase (LDH), low-density lipoprotein (LDL) cholesterol, magnesium (MG), sodium (NA), inorganic phosphorus (P), rheumatoid factor (RF), total bilirubin (TB), total cholesterol (TCHO), triglyceride (TG), total protein (TP), and uric acid (UA). All tests were performed using the Cobas 8000 system (Cobas c 702 and E 601 module; Roche Diagnostics, Basel, Switzerland). This study was approved by the Institutional Review Board (IRB) of Wonju Severance Christian Hospital (IRB No. CR318097), which waived the requirement for informed consent.
Patient data analysis
To calculate the changes in serial test results, pairs of test results from the same patient were collected. To correct the time order of previous and current results, we used the absolute values. The absolute values of percent difference (absolute delta%) between the previous result and current result were calculated as follows:
The distributions of absolute delta% were assessed for all analytes tested in this study; the 97.5th percentile values of our data distribution for each item were also calculated.
Interpreting the changes in serial test results
The RCVs for each analyte were calculated using the following formula [
11112131415]:
where RCV is the reference change value, Z-score is the value of the standard deviation at a point above the specified probability in normal distribution (i.e. the Z-scores corresponding to 95% and 99% are 1.96 and 2.58, respectively), CV
A is the analytical coefficient of variation obtained from the internal quality control program of our clinical chemistry laboratory, and CV
I is the intra-individual variation defined by the European Federation of Clinical Chemistry and Laboratory Medicine [
18] and the Westgard database of biological variation [
319].
The concept of overlapping CIs was used to interpret the changes in serial laboratory results. The ranges of the CIs, which were also based on the biological variation database for each analyte, were calculated using the following formula:
where Z test indicates the Z-score, at which the degree of overlapping CIs at a given probability can reject a null hypothesis.
Previous studies have noted that a confidence level of 83.4% could explain the statistical significance of whether overlapping CIs between two means are significantly different at the α=0.05 level [
2021]. Similar to previous studies, we calculated that a confidence level of 93.2% could explain the statistical significance, i.e., whether overlapping CIs between two means are significantly different at α=0.01 level. In addition, we also compared the values obtained at 95% confidence level. The Z
test values were 1.39, 1.82, and 1.98 at confidence levels of 83.4%, 93.2%, and 95%, respectively.
Development of the monitoring system
We developed a monitoring system for the changes in serial laboratory results using the concept of overlapping CIs and integrated it into our laboratory information system (LIS) technology. Details of the criteria and formulae for interpreting the changes in serial test results are described in the ‘Interpreting the changes in serial test results’ section.
Fig. 1 provides an example of the results obtained using this new system. The 95% CI of the current and previous test results reflecting biological variation are displayed and compared. If the 95% CIs do not overlap, the two results are shown to be significantly different from each other. For example, the 95% CI for TG in the initial results was 65.1–148.9, while the follow-up result was 158.8–363.2. As the CIs of this item do not overlap, the “95% CI_O” sign was displayed as significantly increased. Additionally, through the administration menu of our system, we input the CV
A values, which were obtained from our annual internal quality control data. CV
I is a fixed value for each test item, but it can be changed if the Westgard database is changed.
Statistical analysis
Data distributions were confirmed as normal by the Kolmogorov-Smirnov test, and a P value greater than 0.05 indicates normal distribution. Since the distributions of absolute delta% did not show normal distribution (non-parametric), they were presented as median and interquartile range (IQR). The percentages of test results exceeding the RCV and those in which the CIs do not overlap showed normal distribution (parametric) and were presented as mean and 95% CI. All statistical analyses were performed using SPSS version 23.0 (IBM Corp., Armonk, NY, USA) and the Analyse-it version 5.01 (Analyse-It Software, Ltd., Leeds, UK) add-on in Microsoft Excel 2016 (Microsoft Corp, Redmond, WA, USA).
RESULTS
The numbers and percentages of the 33 clinical chemistry test items are listed in
Table 1. The values of CV
A, CV
I, inter-individual variation (CV
G), and indexes of individuality (=the ratio of CV
I to CV
G) [
22] are also listed in
Table 1. The statistical characteristics for the absolute value of each delta% in each analyte are summarized in
Table 2.
Many analytes showed right-skewed patterns in the histogram; however, some analytes (ALT, CK, CRP, DB, FE, and GGT) showed relatively high proportions, exceeding 100% of the absolute delta%. The distributions of absolute delta% for each analyte are illustrated in
Supplemental Data Fig. S1.
Table 3 shows the comparison of the percentages of test results between those exceeding the RCV and those in which the CIs do not overlap. In CA and TB, when the cut-off was changed from RCV
95% to RCV
99%, the percentage of test results exceeding the RCV was reduced by more than a half. The percentages of non-overlapping CIs also varied according to the analytes; however, the overall tendency was similar to that of the RCVs. The distributions of percentages exceeding RCV
95% and RCV
99% were similar to those of non-overlapping 83.4% CI and 93.2% CI, respectively. However, we found that the excess rates were slightly lower when using the CI than when using the RCV. In particular, in CRP, the decrease in the excess rate was more pronounced when using the CI than using the RCV.
Additionally, we examined the change in percentage of test results, in which the CIs did not overlap after correction using the reference interval. In some analytes, such as CRE, CRP, DB, and TB, the change in absolute values of <1 may be overestimated owing to the large percent change. Additionally, for some analytes for which most test results are within the reference interval, monitoring changes in serial results might be less important. We therefore performed further analysis only if at least one of the previous and current test results deviated from the reference range, as shown in
Table 3. After correction using the reference interval, most of the percentages decreased.
DISCUSSION
In this study, we attempted to introduce a new method of considering both the CV
A and CV
I to replace the existing concept of RCV used for monitoring the changes in serial clinical chemistry results. The 95% CI covers the true value with 95% probability. However, it is thought that the 95% CI of two means could overlap, even if the two means are significantly different at the α=0.05 level [
2021]. Austin and Hux [
20] demonstrated that two means are statistically significantly different at the α=0.05 level when the ratio of overlapping CIs between two means is <29%. Furthermore, Knol, et al. [
21] found that the probability of a type 1 error is 0.056, rather than 0.05, when the 95% CIs of the two means do not overlap. Therefore, it is necessary to set each CI level to explain how much the degree of each overlap significantly differs from a certain probability. When we calculated the CI corresponding to a specific type 1 error probability value according to the method introduced by Knol, et al. [
21], a CI level of 83.4% corresponded to a type 1 error probability of 0.05 and a CI level of 93.2% corresponded to a type 1 error probability of 0.01. As mentioned in the Results section, the distributions of percentages exceeding RCV
95% and RCV
99% were similar to those of non-overlapping 83.4% CI and 93.2% CI, respectively. The percentages of results in which 95% CIs did not overlap were lower than those exceeding RCV
95%, RCV
97%, and RCV
99%.
In this study, most of the 97.5th percentile cut-off values for each analyte were larger than the corresponding RCVs, indicating that intra-individual fluctuations in biochemical analytes were greater than the values from the biological variation database. This is consistent with previous reports that the utility of RCV may be somewhat limited in clinical settings [
14]. Additionally, we found that the tendency of the distribution of delta% for each analyte tended to be proportional to CV
I. On the other hand, the indexes of individuality were <0.6 in many analytes, and there were no analytes showing values >1.4. This finding is consistent with the fact that CV
I is much smaller than CV
G [
21323] and indicates that analysis of changes in serial clinical chemistry results is more important in interpreting patient condition and progress compared with the conventional reference values [
1124]. RCV has limited significance especially for analytes with a large CV
I value [
1314]. However, considering each CV
I, using overlapping CI levels to compare two serial values can be clinically meaningful for monitoring patients.
We also analyzed how the percentage of test results without overlapping CIs changes after correction using the reference interval. Additional analysis showed an overall decrease in percentage in most analytes. However, as the significance of the reference interval may vary from analyte to analyte, application should be conducted according to the characteristics of each analyte or the clinical situation of each laboratory.
This study has some limitations. First, we applied this monitoring system only to outpatients. Hospitalized patients undergo dynamic changes, and their blood analyte levels are significantly altered by treatment, intervention, or sudden physiological changes. Furthermore, the number of blood tests is much higher for hospitalized patients than for outpatients. As hospitalized patients require increased monitoring and analysis results, the LIS server capacity is a huge barrier against actual application. We intend to apply this system to hospitalized patients in the near future.
Second, the levels of some analytes fluctuate in a time-dependent manner. Therefore, for these analytes, time differences should be considered [
25]. However, we did not consider the time variable in our monitoring system and compared serial laboratory results only by statistical methods.
Third, some of the CV
I values differ significantly according to studies or publications, especially in analytes with a large CV
I value [
26]. We can use means, medians, or most frequently used values for different purposes. However, as there are no criteria for verifying or correcting them, we have used the values in the Westgard database.
Fourth, we did not use SI units for some analytes.
Fig. 1 is a real screen capture of our LIS system. The units are actually assigned according to the requests of clinicians and the judgement of laboratory physicians in our laboratory. Therefore, we could not modify the units to SI units.
Fifth, the concepts of RCV and overlapping CIs are similar in that they consider both the CV
A and CV
I. However, our monitoring system provides the CI range of two serially measured results. Clinicians and patients can easily visually comprehend the results by comparing the 95% CI ranges to see whether their change is statistically significant. Additionally, as we have demonstrated in the above sections, RCV
95% corresponds to the 83.4% CI overlap and RCV
99% corresponds to the 93.2% CI overlap. We hypothesize that 95% CI overlap broadens the statistically significant ranges and these two-sided comparisons are expected to overcome the limitations of existing RCVs, reflecting the variability of each test result, especially for analytes with large intra-individual variation. The concept of overlapping CIs has been used in some clinical studies [
2021]. We have applied this concept to the LIS system for the first time. Thus, there is no convincing clinical evidence to date. Further studies applying this concept in real clinical situations using actual clinical data reflecting patient disease status and their response to treatment and validating its clinical utility are required.
In conclusion, we applied the concept of overlapping CIs to interpret the changes in serial clinical chemistry test results. We suggest that this strategy can overcome the limitations of existing RCVs and replace them, especially for analytes with a large CVI value.