Journal List > Yonsei Med J > v.48(2) > 1030127

Bartolucci: Meta-Analysis: Some Clinical and Statistical Contributions in Several Medical Disciplines

Abstract

Meta-analysis in its present form of statistically integrating information from several studies all with a common underlying theme has been around for over 25 years. The medical field has seen many attempts by many investigators to pull summary data together from various sources within a discipline with the goal of making some definitive statement about the state of the science in that discipline. Likewise authors of manuscripts in the background and rationale section of their paper always summarize what they believe to be the state of affairs up to the time of the presentation of their own results in that particular paper. The new data and results they present in their current publication is an attempt to update the progress in that field. Thus in a sense they have performed a partial meta-analysis of summarizing information from the past, presenting their added contribution and thus updating the knowledge base. They have not quite integrated past data in a rigorous statistical way with their new data, but have merely used the data history to justify their current research which pretty much stands on its own. Thus meta-analysis is an after the fact attempt to pull together the current knowledge base whether it be publications or raw data and present a statistical synthesis of all the information and reach a conclusion as to the best treatment or intervention strategy based on all these past contributions. Now it's time to look back at some of these meta-analyses and determine what contributions, if any, they have made to the knowledge base within certain medical disciplines. Many disciplines including psychiatry have been visited by meta-analysis. One now examines some of these studies in the areas of oncology, orthopedics, psychiatry, pediatrics and cardiology. The purpose is to determine, given the information presented, what contributions, statistical challenges and peripheral issues in these disciplines have been brought to light in these meta-analyses.

INTRODUCTION

The goal of a meta-analysis is to present a quantitative synthesis of randomized clinical trials usually motivated by the fact that past studies on a particular therapy or intervention are either; inconclusive, moderate or controversial. The ideal approach is that of a meta-analysis of pooled data in which one obtains individual patient or subject data from the studies of interest and performs a rather comprehensive analysis of the combined results including subset analyses, covariate associations, and other analyses of interest. The more common meta analysis combines the results (summary statistics such as means, standard errors, odds ratios, hazard ratios etc.) of available studies that examined the same question such as the effect of aspirin versus no aspirin intervention in the prevention of cardiovascular disease. There are many references describing the goals, successes and limitations of meta-analyses.1-4 Also, it should be noted that meta-analysis is usually the last step in a systematic review in which one identifies relevant publications, evaluates their quality and then performs the analytic synthesis of their results. The process and procedures for determining the quality of studies and their inclusion in the meta-analysis are quite comprehensive and involve such items as well defined endpoints, units of analyses and their detailed description, well defined eligibility criteria, proper randomization or treatment assignment and adequacy of follow up in a longitudinal study. Such criteria are well explained and presented in most articles presenting a systematic review and meta-analysis strategy.5 The purpose of this article is to touch upon several disciplines in medicine including psychiatry and discuss some major contributions as well as challenges that have been made by meta analysis. This is not a comprehensive review. That would be impossible given the number of meta-analytic studies in the literature and the number of disciplines studied. We take some of the work that has been done and look at the contributions made by meta-analysis.

RESULTS

Five studies were evaluated for recent updates in the medical field. They included the disciplines of oncology, orthopedics, psychiatry, pediatrics, and cardiology. Each meta-analysis was evaluated for thoroughness of clearly realistically stated objectives, well defined inclusion and exclusion criteria, reasonable statistical strategies, justifiable conclusions, overall contribution to the knowledge base in that discipline and novel features, if any, presented in the article that contribute to the validity of the results.
One of the more noted meta-analyses was the pooled analysis resulting in the publication, "Tamoxifen for early breast cancer: an overview of the randomized trials" by the Early Breast Cancer Trialists' Collaborative Group.6 The objective was to determine the effectiveness of tamoxifen on survival of women with early stage breast cancer. The inclusion criteria were clearly stated involving cancer restricted to the breast or node positive lymph nodes removed surgically. The only issue here is that micro metastases might still remain. This is not easily discernable. As the inclusion criteria are rather strict, the exclusion criteria are not necessarily relevant for this particular analysis as they were incorporated into the individual randomized trial protocols. This study was actually a pooled analysis as defined above in the Introduction which included 3,700 women from 55 randomized trials. This was an ITT (intent to treat) analysis as should be the case for a meta-analysis. The results are rather comprehensive taking into account the various years that subjects were on tamoxifen (1, 2 or 5 for example)with a weighting factor for such. Proportional benefits and absolute benefits were couched in terms of the odds ratio or hazard ratio and reduction of death rate at a particular point in time respectively. The endpoints were well defined for comparing tamoxifen to a control in terms of time to recurrence and mortality due to any cause. The primary statistic was the log rank test and several Kaplan Meier curves were presented showing the clear advantage of tamoxifen over time. The conclusions appeared to be well justified for the various cohorts to which they applied. That is, the overall contribution of this work appears to affirm that for women who are ER (estrogen receptor) positive or ER status unknown, several years of tamoxifen improves the 10 year survival. The proportional reduction in breast cancer recurrence and in mortality is largely unaffected by other patient characteristics or other supportive treatments. The Forest plots which show the relative odds ratios and 95% confidence intervals for all studies included in the analysis all clearly show that the odds of failure for all studies is much reduced for tamoxifen relative to the control or non tamoxifen arm. The summary odds ratio for all studies combined is rather pronounced in favor of tamoxifen as well. For women who are ER negative the authors state that administration of adjuvant tamoxifen is a "matter of research". Other conclusions from this meta-analysis drawn were that adjuvant tamoxifen may produce substantial benefit for women aged 50 to 69 and those aged 70 or more. However, contrary to earlier reports7-11 prior to this analysis benefit to women younger than 50 was seen in this study. One of the most attractive features of this study was the ability to continue to update this analysis periodically by updating the survival and recurrence status of the participants. This is clearly an advantage of doing a meta-analysis of pooled data. Other features of this study which contribute to the knowledge base is the ability with long term follow up to determine the occurrence of second primaries such as endometrial cancer and colon cancer as were found in this data. This provides an opportunity to examine the associations, if any, of these second malignancies with treatment duration.
In the area of orthopedics a Norwegian review of the efficacy of short-term interventions of pharmacotherapeutic agents in osteoarthritic knee pain12 (OAK) resulted in a meta-analysis of 63 randomized placebo-controlled trials involving 14,060 patients and 53 trials. This group made use of the Cochrane Controlled Trials Register. The Cochrane Collaboration is geared to providing reliable evidence of the effectiveness of health care through systematic reviews of randomised controlled trials (RCTs), and stresses the importance of prospectively registering trials so that the evidence assessed and material presented is complete and unbiased. This is an excellent resource for those wishing to pursue a meta-analysis. This OAK research was well studied including appropriate diagnoses, randomized placebo controlled trials with well defined outcome measures, heterogeneity testing (differences across studies that may effect the results) and overall study appraisal for inclusion in the meta-analysis. The detailed protocol for inclusion of studies was specified prior to analysis and included a three step reviewing procedure of locating randomized placebo controlled trials in which patients were treated with specific interventions for knee osteoarthritis, methodology evaluation according to specific predefined criteria and calculation of their pooled effect incorporating an appropriated weighting scheme. The results were mean differences effect sizes as well as secondary time effect profiles over several weeks. This following of subjects over time is critical to truly asses the clinical significance of the results. As a result the overall conclusions were justified given this procedure. That is to say there was short term effect of therapy which did not maintain over time. There were several major contributions noted by this meta-analysis. The authors mention patient selection bias in several oral NSAID (non steroidal anti inflammatory drugs) trials. Apparently there was exclusion of non responders in several of the studies. That is to say these were mainly regular NSAID users requiring a minimum increase in disease activity after pre trial NSAID discontinuations. This had a tendency to favor oral NSAID intervention compared to other treatment strategies. The authors thus point out that only trials with oral and topical NSAIDs were significantly heterogeneous and their analyses were appropriately performed with a random effects model while other interventions were analyzed with fixed effects models. Also, for opioid trials a large dropout rate was noted causing the last observation to be carried forward (locf) in several longitudinal trials which inflated the response at later times. Statistically, using locf often causes one to make unwarranted assumptions about missing data yielding an inaccurate estimate of the variance covariance structure in the model either underestimating or overestimating the treatment effects. Although the authors, don't concern themselves with these statistical details they are astute enough to realize the practical shortcomings of the lof methodology. At any rate they are thorough in their conclusion that the short term statistical significance of some of the non placebo interventions do not translate into long term clinical significance.
Meta-analysis has come into play in the area of Psychotherapy.13 The authors of this particular study wished to compare the efficacy of psychotherapy, i.e. cognitive behavioral therapy (CBT) for childhood anxiety disorders excluding posttraumatic stress disorder (PTSD) and obsessive compulsive disorder (OCD). The studies included were required to have investigated the efficacy of a specific treatment for anxiety disorder in children against a control condition or credible psychotherapeutic treatment. The literature search included published peer reviewed randomized studies. The inclusion criteria provided for diagnoses that met the Diagnostic and Statistical Manual of Mental Disorders (DSM) or the International Classification of Diseases (ICD). The studies had to also meet the basic CONSORT (consolidated standards of reporting trials) criteria. Studies with less than 10 patients were excluded due to lack of power. To be included studies had to be published by March 2005. As a result, 24 studies met this criteria out of a pool of 36 treatment outcome studies on anxiety disorders in children and adolescents. We note that if this had been a pooled analysis involving the raw subject data then the condition of a minimum of 10 subjects would not have been an issue. This trial actually used a multidimensional approach to the analysis in that not only were mean effect sizes used for continuous outcomes comparing CBT to a waiting list control condition, but also considered were percent recovery or no longer meeting the diagnostic criteria for their principal pretreatment anxiety disorder. The analysis also took into account the intent to treat as well as those completing the study. Follow up when feasible was examined for the lasting effects of therapy. Many of the results were presented as two sided confidence intervals with 3 childhood measures as the primary endpoints as well as an overall measure. The conclusions appeared to be well justified, although it would have been nice to have incorporated into the analysis results the exact p-values of the intervention comparisons. One of the nice features of this study is that the authors incorporated into their consideration a fail safe analysis in which they projected the number of studies of effect sizes of 0 that would be needed to substantially reduce the mean effect size of the overall result. This addresses somewhat the issue of publication bias in which only positive studies i.e. those showing a significant or statistical advantage of the intervention of interest (in this case CBT) are published. The contribution appears to be that CBT is effective in treating childhood anxiety disorder. The completion rate of the studies appears to be rather good in the range of 83% to 86%. Although, PTSD was not a diagnosis in this young group of subjects and PTSD is of interest in adult subjects, one study14 considering a multidimensional meta-analysis of psychotherapy for PTSD did show results that suggest that psychotherapy for PTSD leads to a large initial improvement from baseline measures. They also reported other metrics besides effect size which showed a more varied account of outcome. The methodology allowed them to generalize further their results to this diagnostic group. The results of these two studies lead one to believe that in certain psychiatric diagnoses some cognitive or psychotherapeutic intervention is certainly worth considering.
One of the medical issues appearing to take center stage at times is that of determining the amount of antibiotic usage for a particular indication in medical practice. A meta-analysis addressing a similar issue15 was written to determine whether long course antibiotic therapy was more effective than short course therapy in treatment of urinary tract infection (UTI) in children. The authors searched online using Medline and the Cochrane Clinical Trials Registry and discovered 16 studies that met the inclusion criteria. That is to say, candidate studies were restricted to RCT's comparing short term (<=3 days) and long tern (7-14 days) outpatient therapy for acute UTI in children ages 0 to 18 years. Study quality was evaluated using a 9 item scoring system developed by the investigators. For the sake of example it is worth noting the criteria for the 9 item scale. This demonstrates the type of detail required in many meta-analyses to insure both study quality and consistency. These included exclusion of children with anatomic and/or functional urinary tract abnormalities, ability to distinguish between upper and lower tract infection by listed signs and symptoms, UTI defined by symptoms and bacteriologic findings, distinction between persistent infection, relapse with same organism and re infection with a different organism, placebo-control for short course arm, blinding, method of subject allocation or randomization, intent to treat analysis and equal duration of follow up for treatment and control groups. The results were fairly straight forward in that the authors examined the relative risk (RR) of treatment failure of short term vs. long term therapy. Setting the odds ratio as a dependent variable, they used a random regression model to determine if any study effect could be contributing to heterogeneity. The study quality rating or score was non significant. The article used good visuals such as Forest plots to demonstrate the RR. Funnel plots were referenced but not shown to indicate there was no publication bias, p=0.22. When possible the intent to treat analysis was used. The conclusion was that long term (7 to 14 days) administration of antibiotics was associated with fewer treatment failures. Some of the cautionary features included distinguishing between lower and upper tract infection. Three studies did not meet that criteria of lower tract infection and thus were eliminated for a more interpretable result. The interesting features of quality scoring and use of the random regression model to focus on sources of heterogeneity were well justified. Also the authors noted that there were no statistical differences noted in all but 2 of the studies. Thus meta-analysis is well suited for this type of investigation since one can interpret those non significant results to imply that short course therapy was at least as effective as long course therapy. Another more plausible argument is that there may have been a difference, but the sample size in the individual studies was insufficient to allow for statistical significance or as some would say did not have the power to detect this difference. As the authors note, obviously as one properly combines information from many smaller studies one gains sufficient power to show that short term therapy was indeed associated with a statistically significant increase of treatment failures compared to long course therapy. Lastly it is important to note that when examining the re infection rate for 1 day vs. 3 day therapy, the RR was not statistically significant.
The last meta-analysis we consider involves addressing the ongoing discussion of the role of aspirin therapy in the prevention of cardiovascular events.16 Aspirin is an antiplatelet agent that has been shown to be effective for the primary and secondary prevention of cardiovascular events. With the recent completion of the Women's Health Study (WHS)17 there are six trials Physicians' Health Study (PHS ),18 British Doctors Trial (BDT),19 Hypertension Optimal Treatment Trial (HOT),20 Primary Prevention Project (PPP),21 Thrombosis Prevention Trial (TPT)22 and the WHS that have addressed the question of the benefits of aspirin in the primary prevention of cardiovascular events. Meta analyses of the first five trials demonstrated a positive outcome for total coronary heart disease (CHD) events and nonfatal myocardial infarctions (MI) but not for cardiovascular (CV) death, total stroke or all cause mortality. The aim of this analysis was to add data from WHS in order to better understand the meta-analytical contribution of all six trials.
The literature search was by design complete as it involved these 6 major controlled trials. The computed odds ratios demonstrated statistical superiority of aspirin intervention in the 3 categories of: 1) total CHD defined as nonfatal and fatal MI and death due to CHD, 2) non fatal MI as confirmed MI that did not result in death and 3) total CV events as a composite of CV death, MI or stroke. The p-value in all three cases was p=0.001.There were nearly 93000 subjects and each of the studies was weighted for its sample size. The sources of heterogeneity were primarily due to the different study designs and the diversity of gender content across studies and low to high risk subjects for coronary disease. The authors used a random effects model to adjust for this diversity. The primary endpoint considered for aspirin failure vs. non aspirin failure was the odds ratio (OR). Also publication and small study bias was discussed. Forest plots showed the reduction in odds of failure with aspirin vs. control or no aspirin intervention. This study came under scrutiny shortly after publication and was criticized for not including the Prevention of Pulmonary Embolism (PEP)23 trial which included 13356 additional subjects with adequate data to determine risk of non fatal MI in aspirin vs. a control. In the PEP study the OR was 1.33, p=0.05 in favor of the control. This result when added to the 6 primary prevention trial16 mentioned here (resulting in 106,000+ subjects) yielded an OR=0.818 overall in favor of aspirin for non fatal MI, p=0.001. Thus this trial was unaffected for non fatal MI given this additional data. The new PEP result did add to the significance of the heterogeneity, but was accommodated by a random effects model analysis overall. This is a prime example where one can continually update the results of a meta-analysis with new information to confirm, augment, or refute previous results.

SUMMARY

The contributions of meta-analyses are certainly without question as we have seen from the studies we chose to investigate. That is the overall contribution of work with tamoxifen6 appears to affirm that for women who are ER (estrogen receptor) positive or ER status unknown, several years of tamoxifen improves the 10 year survival. Other patient characteristics such as demographics and pre trial clinical status as well as other supportive treatments leave unaffected the proportional reduction in breast cancer recurrence and in mortality. The relative odds ratios and 95% confidence intervals for all studies included in the analysis all clearly show that the odds of failure for all studies is much reduced for tamoxifen relative to the control or non tamoxifen arm. The summary odds ratio for all studies combined is rather pronounced in favor of tamoxifen as well. For women who are ER negative the addition of adjuvant tamoxifen may or may not be beneficial. In the study of knee arthritis12 the authors were thorough in their conclusion that the short term statistical significance of some of the non placebo interventions do not translate into long term clinical significance. This article brought to light the importance of the Cochrane Collaboration in examining available data for summary considerations as well as the statistical concepts of heterogeneity, weighing of evidence from the various studies and the dangers of bias of inflated response at later times introduced by considering the last observation carried forward in studies with a large dropout rate. In the area of psychiatry13 one sees the importance of cognitive behavioral therapy (CBT) for childhood anxiety disorders excluding posttraumatic stress disorder (PTSD) and obsessive compulsive disorder. It points out the importance of assuring correct and consistent diagnoses when combining studies via the DSM, ICD and CONSORT criteria. The multidimensional approach to measuring effectiveness by considering more than just one endpoint is stressed and actually confirmed in an earlier adult meta-analysis14 in PTSD which we reference above. The authors also introduce a fail safe analysis so as not to over inflate their results and to address in part the issue of publication bias. Upon examining another pediatric study, but in UTI,15 the authors make us aware of the importance of statistical concepts such as the study rating score which determines if articles are of sufficiently high enough quality to be considered in a meta-analysis, the use of a random regression model to possibly determine sources of heterogeneity and the importance of examining sub diagnoses or in this case lower UTI to determine where the effectiveness of the therapy considered most advantageous, such as longer term antibiotic therapy, may apply. The up side of this examination of subgroups in meta-analysis is that one may not be as closely tied to the issues of multiple statistical testing and p-value adjustment as one may be in a prospectively randomized trial. Another point made by the authors of this study is that this meta-analysis addressed the issue of too many smaller underpowered studies being left on their own and thus not providing a definitive statistical answer to the efficacy of short term versus long term antibiotic therapy. Their combination of studies, if handled appropriately, can come to a conclusion that the longer term therapy is most effective. The last study considered the role16 of aspirin in reducing the risks of cardiovascular events. This meta-analysis added the WHS to the other 5 studies that were considered previously in the meta-analysis to address this issue. The results were consistent in that aspirin was superior to a non aspirin control in 3 categories of: 1) total CHD defined as nonfatal and fatal MI and death due to CHD, 2)non fatal MI as confirmed MI that did not result in death and 3) total CV events as a composite of CV death, MI or stroke. The p-value in all three cases was p=0.001.There were nearly 93000 subjects in this analysis. It also demonstrated that aspirin was not statistically advantageous in preventing stroke. The point of all this is that as studies become known they can be added to a meta-analysis and results updated provided the correct statistical safeguards are in place such as checking for heterogeneity and the sources of heterogeneity. Random effects models as considered in this article are in place to address the diversity that may exist across studies. Also the point was made above that the authors did not include a large 13,000 patient cohort. However, when that data was added to the six studies the results were consistent. Again there is the ability to update the data and confirm or question established results.
Thus it is important to note that there are many features to the design and analysis of meta-analytic studies, all of which can affect the quality of their conclusion, An attempt was made here to discuss some of the trials in print and show the statistical considerations made by the authors to enhance the quality of their work. Having presented all this, one must not lose sight of the fact that the prospectively randomized clinical trial is still the mainstay of clinical research.23 It is within these trials that issues such as heterogeneity can be avoided and thus clinical questions in the purest environment of a well designed study can be addressed. Meta-analysis has the role of helping to integrate results from different trials with much the same objective. We often expect it will help to refine the precision of treatment effect by integrating the results of several studies. At the very least, we hope that meta-analyses will at least confirm the trend in treatment performance.

References

1. Bartolucci A. The significance of clinical trials and the role of meta-analysis. J Surg Oncol. 1999. 72:121–123.
2. Bartolucci AA, Katholi CR, Singh KP, Alarcon GS. Issues in meta-analysis: an overview. Arthritis Care Res. 1994. 7:156–160.
3. Mosteller F, Chalmers C. Some progress and problems in meta-analysis of clinical trials. Stat Sci. 1992. 7:227–236.
4. Dickersin K, Berlin JA. Meta-analysis: state of the science. Epidemiol Rev. 1992. 14:154–176.
5. Counsell C. Formulating questions and locating primary studies for inclusion in systematic reviews. Ann Intern Med. 1997. 127:380–387.
6. Early Breast Cancer Trialists' Collaborative Group. Tamoxifen for early breast cancer: an overview of the randomized trials. Lancet. 1998. 351:1451–1467.
7. Early Breast Cancer Trialists' Collaborative Group. Effects of adjuvant tamoxifen and of cytotoxic therapy on mortality in early breast cancer. An overview of 61 randomised trials among 28,896 women. N Engl J Med. 1988. 319:1681–1692.
8. Early Breast Cancer Trialists' Collaborative Group. Treatment of early breast cancer, vol 1: worldwide evidence 1985-1990. 1990. 1. Oxford: Oxford University Press.
9. Early Breast Cancer Trialists' Collaborative Group. Systemic treatment of early breast cancer by hormonal, cytotoxic, or immune therapy. 133 randomised trials involving 31 000 recurrences and 24,000 deaths among 75,000 women. Lancet. 1992. 339:71–85.
10. Early Breast Cancer Trialists' Collaborative Group. Effects of radiotherapy and surgery in early breast cancer. An overview of the randomized trials. N Engl J Med. 1995. 333:1444–1455.
11. Early Breast Cancer Trialists' Collaborative Group. Ovarian ablation in early breast cancer: overview of the randomised trials. Lancet. 1996. 348:1189–1196.
12. Bjordal JM, Klovning A, Ljunggren AE, Slordal L. Short-term efficacy of pharmacotherapeutic interventions in osteoarthritic knee pain: A meta-analysis of randomised placebo-controlled trials. Eur J Pain. 2007. 11:125–138.
13. In-Albon T, Schneider S. Psychotherapy of childhood anxiety disorders: A meta-analysis. Psychother Psychosom. 2007. 76:15–24.
14. Bradley R, Greene J, Russ E, Dutra L, Westen D. A multidimensional meta-analysis of psychotherapy for PTSD. Am J Psychiatry. 2005. 162:214–227.
15. Kerren R, Chan E. A meta-analysis of randomized controlled trials comparing short- and long-course antibiotic therapy for urinary tract infections in children. Pediatrics. 2002. 109:E70.
16. Bartolucci AA, Howard G. Meta-analysis of data from the six primary prevention trials of cardiovascular events using aspirin. Am J Cardiol. 2006. 98:746–750.
17. Ridker PM, Cook NR, Lee IM, Gordon D, Gaziano JM, Manson JE, et al. A randomized trial of low-dose aspirin in the primary prevention of cardiovascular disease in women. N Engl J Med. 2005. 352:1293–1304.
18. Final report on the aspirin component of the ongoing Physicians' Health Study. Steering Committee of the Physicians' Health Study Research Group. N Engl J Med. 1989. 321:129–135.
19. Peto R, Gray R, Collins R, Wheatley K, Hennekens C, Jamrozik K, et al. Randomised trial of prophylactic daily aspirin in British male doctors. British medical journal (Clinical research ed.). 1988. 296:313–316.
20. Hansson L, Zanchetti A, Carruthers SG, Dahlof B, Elmfeldt D, Julius S, et al. Effects of intensive blood-pressure lowering and low-dose aspirin in patients with hypertension: principal results of the Hypertension Optimal Treatment (HOT) randomised trial. HOT Study Group. Lancet. 1998. 351:1755–1762.
21. Low-dose aspirin and vitamin E in people at cardiovascular risk: a randomised trial in general practice. Collaborative Group of the Primary Prevention Project. Lancet. 2001. 357:89–95.
22. Thrombosis prevention trial: randomised trial of low-intensity oral anticoagulation with warfarin and low-dose aspirin in the primary prevention of ischaemic heart disease in men at increased risk. The Medical Research Council's General Practice Research Framework. Lancet. 1998. 351:233–241.
23. Bartolucci A. The significance of clinical trials and the role of meta-analyses. J Surg Oncol. 1999. 72:121–123.
TOOLS
Similar articles