Abstract
The aim of this study is to evaluate the reporting quality of animal experiments in Korea using the Animals in Research: Reporting In Vivo Experiments (ARRIVE) guideline developed in 2010 to overcome the reproducibility problem and to encourage compliance with replacement, refinement and reduction of animals in research (3R's principle). We reviewed 50 papers published by a Korean research group from 2013 to 2016 and scored the conformity with the 20-items ARRIVE guideline. The median conformity score was 39.50%. For more precise evaluation, the 20 items were subdivided into 57 sub-items. Among the sub-items, status of experimental animals, housing and husbandry were described under the average level. Microenvironment sub-items, such as enrichment, bedding material, cage type, number of companions, scored under 10%. Although statistical methods used for the studies were given in most publications (84%), sample size calculation and statistical assumption were rarely described. Most publications mentioned the IACUC approval, but only 8% mentioned welfare-related assessments and interventions, and only 4% mentioned any implications of experimental methods or findings for 3R. We may recommend the revision of the present IACUC proposal to collect more detailed information and improving educational program for animal researchers according to the ARRIVE guideline.
In a recent survey of 1,576 researchers, 52% of the respondents perceived the crisis of reproducibility and the increasing inability to reproduce another scientist's experiments. Over 60% of scientists in biology and medicine reported not being able to reproduce results by other researchers [1]. In the past decade, academic journals have noted the scientific and ethical implications for the entire research process due to the failure to describe research methods and to report results comprehensibly and accurately [2]. The main challenges were found in statistical methods including randomization and blinding, and in describing status of experimental animals. For example, in a survey of 271 publications on reporting research methods, only 59% of the studies described the number and characteristics of the used animals, and 30% of the studies did not present statistical information. Less than 20% of the studied publications mentioned randomization or blinding for reducing bias in the experiments [3].
The 3R's principle (replacement, refinement and reduction of animals in research) in the perspective of quality improvement of science has been recognized because the treatment of experimental animals in the most humane way is a prerequisite for a successful animal experiment [456]. It is not limited to the implementation phase but whole process of scientific investigation from formulation to communication and replication. More recently, the 3R's principle has become acknowledged as an integral part of conducting high quality bioscience and a means of addressing issues of poor reproducibility and high rates of attrition in drug development [7].
The Animals in Research: Reporting In Vivo Experiments (ARRIVE) guideline was developed in 2010 to improve the design, analysis, and reporting of research using animals by the National Centre for the Replacement, Refinement & Reduction of Animals in Research (NC3Rs). The ARRIVE guideline provides a 20-item checklist for a minimum description of animals included in a scientific paper. Some items are study design, experimental procedure, experimental animals, housing and husbandry, and sample size [2].
Since 2010, leading biomedical journals have adopted the ARRIVE guideline for reporting research methods and results to ensure quality of the published scientific papers. In the intervening 8 years, the reporting quality in these science publications has not appreciably changed. Yet, the guideline has proven successful in motivating the scientific community to recognize the problems related to animal research and strive to improve the situation. For example, an evaluation of 83 publications assessing new compounds for Chagas disease in animal models showed slight improvement in reporting information on animals, macro- and microenvironment in spite of significant lack of the reporting accuracy [6]. Also, a review of 396 scientific studies involving animals in China showed that the publications after 2010 had a significantly larger ARRIVE value than in 2010 [8].
Genetically engineered mice (GEM) have become a major research tool to identify the function of human genes [9] and the mechanism of human disease [10]. The mouse phenotyping, which is the work of identifying the function of the mouse gene, is proceeding with international cooperation under the leadership of International mouse phenotyping consortium (IMPC). IMPC requires member organizations to report the phenotypic analysis results in detail, and to ensure reproducibility of the experiment by sharing them.
The present study examined 50 scientific papers published by the Korea Mouse Phenotyping Center (KMPC), an IMPC member and one of the largest mouse research teams in Korea, to examine their conformity with the ARRIVE guideline. This evaluation focuses on accurate description of experimental animals and statistical methods from the reproducibility perspective and ethical consideration for experimental animals from the 3R's perspective. Revealing the weaknesses and strengths in reporting animal experimentation in Korea, and the need to enhance research quality by adopting in vivo experiments reporting guidelines in our science community can only be beneficial.
Titles and abstracts of 88 papers published in English in academic journals between 2013 and 2016 by KMPC members were screened. Papers using animals were selected by the authors including an experimental animal veterinarian. The full texts with supplementary materials were reviewed in cases of difficulty in selecting based on titles and abstracts. The potentially proper papers were finally selected by excluding ones beyond the scope of this study. For example, papers on in vitro study and reviews were excluded.
The ARRIVE guideline consists of 20 checklist items and 39 sub-items. To provide more detailed information on the reporting situation, we added 18 sub-items to the checklist by subdividing items related to experimental animals (item 8), housing and husbandry (item 9), and study design about minimization of subjective bias (item 6).
All the 57 sub-items were assessed as “YES (described in the study)” or “NO (not described in the study)”. If a sub-item could not be assessed due to the research type (for example, “quality of water for fish” in the case of mouse experiments), it was indicated as “NONE”.
The data were summarized using Microsoft Excel. Evaluation scores are expressed as the ratio of absolute numbers and percentiles in the 20 items and 57 sub-items. The number of “NONE” cases was excluded from the calculations. All the scores about the papers of animal experiments were presented at Figure 1, 2, 3, 4, 5 and Table 1.
Of the 88 KMPC papers, 50 papers were selected for evaluation concerning their application of the ARRIVE guideline. Among them, 41 were in vivo studies that compared test groups with control groups as well as analyzing specific phenotypes, with the remainder being done ex vivo conditions. Thirty seven papers were excluded in the screening of titles and abstracts, with one more excluded after the assessment of the full texts.
The scores for 20 ARRIVE items are shown in Figure 1. The median percentile score was 39.50% (average 51.17%). The five most frequently reported items were title, abstract, objectives, outcomes and estimation, and funding. In contrast, five items that were least frequently reported were experimental procedural items of housing and husbandry, allocating animals, baseline data, and adverse events. The scores for the 57 sub-items are presented in Table 1. The median score was 30.00% (average 39.94%). Sub-items scored over 75% were 1, 2, 3-a, 4, 5, 7-a, 8-a1, 8-a2, 8-a4, 8-b1, 8-b2, 13-a, 16, 18-a, and 20. These sub-items mainly correspond to information on title, abstract, background, objectives, experimental animals, outcome and estimation, and funding. In contrast, sub-items scored under 25% were 6-b1, 6-b2, 7-b, 7-c, 7-d, 8-a5, 8-b3, 9-a2, 9-a3, 9-a4, 9-b3, 9-c, 10-b, 11-a, 14, 17-a, 18-b, and 18-c. Sub-items 8-b4, 9-b4, 9-b7, 11-b, 13-c and 17-b were not described. These sub-items are chiefly related to experimental procedure, housing and husbandry, allocating animals, and adverse events.
The reporting scores on the experimental animals are shown in Figure 2, and are categorized as basic and additional information. Basic information of experimental animals, such as strain, sex, and age, were described frequently. Animal weight was not, being mentioned in only 25% of papers. Microbiological status and information on drug or test history of animal used were described in fewer than 10% of the papers.
Information on microenvironment and macroenvironment were described in the sub-items of housing and husbandry (Figure 3). Microenvironment refers to the immediate physical environment surrounding the animal, such as type of cage, bedding material, numbers of cage companion, type of food, access to food/water and environmental enrichment. Macroenvironment parameters include the physical environment of the secondary enclosure, such as type of facility, light/dark cycle, temperature, and humidity [11]. Information about housing and husbandry were described in fewer than 30% of the reviewed publications. In particular, the sub-items of microenvironment, such as enrichment, bedding material, cage type, number of companions, scored under 10%. No paper described environmental enrichment.
Most papers did not report randomization in animal selection or blinding in outcome measurements to minimize the effects of subjective bias (Figure 4). Methods about allocating animals to experimental groups and order for treatment or assessment were rarely mentioned. Score of the statistical methods used for each analysis was relatively high at 84%, while only 8% of the papers described the sample size calculation. No study reported about assessment of statistical assumption.
The results of assessment about laboratory animal ethics are shown in Figure 5. Reporting for the ethical statement about protocol review and approval by Institutional Animal Care and Use Committee (IACUC) was scored 80%. Of these, 34% described the approval number, while the remaining 46% did not. 16% of the papers just mentioned like “follow affiliation's guide” and 4% did not include a specific ethical statement. Only 8% of papers mentioned welfare-related assessments and interventions. Only 4% of paper mentioned any implications of experimental methods or 3R findings of the study.
The ARRIVE guideline was developed to ensure reproducibility of animal experiments and to avoid unnecessary animal use. It also aid in more transparent, comprehensive, and logical reporting of research findings. A recent phenomenon described as a ‘reproducibility crisis’ in science can be effectively overcome by following this guideline at design, conduct, and analysis [121314].
Of the reviewed publications, only 30–40% included the required ARRIVE information about their experiments. Considering that the publications were produced in an animal experiment-specialized group, the score represents the maximum level of the conformity with the standards.
These scores, however, do not directly correspond to the reproducible level of papers. The ARRIVE guideline is a ‘reporting’ guideline aimed to improve reproducibility and ethical consideration in published papers. Reported experimental steps could be omitted because the peer reviewers thought the description was unnecessary. Some sub-items of the ARRIVE guideline could not be applied because of the characteristics of the experiments. However, it is clear that there is a need for improvement of the accurate reporting in micro- and macroenvironment for animals, study design, and the statistics and ethical consideration in research using animals.
First, the sub-items about the status of animals and environment, which our team mainly subdivided from the original guideline, need to be reported more accurately. Animal status sub-items, such as health status, weight, and sex, and environment sub-items including bedding material, type of cage, and numbers of companions, are important information needed to reproduce the experimental environment. Especially, microenvironmental conditions can directly affect physiologic processes and behavior and may alter disease susceptibility [11]. In some cases of describing health status or type of facility, the term SPF (specific pathogen-free) was simply used without clarifying the specific microorganisms or whether the conditions met the standards recommended by the relevant academic society or association. To improve maximize information about environment, researchers should be familiar with the animal care program of laboratory animal facility.
The second noteworthy finding was the low scores of study design and statistics. The problem of inadequate reporting of statistics has frequently been pointed out as a major reason for the reproducibility crisis. For example, one study described that less than 10% of animal studies in Nature or PLoS journals reported randomization, and around 20% of them mentioned blinding in past surveys [15]. Also, another study found that sample size was not calculated in any of paper concerning Chagas disease that used animals [5], nor did a review article addressing the conformity with ARRIVE in China [8]. Because the importance of using correct statistical treatment methods has been increasingly emphasized in animal experiments as a way to overcome the reproducibility crisis, researchers need to report in detail the procedure for allocating animals, the number of animals before and during experiments, and the reason for adopting sample size and specific statistical methods. Especially, the description of how the sample size is reached is related to refinement of 3Rs. Therefore it can be considered as essential item that should be reported in animal using research.
Third, ethical consideration for experiments needs to be more clearly indicated in publications. According to the Animal Protection Act and the Laboratory Animals Act in Korea, all the animal experiments should be approved by the relevant IACUC. Presently, 80% of the reviewed papers mentioned the approval of IACUC or supplied the permission number. However, more active statements of the researcher's intention and active steps to maintain animal welfare during the experiments and acknowledgement of adherence to the 3R's principle are required because public concern about health and husbandry conditions is growing. Welfare-related assessments and interventions that were carried out prior to, during, or after the experiment can be considered as post-approval monitoring (PAM) that is required by laws, regulations, and policies. PAM helps ensure the well-being of the animals and may also provide opportunities to refine research procedures [11]. For the improvement of the reporting accuracy the IACUC guidelines at the step of the study design should require the consideration of the ARRIVE guideline items. We recommend the revision of existing IACUC proposal forms with reference to the sub-items of the ARRIVE guideline to help researchers report their experiments to a standard that matches the international level. Also, an educational program on the ARRIVE guidelines would help researchers understand the reproducibility crisis and the importance of the accurate reporting of research using animals. Better understanding of the context of ARRIVE will encourage researchers to apply the guideline to improve the quality of their future studies.
Acknowledgments
This research was supported by Korea Mouse Phenotyping Project (2014M3A9D5A01074636) of the Ministry of Science, ICT and Future Planning through the National Research Foundation. The authors thank to Hanah Sung and Misun Mang for their help in data collection.
References
1. Baker M. Is there a reproducibility crisis? Nature. 2016; 533(7604):452–454. PMID: 27225100.
2. Kilkenny C, Browne WJ, Cuthill IC, Emerson M, Altman DG. Improving bioscience research reporting: the ARRIVE guidelines for reporting animal research. PLoS Biol. 2010; 8(6):e1000412. PMID: 20613859.
3. Kilkenny C, Parsons N, Kadyszewski E, Festing MF, Cuthill IC, Fry D, Hutton J, Altman DG. Survey of the quality of experimental design, statistical analysis and reporting of research using animals. PloS one. 2009; 4(11):e7824. PMID: 19956596.
4. Burden N, Chapman K, Sewell F, Robinson V. Pioneering better science through the 3Rs: an introduction to the national centre for the replacement, refinement, and reduction of animals in research (NC3Rs). J Am Assoc Lab Anim Sci. 2015; 54(2):198–208. PMID: 25836967.
5. Gulin JE, Rocco DM, García-Bournissen F. Quality of reporting and adherence to ARRIVE guidelines in animal studies for Chagas disease preclinical drug research: a systematic review. PLoS Negl Trop Dis. 2015; 9(11):e0004194. PMID: 26587586.
6. Tannenbaum J, Bennett BT. Russell and Burch's 3Rs then and now: the need for clarity in definition and purpose. J Am Assoc Lab Anim Sci. 2015; 54(2):120–132. PMID: 25836957.
7. Graham ML, Prescott MJ. The multifactorial role of the 3Rs in shifting the harm-benefit analysis in animal models of disease. Eur J Pharmacol. 2015; 759:19–29. PMID: 25823812.
8. Liu Y, Zhao X, Mai Y, Li X, Wang J, Chen L, Mu J, Jin G, Gou H, Sun W, Feng Y. Adherence to ARRIVE guidelines in Chinese journal reports on neoplasms in animals. PloS one. 2016; 11(5):e0154657. PMID: 27182788.
9. Kim IY, Shin JH, Seong JK. Mouse phenogenomics, toolbox for functional annotation of human genome. BMB Rep. 2010; 43(2):79–90. PMID: 20193125.
10. Prattis S, Jurjus A. Spontaneous and transgenic rodent models of inflammatory bowel disease. Lab Anim Res. 2015; 31(2):47–68. PMID: 26155200.
11. National Research Council. Guide for the care and use of laboratory animals. National Academies Press;2010. Available from: http://www.nap.edu/catalog/12910.html.
12. Bryant FB. Enhancing predictive accuracy and reproducibility in clinical evaluation research: Commentary on the special section of the Journal of Evaluation in Clinical Practice. J Eval Clin Pract. 2016; 22(6):829–834. PMID: 27870286.
14. Mullane K, Williams M. Enhancing reproducibility: failures from reproducibility initiatives underline core challenges. Biochem Pharmacol. 2017; 138:7–18. PMID: 28396196.
15. Baker D, Lidster K, Sottomayor A, Amor S. Two years later: journals are not yet enforcing the ARRIVE guidelines on reporting standards for pre-clinical animal studies. PLoS Biol. 2014; 12(1):e1001756. PMID: 24409096.