Abstract
Background/Aims
Endoscopic retrograde cholangiopancreatography (ERCP) requires a unique skill set. Currently, there is no objective methodology to assess and train a professional to perform ERCP. This study aimed to develop and validate a novel ERCP simulator.
Methods
The simulator consists of papillae presenting different anatomy and positioned in varied locations. Deep cannulation of the pancreatic duct, followed by the bile duct, was performed. The time allotted was 5 minutes. The content validity indexes (CVIs) for realism, relevance, and representativeness were calculated. Correlation between ERCP experience and simulator score was determined.
Results
Twenty-three participants completed the simulation. The CVIs for realism were orientation of duodenoscope to papilla (1.00), angulation of papillotome to achieve cannulation (0.71), and haptic feedback during cannulation (0.80). The CVIs for relevance were use of elevator (1.00), wheels to achieve en face orientation (1.00), and papillotome for selective cannulation (1.00). Regarding CVI for representativeness, the results were as follows: basic cannulation (0.83), papilla locations (0.83), and papilla anatomies (0.80). The novice, intermediate, and experienced groups scored 6.7±8.7, 30.0±16.3, and 74.4±43.9, respectively (p<0.0001). There was a strong correlation between the ERCP experience level and the individual’s simulator score (Pearson value of 0.77, R2 of 0.60).
Endoscopic retrograde cholangiopancreatography (ERCP) is one of the more invasive endoscopic procedures performed by gastroenterologists. Optimal performance requires comprehensive knowledge of pancreaticobiliary anatomy as well as understanding indications, contraindications and alternatives, proper handling of equipment and accessory tools, and complex manual dexterity. Nevertheless, ERCP carries substantial risks, including post-ERCP pancreatitis (3.5%), bleeding (1.3%), infection (1%) and perforation (0.5%) [1-3].
Traditionally, training in ERCP has followed an apprenticeship model. During a fixed duration and under the supervision of a trainer, trainees (in this case, fellows) progress sequentially through the following steps: observing procedures, assisting with the procedure, learning passage of the duodenoscope and functions of various ERCP accessory tools, attempting cannulation, and finally, performing the entire ERCP procedure. As trainees become more comfortable with the basic techniques, they are allowed to attempt therapeutic intervention. In most programs, these steps are performed by trainees on patients during their clinical training [4]. However, a limitation of this approach is the increased mental workload that the trainee endures while learning both the devices and procedural techniques simultaneously during clinical cases, which could impact outcomes and patient safety. As a result, in recent years, ERCP simulators have been developed as an alternative and/or adjunct to the current clinical training model in order to shorten the learning curve and improve patient outcomes [5].
Currently, ERCP simulators may be divided into 3 categories. The first category consists of the explanted animal organ model. This includes the porcine model, as well as the composite chicken heart muscle and porcine duodenum model [6,7]. While these models provide relatively realistic haptic feedback at a lower cost [8], there are a few disadvantages, including lengthy preparation time, need for disposal of tissue, and the need for special duodenoscopes made to be used on animals. The second category is computer-based devices. These simulators are usually complex and have modules that simulate diverse scenarios with varied difficulty levels. While there exists only limited data to support their validity [9], the high costs of computer simulators usually prevents most training programs from acquiring one. Additionally, due to their large size, these simulators are usually located in a separate unit isolated from the endoscopy suite, which may hinder availability for trainees. The last type of ERCP simulators is the mechanical simulator. These usually lack realism and variety, which may limit their usefulness.
Our previous work described the development and validation of an endoscopic part-task mechanical simulator, also known as the Thompson Endoscopic Skills Trainer (TEST) box, focusing on upper endoscopy and colonoscopy skills [10,11]. Studies show that this simulator can differentiate endoscopic skills based on clinical experience and that it may be beneficial in the preclinical setting [12]. The simulator is compact and is currently available commercially (EndoSim, LLC, Bolton, MA, USA). Based on a similar process, a novel ERCP mechanical simulator was developed that focuses on fundamental cannulation skills.
This study describes the process of developing and validating a novel ERCP mechanical simulator. Validity evidence regarding test content and its relationship to other variables are demonstrated to assess the following characteristics of the simulator: realism, relevance, representativeness of clinical ERCP, and the capacity to differentiate ERCP skills based on prior experience.
The primary objective of this study was to describe the systematic process for the development and validation of a novel mechanical ERCP cannulation simulator and its scoring system. Following an extensive literature review, expert survey and prototype iterations, the final simulator focusing on selective pancreatic and biliary cannulation was constructed. The simulator consists of 6 silicone papillae, each of which varies by difficulty level. Participants are allotted 5 minutes per session and must approach papillae 1 to 5 sequentially, followed by a bonus papilla that focuses on altered anatomy. Deep cannulation of the pancreatic duct, followed by the bile duct, must be achieved prior to advancing to the next papilla. In an attempt to validate the simulator, we proceeded to conduct a study assessing the realism, relevance, and representativeness of the simulator (these terms are described in more detail below) compared to technical skills required during clinical ERCP, as well as its ability to differentiate cannulation skills based on prior ERCP experience
This was a prospective study conducted at an academic institution. Participants with varying levels of ERCP experience were recruited. Novice, intermediate and experienced groups were defined as those who had performed 0–20, 21–200, and greater than 200 ERCPs, respectively. None of the participants had used the simulator prior to the study. The participants completed an assessment session using the ERCP simulator. Their performance was scored and recorded, along with information on their prior endoscopic experience. Additionally, participants filled out a questionnaire regarding simulation experience, level of comfort and demand.
A literature review and expert interview were conducted to identify skills that are deemed important for cannulation. For this simulator, only technical skills (with the exclusion of cognitive skills) were simulated. A panel of 3 experts was employed to rate the pertinent skills, and a final list of fundamental skills was generated.
Based on the final list of fundamental ERCP skills, several prototypes were constructed using polyethylene and polypropylene materials to allow rapid modification of the prototypes. Each prototype was tested for practicality, consistency, and realism. The prototype with the highest ranked design was selected for the final version of the simulator.
The scoring system was developed based on the previous system that was used for the endoscopic part-task training box [10]. Through iterations of scoring systems, the system that was selected, and ultimately passed the validation tests, was the one that allotted 5 minutes per module. Each task successfully completed was awarded 10 points. One point was awarded for each second remaining after task completion.
Once the prototype and scoring system were finalized, the validation stage was initiated. According to the Standards for Educational and Psychological Testing of the American Educational Research Association, American Psychological Association, and the National Council on Measurement in Education, there are 5 main sources of evidence that can be used to support the validity of an interpretation for a new test or, in this case, to support the validity of a scoring system for a new simulator [13,14]. These 5 sources include content, internal structure, response process, relationships to other variables, and consequences of testing evidence. In this study, evidence regarding test content and relationships to other variables was provided to support the newly developed ERCP simulator and scoring system.
This refers to the degree to which test content corresponds to testing purposes. In this study, a panel of expert endoscopists was asked to rank the simulator on a four-point scale based on its realism, relevance, and representativeness (Fig. 1). In this case, realism refers to realistic haptic feedback during cannulation of the papillae, compared to real cannulation during clinical ERCP, while considering technical features, such as: (1) orientation of the duodenoscope to the papilla, (2) angulation of the papillotome to achieve selective pancreatic and bile duct cannulation, and (3) haptic feedback during wire-guided cannulation. The relevance factor refers to the capacity of the simulator to assess the relevant technical ERCP skills, including: (1) use of the duodenoscope elevator, (2) use of the duodenoscope wheels to achieve en face orientation, and (3) use of the papillotome with application of tension to bend the device to achieve selective cannulation. Lastly, the representativeness factor assesses the capacity of the simulator to encompass essential technical ERCP cannulation skills, including: (1) basic selective cannulation skills (2) varied papilla locations that require relevant adjustments to achieve en face orientation, and (3) papillae with different common channel lengths. Subsequently, content validity indexes (CVIs) for realism, relevance, and representativeness were calculated using the proportion of experts who rated the item as “content valid”, which was defined as a rating of 3 (agree) or 4 (strongly agree).
Additionally, as part of evidence based on test content, all participants were asked to comment on the face validity of the simulator. More specifically, participants rated the simulator based on the following characteristics: (1) the capacity of the simulator to differentiate between different levels of cannulation skill, (2) potential to improve clinical cannulation skill, and (3) whether the simulator should be used prior to allowing the trainee to initiate ERCP intervention with human cases.
This refers to a correlation between scores obtained from a new assessment tool and an existing measure [13,14]. In this study, the number of ERCPs previously performed was used as a criterion standard to measure the level of ERCP experience. Traditionally, it is believed that endoscopists with a higher number of procedures are generally more technically proficient than those with considerably less experience. In the Gastroenterology Core Curriculum (GCC), it is recommended that at least 200 ERCPs be performed before assessing competency [15]. Therefore, in this study, participants were divided into three groups: 0–20 ERCPs (0th–10th percentile of GCC threshold), 21–200 ERCPs (11th–100th percentile of GCC threshold), and greater than 200 ERCPs (greater than 100th percentile of GCC threshold). A sensitivity analysis was also performed.
Participants were asked to fill out a questionnaire following the simulation session. The questionnaire consisted of the unweighted NASA Task Load Index (NASA-TLX), which was used to assess the perceived workload of the simulator [13]. Seven subscales were assessed, including mental demand, physical demand, temporal demand, effort, performance, frustration level, and technical difficulty. Each subscale was presented as a 12-cm line divided into 20 equal intervals. A visual analog scale was used with 21 vertical tick marks on each scale, which divided the scale from 0 to 100 in increments of 5. If a subject marked between 2 ticks, the value corresponding to the right tick was used [16,17].
Data were presented as mean±standard deviation for continuous variables, or proportion (%) for categorical variables. Means were compared using a Student’s t-test. Proportions were compared using a Chi-squared test. A p-value of less than 0.05 was deemed statistically significant. Statistics were performed using SAS 9.4 (SAS Institute Inc., Cary, NC, USA).
A review of professional society recommendations and published literature resulted in a list of essential cannulation-related ERCP skills. These skills were divided into cognitive and technical skills, and the technical skills were considered for inclusion in our simulator prototype (Table 1). Therapeutic techniques, including sphincterotomy, balloon and basket extraction, and dilation and stent placement, were considered beyond the scope of the development of our simulator and were not included in the list of cannulation-related ERCP skills.
After an expert panel reviewed the list of skills, a final list of fundamental ERCP skills was created to be considered for inclusion in our simulator prototype. Criteria for skill selection included not only the essence of the skills themselves, but also the practicality of simulating the skills and including them in a simulator module, as well as the durability of the simulator without requiring replacement of parts. The final list of skills included: (1) positioning the duodenoscope to achieve an en face view to the papilla, (2) usage of a sphincterotome to assist with selective cannulation, (3) selective cannulation of the bile duct and pancreatic duct, and (4) selective cannulation for papilla in different locations and in altered anatomy.
More than 20 prototypes were developed to simulate the selected fundamental ERCP skills. In order to determine the appropriate duodenoscope positions and to allow scope stabilization without gravitational drifting, the following parameters had to be adjusted: box dimensions, scope entry site, and papilla locations. Different wires and sphincterotomes were also tested to ensure generalizability across various instrument platforms. To allow for varied difficulty levels in selective cannulation, the following components had to be adjusted: (1) papilla materials, (2) length, diameter and angle of the common channel, and (3) angle and rotation between the pancreatic duct and bile duct. The locations of each papilla and the order of cannulation were also modified until general skill differentiation was achieved without compromising realism. Additionally, special attention was given to ensuring that papillae were placed in a location that mimicked cannulation in a supine position and in altered anatomy.
The final prototype included six silicone papillae, each with a bile duct and pancreatic duct (Fig. 1). Five of the six papillae represent variants of normal anatomy, including those in the proximal second portion of the duodenum (D2), standard D2, distal D2, and supine positions. Difficulty levels increase from the first to the fifth papilla, as the length of the common channel and the angle between the pancreatic duct and bile duct increase (Fig. 2). Scope position also becomes increasingly more unstable as an operator moves from the first to the fifth papilla. The sixth papilla mimics cannulation in patients with a Billroth II and/or Roux-en-Y gastric bypass (RYGB) anatomy. The simulator also allows double-wire cannulation and minor papilla cannulation. However, evaluation of these competencies is not included in the skill assessment test. Fig. 3. demonstrates the simulator and room setup during the simulation session.
During the design process, scores for novice, intermediate, and experienced operators were collected. The scoring system was modified to maintain differentiation in scores among the 3 groups. The final scoring system allows 5 minutes per session. Once the duodenoscope is positioned en face at the starting point, the timer is initiated. The duodenoscope must be advanced using gestures similar to those required to advance through the duodenal sweep and positioned at a red indicator. Subsequently, papillae 1 through 5 are encountered sequentially. Once the duodenoscope is positioned en face with each papilla, deep cannulation of the pancreatic duct, followed by the bile duct, must be achieved before advancing to the next papilla. A standard sphincterotome and a guidewire with a hydrophilic tip are used to perform cannulation. In order to ensure deep cannulation, participants are instructed to advance the wire until the markings are seen through the clear papilla. Each duct that is successfully cannulated is awarded 10 points. In order to award efficiency, one additional point is given for each second remaining after completion of the session. In addition, for participants who complete the session within 5 minutes, an option to cannulate a bonus papilla is provided. This bonus papilla represents Billroth II or RYGB anatomy, with a standard sphincterotome, a cannulation catheter, or a flexible tip cannula. In this case, only the bile duct needs to be cannulated, and if this is accomplished within 1 minute, an additional 20 points are awarded and added to the original score (Supplementary video 1).
We recommend that an experienced proctor should administer the simulator sessions. Also, uniform instructions are read to all participants that provide information regarding objectives, specific instructions, and the scoring system.
A total of 23 participants completed the simulation session. The baseline characteristics for each group are summarized in Table 2.
Seven experts graded the test content following the simulation session (Table 3). The average CVI for realism, relevance and representativeness was 0.84, 1.00 and 0.82, respectively. The composite CVI was 0.89, suggesting consensus among the reviewers that the simulator had demonstrated content-related validity in terms of its realism, relevance and representativeness.
In addition, participants completed a questionnaire regarding their impression of the simulator. Overall, 87% of participants believed that the simulator was able to distinguish between different levels of cannulation skill, that it could improve cannulation skills for clinical ERCP, and that it should be used prior to the trainee initiating ERCP intervention with human cases.
There was a statistically significant correlation between prior ERCP experience levels and simulator scores. Average simulator scores of the novice, intermediate, and experienced groups were 6.7±8.7, 30.0±16.3 and 74.4±43.9, respectively (p<0.0001) (Fig. 4A). Additionally, there was a strong positive correlation between the number of ERCPs previously performed and the individual’s simulator score (Pearson value of 0.77, R2 of 0.60) (Fig. 4B).
A sensitivity analysis of the non-experienced (novice + intermediate) versus experienced groups showed similar results, with average simulator scores being 16.9±17.0 versus 74.4±43.9 in the two respective groups (p=0.0001).
The scores for perceived workload for all participants were as follows (score in 100): mental demand 58, physical demand 46, temporal demand 60, effort 58, performance 46, frustration level 53 and technical difficulty 57. The levels of mental demand, effort, and perceived technical difficulty differed significantly among the 3 groups (novice, intermediate and experienced) (Fig. 5). The average workload levels of the novice, intermediate and experienced groups were: mental demand 81±4, 63±18, and 38±27 (p=0.015); effort level 77±7, 65±21, and 40±21; and levels of perceived technical difficulty 78±8, 50±0, and 41±22 (p=0.01), respectively.
Traditionally, ERCP training has relied upon meeting an arbitrary volume threshold as a surrogate for determination of competence. In the past, various objective criteria have also been proposed to assess ERCP skills, with the most common being a native papilla cannulation rate of at least 90% [18]. Additionally, the American Society for Gastrointestinal Endoscopy (ASGE) has recently recommended that training programs consider using the endoscopic ultrasound and ERCP Skills Assessment Tool, which assesses a variety of technical and cognitive skills. This assessment occurs immediately after completion of the ERCP procedure and should be performed periodically throughout the fellowship training [19].
This study constitutes a different approach to objectively assess ERCP skills, with the intent to serve as an adjunct to clinical assessment. Based on contemporary methods of content development, an ERCP mechanical simulator was developed as a tool to assess fundamental cannulation skills and to allow trainees to practice basic maneuvers prior to initiation of clinical cases. Validation studies were performed to demonstrate its realism, relevance, representativeness, and ability to differentiate ERCP skills based on clinical experience.
To date, there have only been a few ERCP mechanical simulators described in the literature. In 2011, Leung et al. described a mechanical simulator consisting of a papilla with biliary and pancreatic ducts that were positioned 60° apart [20]. The papilla was detachable to allow placement of different designs for stricture dilation and stent placement. Around the same time, Frimberger et al. developed the X-vision ERCP training system that consisted of 4 models, including selective cannulation, problem papillae, selective stent placement, and sphincterotomy [21,22]. In 2016, the Boskoski-Costamagna ERCP trainer was described that consisted of the esophagus, stomach, and duodenum, which were attached to different papilla and represented anatomical and patient-position variations [23]. Validation studies of these simulators revealed that operators with more clinical ERCP experience performed better on the simulators compared to those with less experience [20-24]. Considering previous models, our simulator contains multiple papillae that vary based on several factors, including: (1) angles or axes of the papilla, (2) common channel length, (3) angle between biliary and pancreatic ducts, and (4) location within the duodenum. Additionally, a papilla to simulate Billroth II and RYGB anatomy was included for assessment of advanced cannulation skill. Unlike other previous simulators, which measure time to task completion as the primary outcome, our simulator measures the number of tasks successfully performed within an allotted time. Benefits of this system include having a predetermined time duration per simulation session, which may increase the practicality of the simulation session. This system was based on our prior part-task endoscopic simulator, which was extensively validated [10,11].
While most mechanical simulators are criticized for their lack of realism [5], this study provided strong content-related validity evidence regarding the simulator’s realism. This was achieved through a rigorous developmental strategy based on a literature review and expert opinion, in which multiple prototypes were built from various materials until a consensus on quality of realism was achieved. Subsequently, the simulator was judged by another set of experts who deemed that this device was realistic in the context of cannulation (CVI for realism was 0.84). Additionally, CVI for relevance and representativeness were 1.00 and 0.82, respectively, making construct under-representation and construct irrelevance, the two major threats to validity, less likely [25]. In addition, a group of 23 participants with varied ERCP experience agreed that the simulator was able to distinguish between different levels of cannulation skill (87% agreement). This content-related validity evidence supports the use of the simulator and its scoring system as a skill assessment tool.
In addition to content-related validity evidence, our study provides strong evidence based on the relationships between variables. For example, the scoring system was able to significantly differentiate cannulation skill based on clinical ERCP experience. A Pearson analysis was also performed to assess the correlation between the individual’s simulator score and number of ERCPs previously performed. The analysis demonstrated strong positive correlation between these 2 variables, with a Pearson value of 0.77.
Alternatively, this simulator may serve as a useful tool for ERCP skill development, especially in a preclinical setting. Traditionally, ERCP training has involved the apprenticeship model, wherein trainees observe and learn different steps during clinical cases that gradually increase in the level of complexity. This approach increases trainees’ mental workload, especially during the beginning of the learning curve, when they are learning the function of the duodenoscope and accessory devices, in addition to basic ERCP steps. Simulation, therefore, may play an important role during this step by allowing trainees to become familiar with the devices and scope positioning prior to focusing on clinical procedures, which involve many other factors.
It is important to recognize that the same simulator should not be used for both skill development and assessment purposes. This practice could result in a trainee learning the simulator and not the necessary ERCP skills. Additionally, the simulator cannot be used in lieu of clinical experience, which incorporates a variety of important factors beyond cannulation that are not captured by this simulator.
This study has a few limitations. The simulator described herein focuses only on the technical aspects of ERCP skills, although both technical and cognitive skills are equally important to perform optimal ERCP. Moreover, more advanced therapeutic techniques are not included in this fundamental ERCP simulator. The sample size is relatively small; however, the results are statistically significant, and the number of participants is comparable to a majority of simulation studies.
In summary, this novel ERCP mechanical simulator appears to be realistic, relevant, and representative of the technical aspects of ERCP cannulation. The simulator is also able to differentiate cannulation skill by clinical experience level. Therefore, the simulator may be useful as a tool to determine whether a trainee is equipped to be evaluated on their level of competency in ERCP cannulation. Additionally, given the risk profile of ERCP, simulation use should be encouraged prior to initiation of clinical cases; however, the same simulator should not be used for both training and skill assessment.
REFERENCES
1. Andriulli A, Loperfido S, Napolitano G, et al. Incidence rates of post-ERCP complications: a systematic survey of prospective studies. Am J Gastroenterol. 2007; 102:1781–1788.
2. Freeman ML, Nelson DB, Sherman S, et al. Complications of endoscopic biliary sphincterotomy. N Engl J Med. 1996; 335:909–918.
3. Masci E, Toti G, Mariani A, et al. Complications of diagnostic and therapeutic ERCP: a prospective multicenter study. Am J Gastroenterol. 2001; 96:417–423.
4. Chutkan RK, Ahmad AS, Cohen J, et al. ERCP core curriculum. Gastrointest Endosc. 2006; 63:361–376.
5. Desilets DJ, Banerjee S, Barth BA, et al. Endoscopic simulators. Gastrointest Endosc. 2011; 73:861–867.
6. Matthes K, Cohen J. The Neo-Papilla: a new modification of porcine ex vivo simulators for ERCP training (with videos). Gastrointest Endosc. 2006; 64:570–576.
7. Neumann M, Mayer G, Ell C, et al. The Erlangen Endo-Trainer: lifelike simulation for diagnostic and interventional endoscopic retrograde cholangiography. Endoscopy. 2000; 32:906–910.
8. Sedlack R, Petersen B, Binmoeller K, Kolars J. A direct comparison of ERCP teaching models. Gastrointest Endosc. 2003; 57:886–890.
9. Bittner JG 4th, Mellinger JD, Imam T, Schade RR, Macfadyen BV Jr. Face and construct validity of a computer-based virtual reality simulator for ERCP. Gastrointest Endosc. 2010; 71:357–364.
10. Thompson CC, Jirapinyo P, Kumar N, et al. Development and initial validation of an endoscopic part-task training box. Endoscopy. 2014; 46:735–744.
11. Jirapinyo P, Kumar N, Thompson CC. Validation of an endoscopic part-task training box as a skill assessment tool. Gastrointest Endosc. 2015; 81:967–973.
12. Jirapinyo P, Abidi WM, Aihara H, et al. Preclinical endoscopic training using a part-task simulator: learning curve assessment and determination of threshold score for advancement to clinical endoscopy. Surg Endosc. 2017; 31:4010–4015.
13. American Educational Research Association; American Psychological Association; National Council on Measurement in Education; Joint Committee on Standards for Educational and Psychological Testing (U.S.). Standards for educational and psychological testing. Washington, D.C.: American Educational Research Association;1999.
14. Downing SM. Validity: on meaningful interpretation of assessment data. Med Educ. 2003; 37:830–837.
15. American Association for the Study of Liver Diseases; American College of Gastroenterology; American Gastroenterological Association (AGA) Institute; American Society for Gastrointestinal Endoscopy. The gastroenterology core curriculum, Third edition. Gastroenterology. 2007; 132:2012–2018.
16. Hart SG. NASA Task Load Index (TLX). vol. 1.0. paper and pencil package. Moffett Field (CA): NASA Ames Research Center;1986. p. 26.
17. Hart SG, Staveland LE. NASA Task Load Index (TLX). vol. 1.0. paper and pencil package. Moffett Field (CA): NASA Ames Research Center;1986. p. 136–183.
18. Adler DG, Lieb JG 2nd, Cohen J, et al. Quality indicators for ERCP. Gastrointest Endosc. 2015; 81:54–66.
19. Wani S, Keswani RN, Petersen B, et al. Training in EUS and ERCP: standardizing methods to assess competence. Gastrointest Endosc. 2018; 87:1371–1382.
20. Leung JW, Yen D. ERCP training - the potential role of simulation practice. J Interv Gastroenterol. 2011; 1:14–18.
21. Frimberger E, von Delius S, Rösch T, Karagianni A, Schmid RM, Prinz C. A novel and practicable ERCP training system with simulated fluoroscopy. Endoscopy. 2008; 40:517–520.
22. von Delius S, Thies P, Meining A, et al. Validation of the X-Vision ERCP training system and technical challenges during early training of sphincterotomy. Clin Gastroenterol Hepatol. 2009; 7:389–396.
23. Boškoski I, Costamagna G. The Boskoski-Costamagna ERCP trainer: from dream to reality. Endoscopy. 2016; 48:593.