Eun Young Lim; Mi Kyoung Yim; Sun Huh

doi:10.3352/jeehp.2018.15.33

Abstract

Purpose

Smart device-based testing (SBT) is being introduced into the Republic of Korea’s high-stakes examination system, starting with the Korean Emergency Medicine Technician Licensing Examination (KEMTLE) in December 2017. In order to minimize the effects of variation in examinees’ environment on test scores, this study aimed to identify any associations of variables related to examinees’ individual characteristics and their perceived acceptability of SBT with their SBT practice test scores.

Methods

Of the 569 candidate students who took the KEMTLE on September 12, 2015, 560 responded to a survey questionnaire on the acceptability of SBT after the examination. The questionnaire addressed 8 individual characteristics and contained 2 satisfaction, 9 convenience, and 9 preference items. A comparative analysis according to individual variables was performed. Furthermore, a generalized linear model (GLM) analysis was conducted to identify the effects of individual characteristics and perceived acceptability of SBT on test scores.

Results

Among those who preferred SBT over paper-and-pencil testing, test scores were higher for male participants (mean± standard deviation [SD], 4.36± 0.72) than for female participants (mean± SD, 4.21± 0.73). According to the GLM, no variables evaluated— including gender and experience with computer-based testing, SBT, or using a tablet PC—showed a statistically significant relationship with the total score, scores on multimedia items, or scores on text items.

Conclusion

Individual characteristics and perceived acceptability of SBT did not affect the SBT practice test scores of emergency medicine technician students in Korea. It should be possible to adopt SBT for the KEMTLE without interference from the variables examined in this study.

Introduction

Computer-based testing (CBT) has been successfully used for high-stakes medical health licensing examinations in the United States, Canada, and Taiwan. In the Republic of Korea, 24 medical health licensing examinations are managed by the Korea Health Personnel Licensing Examination Institute (KHPLEI). The KHPLEI decided to introduce CBT for the Korean Emergency Medical Technician Licensing Examination (KEMTLE), which is one of the 24 medical health licensing exams managed by the KHPLEI, starting in late 2017 [1,2]. The KEMTLE is the first professional licensing examination that will use SBT in Korea.

The KHPLEI began to administer CBT practice tests in 2014, and decided to introduce smart device-based testing (SBT), which involves the use of a tablet PC instead of a desktop PC. A tablet PC was chosen to avoid placing limitations on testing locations and the number of examinees. If a desktop PC is used for the exam, specifically equipped test centers would be needed, and the number of desktop PCs at the test center would limit the number of examinees. In contrast, using a tablet PC for the exam increases flexibility in the testing locations, and enables the administration of as many exams as the KHPLEI provides tablet PCs for. Therefore, in this report, we use the term SBT instead of CBT. Based on the results of the practice test scores and the questionnaire on examinees’ perceived acceptability of SBT, it may be possible to identify individual characteristics and acceptability-related variables that affect test scores. If such variables are found, we would need to make an effort to minimize their effects in order to achieve comparability between SBT scores and conventional test scores.

In a recent study on SBT in Korea, satisfaction with, convenience of, and preference for SBT compared to paper-and-pencil testing were sufficient to determine that administering SBT was worthwhile [3]. In a focus group interview after CBT at a medical school in Korea, CBT was reported to be good for student learning because it strengthened the clinical context [4]. In another study, experience with computers and anxiety about computers did not affect the CBT test scores of health professions students [5]. In medical school in the United States, content familiarity was found to be related to differences in performance, but not gender, competitiveness, or familiarity with computers [6]. Although some evidence suggests that individual characteristics might affect CBT test scores, more extensive research is needed on the impacts of those characteristics and the perceived acceptability of SBT on SBT test scores. Therefore, we aimed to determine whether individual characteristics and perceived acceptability affected the test scores of examinees on the KEMTLE practice test using SBT. Specifically, we investigated whether individual characteristics affected the perceived acceptability of SBT and whether individual characteristics and perceived acceptability affected the test scores. The acceptability variables consisted of 3 subcategories: satisfaction with, convenience of, and preference for SBT. The null hypotheses of this study were as follows: first, variables relating to individual characteristics would not affect perceived acceptability; and second, variables relating to individual characteristics and perceived acceptability would not affect examinees’ test scores.

Methods

Ethics approval

Students participated in the survey after providing written informed consent. This study was approved by the Institutional Review Board of Hallym University (HIRB-2015-092).

Study design

The study had an observational design based on test results and a questionnaire survey. A generalized linear model (GLM) analysis was conducted to evaluate the effects of individual characteristics and perceived acceptability of SBT on test scores.

Setting

The SBT KEMTLE practice test and questionnaire were administered to 569 candidate students (examinees) at the same sitting on September 12, 2015 in Daejon, Korea. A smart device (a 10-inch tablet PC) was distributed to each examinee, and they marked their responses on the screen of the device. The test items consisted of 50 multimedia items and 80 text items. They were given 120 minutes to complete the examination. All items contained 5 options with 1 best answer. All 569 examinees who were present took the examination; and 560 students responded to the questionnaire on the acceptability of SBT after the examination. The original questionnaires consisted of 8 items regarding individual characteristics, as well as 2 satisfaction, 13 convenience, and 16 preference items (Supplement 1), but based on the results of exploratory factor analysis, 9 convenience and 9 preference items were selected for this study. Items were scored on a 5-point Likert scales (1, strongly disagree; 2, disagree; 3, neutral; 4, agree; 5, strongly disagree). The questionnaire was also administered on the tablet PC. The exam and questionnaire were not internet-based; instead, stand-alone tablet-based testing was used. After the examination and survey, the data in the tablet PCs were moved to a separate location and the responses were transferred to a server. The collected data comprised the test scores of the examinees (Supplement 1) and their responses to the survey questionnaire. Fig. 1 presents a diagram of the study process.

Fig. 1.

Diagram of the study process. KEMTLE, Korea Emergency Medicine Technician Licensing Examination.

Participants

A total of 569 examinees were included from the 41 emergency medicine technician schools in Korea, who were arbitrarily selected to be administered the practice test and questionnaire on the perceived acceptability of SBT. They were in their final year of study (i.e., third-year students from 3-year programs or fourth-year students from 4-year programs). The total annual enrollment in the 41 schools was 1,400 based on a national regulation; therefore, the 569 participants corresponded to 40.6% of the target population. The characteristics of the participants are presented in greater detail in Table 1. Of the 569 subjects who took the examination, 560 participated in the questionnaire survey. The validity test was conducted using responses from 162 students, and responses from the other 398 students were used for the null test.

Table 1.

Background of the 560 emergency medicine technician students who took the exam in 2015 in Korea and responded to the questionnaire

Background	Frequency (%)
Background	Survey validation (N=162)	t-test and generalized linear model analysis (N=398)
Gender
Male	91 (56.2)	227 (57.0)
Female	71 (43.8)	171 (43.0)
Age (yr)
20–24	127 (78.4)	312 (78.4)
≥ 25	35 (21.6)	86 (21.6)
Program length (yr)
3	98 (60.5)	222 (55.8)
4	64 (39.5)	176 (44.2)
Experience taking a computer-based test
Yes	74 (45.7)	161 (40.5)
No	88 (54.3)	237 (59.5)
Experience taking a smart device-based test
Yes	20 (12.3)	80 (20.1)
No	142 (87.7)	318 (79.9)
Experience using a smart device
Yes	160 (98.8)	380 (95.5)
No	2 (1.2)	18 (4.5)

Variables

The variables related to individual characteristics and perceived acceptability of SBT are listed in Tables 1-4. The examinees’ test scores were considered to be the outcome. The variables for individual characteristics were treated as dichotomous values. The variables for acceptability were on a 5-point Likert scale. Test scores were a continuous variable.

Table 2.

Results of exploratory factor analysis of the scale for the convenience of smart device-based testing features from 162 emergency medicine technician students in 2015 in Korea

Convenience of smart device-based testing	Factor
Convenience of smart device-based testing	1	2
It was convenient to check the items that were not solved before submitting the answers.	0.755	0.270
The functions for selecting the correct answer and changing the selection were convenient.	0.736	0.322
The functions for seeing the previous item, the next item, and the list of all items was convenient.	0.728	0.282
It was convenient to see one item on one screen.	0.604	0.292
The indication of the time remaining on the test was more convenient than announcements of the remaining time.	0.516	0.267
The screen user interface was appropriate.	0.261	0.849
The loading time was adequate for video playing.	0.300	0.705
Zooming in and out of the figure and replaying of video were appropriate.	0.329	0.669
The font was good.	0.367	0.580

Table 3.

Results of exploratory factor analysis of the scale for preferences for smart device-based testing features from 162 emergency medicine technician students in 2015 in Korea

Preference for smart device-based testing	Factor
Preference for smart device-based testing	1	2
Tablet PC-based testing is an improved system over existing desktop PC-based testing.	0.754	0.291
Finger touch input to the tablet PC was more convenient than mouse-click input to a desktop PC.	0.752	0.363
Cheating may be less likely when using a tablet PC than when using a desktop PC.	0.745	0.275
The test device was simple and convenient, allowing me to focus more on the test.	0.737	0.475
There was no noise or heat from the test device, so that the test environment was comfortable.	0.718	0.215
There was less eye strain with the tablet PC than with the desktop PC.	0.693	0.349
The psychological burden (tension) decreased because there was no answer-marking procedure on the OMR card.	0.236	0.772
The font size was large and clear, so that it was convenient for me to solve the problems.	0.332	0.771
The lack of an answer-marking procedure on the OMR card helped me allocate time for solving items.	0.389	0.724

OMR, optical mark reader.

Table 4.

The content, number of items, and Cronbach alpha coefficient of each scale in the questionnaire on the perceived acceptability of SBT administered to 162 emergency medicine technician students in 2015 in Korea (N=162)

Scale (description)	Content factor	No. of items	Mean±standard deviation	Cronbach alpha
Satisfaction with SBT		2	4.20 ± 0.88	0.875
Convenience of SBT (degree of convenience of SBT features)	All	9	4.30 ± 0.58	0.885
	1) Related to solving items	5	4.40 ± 0.62	0.836
	2) Related to user-interface	4	4.19 ± 0.66	0.849
Preference for SBT (preference for SBT compared to paper-and-pencil testing and computer-based testing)	All	9	4.20 ± 0.66	0.920
	1) Compared to computer-based testing	6	4.13 ± 0.73	0.905
	2) Compared to paper-and-pencil testing	3	4.35 ± 0.70	0.898

SBT, smart device-based testing.

Data sources/measurement

The source of all variables was response data from the survey questionnaire. The measurement methods were exploratory factor analysis for validity, the Cronbach alpha for reliability of the survey items on perceived acceptability of SBT, the t-test for the relationships of individual variables with perceived acceptability, and a GLM for the effects of variables related to individual characteristics and perceived acceptability on test scores.

Bias

There was no noteworthy source of bias in data collection or analysis. Nine of the 569 examinees did not respond to the acceptability questionnaire after SBT; this was low enough to have a negligible influence on the analysis.

Study size

The sample size (N= 569) corresponded to 40.6% of the total target student population, and examinees were drawn from 100% of the 41 emergency medicine technician schools; therefore, the sample size in this study was sufficient for the statistical analysis to be representative of the student population.

Quantitative variables

All variables were quantitative. They were subjected to a parametric analysis.

Statistical methods

Three procedures were conducted to test 2 null hypotheses. First, the survey questionnaire on the acceptability of SBT was validated and its reliability was confirmed; second, t-test analyses were performed to evaluate relationships between individual variables and perceived acceptability of SBT; and third, a GLM analysis was conducted to evaluate the effects of individual characteristics and perceived acceptability of SBT on test scores.

To confirm the validity of the questionnaire on the acceptability of SBT, exploratory factor analysis was conducted with the principal axis for the factor extraction method and varimax for factor rotation with 162 examinees. A total of 560 subjects were arbitrarily divided into 2 groups for survey validation (N= 162) and analysis using the t-test and GLM (N= 398). Reliability was assessed using the Cronbach alpha.

To test the null hypotheses, t-test analyses were performed with the results of the questionnaire on the acceptability of SBT and test scores according to the background variables of gender, age, type of university, and experience with CAT, SBT, and use of a tablet PC. Test scores on the KEMTLE were used as the dependent variable. The KEMTLE used for the practice test was composed of 130 items, including multimedia items and text items.

To determine the effect of individual characteristics and perceived acceptability of SBT on test scores, 3 different GLM models were analyzed using 3 different sets of test scores as dependent variables, with the same independent variables that were analyzed using the ttest. More specifically, GLM analyses were conducted of test scores on all 130 items (total scores), test scores on the 50 multimedia items, and test scores on the 80 text items. For this study, 14 variables were available: 6 categorical variables related to individual background characteristics, and 5 factors from the questionnaire regarding perceived acceptability of SBT and the 3 different types of test scores. The factors relating to perceived acceptability of SBT and the test scores were continuous variables. For the GLM analyses, examinees’ characteristics, which were used as independent variables, were selected based on the t-test results. Furthermore, 3 composites derived from the questionnaire on the acceptability of SBT were employed as independent variables (satisfaction with SBT, convenience of each of two SBT features, item solving, and the interface), as well as 2 factors related to preferences for SBT compared to paperand-pencil testing and compared to CBT. SAS ver. 9.4 (SAS Institute Inc., Cary, NC, USA.) was used for the analysis.

Results

Descriptive data of participants

Table 1 shows the number of examinees who responded to the survey questionnaire based on their background, subdivided according whether their responses were used for survey validation or the t-test and GLM analysis.

Outcome

The outcomes of this study were 6 variables related to individual characteristics, their perceived acceptability of SBT, and 3 sets of test scores from the practice examination (Supplement 1).

Validity and reliability of the acceptability questionnaire

Tables 2 and 3 present the results of exploratory factor analysis of the scale for the convenience of SBT features and the scale for preferences for SBT, respectively.

In addition to these 2 scales, overall satisfaction with using SBT was included in the SBT evaluation survey. The survey was composed of 3 scales: a scale for satisfaction with SBT (2 items), the scale for the convenience of SBT features (9 items), and the scale for preferences for SBT (9 items). The scale for the convenience of SBT features was composed of 2 factors (convenience related to item-solving, and convenience related to the user interface). The scale for preferences for SBT was also composed of 2 factors (preference for SBT compared to CBT and preference for SBT compared to paperand-pencil testing). Table 4 shows the description, the number of items, and the Cronbach alpha coefficient of each scale in the SBT evaluation survey. The range of reliability of scales and factors in each scale for the evaluation survey was 0.836 (convenience of SBT features relative to computer-based test) to 0.920 (preference for SBT). All scales and the factors in each scale showed strong internal consistency and a high level of reliability.

Descriptive statistics of 8 variables

Table 5 presents the descriptive statistics of the 8 variables related to test scores and the perceived acceptability of SBT.

Table 5.

Descriptive statistics of 8 variables related to test scores and perceived acceptability of SBT from 398 emergency medical technician students in Korea in 2015 (N=398)

Continuous variable	Min	Max	Mean ± standard deviation
Test scores
Total score (no. of items = 115)	33	115	77.65 ± 14.23
Score on multimedia items (no. of items = 50)	15	43	29.32 ± 5.44
Scores on text items (no. of items = 80)	18	75	48.33 ± 9.77
Satisfaction with SBT	1.00	5.00	4.01 ± 0.98
Convenience of SBT features
1) Solving items	2.20	5.00	4.35 ± 0.66
2) Interface	2.00	5.00	4.13 ± 0.71
Preference for SBT
1) Compared to paper-and-pencil testing	1.83	5.00	4.10 ± 0.74
2) Compared to computer-based testing	2.00	5.00	4.30 ± 0.72

SBT, smart device-based testing.

Effects of individual background characteristics on the SBT evaluation survey and test scores

Tables 6-10 show the mean values and standard deviations of the results of the SBT evaluation survey according to background variables and their t-test results. The mean results of the evaluation survey by each background category were higher than 3.81 (the mean score for satisfaction with SBT among examinees who had no experience of using a tablet PC) and examinees had high values of satisfaction with SBT, convenience of SBT features, and preference for SBT. The t-test analyses did not yield many statistically significant results. The mean differences in preferences for SBT (t= 2.132, degrees of freedom [df]= 396) and preference for SBT compared to paper-and-pencil testing (t= 2.076, df= 396) by gender were statistically significant at the level of 0.05. No other statistically significant results were found. The mean score for preferences for SBT among male participants (mean± standard deviation [SD], 4.23± 0.63) was higher than among female participants (mean± SD, 4.08± 0.66). Furthermore, the mean score for preference for SBT compared to paper-and-pencil testing among male participants (mean±SD, 4.36± 0.72) was higher than among female participants (mean± SD, 4.21± 0.73). The gender difference in preferences for SBT might have reflected gender differences in adaptability and favorable attitudes to using new information technology. Thus, for the GLM analyses, we needed to confirm whether preferences for SBT or gender affected test scores.