Background/rationale
Blended learning is increasingly recognized as a pivotal educational paradigm in the field of health professions education [
1-
4]. Blended learning programs are educational interventions that integrate synchronous learning modalities (e.g., real-time, in-person or videoconferencing sessions) with asynchronous learning modalities (e.g., pre-recorded online modules and learning management systems) [
5,
6]. In so doing, blended learning programs offer more flexible and personalized learning experiences compared to traditional face-to-face or fully online learning modalities [
1-
6]. Learners engaged in blended learning describe better control over the content, sequence, pace, and timing of their learning, often leading to more meaningful educational experiences [
1-
6]. Educators adopting a blended learning approach can teach knowledge-building content (i.e., memorization-focused material) through asynchronous modules, and skill-building content (i.e., practical experienced-based learning) in synchronous sessions [
3-
6]. This flexibility can enhance learner engagement, satisfaction, and educational outcomes [
1-
4].
Despite their benefits, evaluations of blended learning programs in health professions education remain haphazard, hindering quality improvement, scaling, and systematic comparisons [
5]. Challenges around evaluation can be attributed to the fact that evaluative terminology is often undefined and poorly conceptualized across health professions education (e.g., some studies may consider increase in learner satisfaction as sufficient evidence for effectiveness of a program, whereas others may only consider increase in post-intervention test scores as evidence for effectiveness) [
5]. Additionally, though questionnaires are the most utilized approach to evaluating blended learning programs in health professions education, most are often not designed or validated for the purpose of evaluating such programs (e.g., many use their institution’s generic end-of-course questionnaire as a baseline measure for program evaluation) [
5]. Recently, evaluation scholarship has turned toward the construct of “usability” to support comprehensive and meaningful evaluations of blended learning programs [
5,
6].
Usability, as perceived by learners, is a multidimensional construct which encompasses the following domains: effectiveness, efficiency, satisfaction, accessibility, organization, and overall experience from engaging with a product, technology, and/or service [
6]. Thus, usability goes beyond simply measuring “ease of use” to, instead, comprehensively evaluating the quality of systems, products, and services [
5,
6]. Although usability has been highly utilized for evaluating e-learning programs, its utilization with blended learning programs has been poor [
5,
6]. This is potentially due to the added complexity of blended learning programs (i.e., content spread across different learning modalities), as opposed to more straightforward evaluations of online learning settings (e.g., accessibility and organization of a learning management system).
To enable rigorous usability-focused evaluations of blended learning programs in health professions education, the Blended Learning Usability Evaluation–Questionnaire (BLUE-Q) was developed [
6,
7]. To date, content validity (i.e., if items are understandable, meaningful, comprehensive, and if sufficient item-domain correlation exists) for the BLUE-Q has been established through a Bayesian questionnaire validation approach with medical and health science faculty members [
7]. However, other evidence for the BLUE-Q’s construct validity (i.e., the degree to which a tool measures the theoretical construct it intends to assess) [
8,
9] including reliability evidence (i.e., the degree to which a tool is free from random error) [
9,
10], as established through real-world application of the BLUE-Q with learners, remains unexplored.
Importantly, in recent years, the use of mixed methods has gained traction as an effective approach to establishing construct validity [
8]. Specifically, by integrating quantitative and qualitative data, a more comprehensive understanding can be generated around how the underlying theoretical construct of the tool being validated is conceptualized and rated by users. This breadth of data overcomes limitations from traditional validation methods which are primarily quantitative, and thus, often fall short in capturing the nuances and contextual factors that are critical to understanding a construct’s full meaning and relevance across diverse settings and populations [
8]. Including mixed methods construct validity with reliability evidence improves the depth of evaluation of educational tools by working towards both statistical rigor and real-world applicability.