INTRODUCTION
Multiple methods of feedback exist, which include downward feedback, upward feedback, peer feedback and self-evaluation. The most commonly known form of feedback is downward appraisal, where the supervisor gives feedback to the subordinate [1]. However, upward feedback, where the feedback is given from the subordinate to the supervisor is becoming more recognized and adopted, especially in the private sector. It has been reported that over 90% of fortune 100 companies in the United States participate in some form of upward feedback [1]. The role of upward feedback has also been widely acknowledged within the educational sector as well, where students give feedback to their lecturers [2-7]. Within medical training, the General Medical Council (GMC) in the United Kingdom has adopted upward feedback to monitor teaching performance for quality control purposes [8]. Although upward feedback has been advocated by the GMC, it is not immune from bias and there has been much debate about the accuracy of upward feedback [9-17]. This systemic review has been prompted by the increasing significant role of upward feedback as medical training becomes more closely regulated. Bias present within upward feedback could potentially skew feedback on medical training and this review aims to identify these factors.
METHODS
Search strategy
In order to obtain a comprehensive overview of the literature in upward feedback, a total of 35 databases were searched (Embase, Medline, PsychINFO, Cochrane and EBM Reviews, Allied and Complementary Medicine, CAB and ATLA Religion Database, Econ lit, GeoBase, Global Health, Health and Psychosocial Instruments, HMIC Health and Management, Index to Foreign Legal Periodicals, International Pharmaceutical Abstracts, Maternity and Infant Care, The Philosopher’s Index, Social Policy And Practice, Zoological Records, BNI, CINAHL, Health Business Elite, ERIC, British Educational Index, ASSIA, Web of Knowledge, Social Care Online, Sage Full Text Journals, IBBS, National Research Register Archive, Proquest, Wiley Online Library, Taylor and Francis, Engineer-ing Village, Scopus, Science Direct, PubMed). A stratified search involving multiple keywords was used (Fig. 1).
Searches were initially done to search within all fields. If more than 1,000 results were returned, then the search would be repeated to search within keywords, then abstract and then within the title in order to narrow down results to less than 1,000 articles. Search results of less than 1,000 articles were reviewed by reading the abstract; relevant abstracts were then shortlisted. If no abstract was available, but the title appeared relevant, this would also be temporarily shortlisted until further information could be obtained from the full article. Further references were found by reviewing the reference bibliography of the shortlisted articles.
Inclusion and exclusion criteria
Both medical and non-medical articles written in English were included. No time limit was set. Books were excluded from the search.
Data management techniques
A proforma was developed to allow efficient and relevant data extraction. This included: study method (e.g., observational or review article), profession, type of participant, geographical location, purpose of feedback (e.g., summative or formative), feedback subject (e.g., trainer, training or environment), qualitative/quantitative feedback, the use of controls and type of intervention involved (e.g., counseling, timing of feedback), type of feedback used (e.g., paper survey, semi-structured interviews), quality of questions (e.g., closed, open), duration of study, number of participants, response rates, types of bias present (overt and implied), Kirkpatrick level [18] and whether outcomes were addressed.
RESULTS
Literature search and selection
A total of 8,914 potential articles were found using the search strategy (Fig. 1), in which 291 articles were shortlisted. The shortlisted articles were then subsequently pooled together and duplicates were removed. A total of 169 articles were shortlisted after the removal of duplicates. By reviewing the reference bibliography of the shortlisted articles, a further 70 articles were shortlisted. A total of 239 references were shortlisted. After reviewing the articles 35 articles were excluded from further analysis. This was due to: 10 articles were not relevant to the objective, 1 reference was a book, complete versions were not obtainable for 21 references, 2 references were not written in English and 1 reference was a duplicate of another shortlisted reference but was under a different title. This lead to a total of 204 articles being analyzed, all of which are presented in Table 1.
Demographics
More than 50% of the references were related to the medical profession (n=109). Other professions that have commonly utilized upward feedback include teaching and education (n=39), nursing (n=22) and management (n=18). The majority of references included postgraduate participants (n=106). Thirteen references included both undergraduate and postgraduate participants. A large proportion of references were from North America (Fig. 2).
Types of studies and feedback
Studies were categorized according to the definitions in Table 2. Most references were evaluation studies (n=176) and most studies were done for formative purposes (n=172). A large majority of studies were quantitative (n=152) and high proportion of studies used paper surveys as a means of evaluating upward feedback (n=124). Most studies (n=162) only covered Kirkpatrick level 1, reaction. The median response rate was 76%, the median number of participants was 198 and the median duration of the study was 6 months. Only 1/3 of references addressed the outcomes of their study by developing an action plan. Furthermore, only 11 studies used controls to compare different interventions (Fig. 3).
Types of bias
Types of bias data separated into implied and overt bias. Implied bias involves factors that potentially could affect the upward feedback process but was not explicitly acknowledged within the article. Overt bias included factors affecting the upward feedback process that were mentioned within the article. A summary of the different types of bias found in this systematic review can be found in Table 3. Accountability and confidentiality were the most common biases recognized within references. On the other hand, the method of feedback, which involves the type of survey, the location, the use and methodology of reminders and the duration, were most commonly implied within articles but not explicitly acknowledged (Table 4).
DISCUSSION
This review shows that multiple sources of bias, in the important task of using feedback in the assessment of training quality, are already described.
Feedback philosophy
Although there has been extensive research on upward feedback within an undergraduate classroom setting [2-7,9,19-46], the high proportion of references related to the medical profession and to postgraduate participants confirms the popularity of upward feedback in postgraduate medical training. The majority used surveys for formative purposes, which can provide the trainer/teacher with guidance on their current performance. The lack of studies for summative purposes could be due to raters tending to be over-lenient when upward feedback was for administrative purposes [14,17,39]. However, in contrast, Smith and Fortunato [16] found that rating purpose did not affect intentions to provide honest ratings since raters may use the purpose as a tool to retaliate and reward their supervisors. Upward feedback could potentially be used as a tool to develop clinical trainers and to give guidance to clinical educators on their own career plans [47]. However, the effectiveness of upward feedback could be confounded by multiple factors, which will be discussed below. Most studies only evaluated Kirkpatrick level 1–reaction, which mostly involved surveying subordinate’s views on certain topics. Only 10 studies covered Kirkpatrick level 4–outcomes [1,4,5,38,44,48-52]. The majority of studies did not address the consequences or results of the study. This could be because it may be difficult to develop specific action plans based on Kirkpatrick level-one evidence. Furthermore, very few studies specifically compare the different factors or their effect on feedback quality.
Study administration
Upward feedback usually involves subordinates to appraising their superiors or training, hence it is not surprising that the majority of studies were evaluation studies. Only one study was a randomized controlled trial that stratified participants into 3 groups (online survey, simultaneous paper and online survey, sequential online and paper survey) [53]. This study found that the sequential survey method, in which online and paper surveys were administered at different times, gave the highest response rate but increased costs [53]. The small number of studies involving controls could be due to time and financial constraints. Controlled trials of educational interventions are rare, but more studies may need to include controls if we are to assess the efficacy of the different interventions. Without evidence for the effectiveness of interventions, it may be difficult for trainers to accept upward feedback from their subordinates. Tews and Tracey [49] showed that managers who participated either in self-coaching courses or upward feedback intervention, improved interpersonal scores compared to controls. Managers who participated in the upward feedback training scored higher overall [49]. This could due to the fact that upward feedback, if utilized appropriately, can facilitate information sharing, act as a refresher in order to avoid complacency and promote further development of skills [48]. Another form of support in upward feedback was the use of feedback reports, as demonstrated in Smither et al. [54]’s study. Feedback reports enabled managers to improve their managerial skills and also encouraged communication with their subordinates. However, adequate support with regular formal feedback in order to facilitate the process [48], may be difficult to orchestrate in medical training where clinical educators work shift patterns. Moreover, the costs of facilitating upward feedback support may be quite high.
It is only in recent years as the internet has become widely accessible that online surveys have become more commonly utilized, hence why paper surveys were still the most commonly used form of feedback method within this review. Online surveys are cheaper and easier to administer in comparison to paper surveys and allow people to do the survey at a time that is convenient for them [55]. Scott et al. [53]’s study showed that although doctors in training did not give the highest response rates overall, trainee doctors gave the highest response rate when the survey was online. This may suggest the increasing role of online surveys in the newer generation of doctors. Furthermore, using online surveys to monitor training and trainers could allow the data to be more representative of the population of doctors in training.
Human factors in upward feedback bias
Affect describes the feeling of liking someone [56,57]. It has been thought that affect can lead to leniency because it can prevent one’s ability to objectively and rationally evaluate someone [58]. Al-issa found that students gave higher ratings to teachers who they got along with [9]. Moreover, Antonioni and Park showed that the leniency was more profound in both peer and upward feedback compared to downward feedback [56], suggesting that affect may play a role in both peer and upward feedback. In contrast, a study by Ryan et al. [59] found that recipients of feedback were more likely to accept feedback from those who they are already acquainted to and this finding was confirmed in another study [60]. This could suggest that supervisors may be more accepting of honest feedback and this may encourage subordinates who have a positive relationship with their supervisors to give honest feedback.
Antonioni [61] found that participants who were not anonymous when they gave upward feedback did give higher ratings compared to anonymous participants. Furthermore, fewer participants stayed in the study after finding out they were in the group which could be identified [61]. However, this study was implemented within an insurance company where upward feedback could potentially be for used for summative purposes. This could lead to greater inflation in order to minimize the negative consequences. In contrast, upward feedback in medical training is more likely to be for formative purposes in order to further develop the clinical educator. Many studies have allowed upward feedback response to be confidential due to potential rating inflation [3,4,7,12-15,17,22-24,26,28,34-39, 43-45,47-50,52-55,57,58,61-142], hence accountability and confidentiality was the most commonly acknowledged type of bias found within this systematic review. In contrast, Roch and McNall [67] that investigated whether anonymity affected ratings found that students who were not anonymous actually gave lower ratings compared to anonymous raters. Non-anonymous raters may feel more pressure to give high quality ratings [67]. So, there still may be a role for surveys in which subordinates may be accountable for their ratings. Furthermore, supervisors seem to be more accepting of accountable surveys [61]. Unfortunately, in potentially negative situations, anonymity seems likely to be the best policy.
Reward anticipation could be related to evaluation inflation. Previous studies have found that course grades can significantly predict student ratings [7,9], but the causation is unclear. Marsh and Roche [30] found that giving high grades were not related to higher student evaluation, but instead a lot of the variation within student evaluations could be accounted for by prior subject interest, higher and challenging workloads and learning. Furthermore, Abrami et al. [6] found that student grades were unlikely to have an effect on student ratings. The relationship of reward and ratings has been inconsistent and can be subjected to interpretation, hence the need for further research in this area.
Even if confidentiality concerns are addressed, this may still affect participation due to fear and retaliation [10,12,15,61,62, 132]. The miscorrelation of self-perception and upward feedback results could affect acceptability and credibility of upward feedback since it threatens self-esteem [143]. Multiple factors can affect people’s receptivity to feedback, this includes their motivation, fear and expectations [60]. However, if feedback is delivered appropriately and is perceived as valuable, then this can minimise the risk of negative emotions and dismissal of the feedback [60]. This is likely to require specialist input e.g., counseling which may have extra cost implications.
A lack of trust and cynicism was not an uncommon finding in both medical [45,53,55,58,137,142,144-147] and non-medical feedback [5,9,15-17,21,26,38-40,52,61,67,70,71,75 81,82, 91,148-150]. If there is discrepancy between self-ratings and upward feedback ratings [128,145], there is a possibility that the recipient may not find the feedback credible. Also poorly designed surveys that may lack useful feedback can lead to reluctance to change. Even trainees question the credibility of some of the feedback provided by their supervisors [151], hence it is likely that supervisors may do the same to feedback from trainees. Moreover, upward feedback, especially in an undergraduate setting has been compared to ‘popularity contests.’ Aleamoni [46]’s review article demonstrated that evidence supports the fact that students are able to judge the effectiveness of teaching. However, attitudes are harder to modify and this misperception may still lead to faculty being more resistant to change. This resistance could in turn affect raters’ enthusiasm, especially if previous experiences of upward feedback lead to no improvement.
Limitations
Although a comprehensive search was done, however, this may not be representative of all the data available on upward feedback. Also, a total of 35 articles shortlisted in the systematic review were not included in the results. There could potentially be other types of bias present in literature that was not reviewed within this systematic review. Moreover, we have identified a number of different biases that are involved in upward feedback, however we have not investigated how these biases can be minimised. Further research will be required in order to determine whether these biases are interrelated and if it is possible to minimise the effects of different biases, especially human factors.
CONCLUSION
Upward feedback is a multidimensional form of feedback that can lead to improvement if facilitated and implemented appropriately. This systematic review has shown that multiple different types of bias can exist within upward feedback. The established literature acknowledges and suggests likely causes of bias, without thoroughly investigating their effect on feedback quality. This highlights the importance for managers of training to consider important factors such as survey method and intended uses when designing and interpreting feedback. Currently, a mixed approach with triangulation of methods seems to be the best way to evaluate medical training. Further research is required in order to evaluate which types of bias are associated with specific survey characteristics and which factors are potentially modifiable.