Bioequivalence data analysis for the case of separate hospitalization

Kyun-Seop Bae; Seung-Ho Kang

doi:10.12793/tcp.2017.25.2.93

Journal List > Transl Clin Pharmacol > v.25(2) > 1142669

Go to TopGo to Top Go to BottomGo to Bottom

TOOLS

Bae and Kang: Bioequivalence data analysis for the case of separate hospitalization

Original Article

Translational and Clinical Pharmacology 2017; 25(2): 93-100.

Published online: 15 June 2017

DOI: https://doi.org/10.12793/tcp.2017.25.2.93

Bioequivalence data analysis for the case of separate hospitalization

Kyun-Seop Bae¹, Seung-Ho Kang²

¹Department of Clinical Pharmacology and Therapeutics Asan Medical Center, University of Ulsan, Seoul 05505, Republic of Korea.

²Department of Applied Statistics, Yonsei University, Seoul 03722, Republic of Korea.

Correspondence: K. S. Bae; Tel: +82-2-3010-4611, Fax: +82-2-3010-4623, ksbae@amc.seoul.kr

Received 7 May 2017 Revised 30 May 2017 Accepted 1 June 2017

(open-access, http://creativecommons.org/licenses/by-nc/3.0/):

It is identical to the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/).

Abstract

A bioequivalence study is usually conducted with the same-day drug administration. However, hospitalization is occasionally separated for logistical, operational, or other reasons. Recently, there was a case of separate hospitalization because of difficulties in subject recruitment. This article suggests a better way of bioequivalence data analysis for the case of separate hospitalization. The key features are (1) considering the hospitalization date as a random effect than a fixed effect and 2) using “PROC MIXED” instead of “PROC GLM” to include incomplete subject data.

Keywords: Bioequivalence, Separate hospitalization, Mixed effects model

Introduction

Determining a final model among many competitive models is usually not a matter of “right or wrong” but of “better or worse.” In other words, it is important to remember the famous statement by George Box, “All models are wrong, but some are useful.”

A result of bioequivalence study with separate hospitalization was discussed at the Central Pharmaceutical Affairs Advisory Committee (CPAC) by the Korea Ministry of Food and Drug Safety (MFDS) in January 2017. The content of this article is the authors' opinion as expert advisors. The sponsor company agreed on the publication this information and provided the data for this article.

Methods

A 2×2 bioequivalence study was planned to include 24 subjects for each of the two treatment sequence groups (48 subjects in total). The study requested the subjects to have long period of hospitalization with strict inhibition of sunlight exposure. Therefore, there were not many volunteers for this condition. Subject disposition is shown in Figure 1 and maximum concentration (C_max) data is listed in Table 1.

SAS® 9.4 was used for data analysis; the script for data loading and an explanation of variable names are specified in Figure 2. At least 10 data analysis models, from the most naïve to the most complex ones, were considered (Table 2).

Results

Model 1. Independent two-group t-test

Figure 3 shows the summary of results of the independent two-group t-test, the most naïve approach. The equality of variances between the treatments could be assumed (p=0.8986), and the null hypothesis (i.e., there is no difference between the treatments) could not be rejected (p=0.8809). The width of 90% confidence interval (CI) for the geometric mean ratio was 0.3784, which is relatively wide and means the most inefficient method presented in Table 2. However, the bioequivalence of the test treatment, within the limit of [0.8, 1.25], was observed. However, this model is not acceptable as a final model by any regulatory body. Current regulatory guidelines request bioequivalence study to include the effects such as sequence, period, and random subject effect nested within the sequence in the final model.

Model 2. Conventional 2×2 model

If we ignore the effect of separate hospitalization (drug administration), the final model could be the conventional 2×2 crossover bioequivalence study model (Fig. 4). This model can only be used after the full model (considering the effect of separate hospitalization) is examined and when the additional effects such as hospitalization date can be ignored. This was the final model of the sponsor company after consulting a professor of statistics who advised that those insignificant additional effects (hospitalization and its interaction effects) could be removed.

Model 3A. Full model with administration (ADM) as fixed factor and period (PRD) nested within ADM

Figure 5 shows the result of this model. The interaction term between ADM and treatment was not significant (p = 0.1387), and many statisticians would agree to remove this term. The 90% CI (0.99480–1.34591) did not meet the bioequivalence limit, which was the main reason why the Korea MFDS summoned CPAC. In fact, European Medicines Agency (EMA) prohibited this kind of analysis, but some CPAC members wanted this to be the final model or analysis.

Model 3B. Reduced model of 3A by removing the interaction term between ADM and treatment

After removing the insignificant interaction term, the CI (0.91159–1.13029) satisfied the bioequivalence criteria, and the analysis of variance (ANOVA) result was acceptable (Fig. 6). Many statisticians would be comfortable with this as a final model. A more simplified model, such as Model 2, would also be acceptable. The ANOVA table shows satisfactory F values for further pooling of the terms into the error term to increase the efficiency of the estimation, which are explained in the statistics textbooks of experimental designs.[1] A rule of thumb for pooling is “F ≤1.” This model is the same one that the EMA suggested.[2]

Model 4A. Full model with ADM as fixed factor and PRD not nested

The EMA suggests using Model 3B, in which PRD is nested within the ADM. However, some may consider PRD as not-being nested. The ANOVA result are not much different (data not shown), the CIs of this and other models are summarized in Table 3. This model along with all the following models showed desirable ANOVA results and satisfied the bioequivalence criteria.

Model 4B. Reduced model of 4A by removing the interaction term between ADM and treatment

After removing the insignificant interaction term, the result met the bioequivalence criteria. The confidence limit is summarized in Table 3.

Model 5A–6B. Models considering ADM as a random factor and using PROC MIXED to include the subject data with PRD 1 only

The Models 5A–6B corresponded to Models 3A–4B, respectively, using PROC MIXED instead of PROC GLM for the CI calculation. PROC MIXED used the data of subjects who dropped out after PRD 1, whereas PROC GLM did not. Another important difference of these models is considering ADM as a random factor based on the statistics textbook.[1] Models 5A and 5B seem controversial because some consider that a fixed factor (PRD) nested within a random factor (ADM) should be a random factor.[3] All models examined here showed satisfactory results and met the equivalence criteria. The CIs are summarized in Table 3. Model 6B was the most efficient model and showed the narrowest CI (Table 3). In addition, Models 3A and 4A show seemingly biased point estimations compared with the other models.

Discussion

All acquired data during the trial should be included, if they increase the precision, and do not cause more bias. Thus, we suggest that using PROC MIXED is better than using PROC GLM. Many references comparing PROC MIXED and PROC GLM are available recently.[4 5 6]

Another point of discussion is how to deal with the drug ADM (hospitalization) date as a fixed or a random effect. We strongly suggest that this effect should be considered random, following the textbook[1] written by Sung Hyun Park, a professor of statistics at Seoul National University and president of the South Korean Academy of Science and Technology. Many other references also support that.[3,7 8 9 10 11] Table 4 summarizes the fixed versus random factor concept. For both fixed and random factors, randomization is easy for some (treatment for fixed factor, drug bottle for random factor), while difficult for some others (sex for fixed factor, hospitalization date for random factor). Therefore, randomization is not a classification criterion.

Precision or efficiency (small or minimum variance) is one of the criteria used to judge whether an estimation is good or not. If bias is not a problem, a more precise estimation will result in a narrower CI. As seen in Table 3, Model 6B was the most efficient (CI width, 0.21300), and Model 6B is likely to be less biased than Models 3A or 4A. A possible reason for Models 3A and 4A being biased and less efficient can be found in the following paragraph from the EMA[2]:

A model which also includes a term for a formulation^*stage interaction would give equal weight to the two stages, even if the number of subjects in each stage is very different. The results can be very misleading; hence, such a model is not considered acceptable. Furthermore, this model assumes that the formulation effect is truly different in each stage. If such an assumption were true, there is no single formulation effect that can be applied to the general population, and the estimate from the study has no real meaning.

Conclusion

...

3) A term for a formulation^*stage interaction should not be fitted.

“Formulation” and “stage” in the above passage are equivalent to “treatment” and “hospitalization,” respectively, in the present article.

Many more models can be considered with different arrangement of effect terms. However, all important models are addressed here.

In a retrospective view, the third hospitalization should not be done, because the sample size of the earlier two hospitalization groups appeared sufficient (post hoc power analysis indicated 16 subjects/group achieved a power of 80%[12]), whereas the third hospitalization group was too small to be balanced. With one subject drop, the allocation ratio became 3:1. Therefore, one seemingly outlier subject (ID: 48) had high influence on the third group, which in turn had too much weight for the estimation, if we had used a fixed effect model. Meanwhile, random effect models of ADM were resistant to this kind of bias or outlier. In practice, we could not assign or specify ADM at the time of protocol development or trial planning nor could we reproduce that date effect thereafter. Moreover, ADM could not (and should not) be the concern of the fixed effect (i.e., the level means of specific dates are not our concern). A very large inter-day variability compared with that of the treatment effect can be a concern for doctors. However, this was not the case (F <1). Therefore, the authors insist the use of a random effect model for the hospitalization (or drug administration) date to increase efficiency and robustness. Table 5 shows the comparison of PROC MIXED and PROC GLM to help choosing a procedure.

Our prescriptive conclusions are summarized below from the highest to lowest priority:

Treat hospitalization date as a random factor
Use PROC MIXED rather than PROC GLM to use all acquired data
Do not nest period within hospitalization date

Acknowledgements

The authors would like to thank Dr. Sungpil Han for helping in proofreading and drawing figures.

This manuscript was to be written as a tutorial or opinion paper when initially invited by the EIC of TCP. However, it was finally written in the format of original article by the opinion of authors.

Notes

Conflict of Interest: The authors declare that they have no conflict of interest.

References

1. Park SH. Design of Experiments. 2nd ed. Seoul: Min-Young Sa;2003. p. 58–60. p. 107–109. p. 146–148.

2. EMA. Questions & Answers: positions on specific questions addressed to the Pharmacokinetics Working Party (PKWP). 2015. p. 32.

Smith MK. Inappropriately Designating a Factor as Fixed or Random. Accessed 1 May 2017. https://www.ma.utexas.edu/users/mks/statmistakes/fixedvsrandom.html.

4. SAS. SAS/STAT 9.3 User's Guide. SAS Institute;2011. p. 217–218.

5. SAS. SAS/STAT 14.1 User's Guide. SAS Institute;2015. p. 123.

6. Elliott AC, Woodward WA. SAS Essentials: Mastering SAS for Data Analytics. 2nd ed. Wiley;2015.

7. Galwey NW. Introduction to Mixed Modeling: Beyond Regression and Analysis of Variance. 2nd ed. Wiley;2014.

8. Haney SA, Bowman D, Chakravarty A, Davies A, Shamu C. An Introduction to High Content Screening: Imaging Technology, Assay Development, and Data Analysis in Biology and Drug Discovery. Wiley;2015.

9. Torbeck LD. Pharmaceutical and Medical Device Validation by Experimental Design. CRC Press;2017.

10. Aris VM. 8. Using microarrays to measure cellular changes induced by biomaterials. Characterization of biomaterials. Woodhead Publishing;2012.

11. Gibson D. Methods in Comparative Plant Population Ecology. 2nd ed. Oxford University Press;2014.

12. Diletti E, Hauschke D, Steinijans VW. Sample size determination for bioequivalence assessment by means of confidence intervals. Int J Clin Pharmacol Ther Toxicol. 1992; 30(Suppl 1):S51–S58. PMID: 1601532.

Figure 1

Subject disposition.

Figure 2

SAS script for data loading. ADM, hospitalization (drug administration) group code (1, 2, or 3); SEQ, treatment sequence group (RT, reference then test treatment; TR, test then reference treatment); PRD, period (1 or 2); TRT, treatment (T, test treatment; R, reference treatment); SUBJ, subject ID; C_MAX, maximum concentration (C_max) value in original scale; LNCMAX, Cmax value in natural log scale.

Figure 3

Results of the independent two-group t-test.

Figure 4

Result of conventional 2 × 2 model (Model 2). (a) ANOVA result, (b) 90% confidence interval, SEQ, treatment sequence group; SUBJ, subject ID; PRD, period; TRT, treatment; PE, point estimate; LL, lower limit; UL, upper limit; WD, width of confidence interval.

Figure 5

Result of full Model (3A) with drug administration (ADM) as a fixed factor and period (PRD) nested within ADM. (a) ANOVA result, (b) 90% confidence interval, ANOVA, analysis of variance; SEQ, treatment sequence group; SUBJ, subject ID; TRT, treatment; PE, point estimate; LL, lower limit; UL, upper limit; WD, width of confidence interval.

Figure 6

Result of reduced Model (3B) with drug administration (ADM) as a fixed factor and period (PRD) nested within ADM. (a) ANOVA result, (b) 90% confidence interval, ANOVA, analysis of variance; SEQ, treatment sequence group; SUBJ, subject ID; TRT, treatment; PE, point estimate; LL, lower limit; UL, upper limit; WD, width of confidence interval.

Table 1

Maximum concentration (C_max) data before log transformation

Hospitalization or Drug Administration Group (ADM)	Sequence Group (SEQ)
	RT			TR
	Subject ID (SUBJ)	Period (PRD)		Subject ID (SUBJ)	Period
	Subject ID (SUBJ)	1 (Reference)	2 (Test)	Subject ID (SUBJ)	1 (Test)	2 (Reference)
1	02	506.42	596.23	01	351.85	530.60
	03	295.81	335.76	04	681.67	751.05
	06	450.59	251.70	05	601.97	645.09
	07	394.44	357.95	08	226.18	204.77
	09	585.16	300.40	10	420.29	563.72
	11	414.42	877.16	12	177.30	183.03
	13	Dropped		14	687.42	1010.04
	15	564.39	478.58	16	453.37	316.43
	17	161.49	156.34	18	1387.18	1021.87
	20	648.87	661.65	19	165.27	143.67
	22	754.37	475.66	21	613.72	362.84
	23	437.20	378.81	24	329.92	322.86
2	25	919.83	382.16	26	509.45	338.34
	28	541.73	606.97	27	504.76	327.09
	30	175.83	310.46	29	929.18	641.00
	31	363.42	536.39	32	410.74	434.10
	33	510.25	421.44	34	421.18	351.56
	36	251.42	203.29	35	168.70	Dropped
	37	457.28	440.53	38	786.90	1410.20
	39	362.80	205.46	40	252.79	Dropped
	42	253.98	200.54	41	1338.45	1403.20
				43	584.43	379.52
3	44	Protocol Violation		45	1016.63	575.24
	46	302.31	231.11	47	378.18	Dropped
	48	227.17	816.28
	113^*	731.40	797.59

^*Subject 113 is the replacement of subject 13.

Table 2

Ten models for data in Table 1

Model No.	Description	SAS Script
1	Independent two-group t-test	PROC TTEST DIST=LOGNORMAL ALPHA=0.1; CLASS TRT2; VAR CMAX;
2	Conventional 2×2 model	PROC GLM; CLASS SEQ PRD TRT SUBJ; MODEL LNCMAX = SEQ SUBJ(SEQ) PRD TRT; RANDOM SUBJ(SEQ) / TEST; LSMEANS TRT /PDIFF=CONTROL('R') CL ALPHA=0.1;
3A	Full model with ADM as fixed factor and PRD nested within ADM	PROC GLM; CLASS ADM SEQ PRD TRT SUBJ; MODEL LNCMAX = ADM SEQ(ADM) SUBJ(ADMSEQ) PRD(ADM) ADMTRT TRT; RANDOM SUBJ(ADM*SEQ) / TEST; LSMEANS TRT /PDIFF=CONTROL('R') CL ALPHA=0.1;
3B	Reduced model of 3A remov- ing ADM*TRT	PROC GLM; CLASS ADM SEQ PRD TRT SUBJ; MODEL LNCMAX = ADM SEQ(ADM) SUBJ(ADMSEQ) PRD(ADM) TRT; RANDOM SUBJ(ADMSEQ) / TEST; LSMEANS TRT /PDIFF=CONTROL('R') CL ALPHA=0.1;
4A	Full model with ADM as fixed factor and PRD not nested	PROC GLM; CLASS ADM SEQ PRD TRT SUBJ; MODEL LNCMAX = ADM SEQ(ADM) SUBJ(ADMSEQ) PRD ADMTRT TRT; RANDOM SUBJ(ADM*SEQ) / TEST; LSMEANS TRT /PDIFF=CONTROL('R') CL ALPHA=0.1;
4B	Reduced model of 4A removing ADM*TRT	PROC GLM; CLASS ADM SEQ PRD TRT SUBJ; MODEL LNCMAX = ADM SEQ(ADM) SUBJ(ADMSEQ) PRD TRT; RANDOM SUBJ(ADMSEQ) / TEST; LSMEANS TRT /PDIFF=CONTROL('R') CL ALPHA=0.1;
5A	Full model with ADM as random factor and PRD nested within ADM	PROC MIXED; CLASS ADM SEQ TRT SUBJ PRD; MODEL LNCMAX = SEQ(ADM) PRD(ADM) TRT; RANDOM ADM SUBJ(ADMSEQ) ADMTRT; ESTIMATE 'T VS R' TRT -1 1 / CL ALPHA=0.1;
5B	Reduced model of 5A removing ADM*TRT	PROC MIXED; CLASS ADM SEQ TRT SUBJ PRD; MODEL LNCMAX = SEQ(ADM) PRD(ADM) TRT; RANDOM ADM SUBJ(ADM*SEQ); ESTIMATE 'T VS R' TRT -1 1 / CL ALPHA=0.1;
6A	Full model with ADM as random factor and PRD not nested	PROC MIXED; CLASS ADM SEQ TRT SUBJ PRD; MODEL LNCMAX = SEQ(ADM) PRD TRT; RANDOM ADM SUBJ(ADMSEQ) ADMTRT; ESTIMATE 'T VS R' TRT -1 1 / CL ALPHA=0.1;
6B	Reduced model of 6A removing ADM*TRT	PROC MIXED; CLASS ADM SEQ TRT SUBJ PRD; MODEL LNCMAX = SEQ(ADM) PRD TRT; RANDOM ADM SUBJ(ADM*SEQ); ESTIMATE 'T VS R' TRT -1 1 / CL ALPHA=0.1;

ADM, hospitalization and drug administration group code (1, 2, or 3); SEQ, treatment sequence group (RT, reference then test treatment; TR, test then reference treatment); PRD, period (1 or 2); TRT, treatment (T, test treatment; R, reference treatment); SUBJ, subject ID; LNCMAX, maximum concentration (C_max) value in natural log scale.

Table 3

Comparison of 90% confidence intervals

Hopitalization Date (ADM)	Period (PRD)	ADM*TRT Interaction Term	Model No	Point Estimate	Lower Limit	Upper Limit	Interval Width
As Fixed Factor^a)	Nested	Present	3A	1.15711	0.99480	1.34591^b)	0.35111
	Nested	Removed	3B	1.01507	0.91159	1.13029	0.21870
	Not nested	Present	4A	1.15307	1.00848	1.31840^b)	0.30992
	Not nested	Removed	4B	1.02219	0.92013	1.13556	0.21542
As Random Factor^a)	Nested	Present	5A	0.99945	0.82907	1.20483	0.37576
	Nested	Removed	5B	0.99945	0.89733	1.11318	0.21585
	Not nested	Present	6A	1.00802	0.83938	1.21055	0.37117
	Not nested	Removed	6B	1.00802	0.90713	1.12014	0.21300^c)

^aFixed factor Models (3A–4B) used PROC GLM, and random factor Models (5A–6B) used PROC MIXED. ^bThese values do not satisfy bioequivalence criteria. ^cThe narrowest and most efficient confidence interval. ADM, drug administration; TRT, treatment.

Table 4

Fixed vs. random factors

	Fixed Factor	Random Factor
Characteristics	Factors could have some unique level values (male, female) or experimenters could assign that level (treatment A, treatment B). Some can be randomized.	Level values are picked among many possible values. Those are not necessarily randomized.
Example	Treatment, Sex, Ethnicity, Season as an idealized one, Relatively permanent and small number of machines	Each patient (subject), Hospitalization date, Drug administration date, Drug bottle, Source barrel, Temporary machines, Some of many machines
Level means and differences after ANOVA (post hoc analysis)	Those can be estimated and tested.	Those should not be estimated nor tested. Only the size of variability (variance) is a concern and should be estimated.
Expectation of a level (a_i)	E(a_i) = a_i	E(a_i) = 0
Variance of a level (a_i)	Var(a_i) = 0	Var(a_i) ≠ 0
Summation of level effects	Σa_i = 0, a̅ = 0	Σa_i ≠ 0, a̅ ≠ 0
Variability among k levels, Variability of a_i	$σ_{A}^{2} = \sum_{i = 1}^{k} a_{i}^{2} / (k - 1)$	$σ_{A}^{2} = [\sum_{i = 1}^{k} {(a_{i} - \bar{a})}^{2} / (k - 1)]$

Table 5

Usage of PROC MIXED and PROC GLM

		Hospitalizat	ion Date
		Fixed Factor	Random Factor
Dataset	Complete Subjects Only	PROC GLM or MIXED (current practice)	PROC MIXED
Dataset	All Data	PROC MIXED	PROC MIXED (author's suggestion)

TOOLS

Similar articles

Bioequivalence data analysis for the case of separate hospitalization

Abstract

Introduction

Methods

Results

Model 1. Independent two-group t-test

Model 2. Conventional 2×2 model

Model 3A. Full model with administration (ADM) as fixed factor and period (PRD) nested within ADM

Model 3B. Reduced model of 3A by removing the interaction term between ADM and treatment

Model 4A. Full model with ADM as fixed factor and PRD not nested

Model 4B. Reduced model of 4A by removing the interaction term between ADM and treatment

Model 5A–6B. Models considering ADM as a random factor and using PROC MIXED to include the subject data with PRD 1 only

Discussion

Acknowledgements

Notes

References

Figure 1

Subject disposition.

Figure 2

Figure 3

Results of the independent two-group t-test.

Figure 4

Result of conventional 2 × 2 model (Model 2). (a) ANOVA result, (b) 90% confidence interval, SEQ, treatment sequence group; SUBJ, subject ID; PRD, period; TRT, treatment; PE, point estimate; LL, lower limit; UL, upper limit; WD, width of confidence interval.

Figure 5

Figure 6

Table 1

Maximum concentration (Cmax) data before log transformation

Table 2

Ten models for data in Table 1

Table 3

Comparison of 90% confidence intervals

Table 4

Fixed vs. random factors

Table 5

Usage of PROC MIXED and PROC GLM

Maximum concentration (C_max) data before log transformation