Statistical Monitoring in Clinical Trials: Best Practices for Detecting Anomalies Suggestive of Fabrication or Misconduct

Publication Type
Journal Article
Year of Publication
Knepper, D; Lindblad, A; Sharma, G; Gensler, G; Manukyan, Z; Matthews, A; Seifu, Y
Therapeutic Innovation and Regulatory Science
Start Page
Date Published
central monitoring; fabrication; fraud; misconduct; risk-based monitoring; statistical monitoring; TransCelerate


Traditional site-monitoring techniques are not optimal in finding data fabrication and other nonrandom data distributions with the greatest potential for jeopardizing the validity of study results. TransCelerate BioPharma conducted an experiment testing the utility of statistical methods for detecting implanted fabricated data and other signals of noncompliance.


TransCelerate tested statistical monitoring on a data set from a chronic obstructive pulmonary disease (COPD) clinical study with 178 sites and 1554 subjects. Fabricated data were selectively implanted in 7 sites and 43 subjects by expert clinicians in COPD. The data set was partitioned to simulate studies of different sizes. Analyses of vital signs, spirometry, visit dates, and adverse events included distributions of standard deviations, correlations, repeated values, digit preference, and outlier/inlier detection. An interpretation team, including clinicians, statisticians, site monitoring, and data management, reviewed the results and created an algorithm to flag sites for fabricated data.


The algorithm identified 11 sites (19%), 19 sites (31%), 28 sites (16%), and 45 sites (25%) as having potentially fabricated data for studies 2A, 2, 1A, and 1, respectively. For study 2A, 3 of 7 sites with fabricated data were detected, 5 of 7 were detected for studies 2 and 1A, and 6 of 7 for study 1. Except for study 2A, the algorithm had good sensitivity and specificity (>70%) for identifying sites with fabricated data.


We recommend a crossfunctional, collaborative approach to statistical monitoring that can adapt to study design and data source and use a combination of statistical screening techniques and confirmatory graphics.