One important goal in pharmaco-epidemiology studies is to understand the causal relationship between drug exposures and their clinical outcomes, including adverse drug events. In order to achieve this goal, however, we need to resolve several challenges. Most of pharmaco-epidemiology data are observational and confounding is largely present due to many co-medications. The pharmaco-epidemiology study data set is often sampled from large medical record databases using a matched case-control design, and it may not be representative of the original patient population in the medical record databases. Data analysis method needs to handle a large sample size that cannot be handled using existing statistical analysis packages. In this paper, we tackle these challenges both methodologically and computationally. We propose a conditional causal log-odds ratio (OR) definition to characterize causal effects of drug exposures on a binary adverse drug event adjusting for individual level confounders. Using a case-control design, we present a propensity score estimation using only case samples and we provide sufficient conditions for the consistency of the estimation of the causal log-odds ratio using case-based propensity scores. Computationally, we implement a principle component analysis to reduce high-dimensional confounders. Extensive simulation studies are performed to demonstrate superior performance of our method to existing methods. Finally, we apply the proposed method to analyze drug-induced myopathy data sampled from a de-identified subset of medical record database (close to 5 million patient records), The Indiana Network for Patient Care. Our method identified 70 drug-induced myopathy (p < 0.05) out 72 drugs, which have myoathy side effects on their FDA drug labels. These 70 drugs include three statins who are known for their myopathy side effects.
Keywords: Case-control design; OR; causal inference; pharmaco-epidemiology; principal components; propensity scores.