Chemical structure data and corresponding measured bioactivities of compounds are nowadays easily available from public and commercial databases. However, these databases contain heterogeneous data from different laboratories determined under different protocols and, in addition, sometimes even erroneous entries. In this study, we evaluated the use of data from bioactivity databases for the generation of high quality in silico models for off-target mediated toxicity as a decision support in early drug discovery and crop-protection research. We chose human acetylcholinesterase (hAChE) inhibition as an exemplary end point for our case study. A standardized and thorough quality management routine for input data consisting of more than 2,200 chemical entities from bioactivity databases was established. This procedure finally enables the development of predictive QSAR models based on heterogeneous in vitro data from multiple laboratories. An extended applicability domain approach was used, and regression results were refined by an error estimation routine. Subsequent classification augmented by special consideration of borderline candidates leads to high accuracies in external validation achieving correct predictive classification of 96%. The standardized process described herein is implemented as a (semi)automated workflow and thus easily transferable to other off-targets and assay readouts.