Abstract
Early on in the emergence of virtual high-throughput screening (VHTS), it was recognized that for validation to be robust and reliable, decoys should match actives as closely as possible in as many aspects as possible. This has given rise to several generations of validation sets that address previously reported shortcomings of earlier collections. This is an iterative and expensive method of curating validation sets that leaves ample scope for discrepancies between actives and decoys to creep in. It has previously been conjectured that in silico isomerization offers an attractive alternative to generating decoys for drug-like compounds that naturally mitigates many of these discrepancies. Here, we explore this proposition and prove the conjecture. We show that isomerization can produce molecules that have hydrogen bond acceptor, donor, rotatable bonds counts, charge and surface area distributions that match more closely experimental actives than experimental decoys. While these are properties that receive a lot of attention in drug design, we also show that isomerization can effectively produce decoys that are positioned more closely to actives in property hyperspace than current experimental decoys which tend to be highly dissimilar from the actives. The latter is a significant shortcoming that has thus far remained unreported and unaddressed. Herein, we build upon the methods, tools, and work of others to facilitate the generation of new and better validation sets more cheaply and efficiently in the hope of moving the field of VHTS forward toward maturity. To that end, we make our code fully and freely available on GitHub (https://github.com/sivanovMU-Sofia/isomerization).
Supplementary materials
Title
In Silico Isomerization Produces Apt Negative Data for VHTS Validation
Description
Molecular property distributions broken down by target
Actions
Supplementary weblinks
Title
GitHub code
Description
Input files and code used to generate the results and paper
Actions
View 


![Author ORCID: We display the ORCID iD icon alongside authors names on our website to acknowledge that the ORCiD has been authenticated when entered by the user. To view the users ORCiD record click the icon. [opens in a new tab]](https://www.cambridge.org/engage/assets/public/coe/logo/orcid.png)