Ligand-Based Virtual Screening

The idea of using molecular similarity to search molecular databases has a long history [1], and has been recently reviewed [2]. Combining the best of the results from a wide variety of these approaches (superpositional, non-superpositional, electrostatic and chemical feature-based) gives a diverse set of potentially similar ligands for subsequent analysis or assay.

A suitable searching method needs to possess certain characteristics:

  • It should be able to identify compounds which are “biologically similar”; in other words compounds that have an increased likelihood of sharing biological activity with the query molecule.

  • It should identified compounds that are chemically distinct from the query; compounds which are close analogues of the query are much less interesting than compounds which possess a very different chemical scaffold.  This suggests that the comparison should be based not on the similarity to the moelcule’s skeleton of atoms and bonds but on  its external appearance – its shape, electrostatic potential and other relevant fields.

  • It must be very fast, to allow the calculation to complete in a reasonable time even when many millions of compounds are being considered.

The suite of Affinity methods collectively known as RAMS (Rapid Assessment of Molecular Similarity) enable lightening fast identification of compounds which are similar to a query moecule in terms of molecular shape or a combination of shape and other molecular properties.  It is well-known that these properties, in particular shape, electrostatics and lipophilicity, are key determinants of molecular recognition and it is similarity in these terms, rather than the connection of atoms or functional groups, that detemines bio-isosterism.  Whilst this has been appreciated for a long time, these molecular features are complex objects to represent and compare computationally.  Historically, shape matching has been performed by molecular superpositon algorithms that are computationally intensive, which therefor limits their application to large databases.  The development of alignmnet-free molecular similarity measures that can describe shape and properties effectively has removed this bottleneck.

RAMS

RAMS represents a family of alignment-free, 3D, molecular similarity methods which execute near real-time ligand searching based on combinations of shape, charge and various phyico-chemical properties.

The key technology to enable the exceptionally fast searching that is possible with the RAMS methodologies is the calculation of a compact descriptor that captures the complexities of molecular shape. A key enhancement over previously described non-superpositional methods is that CSR takes into account the chirality of the molecules being compared, while retaining the speed and efficiency of these methods. These differences are important because interactions between proteins and small molecules are often chiral in nature. Using CSR, similarly shaped compounds can be quickly identified from within even the largest molecular databases. In addition, the problematic requirement of aligning molecules for comparison is circumvented, as the proposed distributions are independent of molecular orientation. CSR has been demonstrated [3] to provide superior enrichment to previously described methods.

Building on this approach to molecular shape comparison, ElectroShape[4] incorprates electrostatic and other physico-chemical properties of the atoms, so that these properties are included in the comparison, in addition to the molecular shape and stereochemistry. Combining these properties maximizes the discovery of relevant lead molecules within the top few percent of structures screened, nearly doubling the enrichment ratio at 1% over previously published shape-based methods.

RAMS methods can search databases of millions of compounds (100s of millions of conformations) in seconds using commodity hardware.

Additionally, as yet unpublished methods are capable of even better performance in benchmarking studies.

Taken together, the RAMS approach is a powerful tool for lead identification.

COver

COver is Affinity’s proprietary superpositional search method, based on original research in Professor Graham Richards’ group in Oxford. A typical use case would be to use COver, after a first pass with a fast search method, such as ElectroShape, in its superpositional mode to align top hits on the query molecule.  This is especially useful where the query is a co-crystalized ligand structure.  COver can also be used to assess quickly the steric overlap between molecules, which is particularly useful for filtering results from our de novo fragment-based ligand design software, LOx.

 

References

[1] A.C. Good, E.E. Hodgkin, and W.G. Richards (1992). "Similarity screening of molecular data sets." Journal of Computer-Aided Molecular Design, 6(5): 513-520.

[2] P.W. Finn and G.M. Morris (2013). "Shape-based similarity searching in chemical databases." Wiley Interdisciplinary Reviews: Computational Molecular Science, 3(3): 226–241.

[3] M.S. Armstrong, G.M. Morris, P.W. Finn, R. Sharma, and W.G. Richards (2009). "Molecular similarity including chirality", Journal of Molecular Graphics and Modelling, 28: 368-370.

[4]    M.S., Armstrong, G.M., Morris, P.W., Finn, R. Sharma, L. Moretti, R.I. Cooper, and W.G. Richards (2010). "ElectroShape: fast molecular similarity calculations incorporating shape, chirality and electrostatics." Journal of Computer-Aided Molecular Design 24(9): 789–801. Epub 2010 Jul 8.

[4]   N. Stiefl and K. Baumann (2003). "Mapping property distributions of molecular surfaces: Algorithm and evaluation of a novel 3D quantitative structure-activity relationship technique." Journal of Medicinal Chemistry, 46(8): 1390–1407.

[5]   P. Willett, J.M. Barnard, and G.M. Downs (1998). "Chemical similarity searching". Journal of Chemical Information and Computer Sciences, 38(6): 983–996.