Performance of Similarity Measures in 2D Fragment-Based Similarity Searching:  Comparison of Structural Descriptors and Similarity Coefficients

2D fragment-based similarity searching is one of the most popular techniques for searching a large database of chemical structures and has been widely applied in drug discovery. However, its performance, especially its effectiveness in retrieving active structural analogues, has not been adequately...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of Chemical Information and Computer Sciences 2002-11, Vol.42 (6), p.1407-1414
Hauptverfasser: Chen, Xin, Reynolds, Charles H
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:2D fragment-based similarity searching is one of the most popular techniques for searching a large database of chemical structures and has been widely applied in drug discovery. However, its performance, especially its effectiveness in retrieving active structural analogues, has not been adequately studied. We report a series of computational experiments, where we systematically studied the influence of structural descriptors and similarity coefficients on the effectiveness of similarity searching. The study was conducted using two public large data sets, NCI anti-AIDS and MDDR. Four sets of 2D linear fragment descriptors, based on the original definitions of atom pairs and atom sequences, were compared. The effect of using the Tanimoto coefficient and the Euclidean distance was studied as a function of descriptor set. The results clearly indicate that the Tanimoto coefficient is superior to the Euclidean distance in 2D-fragment based similarity searching, in terms of hit rate, while atom sequences demonstrate the best overall performance among the structural descriptors we studied.
ISSN:0095-2338
1549-960X
DOI:10.1021/ci025531g