The core set, it is a common practice to use the
The core set, it is a common practice to use the core set as a test set andthe remaining complexes in the refined set as a training set. On one hand, Cyscore was tested on two independent sets: PDBbind v2007 core set (N = 195) and PDBbind v2012 core set (N = 201), whose experimental binding affinities span 12.56 and 9.85 pKd units, respectively. On the other hand, Cyscore was trained on a special set of 247 complexes carefully selected from the PDBbind v2012 refined set using certain criteria [13] (e.g. structural resolution < 1.8? binding affinity spans 1 to 11 kcal/mol, protein sequence similarity and ligand chemical composition are different from the test set), ensuring that the training complexes are of high quality and do not overlap with 3-(2,2,2-Trifluoroethoxy)aniline hydrochloride any of the two test sets. In this study we used exactly the same training set and the same test sets in order to make a fair comparison to Cyscore. Furthermore, considering the fact that 16 classical scoring functions have already been evaluated [24] on PDBbind v2007 core set and the top performing of them (e.g. X-Score) were trained on the remaining 1105 complexes PubMed ID:https://www.ncbi.nlm.nih.gov/pubmed/9625274 in PDBbind v2007 refined set, we also used these 1105 complexes as another training set to permit a direct comparison. Using predefined training and test sets, where other scoring functions had previously been trained and tested, has the advantage of reducing the risk of using a benchmark complementary to one particular scoring function. Likewise for the PDBbind v2012 benchmark, we used an additional training set comprising the complexes in PDBbind v2012 refined set excluding those in PDBbind v2012 core set. PubMed ID:https://www.ncbi.nlm.nih.gov/pubmed/10485401 This led to a total of 2696 complexes. By construction, this training set does not overlap with the test 6-(Thiophen-3-yl)pyridin-3-amine set.PDBbind v2013 round-robin benchmarkTable 1 The three combinations of three different sets of features used to train RF models in this studyModel RF::Cyscore RF::CyscoreVina RF::CyscoreVinaElem Features 4 Cyscore features 4 Cyscore features + 6 AutoDock Vina features 4 Cyscore features + 6 AutoDock Vina features + 36 RF-Score featuresWe propose a new benchmark to investigate how prediction performance of the four models changes in cross validation and with varying numbers of training samples. We used PDBbind v2013 refined set (N = 2959), which is the latest version and constitutes the most comprehensive and publicly available structural dataset suitable for training scoring functions. We used 5-fold cross validation, as was used by the recently published empirical scoring function ID-Score [23], to reduce overfitting and thus generalization errors. The entire PDBbind v2013 refined set (N = 2959) was divided into five equal partitions using uniform sampling on a round-robin basis: the entire 2959 complexes were first sorted in the ascending order of their measured binding affinity, and the complexes with the 1st, 6th, 11th, etc. lowest binding affinity belonged to the first partition, the complexes with the 2nd, 7th, 12th, etc. lowest binding affinity belonged to the second partition, and so on. This partitioning method, though not completely random, hasLi et al. BMC Bioinformatics 2014, 15:291 http://www.biomedcentral.com/1471-2105/15/Page 4 oftwo advantages: on one hand, each partition is guaranteed to span the largest range of binding affinities and incorporates the largest structural diversity of different protein families; on the other hand, each partition is composed of a deterministic list of complexes, permitting reprod.