Skip to main content

Advertisement

Table 7 Performance statistics of selected genes on NSCLC data (multiple-class case)

From: Weighted-SAMGSR: combining significance analysis of microarray-gene set reduction algorithm with pathway topology-based weights to select relevant genes

  Training set (5-fold CV results) Test set
A. Performance comparison
 Method (n) Error (%) GBS BCM AUPR Error (%) GBS BCM AUPR
 SAMGSR (30)a 40.7 0.279 0.377 0.462 51.3 0.348 0.407 0.486
 W-SAMGSR (27)a 37.2 0.276 0.378 0.453 51.3 0.345 0.405 0.492
 LASSO (95) 38.6 0.281 0.458 0.483 52.7 0.395 0.456 0.485
 pSVM (>100) 42.8 0.370 0.344 0.428 53.3 0.433 0.385 0.397
 gelnet (>400) 36.6 0.284 0.346 0.416 54.7 0.343 0.377 0.489
 RRFE (>200) 36.6 0.272 0.395 0.448 54 0.336 0.410 0.468
B. Performance of the top 3 teams in sbv NSCLC sub-challenge (among 54 teams)
 Study (size) Training data used/Method used Error (%) GBS BCM AUPR
 Ben-Hamo’s (23) GSE10245, GSE18842, GSE31799/PAM 49.3 -- 0.48 0.46
 Tarca’s (25) GSE10245, GSE18842, GSE2109/moderated t-tests + LDA -- -- 0.459 0.454
 Tian’s (66) GSE10245, GSE18842, GSE2109/TGDR in hierarchical way 53.3 0.374 0.440 0.471
  1. Note: W-SAMGSR weighted-SAMGSR, pSVM penalized support vector machine (SCAD penalty term), gelnet generalized elastic net, RRFE reweighted recursive feature elimination, LDA linear discriminant analysis, PAM partitioning around medoid, TGDR threshold gradient descent regularization
  2. aThe sizes of final model for the stage segmentation because the results for the subtype segmentation for both algorithms are identical (but the final size > 300). Ben-Hamo’s study [31], Tarca’s study [44] and Tian’s study [45] are the 3 best studies in the sbv LC sub-challenge