Skip to main content

Table 7 Performance statistics of selected genes on NSCLC data (multiple-class case)

From: Weighted-SAMGSR: combining significance analysis of microarray-gene set reduction algorithm with pathway topology-based weights to select relevant genes

 

Training set (5-fold CV results)

Test set

A. Performance comparison

 Method (n)

Error (%)

GBS

BCM

AUPR

Error (%)

GBS

BCM

AUPR

 SAMGSR (30)a

40.7

0.279

0.377

0.462

51.3

0.348

0.407

0.486

 W-SAMGSR (27)a

37.2

0.276

0.378

0.453

51.3

0.345

0.405

0.492

 LASSO (95)

38.6

0.281

0.458

0.483

52.7

0.395

0.456

0.485

 pSVM (>100)

42.8

0.370

0.344

0.428

53.3

0.433

0.385

0.397

 gelnet (>400)

36.6

0.284

0.346

0.416

54.7

0.343

0.377

0.489

 RRFE (>200)

36.6

0.272

0.395

0.448

54

0.336

0.410

0.468

B. Performance of the top 3 teams in sbv NSCLC sub-challenge (among 54 teams)

 Study (size)

Training data used/Method used

Error (%)

GBS

BCM

AUPR

 Ben-Hamo’s (23)

GSE10245, GSE18842, GSE31799/PAM

49.3

--

0.48

0.46

 Tarca’s (25)

GSE10245, GSE18842, GSE2109/moderated t-tests + LDA

--

--

0.459

0.454

 Tian’s (66)

GSE10245, GSE18842, GSE2109/TGDR in hierarchical way

53.3

0.374

0.440

0.471

  1. Note: W-SAMGSR weighted-SAMGSR, pSVM penalized support vector machine (SCAD penalty term), gelnet generalized elastic net, RRFE reweighted recursive feature elimination, LDA linear discriminant analysis, PAM partitioning around medoid, TGDR threshold gradient descent regularization
  2. aThe sizes of final model for the stage segmentation because the results for the subtype segmentation for both algorithms are identical (but the final size > 300). Ben-Hamo’s study [31], Tarca’s study [44] and Tian’s study [45] are the 3 best studies in the sbv LC sub-challenge