Feature Selection Challenge
The challenge is over, it ended on December 12, 2003, but the web site is now open again for people who want to benchmark their system against the challenge entries.

The results are evaluated according to the following performance measures.

Performance Measures

The results for a classifier can be represented in a confusion matrix, where a,b,c and d represent the number of examples falling into each possible outcome:

Prediction
Class -1Class +1
TruthClass -1ab
Class +1cd

Balanced Error Rate (BER)

The balanced error rate is the average of the errors on each class: BER = 0.5*(b/(a+b) + c/(c+d)).

Area Under Curve (AUC)

The area under curve is defined as the area under the ROC curve. This area is equivalent to the area under the curve obtained by plotting a/(a+b) against d/(c+d) for each confidence value, starting at (0,1) and ending at (1,0). The area under this curve is calculated using the trapezoid method. In the case when no confidence values are supplied for the classification the curve is given by {(0,1),(d/(c+d),a/(a+b)),(1,0)} and AUC = 1 - BER.

Fraction of Features (FF)

The fraction of features is simply the ratio of the number of features used by the classifier to the total number of features in the dataset.

Fraction of Probes (FP)

Some additional features were added to the original datasets having similar distributions to the original features; these additional features are termed probes. The fraction of probes is simply the ratio of the number of probes used to the number of features used by a classifier.