Feature selection algorithms (with a few notable exceptions) perform a search through the
space of feature subsets, and, as a consequence, must address four basic issues affecting
the nature of the search.
- Starting point : Selecting a point in the feature subset space from which to begin the
search can affect the direction of the search. One option is to begin with no features
and successively add attributes. In this case, the search is said to proceed forward
through the search space. Conversely, the search can begin with all features and
successively remove them. In this case, the search proceeds backward through the
search space. Another alternative is to begin somewhere in the middle and move
outwards from this point.
- Search organisation : An exhaustive search of the feature subspace is prohibitive
for all but a small initial number of features. With N initial features there exist
2^N possible subsets. Heuristic search strategies are more feasible than exhaustive
ones and can give good results, although they do not guarantee finding the optimal
- Evaluation strategy : How feature subsets are evaluated is the single biggest
differentiating factor among feature selection algorithms for machine learning. One
paradigm, dubbed the filter operates independent of any learning algorithm
undesirable features are filtered out of the data before learning begins. These
algorithms use heuristics based on general characteristics of the data to evaluate
the merit of feature subsets. Another school of thought argues that the bias
of a particular induction algorithm should be taken into account when selecting
features. This method, called the wrapper, uses an induction algorithm along with
a statistical re-sampling technique such as cross-validation to estimate the final
accuracy of feature subsets.
- Stopping criterion : A feature selector must decide when to stop searching through
the space of feature subsets. Depending on the evaluation strategy, a feature selector
might stop adding or removing features when none of the alternatives improves
upon the merit of a current feature subset. Alternatively, the algorithm might continue
to revise the feature subset as long as the merit does not degrade. A further
option could be to continue generating feature subsets until reaching the opposite
end of the search space and then select the best.