Characteristics of Feature Selection Algorithms for Machine Learning

Feature selection algorithms (with a few notable exceptions) perform a search through the
space of feature subsets, and, as a consequence, must address four basic issues affecting
the nature of the search.

  • Starting point :  Selecting a point in the feature subset space from which to begin the
    search can affect the direction of the search. One option is to begin with no features
    and successively add attributes. In this case, the search is said to proceed forward
    through the search space. Conversely, the search can begin with all features and
    successively remove them. In this case, the search proceeds backward through the
    search space. Another alternative is to begin somewhere in the middle and move
    outwards from this point.
  • Search organisation :  An exhaustive search of the feature subspace is prohibitive
    for all but a small initial number of features. With N initial features there exist
    2^N possible subsets. Heuristic search strategies are more feasible than exhaustive
    ones and can give good results, although they do not guarantee finding the optimal
    subset.
  • Evaluation strategy :  How feature subsets are evaluated is the single biggest
    differentiating factor among feature selection algorithms for machine learning. One
    paradigm, dubbed the filter operates independent of any learning algorithm
    undesirable features are filtered out of the data before learning begins. These
    algorithms use heuristics based on general characteristics of the data to evaluate
    the merit of feature subsets. Another school of thought argues that the bias
    of a particular induction algorithm should be taken into account when selecting
    features. This method, called the wrapper, uses an induction algorithm along with
    a statistical re-sampling technique such as cross-validation to estimate the final
    accuracy of feature subsets.
  • Stopping criterion :  A feature selector must decide when to stop searching through
    the space of feature subsets. Depending on the evaluation strategy, a feature selector
    might stop adding or removing features when none of the alternatives improves
    upon the merit of a current feature subset. Alternatively, the algorithm might continue
    to revise the feature subset as long as the merit does not degrade. A further
    option could be to continue generating feature subsets until reaching the opposite
    end of the search space and then select the best.
Advertisements

, , , , ,

  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: