patternrecognition.co.za

Home | News | Services | Downloads Forum | About us | Contact us


 


 Return Home 

 Resources
 
 Publications
 Conference papers

 Source Code
 Classifier source code

 Tutorials 
 Pattern recognition tutorials

 
Terminology
 Definitions of pattern
 recognition terms

 
Pattern Recognition
 Applications 
 A summary of pattern
 recognition applications

 Classification Applet
 
 Applet
 Online implementation of
 various classifiers

 
Data Set Format
 Description of the data set
 format used for the
 classification applet


 
Example Data Sets
 Downloadable data set
 examples

 
Classification Applet
 Documentation
 Description of the algorithms
 used



The No Free Lunch Theorem (NFL)

The NFL assume that the functions we want to approximate in machine learning are distributed unifromly, meaning that no function/problem is more probable than another. The NFL theorem is irrelevant to machine learning because we do not apply the principles of indifference, thus we assume that some functions are more likely than others. There are high-level regularities in nature, ie. meta rules that lead us to perform induction at the base level this way rather than that way, hence there is a bias favouring some probability distributions of functions rather than others.  This assumption leads to the machine learning assumption, which voids the NFL theorem for the purposes of base learning because the functions that we learn are not uniformly distributed as assumed by the NFL theorem.

Machine Learning Assumption
We assume some functions are in reality more likely than others (we do not apply the principles of indifference), thus it
is sufficient to claim that some algorithms are better than others.

In science we assume induction is valid - that we can generalize from what we have seen to things we have yet to encounter. The proof is somewhat intuitive, humans can learn from experience of complex and difficult problems that tend to apper in the real world. This gives some indication that an ultimate learning algorithm must exist.

Ultimate Learning Algorithm (ULA)
Each learning algorithm has a bias, thus an area of expertise. Finding a ULA consists of fiding a learning algorithm whose induced models closely match our world's underlying distribution functions.

If trends of data in nature tends to number such as the natural number e and the central limit theorem, will real world functions not also tend to certain distributions of functions? And if some functions are more probable than others, the functions we try to predict are not uniformly distributed and there must exist algorithms that are more probable to do better than others. The algorithms that make underlying assumptions that tend to the data distributions of real world data, will certainly have a higher probability of doing better on any real world data set.

Cross-validation (CV) model selection
Why is cross-validation model selection not sufficient?
CV is regularly used to select among competing learning algorithms, the problem is however that the NFL result holds for CV even under the assumption of machine learning. CV cannot generalize and thus can not be a viable way to build an ultimate learning algorithm.

Meta-learning vs. Base learning
Base-learning is focused on accumulating experience on a specific learning task. Meta-learning is concerned with accumilating experience of multiple applicatiions of a learning system. Meta-learning is thus focused on mapping classification tasks to algorithms.

Where does the NFL theorem apply?
The NFL theorem applies to the meta-level of learning. All meta-learners have equivalent performance given some averaging over the entire feature space (considering all problems). See [1] for proof.

References

[1] C. Giraud-Carrier and F. Provost, "Toward a Justification of Meta-learning: Is the No Free Lunch Theorem a Show-Stopper?"
 





































Copyright 2007. www.patternrecognition.co.za