patternrecognition.co.za

Home | News | Services | Downloads Forum | About us | Contact us


 


 Return Home 

 Resources
 
 Publications
 Conference papers

 Source Code
 Classifier source code

 Tutorials 
 Pattern recognition tutorials

 
Terminology
 Definitions of pattern
 recognition terms

 
Pattern Recognition
 Applications 
 A summary of pattern
 recognition applications

 Classification Applet
 
 Applet
 Online implementation of
 various classifiers

 
Data Set Format
 Description of the data set
 format used for the
 classification applet


 
Example Data Sets
 Downloadable data set
 examples

 
Classification Applet
 Documentation
 Description of the algorithms
 used



Online Classifier Description

The applet is an implementation of the Naïve Bayes, Gaussian, Gaussian Mixture Model, k-Nearest-Neighbours, Decision Tree, Multilayer Perceptron (Neural Network) and Support Vector Machines classifiers.

The optimal parameters of the classifiers are determined by the applet by performing 10-fold cross-validation. The error rates are the optimal 10-fold cross-validation error rates.

NB - Naive Bayes Classifier
A simple Naive Bayes classifier implementation. Each class is assumed to have a Gaussian distribution with independent variables. Maximum likelihood estimation is performed to determine the means and covariances of each class. The prior probabilities of each class are determined by the number of occurrences of each class in the training set. Bayes' rule is used to determine the posterior probabilities of the test observations belonging to each class. A classification decision is made for each observation in the test set by selecting the class with the highest posterior probability for each observation.

Gauss - Gaussian Classifier
A Gaussian classifier implementation. Each class is assumed to have a Gaussian distribution. Maximum likelihood estimation is performed to determine the means and covariances of each class. The prior probabilities of each class are determined by the number of occurrences of each class in the training set. Bayes' rule is used to determine the posterior probabilities of the test observations belonging to each class. A classification decision is made for each observation in the test set by selecting the class with the highest posterior probability for each observation.

GMM - Gaussian Mixture Model Classifier
A simple implementation of a GMM classifier. The number of mixtures per class is given to the classifier. For each class the Expectation Maximization (EM) algorithm is used to determine the mixture/group means and covariances. Equal group prior probabilities are assumed. The class-conditional probability density function of each class is determined by substituting the mean vector and covariance matrix of each mixture into the multivariate gaussian distribution equation and the adding all these probability values. The class prior probabilities are determined by the proportion of samples belonging to a specific class in the training set. To classify a new observation, the class posterior probabilities are calculated by using the Bayes' rule.

The number of mixtures per class are iterated from 1 to 10 to determine the optimal number of mixtures per class.

It should be noted that the Gaussian Mixture Classifier makes use of diagonal covariance matrices for the mixtures.

kNN - k-Nearest-Neighbours Classifier
Implementation of a kNN classifier. The kNN classifier classifies a sample by determining the k nearest data points to the given point in Euclidian space. N Euclidian distances are thus calculated for each sample that has to be classified, where N are the total number of samples in the training set. After the k nearest points (neighbours) have been determined, their corresponding class labels are used to make a classification prediction for the new observation.

Values of k between 1 and 10 are used with 10-fold cross-validation to determine the optimal k value.

DT - Decision Tree Classifier
A  Bayes Decision Tree classifier.  Decision trees capture dependencies between variables without assuming total independence. The dependencies are however also limited to prevent too complex models. The probability distribution of each class is modelled as a tree dependent distribution. The probability distribution is expressed as the product of pair wise conditional probability densities between variables. After the class-conditional probability density functions of each class are determined, classification can be performed by substituting a new observation into each probability density ans selecting the most probable class.

MLP - Multilayer Perceptron
Implementation of a feed-forward, back propagation neural network. The MLP is a specific case of Neural Networks. Neural Networks refer to Radial Basis Function Networks as well as Multilayer Perceptrons.  A single hidden layer is used in this implementation. The number of nodes in the hidden layer is iterated from 2 to 10 and 10-fold cross-validation is performed to determine the optimal number of hidden nodes.

SVM - Support Vector Machine
Implementation of a C-Support Vector classification algorithm with a radial basis function kernel. The optimal penalty factor(C) and basis function width(g) are determined by performing a grid search with the Golden Ratio Search algorithm.












































Copyright 2007. www.patternrecognition.co.za