|
Return Home
Resources
Publications
Conference papers
Source Code
Classifier source code
Tutorials
Pattern recognition tutorials
Terminology
Definitions of pattern
recognition terms
Pattern Recognition Applications
A summary of pattern
recognition applications
Classification Applet
Applet
Online implementation of
various classifiers
Data Set Format
Description of the data set
format used for the
classification applet
Example Data Sets
Downloadable data set
examples
Classification Applet
Documentation
Description of the algorithms
used
|
|
Online
Classifier Description
The applet is an
implementation of the Naïve Bayes, Gaussian, Gaussian Mixture Model,
k-Nearest-Neighbours, Decision Tree, Multilayer Perceptron (Neural Network) and
Support Vector Machines classifiers.
The optimal parameters of the classifiers are determined by the applet by
performing 10-fold cross-validation. The error rates are the optimal 10-fold
cross-validation error rates.
NB - Naive Bayes Classifier
A simple Naive Bayes classifier implementation. Each class is assumed to have a
Gaussian distribution with independent variables. Maximum likelihood estimation
is performed to determine the means and covariances of each class. The prior
probabilities of each class are determined by the number of occurrences of each
class in the training set. Bayes' rule is used to determine the posterior
probabilities of the test observations belonging to each class. A
classification decision is made for each observation in the test set by
selecting the class with the highest posterior probability for each
observation.
Gauss - Gaussian Classifier
A Gaussian classifier implementation. Each class is assumed to have a Gaussian
distribution. Maximum likelihood estimation is performed to determine the means
and covariances of each class. The prior probabilities of each class are
determined by the number of occurrences of each class in the training set.
Bayes' rule is used to determine the posterior probabilities of the test
observations belonging to each class. A classification decision is made for
each observation in the test set by selecting the class with the highest
posterior probability for each observation.
GMM - Gaussian Mixture Model Classifier
A simple implementation of a GMM classifier. The number of mixtures per class
is given to the classifier. For each class the Expectation Maximization (EM)
algorithm is used to determine the mixture/group means and covariances. Equal
group prior probabilities are assumed. The class-conditional probability
density function of each class is determined by substituting the mean vector
and covariance matrix of each mixture into the multivariate gaussian
distribution equation and the adding all these probability values. The class
prior probabilities are determined by the proportion of samples belonging to a
specific class in the training set. To classify a new observation, the class
posterior probabilities are calculated by using the Bayes' rule.
The number of mixtures per class are iterated from 1 to 10 to determine the
optimal number of mixtures per class.
It should be noted that the Gaussian Mixture Classifier makes use of diagonal
covariance matrices for the mixtures.
kNN - k-Nearest-Neighbours Classifier
Implementation of a kNN classifier. The kNN classifier classifies a sample by
determining the k nearest data points to the given point in Euclidian space. N
Euclidian distances are thus calculated for each sample that has to be
classified, where N are the total number of samples in the training set. After
the k nearest points (neighbours) have been determined, their corresponding
class labels are used to make a classification prediction for the new
observation.
Values of k between 1 and 10 are used with 10-fold cross-validation to
determine the optimal k value.
DT - Decision Tree Classifier
A Bayes Decision Tree classifier. Decision trees capture
dependencies between variables without assuming total independence. The
dependencies are however also limited to prevent too complex models. The
probability distribution of each class is modelled as a tree dependent
distribution. The probability distribution is expressed as the product of pair
wise conditional probability densities between variables. After the
class-conditional probability density functions of each class are determined,
classification can be performed by substituting a new observation into each
probability density ans selecting the most probable class.
MLP - Multilayer Perceptron
Implementation of a feed-forward, back propagation neural network. The MLP is a
specific case of Neural Networks. Neural Networks refer to Radial Basis
Function Networks as well as Multilayer Perceptrons. A single hidden
layer is used in this implementation. The number of nodes in the hidden layer
is iterated from 2 to 10 and 10-fold cross-validation is performed to determine
the optimal number of hidden nodes.
SVM - Support Vector Machine
Implementation of a C-Support Vector classification algorithm with a radial
basis function kernel. The optimal penalty factor(C) and basis function
width(g) are determined by performing a grid search with the Golden
Ratio Search algorithm.
|
|