Classifications of ML Methods
Here are a few ways commonly used to classify ML systems:
- supervision based
- incremental learning based
- generalisability
Supervision
In this classification, there are four main classes based on the amount or type of supervision
Supervised
- Data
- Labelled examples \large{({\bold x_i},y_i)_{i=1}^N}
- \large x_i is a feature vector(\large n-dimensional vector of numerical features \large x^d)
- represent objects numerically e.g. for an image, \large x^{(1)} could be the hue, and \large x^{(2)} could be the intensity
- useful for comparing objects e.g. euclidean distance
- granularity depends on the purpose
- \large y_i can take the form of a member of a set, real number, matrix, vector etc.
- Tasks
- Classification e.g. spam filter
- Predict numeric values based on predictors (features) e.g. house price given room numbers and areas
- Goal
- Train a model on a dataset to predict labels based on input feature vectors
- Methods
- Linear regression
- Logistic regression
- kNN
- SVM
- DT (& random forests)
- Neural networks
Unsupervised
- Data
- Unlabelled examples \large{({\bold x_i})_{i=1}^N}
- \large x_i is a feature vector
- Tasks
- Clustering - group data points with shared attributes to extrapolate a relationship e.g. molecular structure similarity
- Anomaly/Outlier detection - find outliers e.g. fraud-detection
- Rule learning
- Goal
- Clustering - transform feature vector \large \bold x into a useful value e.g. an id
- Dimensionality reduction - output feature vector should have fewer features than \large \bold x
- Anomaly/Outlier detection - output value quantifies the difference of \large \bold x from the data
- Methods
- Clustering
- Exclusive
- K-means
- Hierarchical
- Soft - more probabilistic
- GMM
- Expectation Maximisation
- Exclusive
- Association
- Apriori
- Eclat
- Dimensionality reduction
- PCA
- SVD
- LLE
- t-SNE
- Clustering
- Challenges
- Computation and time complexity of training
- Can be unclear as to basis for clustering
- Accuracy
Semi-Supervised
- Data usually partially labelled
- Combination of supervised and unsupervised
- Methods
- Deep Belief Networks (DBN) - based on stacked Restricted Boltzmann machines
Reinforcement
- Agent - learning system
- observes the environment - state
- makes decisions
- performs actions
- feedback loop - penalties or rewards - aims to maximise rewards
- Learns a policy - a function that takes the state as an input feature vector and outputs an action that leads to the highest expected average reward
Incremental
Split into batch (non-incremental) and online
Batch Learning
- Offline - unable to learn incrementally from a data stream (usually due to complexity)
- System trained first then applied, without learning further unless it is taken offline and retrained
- Can be automated e.g. weekly
- High computational requirements
Online learning
- Training using small sequential chunks of data - streamed
- Does not require storage of previous data
- Learning steps are of low complexity therefore can be performed on mobile systems
- Can also be used to process extremely large datasets as a stream
- Learning rate
- if high, will forget old data faster, sensitive to noise
- if low, inertia will be high, slower to learn, less sensitive to noise or outliers
- Corrupted or errors in the stream can affect the performance in real-time
- pair with anomaly detection
Generalisability
Can also categorise based on models of generalisation
Instance based learning
- Learn from prior examples, then generalise to new data using a measure of similarity
Model based learning
- Build or select a model from prior examples
- Define a fitness function or cost function
- Minimise cost or maximise fitness, depending on the chosen model
- Train the model on the training data to optimise parameters for a reasonable fit
- Make predictions
References
- Burkov, A. (2019) The Hundred-Page Machine Learning Book.
- Geron, A. (2017) Hands-On Machine Learning with Scikit-Learn & TensorFlow : concepts, tools, and techniques to build intelligent systems.