TLDR: Isolation Forest - Fast and Efficient Anomaly Detection
Date: 2020-01-31 Source: https://arpitbhayani.me/blogs/isolation-forest
Overview
Uncover anomalies with Isolation Forest, an unsupervised algorithm. Learn its core principles, tree construction, and scoring for anomaly detection. Since anomalies deviate from normal, they are few in numbers (minority) and/or have attribute values that are very different from those of normal.
Key Points
- Characteristics of anomalies: Since anomalies deviate from normal, they are few in numbers (minority) and/or have attribute values that are very different from those of normal.
- The core principle: The core of the algorithm is to “isolate” anomalies by creating decision trees over random attributes.
- Construction of decision tree: The decision tree is constructed by splitting the sub-sample points/instances over a split value of a randomly selected attribute such that the instances whose corresponding attribute value is smaller than the split value goes left and the others go right, and the process is continued recursively until the tree is fully constructed.