DBSCAN

DBSCAN is a powerful algorithm that can easily solve non-convex problems where K-means fails. The main idea is quite simple: a cluster is a high-density area (there are no restrictions on its shape) surrounded by a low-density one. This statement is generally true and doesn't need an initial declaration about the number of expected clusters. The procedure is mainly based on a metric function (normally the Euclidean distance) and a radius, ε. Given a sample xi, its boundary is checked for other samples. If it is surrounded by at least nmin points, it becomes a core point:

A sample xj is defined as directly reachable from a core point ...

Get Machine Learning Algorithms - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.