Skip to content

Chapter 8: Unsupervised Learning

Discover hidden patterns in unlabeled data—clustering, dimensionality reduction, and anomaly detection.


Metadata

Field Value
Track Practitioner
Time 8 hours
Prerequisites Chapters 1–6

Learning Objectives

  • Implement K-Means clustering from scratch using NumPy
  • Apply hierarchical clustering and interpret dendrograms
  • Use DBSCAN for density-based clustering with noise detection
  • Evaluate clusters with silhouette scores and the elbow method
  • Reduce dimensionality with PCA and t-SNE
  • Detect anomalies with Isolation Forest and statistical methods
  • Build a complete customer segmentation pipeline

What's Included

Notebooks

Notebook Description
01_introduction.ipynb K-Means from scratch, evaluation, elbow method
02_intermediate.ipynb Hierarchical, DBSCAN, Gaussian Mixture Models
03_advanced.ipynb PCA, t-SNE, anomaly detection, customer segmentation capstone

Scripts

  • unsupervised_toolkit.py — Core implementations (KMeansScratch, PCAScratch) and plotting utilities

Exercises

  • 5 exercises with solutions (in solutions/ branch)

SVG Diagrams

  • 3 visual diagrams for clustering algorithms, dimensionality reduction, and anomaly detection


Read Online

You can read the full chapter content right here on the website:

Or try the code in the Playground.

How to Use This Chapter

Quick Start

Follow these steps to get coding in minutes.

1. Clone and install dependencies

git clone https://github.com/luigipascal/berta-chapters.git
cd berta-chapters
pip install -r requirements.txt

2. Navigate to the chapter

cd chapters/chapter-08-unsupervised-learning

3. Launch Jupyter

jupyter notebook notebooks/01_introduction.ipynb

GitHub Folder

All chapter materials live in: chapters/chapter-08-unsupervised-learning/

SciPy

This chapter uses SciPy for hierarchical clustering dendrograms. Ensure it's installed: pip install scipy


Created by Luigi Pascal Rondanini | Generated by Berta AI