ML Atlas — Knowledge Base

Machine Learning, Deeply Explained.

Not textbook summaries. Every algorithm explained with intuition, math, code, and interview-level depth. Built for engineers who want to actually understand ML.

48/55

Topics · built · in progress

720+

Interview Qs · per topic with follow-ups

480+

Tricky Qs · deep understanding tests

Zero to ML Foundations

4 topics

What is Machine Learning?

Rule-based programming vs ML, training, inference, and failure modes.

ML Problem Types

Regression, classification, clustering, anomalies, ranking, recommendations, and forecasting.

Dataset Thinking

Features, labels, splits, leakage, noise, missing data, outliers, and imbalance.

ML Pipeline Overview

Problem to monitoring: the full map of a real ML workflow.

Core ML Thinking

15 topics

Overfitting vs Underfitting

Diagnose whether the model is too simple or too fragile.

Bias-Variance Tradeoff

Core tradeoff lens for choosing model capacity and regularization.

Regularization

Control complexity to improve stability and generalization.

Loss Functions

Define what the model optimizes and what errors are expensive.

Optimization

How parameter updates drive convergence and stability.

Generalization

Measure unseen-data performance, not just training fit.

Data Leakage

Prevent hidden leaks that fake performance.

Model Complexity

Choose capacity to balance fit quality and robustness.

Feature Scaling

Normalize magnitudes for stable optimization and geometry.

Model Interpretability

Explain model behavior and decision drivers clearly.

Error Analysis

Turn failure slices into targeted model improvements.

Model Selection

Select models by quality, latency, and maintainability tradeoffs.

Baseline Models

Anchor progress before advanced modeling.

Evaluation Strategy

Design validation that reflects product reality.

Production Failure Cases

Common deployment failures and prevention patterns.

Foundations

6 topics

Linear Regression

Fit a line. Predict a number. Understand everything.

Logistic Regression

Binary classification via sigmoid probability outputs.

Gradient Descent

The engine behind almost every ML algorithm.

Regularization

Ridge, Lasso, and ElasticNet — fighting overfitting.

Bias-Variance Tradeoff

The fundamental tension at the heart of every ML model.

Maximum Likelihood Estimation

The math behind most loss functions — deeply explained.

Tree-Based Methods

4 topics

Decision Trees

Recursive binary splits on a dataset.

Random Forest

An ensemble of decorrelated trees.

Gradient Boosting

Sequential learners that fix prior errors — XGBoost, LightGBM.

AdvancedSoon

Ensemble Methods

Stacking, blending, and voting — combining models intelligently.

22 min

Distance-Based

2 topics

K-Nearest Neighbors

Classification by majority vote of neighbors.

Support Vector Machine

Maximum margin hyperplane classification.

Clustering

8 topics

K-Means Clustering

Partition data into k centroids iteratively.

K-Medoid

Robust variant of K-Means using actual data points.

Hierarchical Clustering

Build a tree of merges or splits (dendrograms).

DBSCAN

Density-based spatial clustering with noise detection.

OPTICS

Density-based ordering for variable-density clusters.

BIRCH

Balanced iterative reducing and clustering using hierarchies.

Affinity Propagation

Message-passing clustering without k selection.

Mean Shift

Non-parametric density mode seeking algorithm.

Probabilistic

2 topics

Naive Bayes

Probabilistic classification via Bayes theorem.

Gaussian Mixture Models

Soft clustering via EM algorithm and probabilistic assignments.

Dimensionality Reduction

4 topics

Principal Component Analysis

Project high-dimensional data to lower dimensions.

t-SNE

Visualize high-dimensional data in 2D with perplexity-based embeddings.

UMAP

Topology-preserving dimensionality reduction — faster than t-SNE.

LDA (Linear Discriminant Analysis)

Supervised dimensionality reduction maximizing class separability.

Neural Networks

3 topics

IntermediateSoon

Neural Network Basics

Perceptrons, layers, weights — the foundation of deep learning.

28 min

AdvancedSoon

Backpropagation

How neural networks learn — chain rule through computation graphs.

26 min

IntermediateSoon

Activation Functions

ReLU, sigmoid, tanh, GELU — what they do and why it matters.

16 min

Practical ML

3 topics

IntermediateSoon

Handling Imbalanced Data

SMOTE, class weights, undersampling — real-world class imbalance.

18 min

IntermediateSoon

Anomaly Detection

Isolation Forest, LOF, and one-class SVM for outlier detection.

20 min

IntermediateSoon

Missing Data & Imputation

MCAR, MAR, MNAR — and how to handle each correctly.

16 min

Evaluation & Best Practices

4 topics

Evaluation Metrics

MSE, R², Accuracy, F1, AUC-ROC — decoded.

Cross-Validation

K-Fold, Stratified, Leave-One-Out — reliable model assessment.

Feature Engineering

Transform raw data into ML-ready signals.

Hyperparameter Tuning

Grid search, random search, Bayesian optimization with Optuna.