ML Problem Types | ML Atlas

Concept Overview

In Plain English

Problem type defines output shape and evaluation strategy before model choice.

Why It Exists

Different outputs require different losses, metrics, and tradeoffs.

Problem It Solves

Prevents solving the right business question with the wrong ML framing.

Real-Life Analogy

"Before choosing tools, decide if you're painting, drilling, or measuring."

When To Use

At project scoping
Before feature engineering and model selection

When NOT To Use

Never skip this; it is always needed

Core Intuition

Most early ML failures come from wrong problem framing, not wrong algorithm.

For the same data, framing as classification vs ranking can produce very different outcomes.

Metrics must match the task: RMSE for regression, F1/AUC for classification, NDCG for ranking.

The Metaphor

"Task type is the contract. Models are implementations of that contract."

Beginner Mental Model

First decide output type, then decide model.

Technical Theory

Formal Definition

A problem type is defined by output space Y, objective function, and evaluation metric.

Key Terms

Regression: Predict continuous values.
Binary Classification: Predict one of two classes.
Multiclass Classification: Predict one class among many.
Clustering: Group unlabeled data by similarity.
Anomaly Detection: Detect rare or abnormal patterns.
Ranking: Order items by relevance.
Recommendation: Predict user-item preference.
Forecasting: Predict future values over time.

Step-by-Step Working

1. Define business decision.
2. Define prediction unit.
3. Define output and horizon.
4. Map to problem type.
5. Select metric tied to decision quality.

Inputs

Problem statement, available data, and decision objective.

Outputs

Task family and baseline metric plan.

Model Assumptions

01Labels or weak signals are available when needed.

02Evaluation data represents target production behavior.

Important Edge Cases

▸Multi-objective tasks
▸Label ambiguity
▸Class imbalance in rare-event tasks

Methodology / Workflow

Role in the ML Pipeline

This stage prevents costly rework in later modeling and deployment phases.

Data Preprocessing

01.Define label generation logic clearly.
02.Check if labels are stable over time.
03.For ranking/recommendation, define interaction windows.

Training Process

01.Build a baseline model for the chosen task family.
02.Validate metric alignment with business outcomes.
03.Iterate framing if offline and online goals diverge.

Implementation Checklist

1Write task spec
2Create baseline dataset
3Train baseline
4Evaluate against product objective

Mathematical Chamber

Implementation

text

1Business goal -> Prediction target -> Output type -> Candidate metrics -> Problem type

Sample Input

Goal: reduce support escalations

Sample Output

Task: binary classification (escalate vs not), metric: recall at fixed precision

Key Implementation Insights

→Task framing quality dominates early project success.
→Metrics should proxy real decision quality, not convenience.

Common Implementation Mistakes

✗Using accuracy for highly imbalanced tasks
✗Framing ranking tasks as plain classification

Dataset Applicability

database

Labeled tabular

Excellent

Best for supervised tasks

💡 Ensure label quality.

activity

Unlabeled event logs

Good

Useful for clustering/anomaly tasks

💡 Feature engineering matters heavily.

Visualizations

Mandatory Visual Blueprint

What should move

At least one parameter, threshold, split, cluster state, or metric should change interactively.

What to observe

The learner should see how the concept affects error, fit, grouping, or decision quality.

Planned visual type

Interactive chart, step animation, or side-by-side failure-mode comparison.

Reference image slot

If no live lab exists yet, attach a relevant diagram/reference image before marking the page complete.

Topic key: ml-problem-types

Problem Type Decision Flow

Choose by output: numeric, class, groups, order, or future horizon.

Flowchart recommendation: output numeric -> regression; categorical -> classification; no labels -> clustering/anomaly; ordered list -> ranking; user-item next action -> recommendation; future timestamp values -> forecasting.

Metric Tradeoff Snapshot

Same model can look different under different metrics.

Compare Accuracy vs F1 on imbalanced data and NDCG vs CTR for ranking.

Advantages & Limitations

Advantages

Clearer System Design
Task family clarifies data, model, and metric choices.
Better Stakeholder Alignment
Shared language around output and tradeoffs.

Limitations

Ambiguous Boundaries
Some products need multiple task families combined.
Metric Drift Risk
Business objective may change, requiring re-framing.

Practical Use Cases

Ranking

Order results by relevance and freshness.

Media

Recommendation

Personalized content ranking per user.

Comparison

Different task families optimize different objectives.

Classification

Similarity

Discrete outputs

Key Difference

Predicts class labels

Choose When

Decision categories are explicit.

Ranking

Similarity

Can use relevance labels

Key Difference

Optimizes order quality

Choose When

Top-k ordering matters more than exact class.

Aspect	Classification	Ranking
Primary Metric	F1/AUC	NDCG/MAP

Choose ML Problem Types when:

Choose based on the product decision you need to automate.

Evaluation

RMSE/MAE

Numeric error for regression.

F1/AUC

Class balance-aware classification quality.

NDCG

Top-rank relevance quality.

Evaluation Process

01.Confirm business objective
02.Select task-family metric
03.Validate against baseline

Evaluation Traps

▸Metric mismatch with product KPI
▸Ignoring threshold strategy in binary tasks

Real-World Interpretation Example

Higher AUC may still underperform product KPI if thresholding is poorly tuned.

Common Mistakes

Students

×Memorizing algorithms without task framing.

Developers

×Choosing model before deciding output contract.

In Interviews

×Confusing multiclass with multilabel.

Real Projects

×Using one metric for all tasks.

Core ML Thinking Lens

What kind of bias does this model have?

Bias depends on model assumptions and feature expressiveness.

What kind of variance does it have?

Variance grows with model flexibility and weak regularization.

How does it overfit?

Overfitting usually appears as strong train performance but weaker validation/test behavior.

How do we regularize it?

Use complexity constraints, robust validation, and data-centric cleanup.

What kind of data does it like?

Prefers representative, low-leakage data with stable feature definitions.

What kind of data breaks it?

Breaks under leakage, severe distribution drift, noisy labels, and poorly engineered features.

Summary Cheat Sheet

Quick Revision Reference

Key Takeaways

Task type first, model second.
Metrics must follow task + business objective.

Critical Formulas

BCE

Best For

✓Project scoping and baseline planning

Avoid When

✗Skipping due to time pressure

Interview Must-Know

★Explain regression vs classification vs ranking with concrete examples.

Interview Questions

Tricky Questions

These questions are designed to break assumptions and expose weak understanding. Most people will answer them wrong on their first attempt. Work through each one carefully.