Mission 00 · Launchpad · Mission Control

New to AI? This is the map before the buzzwords do cartwheels.

Launchpad is mission control. We chart the whole neighborhood of AI: machine learning, deep learning, neural networks, CNNs, RNNs, transformers, attention, and robotics, then climb a gentle ladder from Beginner to Expert, meeting real math and real code along the way.

By the end you will be able to explain: AI vs ML vs deep learning, in plain words Which tool fits images, sequences, or context What a single neuron and softmax actually compute How robots sense, think, act, and learn

Start the ladder See the mission guide

Mission map: the concepts orbiting you. Visit each in its own room later.

The big picture

How it all connects, and why.

These words aren't a random pile. They nest and feed into each other. This living map shows the family tree: AI contains Machine Learning, which contains Deep Learning, which grows the architectures behind today's models. Tap any node to see what it is and why it's wired to its neighbours, or trace a whole story end to end.

Mission guide

The full map: what each room teaches, why it matters, and a path through them.

Twelve rooms, grouped into tracks. The two Foundations rooms (Machine Learning and Deep Learning) are the deep ones: broad and detailed, the backbone everything else is built on. Here's a path that builds up nicely, but every room stands alone, so jump in wherever you're curious.

Suggested path: Launchpad → Machine Learning → Embeddings & Vector Search → Deep Learning → Transformers & LLMs → Diffusion Models → Reinforcement Learning & Control. Advanced: World Models, Vision-Language-Action. Coding skills (DSA, Python) any time.

Start here

Orientation

Get the big picture and an animated map of how everything connects before diving in.

You are hereOrientation

Launchpad

The map of the whole AI neighborhood, plus this guide to every other room.

You'll learn

AI vs machine learning vs deep learning, in plain words
What a single neuron and softmax actually compute
Which tool fits images, sequences, or context
How robots sense, think, act, and learn

Keep scrolling to climb the ladder ↓

Foundations · deep dives

The two pillars

Nearly all modern AI is a flavour of these. They're the deepest, broadest rooms. Learn them well and the rest clicks into place.

Deep diveBroad & deep

Machine Learning

The big idea behind all of it: show examples and let the machine find the pattern.

Why it matters: it's the umbrella every AI technique lives under, and these basics (train/test, loss, overfitting) reappear in every other room.

You'll learn

Supervised / unsupervised / reinforcement learning
Regression & classification (fit a line, draw a boundary)
Overfitting, regularization, ROC/AUC, bias-variance
k-means, k-NN, decision trees, SVM, PCA, ensembles

Enter Machine Learning →

Deep diveBroad & deep

Deep Learning

Machine learning with many layers of neurons, the engine under almost everything modern.

Why it matters: GPT, diffusion, vision, and robotics are all special kinds of deep net. This room is the backbone for the tracks below.

You'll learn

Neurons, activations, the forward pass, why depth matters (XOR)
Backpropagation & the chain rule: train a real net live
CNNs, RNNs, GANs, autoencoders, U-Net, the architecture map
Frontier (with proofs): autodiff, gradient-descent convergence, momentum, Adam, init, double descent

Enter Deep Learning →

Core building blocks

The mechanism everything reuses

Every model turns data into vectors. (Gradient descent, how models learn, now lives in Deep Learning's frontier tier.)

Core conceptBeginner → Frontier

Embeddings & Vector Search

How words, images, and items become "arrows of meaning" (vectors).

Why it matters: every model turns inputs into vectors first; embeddings power search, recommendations, RAG, and the input to transformers.

You'll learn

Vectors, embeddings, dot product, cosine similarity vs. distance
How vector / semantic search finds nearest matches
Frontier: approximate-NN indexes (HNSW), contrastive learning, RAG

Enter Embeddings & Vector Search →

Core conceptBeginner → Research

Probability for Robotics

A robot never knows its exact state, so it carries a belief and updates it with Bayes' rule every time it moves and senses.

Why it matters: the recursive Bayes filter is the engine under the Kalman filter, the EKF/UKF, particle filters, HMMs, and SLAM, the math every robot and many learning systems reuse to reason under uncertainty.

You'll learn

Belief, Bayes' rule, Gaussians, and the recursive Bayes filter
The filter zoo: Kalman, EKF, UKF, particle filters, HMMs, information filter & SLAM
Research (with proofs & code): the free-energy principle, the Bayesian brain, energy-based models, learned differentiable filters

Enter Probability for Robotics →

Language models & Attention

The transformer stack behind ChatGPT

One deep room from self-attention to full language models, with a research-frontier tier on top.

TransformersBeginner → Frontier

Transformers & LLMs

The transformer's core trick (every word decides which other words to look at) built all the way up to large language models.

Why it matters: the transformer is the dominant architecture in modern AI, and attention is its beating heart.

You'll learn

Q/K/V, scaled dot-product attention, multi-head & the full block
Tokenization (BPE), sampling (temperature, top-k/p), cross-entropy & perplexity
The attention zoo: self, cross, causal, local, sparse, one masked operation
Frontier (with proofs): RoPE, FlashAttention, Mixture of Experts, RLHF & DPO

Enter Transformers & LLMs →

Generative & Embodied AI

Making things, and acting in the world

Beyond predicting text: generating images and controlling robots. Both build directly on the foundations.

GenerativeBeginner → Frontier

Diffusion Models

How image generators turn random static into a picture by removing noise, step by step.

Why it matters: diffusion powers today's image, video, and audio generators, and even robot motion planning.

You'll learn

From autoencoders to VAEs, the ELBO & reparameterization
Forward/reverse diffusion; predict the noise that was added
Flows: normalizing flows, continuous normalizing flows, flow matching
Diffusion variants: score/SDE, DDIM, EDM, latent diffusion, consistency models
Recent robotics (with papers): Diffusion Policy, π0, RDT, 3D Diffuser Actor, Diffusion Forcing, UniPi

Enter Diffusion Models →

EmbodiedBeginner → Frontier

Reinforcement Learning & Control

AI with a body: the loop a robot runs to perceive the world and act in it.

Why it matters: embodied AI (robots, self-driving) is where perception, control, and learning meet the real world.

You'll learn

The sense → think → act → learn loop; policies & controllers
Proportional control, discounted return, the sim-to-real gap
Frontier (with proofs): Bellman/value iteration, policy gradients, Q-learning, PPO, LQR

Enter Reinforcement Learning & Control →

Advanced Topics · research-grade

For the frontier

Math-first dives into active research, with step-through proofs, animations, and citations. For after the foundations.

AdvancedResearch-grade

World Models for Robotics

How robots learn to imagine the future and plan in it: the math, the proofs, the tricks.

Why it matters: world models are the engine of modern sample-efficient, planning-based robot learning, and a frontier of AI research.

You'll learn

The world-model objective & the ELBO (with an interactive proof)
RSSM / PlaNet / Dreamer latent imagination; TD-MPC planning
JEPA & V-JEPA 2: predicting in representation space
Diffusion world models, Genie, and 2026 frontier work, all cited

Enter World Models →

AdvancedResearch-grade

Vision-Language-Action Models

How robots turn pixels and an instruction straight into motor commands: the math, the proofs, the tricks.

Why it matters: VLAs are the dominant paradigm for generalist robots (a single network that sees, reads, and acts) and one of the most active frontiers in AI.

You'll learn

Behavior cloning & the compounding-error theorem (with an interactive proof)
Action chunking (ACT), Diffusion Policy & π0 flow matching
RT-1/RT-2, Open X-Embodiment, OpenVLA: tokenized actions & web transfer
FAST, RDT, Gemini Robotics, GR00T N1, π0.5, LAPA: all cited

Enter VLA Models →

Useful Skills Wing

The craft around the AI

Not AI-specific, but the practical foundation for any AI work. Take them any time.

SkillCoding craft

DSA Arena

Thirty coding-interview patterns, each with its algorithm, variants, and three animated LeetCode solutions.

Why it matters: the pattern fluency every coding interview tests, and the algorithmic thinking behind efficient ML code.

You'll learn

Sliding window, two pointers, binary search
Graphs, trees, backtracking, tries, union-find
Heaps, intervals, greedy, bit tricks
The full DP family, animated in Python

Enter DSA Arena →

SkillCoding craft

Python Playground

Python fundamentals made visual, funny, and memorable, with animated execution traces.

Why it matters: Python is the language of AI; solid fundamentals make every other room easier.

You'll learn

Data types, slicing, comprehensions, sorting
Functions, scope, lambdas, recursion, decorators
OOP: classes, dunders, dataclasses, inheritance
Generators, iterators, context managers, gotchas

Enter Python Playground →

Prefer to ease in first? Keep scrolling. The rest of this Launchpad teaches the core ideas on a Beginner → Intermediate → Advanced → Expert ladder before you head into the rooms.

Level 1 · Beginner

What is AI? The friendly map.

Big ideas first, in plain English. No math yet, just the lay of the land.

Explain

AI is not one thing. It is a family.

Some members recognize photos, some read sentences, some generate text, some drive robots. The whole trick of getting started is knowing which cousin does what. Tap a card to meet each one with everyday examples.

Interact

Worked example

One photo, the whole family helps

You snap a picture of a dog. A CNN recognizes "dog". A model writes a caption ("a brown dog on grass") using a transformer that uses attention to focus on the right words. If this ran on a robot pet, it would then act: wag, follow, or fetch. Every one of those is a different branch of the same AI family.

Visualize

The family tree, drawn as a mind map.

AI is the whole school. ML is the student who learns from examples. Deep learning is the student with many notebooks (layers). Transformers are the student who keeps asking, "who should I pay attention to?"

Lines link the central idea (AI) to the relatives. Resize the window; it stays crisp.

Level 2 · Intermediate

How the pieces nest and which tool fits the job.

Now we look at the mechanics: who contains whom, and how to pick a model.

Explain

AI ⊃ ML ⊃ Deep Learning.

These words are not synonyms; they are nested boxes. Deep learning is a kind of machine learning, which is a kind of AI. Click a ring to see what lives there.

Interact

Worked example

Spam filter, three ways

A hand-written rule ("block any email saying FREE MONEY") is plain AI, no learning. A model that learns spam words from thousands of labelled emails is ML. A model that reads the raw text with stacked neural layers and figures out the features itself is deep learning. Same task, three depths of the same box.

Explain

CNN vs RNN vs Transformer: which tool for which job?

Different shapes of data want different models. Pick what your data looks like and see the recommended tool with a reason.

Interact

Worked example

Same word, different tool

To find a cat in a photo → CNN (local patches). To predict tomorrow's temperature from the last 30 days → RNN or a small transformer (order matters). To answer a question about a paragraph → transformer (every word can look at every other word at once).

Explain

Robots: sense → think → act → learn.

AI usually lives in software. Robotics gives it a body. The loop never stops: sensors gather signals, models interpret them, planners choose an action, motors move, then it improves. Tap a stage.

Interact

Worked example

A robot vacuum, one lap

Sense: bump and cliff sensors plus a camera. Think: "wall ahead, carpet below". Act: turn 90° and keep cleaning. Learn: remember the room map so next time it is faster. That is the whole loop, once per second.

Level 3 · Advanced

Meet the real math, gently.

Two tiny equations power almost everything. Here they are, in words, with symbols, and with a knob to turn.

Explain

One artificial neuron.

A neuron multiplies each input by a weight (how much it matters), adds them up with a bias (a base level), and squashes the result through an activation so the output stays in a friendly range. That is the whole atom of a neural network.

\[ y=\sigma\!\Big(\textstyle\sum_{i} w_i x_i + b\Big),\qquad \sigma(z)=\frac{1}{1+e^{-z}} \]

In words: weigh each input, sum them, add a bias, then squeeze through the sigmoid \(\sigma\) so the answer lands between 0 and 1.

\(x_i\): the inputs (the evidence)
\(w_i\): weights (how much each input matters; can be negative)
\(b\): bias (shifts the decision left or right)
\(\sigma\): the activation (here sigmoid), turning a raw score into a 0 to 1 signal
\(y\): the neuron's output

Interact

Input x₁ 1.0 Weight w₁ 0.8 Input x₂ -0.5 Weight w₂ 1.5 Bias b 0.0

The curve is the sigmoid. The dot is where your current weighted sum lands.

Worked example

Should I bring an umbrella?

Let \(x_1\) = cloudiness, \(x_2\) = humidity. Give clouds a big weight \(w_1=0.8\) and humidity \(w_2=1.5\). With a slightly negative bias the neuron stays calm on clear days but fires (output near 1 = "yes, umbrella") once the weighted score climbs. Drag the sliders and watch the dot cross 0.5.

Explain

Why stack layers? One neuron draws a line; two layers draw a curve.

A single neuron's "yes" region is always cut by a straight line (the place where \(w_1x_1+w_2x_2+b=0\)). Some patterns, like the four dots below where diagonal corners share a class, simply cannot be split by any one line. Add a hidden layer and the boundary is free to bend, wrapping the dots correctly. That bending power is the whole reason deep networks exist.

Visualize

Pick the model and watch the boundary change shape:

Greener = the model says "class A"; clay = "class B". The bright contour is the 0.5 boundary, the actual decision line. The single neuron can only split the dots with a straight cut, so it always misses at least one; two layers bend around them.

Explain

Softmax: turning scores into probabilities.

Models output raw scores (logits) that can be any number. Softmax exponentiates them (so bigger scores get much bigger) then divides by the total, giving positive numbers that add up to 1. It is how a model says "I'm 70% sure it's a cat, 25% a dog, 5% a fox."

\[ \operatorname{softmax}(z)_i=\frac{e^{z_i}}{\sum_{j} e^{z_j}} \]

In words: raise \(e\) to each score, then normalize by the sum so the outputs are probabilities that total 1.

\(z_i\): the raw score (logit) for option \(i\)
\(e^{z_i}\): exponentiation, which sharpens differences and keeps everything positive
\(\sum_j e^{z_j}\): the total over all options, the normalizer
output: a probability in \([0,1]\); all of them sum to \(1\)

Interact

Worked example

Why exponentials, not just "share of the total"?

With scores 2, 1, 0 a plain share would give 67/33/0 (and breaks on negatives). Softmax gives roughly 66/24/9, close, but it always stays positive and reacts sharply when one score pulls ahead. Slide Cat up to 6 and watch it dominate almost completely.

Explain

One knob on softmax: temperature.

Divide every score by a temperature \(T\) before softmax and you control how decisive the model is. Low \(T\) (\(<1\)) sharpens: the top option dominates. High \(T\) (\(>1\)) flattens: everything moves toward an even split. This is the exact knob a chat model turns when it samples the next word: low for focused, factual answers; high for varied, creative ones.

\[ \operatorname{softmax}(z/T)_i=\frac{e^{z_i/T}}{\sum_{j} e^{z_j/T}} \]

In words: shrink or stretch the gaps between scores first, then run the same softmax. The bigger the gaps, the more confident the distribution.

\(T\to 0^{+}\): gaps explode; the highest score wins almost all the probability (greedy)
\(T=1\): ordinary softmax, unchanged
\(T\to\infty\): gaps vanish; the distribution approaches uniform \(1/n\)

Worked example

Same scores, three temperatures.

Take the scores \(2,\,1,\,0\) (Cat, Dog, Fox). At \(T=1\) softmax gives about \(66.5\,/\,24.5\,/\,9.0\). Cool it to \(T=0.5\) and the scores become \(4,2,0\), so Cat jumps to about \(86.7\%\). Heat it to \(T=2\) and the scores become \(1,\,0.5,\,0\), flattening to roughly \(50.6\,/\,30.7\,/\,18.6\). Same evidence, very different confidence: that is temperature.

Latest (2024 to 2026)

Where you'll meet this knob.

Every current chat model (GPT-class, Claude, Gemini, Llama) ends each step with logits, then a softmax over the whole vocabulary, then samples the next token. The temperature setting in an API call is exactly the \(T\) above; top-p / top-k just trim which options stay in the running first. Reasoning-tuned models often sample at low temperature for math and code, and higher for brainstorming. The tiny equation here is the production knob.

Explain

Scoring the guess: cross-entropy.

Softmax says what the model predicts; a loss says how wrong it was, giving a single number to push down during training. For classification that number is cross-entropy: it looks only at the probability the model gave the true answer and penalizes being unsure about it.

\[ \mathcal{L}=-\sum_{i} y_i\,\log p_i \;=\; -\log p_{\text{true}} \]

In words: with a one-hot true label, every term is zero except the true class, so the loss is just the negative log of the probability you gave the right answer.

\(y_i\): the true label, \(1\) for the correct class and \(0\) elsewhere (one-hot)
\(p_i\): the softmax probability the model assigned to class \(i\)
\(-\log p_{\text{true}}\): small when the model is confident and right, large when it is unsure or wrong

Worked example

Confident and right vs. unsure.

Suppose the true class is Cat. If the model gave Cat \(p=0.665\), the loss is \(-\log 0.665 \approx 0.41\). Nudge the scores until Cat reaches \(0.90\) and the loss drops to \(-\log 0.90 \approx 0.11\). A perfect \(p=1\) gives loss \(0\); a near-miss \(p=0.01\) gives \(\approx 4.6\). Training repeatedly nudges weights to shrink this number; that is learning, in one equation.

This is the link to the next rooms: softmax (here) feeds cross-entropy, and gradient descent shrinks it, the loop behind Deep Learning and Transformer training alike.

Level 4 · Expert

Real vocabulary, real code, and your next mission.

The formal names you'll meet in courses, papers, and code, then a challenge and the launch board.

Reference

The formal vocabulary table.

Launchpad gives the friendly story; this gives the names you'll be quizzed and interviewed on.

AIArtificial Intelligence: systems that perform tasks normally requiring human intelligence.

MLMachine Learning: learning patterns from data instead of hand-written rules.

Deep LearningNeural networks with many layers that learn their own features (representations).

Neuron / PerceptronThe atom: a weighted sum plus bias passed through an activation function.

CNNConvolutional Neural Network: shares small filters over local image/grid patches.

RNNRecurrent Neural Network: processes sequences step by step, carrying a hidden state.

TransformerArchitecture built on self-attention + feed-forward blocks; processes a whole sequence in parallel.

AttentionA weighted blend: each token decides how much to focus on every other token.

SoftmaxTurns a vector of scores into a probability distribution that sums to 1.

LogitsThe raw, un-normalized scores a model outputs before softmax.

RoboticsEmbodied AI: sensors, perception, planning, control, and actuators in a sense-think-act loop.

EmbeddingA learned vector that represents a word, image patch, or item as numbers.

Code

PyTorch mini examples.

The same ideas in real code. Toggle it open and read the per-line notes underneath.

From neuron to transformer

import torch
import torch.nn as nn

# 1) A single neuron: weighted sum + bias, then an activation
neuron = nn.Sequential(nn.Linear(2, 1), nn.Sigmoid())

# 2) A tiny multilayer network (an MLP): input -> hidden -> output
mlp = nn.Sequential(
    nn.Linear(4, 8),   # 4 inputs to 8 hidden units
    nn.ReLU(),         # nonlinearity
    nn.Linear(8, 3),   # 8 hidden to 3 class scores (logits)
)
logits = mlp(torch.randn(5, 4))     # 5 examples
probs  = torch.softmax(logits, dim=-1)  # scores -> probabilities

# 3) A CNN layer: scans image-like grids with shared filters
conv = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3)

# 4) An RNN cell: walks a sequence carrying hidden state
rnn = nn.GRU(input_size=10, hidden_size=20, batch_first=True)

# 5) A transformer encoder layer: attention + feed-forward
layer  = nn.TransformerEncoderLayer(d_model=32, nhead=4, batch_first=True)
tokens = torch.randn(2, 5, 32)      # 2 sentences, 5 tokens, 32 features
out    = layer(tokens)

Challenge

Checkpoint quiz.

No grades, no doom, just proof the map is sticking. Each answer gives instant feedback.

Launchpad quiz

0 / 16

Reflect

In your own words: what is the difference between AI, ML, and deep learning? Pick any concept above. When would you reach for it, and why that one?

These are just for you; nothing is sent or stored anywhere.

Launch

You are ready. Pick your first room.

You've got the map and the core math. The other rooms (with what each one teaches) are laid out in the mission guide at the top. A good first step is Machine Learning (the big idea behind all of it).

Start with Machine Learning → Back to the mission guide