Guided curriculum
Understand every moving part of a neural network.
This learning path connects the concepts directly to the live GradVex playground, so every formula has an experiment attached to it.
Signal flow — what each layer does
Input Layer
Raw pixel values
Converts your 28×28 drawing into 784 numbers (0=black, 1=white). No spatial understanding — just a flat list of brightness values. Every pixel gets its own neuron.
Analogy
Like reading a spreadsheet — the network sees columns of numbers, not a picture.
x ∈ ℝ⁷⁸⁴Hidden Layer 1
Stroke & edge detection
Each of the 128 neurons looks at all 784 pixels with a different weighted lens. Some neurons learn to fire on horizontal strokes, others on curves or diagonals. ReLU zeros out negatives — only positive evidence passes through.
Analogy
Like 128 inspectors, each trained to spot one type of pen stroke.
a₁ = ReLU(W₁x + b₁)Hidden Layer 2
Shape & part composition
64 neurons combine H1 stroke evidence into higher-level shapes — loops, curves, corners, crossings. A "closed loop" detector might combine multiple curve detectors. This is where digit parts emerge.
Analogy
Like assembling Lego bricks — strokes combine into recognizable digit parts.
a₂ = ReLU(W₂a₁ + b₂)Output Layer
Digit classification
One neuron per digit (0–9). Each reads H2 features and produces a raw score. Softmax converts all 10 scores into probabilities summing to 1. The highest probability wins.
Analogy
Like 10 judges each voting on how likely the drawing is their digit.
ŷ = Softmax(W₃a₂ + b₃)STEP 01
Input
784 normalized pixels
STEP 02
Hidden 1
Stroke detectors
STEP 03
Hidden 2
Shape combinations
STEP 04
Output
10 digit probabilities
Beginner
1. Mental model
A neural network is a stack of small decision units that transform raw input into useful evidence.
Formula
prediction = model(input)In GradVex, the input is a handwritten digit represented by 784 pixel values.
Each layer turns the previous representation into a more useful one: pixels become strokes, strokes become shapes, shapes become digit evidence.
The network is not memorizing one drawing. It learns reusable patterns from many examples.
Real-world example
Fraud systems transform raw transaction fields into risk evidence; medical imaging models transform pixels into signs of disease.
Advantages
- Easy mental model for beginners
- Works across images, tabular data, audio features, and text embeddings
Limitations
- Can still be hard to interpret at scale
- Needs representative training data to generalize well
Try it in the playground
Draw a 7, then draw it with a crossbar. Watch which output probabilities compete.