Guided curriculum

Understand every moving part of a neural network.

This learning path connects the concepts directly to the live GradVex playground, so every formula has an experiment attached to it.

Signal flow — what each layer does

01784 neurons

Input Layer

Raw pixel values

Converts your 28×28 drawing into 784 numbers (0=black, 1=white). No spatial understanding — just a flat list of brightness values. Every pixel gets its own neuron.

Analogy

Like reading a spreadsheet — the network sees columns of numbers, not a picture.

x ∈ ℝ⁷⁸⁴

02128 neurons · ReLU

Hidden Layer 1

Stroke & edge detection

Each of the 128 neurons looks at all 784 pixels with a different weighted lens. Some neurons learn to fire on horizontal strokes, others on curves or diagonals. ReLU zeros out negatives — only positive evidence passes through.

Analogy

Like 128 inspectors, each trained to spot one type of pen stroke.

a₁ = ReLU(W₁x + b₁)

0364 neurons · ReLU

Hidden Layer 2

Shape & part composition

64 neurons combine H1 stroke evidence into higher-level shapes — loops, curves, corners, crossings. A "closed loop" detector might combine multiple curve detectors. This is where digit parts emerge.

Analogy

Like assembling Lego bricks — strokes combine into recognizable digit parts.

a₂ = ReLU(W₂a₁ + b₂)

0410 neurons · Softmax

Output Layer

Digit classification

One neuron per digit (0–9). Each reads H2 features and produces a raw score. Softmax converts all 10 scores into probabilities summing to 1. The highest probability wins.

Analogy

Like 10 judges each voting on how likely the drawing is their digit.

ŷ = Softmax(W₃a₂ + b₃)

STEP 01

Input

784 normalized pixels

STEP 02

Hidden 1

Stroke detectors

STEP 03

Hidden 2

Shape combinations

STEP 04

Output

10 digit probabilities

Beginner

1. Mental model

A neural network is a stack of small decision units that transform raw input into useful evidence.

Formula

prediction = model(input)

In GradVex, the input is a handwritten digit represented by 784 pixel values.

Each layer turns the previous representation into a more useful one: pixels become strokes, strokes become shapes, shapes become digit evidence.

The network is not memorizing one drawing. It learns reusable patterns from many examples.

Real-world example

Fraud systems transform raw transaction fields into risk evidence; medical imaging models transform pixels into signs of disease.

Advantages

Easy mental model for beginners
Works across images, tabular data, audio features, and text embeddings

Limitations

Can still be hard to interpret at scale
Needs representative training data to generalize well

Try it in the playground

Draw a 7, then draw it with a crossbar. Watch which output probabilities compete.