manim-mcp 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +104 -0
- package/dist/demo.mp4 +0 -0
- package/dist/index.js +65 -0
- package/dist/mcp-app.html +142 -0
- package/dist/server.js +1492 -0
- package/package.json +67 -0
- package/references/composer/SKILL.md +154 -0
- package/references/composer/references/3b1b-series-patterns.md +217 -0
- package/references/composer/references/domain-planning-guides/calculus-planning.md +188 -0
- package/references/composer/references/domain-planning-guides/linear-algebra-planning.md +169 -0
- package/references/composer/references/domain-planning-guides/ml-planning.md +286 -0
- package/references/composer/references/domain-planning-guides/number-theory-planning.md +187 -0
- package/references/composer/references/domain-planning-guides/physics-planning.md +249 -0
- package/references/composer/references/domain-planning-guides/probability-planning.md +200 -0
- package/references/composer/references/mathematical-storytelling.md +359 -0
- package/references/composer/references/narrative-patterns.md +221 -0
- package/references/composer/references/opening-patterns.md +284 -0
- package/references/composer/references/pacing-guide.md +289 -0
- package/references/composer/references/scene-archetypes.md +534 -0
- package/references/composer/references/scene-examples.md +379 -0
- package/references/composer/references/visual-techniques.md +480 -0
- package/references/composer/templates/scenes-template.md +147 -0
- package/references/manimce/SKILL.md +166 -0
- package/references/manimce/examples/3d_visualization.py +373 -0
- package/references/manimce/examples/basic_animations.py +212 -0
- package/references/manimce/examples/graph_plotting.py +401 -0
- package/references/manimce/examples/lorenz_attractor.py +172 -0
- package/references/manimce/examples/math_visualization.py +315 -0
- package/references/manimce/examples/updater_patterns.py +369 -0
- package/references/manimce/rules/3b1b-translation.md +594 -0
- package/references/manimce/rules/3d.md +254 -0
- package/references/manimce/rules/advanced-animations.md +594 -0
- package/references/manimce/rules/animation-groups.md +212 -0
- package/references/manimce/rules/animations.md +128 -0
- package/references/manimce/rules/api-pitfalls.md +89 -0
- package/references/manimce/rules/axes.md +214 -0
- package/references/manimce/rules/camera.md +208 -0
- package/references/manimce/rules/cli.md +232 -0
- package/references/manimce/rules/color-conventions.md +444 -0
- package/references/manimce/rules/colors.md +199 -0
- package/references/manimce/rules/config.md +264 -0
- package/references/manimce/rules/creation-animations.md +158 -0
- package/references/manimce/rules/graphing.md +233 -0
- package/references/manimce/rules/grouping.md +220 -0
- package/references/manimce/rules/latex.md +202 -0
- package/references/manimce/rules/lines.md +241 -0
- package/references/manimce/rules/long-form-video.md +552 -0
- package/references/manimce/rules/mathematical-domains.md +689 -0
- package/references/manimce/rules/mobjects.md +116 -0
- package/references/manimce/rules/multi-scene-composition.md +112 -0
- package/references/manimce/rules/pedagogy.md +532 -0
- package/references/manimce/rules/physics-simulations.md +610 -0
- package/references/manimce/rules/positioning.md +211 -0
- package/references/manimce/rules/scenes.md +121 -0
- package/references/manimce/rules/shapes.md +300 -0
- package/references/manimce/rules/styling.md +177 -0
- package/references/manimce/rules/text-animations.md +222 -0
- package/references/manimce/rules/text.md +189 -0
- package/references/manimce/rules/timing.md +227 -0
- package/references/manimce/rules/transform-animations.md +157 -0
- package/references/manimce/rules/updaters.md +226 -0
- package/references/manimce/templates/basic_scene.py +64 -0
- package/references/manimce/templates/camera_scene.py +100 -0
- package/references/manimce/templates/threed_scene.py +138 -0
|
@@ -0,0 +1,169 @@
|
|
|
1
|
+
# Linear Algebra Video Planning Guide
|
|
2
|
+
|
|
3
|
+
Based on the Essence of Linear Algebra series (2016), supplemental linear algebra content across multiple years, and cross-cutting patterns from the 3b1b codebase.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Core Visual Language
|
|
8
|
+
|
|
9
|
+
### The Grid Transformation (THE signature visual)
|
|
10
|
+
|
|
11
|
+
Every linear algebra concept in 3b1b is grounded in the `NumberPlane` grid transformation. This is non-negotiable -- it IS the visual language.
|
|
12
|
+
|
|
13
|
+
```python
|
|
14
|
+
plane = NumberPlane()
|
|
15
|
+
i_hat = Vector(RIGHT, color=GREEN) # Always GREEN
|
|
16
|
+
j_hat = Vector(UP, color=RED) # Always RED
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
**Color conventions (from EoLA):**
|
|
20
|
+
- i-hat (first basis vector): GREEN
|
|
21
|
+
- j-hat (second basis vector): RED
|
|
22
|
+
- k-hat (third basis vector, 3D): Typically BLUE or MAROON
|
|
23
|
+
- Arbitrary vector v: YELLOW
|
|
24
|
+
- Grid lines: BLUE variations (BLUE_D, BLUE_B)
|
|
25
|
+
- Determinant area: Fill with opacity, color by sign (positive=BLUE, negative=RED)
|
|
26
|
+
- Eigenspace: Distinct color, often TEAL or MAROON
|
|
27
|
+
|
|
28
|
+
### Transformation Matrix Display
|
|
29
|
+
|
|
30
|
+
Always show the matrix alongside the grid transformation:
|
|
31
|
+
```python
|
|
32
|
+
matrix = Matrix([[a, b], [c, d]])
|
|
33
|
+
# Position matrix to the side (usually upper-right or lower-right)
|
|
34
|
+
# Column 1 colored GREEN (where i-hat lands)
|
|
35
|
+
# Column 2 colored RED (where j-hat lands)
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
---
|
|
39
|
+
|
|
40
|
+
## Chapter Planning by Topic
|
|
41
|
+
|
|
42
|
+
### Vectors
|
|
43
|
+
|
|
44
|
+
**Key scenes:**
|
|
45
|
+
1. Three perspectives (physics arrow, CS list, mathematician's abstract element)
|
|
46
|
+
2. Addition as tip-to-tail
|
|
47
|
+
3. Scaling as stretching/shrinking
|
|
48
|
+
4. Coordinates as scalars of basis vectors
|
|
49
|
+
|
|
50
|
+
**Visual techniques:**
|
|
51
|
+
- `Arrow` for vectors (NOT `Line` -- arrows have direction)
|
|
52
|
+
- `vector_coordinate_label()` for showing [x, y] next to a vector
|
|
53
|
+
- Color-coded components: x-component in one color, y-component in another
|
|
54
|
+
|
|
55
|
+
### Linear Transformations
|
|
56
|
+
|
|
57
|
+
**Key scenes:**
|
|
58
|
+
1. Show transformation by tracking grid lines
|
|
59
|
+
2. "Where do i-hat and j-hat land?" -- the key insight
|
|
60
|
+
3. Any vector v = xi + yj transforms to x(new i) + y(new j)
|
|
61
|
+
4. `apply_matrix()` on NumberPlane for the dramatic transformation
|
|
62
|
+
|
|
63
|
+
**The critical insight to sell:** If you know where the basis vectors go, you know where EVERYTHING goes. This is why matrices have two columns -- one for each basis vector's landing spot.
|
|
64
|
+
|
|
65
|
+
### Matrix Multiplication
|
|
66
|
+
|
|
67
|
+
**Key scenes:**
|
|
68
|
+
1. Apply transformation A
|
|
69
|
+
2. Apply transformation B
|
|
70
|
+
3. "What single transformation does both?" -> That's AB
|
|
71
|
+
4. Order matters: AB != BA (show a rotation then shear vs shear then rotation)
|
|
72
|
+
|
|
73
|
+
**Visual technique:** Apply one matrix, then a second, then show the composed matrix gives the same result.
|
|
74
|
+
|
|
75
|
+
### Determinant
|
|
76
|
+
|
|
77
|
+
**Key scenes:**
|
|
78
|
+
1. Unit square with area 1
|
|
79
|
+
2. Transform: watch the area change
|
|
80
|
+
3. Determinant = factor by which areas scale
|
|
81
|
+
4. Negative determinant = orientation flip (grid "flips")
|
|
82
|
+
5. Determinant 0 = squished to lower dimension
|
|
83
|
+
|
|
84
|
+
**Visual techniques:**
|
|
85
|
+
- Fill the unit square with semi-transparent color
|
|
86
|
+
- Show area value updating as transformation is applied
|
|
87
|
+
- Use `SurroundingRectangle` or `Polygon` for area visualization
|
|
88
|
+
- Animate orientation flip by showing handedness (clockwise vs counterclockwise)
|
|
89
|
+
|
|
90
|
+
### Eigenvectors and Eigenvalues
|
|
91
|
+
|
|
92
|
+
**Key scenes:**
|
|
93
|
+
1. Apply transformation: most vectors change direction
|
|
94
|
+
2. But SOME vectors stay on their span (just scaled)
|
|
95
|
+
3. These are eigenvectors; the scaling factor is the eigenvalue
|
|
96
|
+
4. Eigenbasis: if all basis vectors are eigenvectors, the matrix is diagonal
|
|
97
|
+
|
|
98
|
+
**Visual techniques:**
|
|
99
|
+
- Highlight eigenvectors with distinct color before and after transformation
|
|
100
|
+
- Show the "span line" that the eigenvector stays on
|
|
101
|
+
- Arrow staying on its line while other arrows rotate
|
|
102
|
+
|
|
103
|
+
---
|
|
104
|
+
|
|
105
|
+
## The "Your Coordinates vs Jennifer's Coordinates" Pattern
|
|
106
|
+
|
|
107
|
+
From EoLA Chapter 10 (Change of Basis). This is a powerful pedagogical device: personify different bases as different people's "languages."
|
|
108
|
+
|
|
109
|
+
**Structure:**
|
|
110
|
+
1. Introduce "your" standard basis (i-hat, j-hat)
|
|
111
|
+
2. Introduce "Jennifer's" basis (different vectors)
|
|
112
|
+
3. Same vector, described differently in each system
|
|
113
|
+
4. The change-of-basis matrix is a "translator"
|
|
114
|
+
|
|
115
|
+
---
|
|
116
|
+
|
|
117
|
+
## 3D Linear Algebra
|
|
118
|
+
|
|
119
|
+
From EoLA Chapter 5 and `_2015/three_dimensions.py`:
|
|
120
|
+
|
|
121
|
+
**Key differences from 2D:**
|
|
122
|
+
- Three basis vectors: i-hat, j-hat, k-hat
|
|
123
|
+
- 3x3 matrices
|
|
124
|
+
- Determinant = volume scaling
|
|
125
|
+
- Cross product as the "third vector" from two inputs
|
|
126
|
+
|
|
127
|
+
**Visual techniques:**
|
|
128
|
+
- `ThreeDScene` or `InteractiveScene` with `frame.reorient()`
|
|
129
|
+
- `set_floor_plane("xz")` for ground-plane perspective
|
|
130
|
+
- `always_sort_to_camera()` for depth ordering
|
|
131
|
+
- Ambient camera rotation: `frame.add_updater(lambda m, dt: m.increment_theta(-0.1 * dt))`
|
|
132
|
+
|
|
133
|
+
---
|
|
134
|
+
|
|
135
|
+
## Equation Patterns
|
|
136
|
+
|
|
137
|
+
### Common LaTeX with Color Coding
|
|
138
|
+
|
|
139
|
+
```python
|
|
140
|
+
# Matrix-vector product
|
|
141
|
+
Tex(R"""
|
|
142
|
+
\begin{bmatrix} a & b \\ c & d \end{bmatrix}
|
|
143
|
+
\begin{bmatrix} x \\ y \end{bmatrix}
|
|
144
|
+
= x \begin{bmatrix} a \\ c \end{bmatrix}
|
|
145
|
+
+ y \begin{bmatrix} b \\ d \end{bmatrix}
|
|
146
|
+
""", t2c={
|
|
147
|
+
"a": GREEN, "c": GREEN, # First column = i-hat landing
|
|
148
|
+
"b": RED, "d": RED, # Second column = j-hat landing
|
|
149
|
+
"x": YELLOW, "y": YELLOW, # Input vector components
|
|
150
|
+
})
|
|
151
|
+
```
|
|
152
|
+
|
|
153
|
+
### Eigenvalue equation
|
|
154
|
+
```python
|
|
155
|
+
Tex(R"A \vec{v} = \lambda \vec{v}", t2c={
|
|
156
|
+
R"\vec{v}": MAROON,
|
|
157
|
+
R"\lambda": YELLOW,
|
|
158
|
+
})
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
---
|
|
162
|
+
|
|
163
|
+
## Common Pitfalls in Linear Algebra Videos
|
|
164
|
+
|
|
165
|
+
1. **Jumping to computation too fast**: Always show the GEOMETRIC meaning before the arithmetic.
|
|
166
|
+
2. **Not animating the transformation**: Static before/after images don't work. The viewer needs to WATCH the grid move.
|
|
167
|
+
3. **Treating matrix multiplication as "row times column"**: This is the computation, not the meaning. The meaning is composition of transformations.
|
|
168
|
+
4. **Ignoring the null space**: When the determinant is 0, SHOW the squishing. This is often more illuminating than the non-degenerate case.
|
|
169
|
+
5. **Abstract spaces without grounding**: If discussing function spaces, first show the concrete 2D analogy, THEN generalize.
|
|
@@ -0,0 +1,286 @@
|
|
|
1
|
+
# Machine Learning Video Planning Guide
|
|
2
|
+
|
|
3
|
+
Based on the Neural Networks series (2017), Transformers series (2024), and related ML content across the 3b1b codebase.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Core Visual Language
|
|
8
|
+
|
|
9
|
+
### Neural Network Visualization
|
|
10
|
+
|
|
11
|
+
**`NeuralNetwork` mobject** (`_2024/transformers/helpers.py`):
|
|
12
|
+
```python
|
|
13
|
+
class NeuralNetwork(VGroup):
|
|
14
|
+
def __init__(self, layer_sizes=[5, 10, 5]):
|
|
15
|
+
self.layers = VGroup(VGroup(Dot() for _ in range(n)) for n in layer_sizes)
|
|
16
|
+
self.lines = VGroup(...) # Connections between layers
|
|
17
|
+
|
|
18
|
+
def randomize_line_style(self):
|
|
19
|
+
# Stroke widths and colors vary by weight value
|
|
20
|
+
|
|
21
|
+
def randomize_layer_values(self):
|
|
22
|
+
# Dot fill opacities represent activation levels
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
**Key principle**: Dot opacity = activation level. Connection stroke width = weight magnitude. Connection color = weight sign (blue=positive, red=negative).
|
|
26
|
+
|
|
27
|
+
### Weight Matrix Visualization
|
|
28
|
+
|
|
29
|
+
**`WeightMatrix`** (`_2024/transformers/helpers.py`):
|
|
30
|
+
```python
|
|
31
|
+
class WeightMatrix(DecimalMatrix):
|
|
32
|
+
# Each entry colored by value_to_color()
|
|
33
|
+
# Blue for positive, Red for negative
|
|
34
|
+
# Intensity proportional to magnitude
|
|
35
|
+
|
|
36
|
+
def value_to_color(value):
|
|
37
|
+
# Positive -> BLUE gradient
|
|
38
|
+
# Negative -> RED gradient
|
|
39
|
+
# Near zero -> dark/muted
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
### Embedding Visualization
|
|
43
|
+
|
|
44
|
+
**`NumericEmbedding`** -- Column vector with color-coded entries:
|
|
45
|
+
```python
|
|
46
|
+
class NumericEmbedding(WeightMatrix):
|
|
47
|
+
def __init__(self, length=8):
|
|
48
|
+
super().__init__(shape=(length, 1))
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
**`EmbeddingArray`** -- Row of embeddings for a sequence:
|
|
52
|
+
```python
|
|
53
|
+
class EmbeddingArray(VGroup):
|
|
54
|
+
# Array of NumericEmbedding objects
|
|
55
|
+
# With brackets and dot separators
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
### Machine Metaphor
|
|
59
|
+
|
|
60
|
+
**`MachineWithDials`** -- Visual metaphor for model parameters:
|
|
61
|
+
```python
|
|
62
|
+
class MachineWithDials(VGroup):
|
|
63
|
+
def random_change_animation(self):
|
|
64
|
+
# Dials rotate to new values
|
|
65
|
+
# Represents gradient descent step
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
---
|
|
69
|
+
|
|
70
|
+
## Topic Planning
|
|
71
|
+
|
|
72
|
+
### Neural Network Basics
|
|
73
|
+
|
|
74
|
+
**Series structure (from `_2017/nn/`):**
|
|
75
|
+
|
|
76
|
+
Part 1: What IS a neural network?
|
|
77
|
+
- Start with MNIST digit recognition
|
|
78
|
+
- `PixelsAsSquares`: show raw pixel input
|
|
79
|
+
- Layer-by-layer architecture
|
|
80
|
+
- Activation function (sigmoid, ReLU)
|
|
81
|
+
- "What does each neuron represent?"
|
|
82
|
+
|
|
83
|
+
Part 2: Gradient descent
|
|
84
|
+
- Cost function visualization (landscape with valleys)
|
|
85
|
+
- Gradient as direction of steepest descent
|
|
86
|
+
- Learning rate as step size
|
|
87
|
+
- "Ball rolling downhill" analogy
|
|
88
|
+
|
|
89
|
+
Part 3: Backpropagation intuition
|
|
90
|
+
- Which weights matter most?
|
|
91
|
+
- Chain rule through the network
|
|
92
|
+
- Sensitivity analysis
|
|
93
|
+
|
|
94
|
+
Part 4: Backpropagation calculus
|
|
95
|
+
- Full derivative computation
|
|
96
|
+
- Matrix notation
|
|
97
|
+
|
|
98
|
+
**Pixel visualization** (`_2017/nn/part1.py`):
|
|
99
|
+
```python
|
|
100
|
+
class PixelsAsSquares(VGroup):
|
|
101
|
+
# Converts image to grid of colored squares
|
|
102
|
+
# Each square's opacity = pixel brightness
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
### Transformer Architecture
|
|
106
|
+
|
|
107
|
+
**Series structure (from `_2024/transformers/`):**
|
|
108
|
+
|
|
109
|
+
**High-Level Flow** (`network_flow.py`):
|
|
110
|
+
```python
|
|
111
|
+
class HighLevelNetworkFlow(InteractiveScene):
|
|
112
|
+
def show_initial_text_embedding(self):
|
|
113
|
+
# Tokenization -> embedding vectors
|
|
114
|
+
|
|
115
|
+
def progress_through_attention_block(self):
|
|
116
|
+
# 3D block with arc-based attention
|
|
117
|
+
|
|
118
|
+
def progress_through_mlp_block(self):
|
|
119
|
+
# Neuron clusters with weighted connections
|
|
120
|
+
|
|
121
|
+
def mention_repetitions(self):
|
|
122
|
+
# "Many repetitions" brace
|
|
123
|
+
|
|
124
|
+
def show_unembedding(self):
|
|
125
|
+
# Final distribution over vocabulary
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
**Attention Visualization** (`attention.py`):
|
|
129
|
+
- Query-Key dot product as "looking at" another token
|
|
130
|
+
- Softmax as normalization (making weights sum to 1)
|
|
131
|
+
- Value vectors weighted and summed
|
|
132
|
+
- Arc-based visual: arcs between token positions
|
|
133
|
+
|
|
134
|
+
```python
|
|
135
|
+
class ContextAnimation(LaggedStart):
|
|
136
|
+
# Curved arcs between embedding positions
|
|
137
|
+
# Color and width encode attention weight
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
**Embedding Space** (`embedding.py`):
|
|
141
|
+
- Tokenization with tiktoken
|
|
142
|
+
- Color-coded token rectangles
|
|
143
|
+
- Word2Vec demonstrations (king - man + woman = queen)
|
|
144
|
+
|
|
145
|
+
```python
|
|
146
|
+
def break_into_tokens(phrase):
|
|
147
|
+
# Uses tiktoken to split text into GPT tokens
|
|
148
|
+
|
|
149
|
+
def get_piece_rectangles(words):
|
|
150
|
+
# Colored rectangles around each token
|
|
151
|
+
```
|
|
152
|
+
|
|
153
|
+
**Auto-regression** (`auto_regression.py`):
|
|
154
|
+
- Live GPT-2 inference in the animation
|
|
155
|
+
- Bar chart of next-token probabilities
|
|
156
|
+
- Random sampling with animated highlight
|
|
157
|
+
|
|
158
|
+
```python
|
|
159
|
+
def gpt2_predict_next_token(text, n_shown=7):
|
|
160
|
+
tokenizer = get_gpt2_tokenizer()
|
|
161
|
+
model = get_gpt2_model()
|
|
162
|
+
# Returns top tokens and probabilities
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
**The full auto-regressive loop:**
|
|
166
|
+
```python
|
|
167
|
+
class SimpleAutogregression(InteractiveScene):
|
|
168
|
+
def new_selection_cycle(self, text_mob, ...):
|
|
169
|
+
self.animate_text_input(text_mob, machine)
|
|
170
|
+
bar_groups = self.animate_prediction_output(machine, self.cur_str)
|
|
171
|
+
self.animate_random_sample(bar_groups)
|
|
172
|
+
new_text_mob = self.animate_word_addition(bar_groups, text_mob, ...)
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
### ML Taxonomy
|
|
176
|
+
|
|
177
|
+
**From `_2024/transformers/ml_basics.py`:**
|
|
178
|
+
|
|
179
|
+
Nested boxes showing: AI > Machine Learning > Deep Learning > Transformers
|
|
180
|
+
|
|
181
|
+
```python
|
|
182
|
+
class MLWithinDeepL(InteractiveScene):
|
|
183
|
+
def get_titled_box(self, text, color):
|
|
184
|
+
title = Text(text)
|
|
185
|
+
box = Rectangle(...)
|
|
186
|
+
box.set_fill(interpolate_color(BLACK, color, opacity), 1)
|
|
187
|
+
box.set_stroke(color, 2)
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
---
|
|
191
|
+
|
|
192
|
+
## The 3D Block Metaphor
|
|
193
|
+
|
|
194
|
+
Transformer blocks rendered as 3D prisms create a visceral sense of data flowing through layers:
|
|
195
|
+
|
|
196
|
+
```python
|
|
197
|
+
blocks = VGroup(VPrism(3, 2, 0.2) for n in range(10))
|
|
198
|
+
blocks.set_fill(GREY_D, 1)
|
|
199
|
+
blocks.set_shading(0.25, 0.5, 0.2)
|
|
200
|
+
blocks.arrange(OUT)
|
|
201
|
+
blocks.rotate(self.machine_phi, RIGHT, about_edge=OUT)
|
|
202
|
+
blocks.rotate(self.machine_theta, UP, about_edge=OUT)
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
Data flows INTO the stack, through each layer, and OUT the other side. This is more engaging than flat 2D diagrams.
|
|
206
|
+
|
|
207
|
+
---
|
|
208
|
+
|
|
209
|
+
## Live Model Integration
|
|
210
|
+
|
|
211
|
+
A unique technique in the Transformer series: actual model inference happens during the animation.
|
|
212
|
+
|
|
213
|
+
**GPT-2** (local, via HuggingFace transformers):
|
|
214
|
+
```python
|
|
215
|
+
from transformers import GPT2Tokenizer, GPT2LMHeadModel
|
|
216
|
+
import torch
|
|
217
|
+
|
|
218
|
+
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
|
|
219
|
+
model = GPT2LMHeadModel.from_pretrained('gpt2')
|
|
220
|
+
```
|
|
221
|
+
|
|
222
|
+
**GPT-3** (API, via OpenAI):
|
|
223
|
+
```python
|
|
224
|
+
openai.api_key = os.getenv('OPENAI_KEY')
|
|
225
|
+
response = openai.Completion.create(engine="gpt-3.5-turbo-instruct", ...)
|
|
226
|
+
```
|
|
227
|
+
|
|
228
|
+
**When to use live model integration:**
|
|
229
|
+
- Demonstrating next-token prediction
|
|
230
|
+
- Showing probability distributions over vocabulary
|
|
231
|
+
- Illustrating temperature effects on sampling
|
|
232
|
+
- Making the abstract architecture tangible
|
|
233
|
+
|
|
234
|
+
---
|
|
235
|
+
|
|
236
|
+
## Storytelling Patterns for ML
|
|
237
|
+
|
|
238
|
+
### The "Black Box Then Open It" Pattern
|
|
239
|
+
|
|
240
|
+
1. Show the model as a black box: input goes in, output comes out
|
|
241
|
+
2. "But what's happening inside?"
|
|
242
|
+
3. Open one component at a time
|
|
243
|
+
4. By the end, the black box is transparent
|
|
244
|
+
|
|
245
|
+
This is the structure of the entire Transformer series: Part 1 is the black box overview, subsequent parts open each component.
|
|
246
|
+
|
|
247
|
+
### The "Why This Architecture?" Pattern
|
|
248
|
+
|
|
249
|
+
Don't just explain what attention does -- explain why you'd INVENT it:
|
|
250
|
+
1. "What problem are we solving?" (context mixing)
|
|
251
|
+
2. "What's the naive approach?" (fully connected layers)
|
|
252
|
+
3. "Why doesn't that scale?" (parameter explosion)
|
|
253
|
+
4. "What if we could selectively attend?" (attention mechanism)
|
|
254
|
+
|
|
255
|
+
### Concrete Before Abstract
|
|
256
|
+
|
|
257
|
+
- Show MNIST before discussing abstract neural networks
|
|
258
|
+
- Show GPT generating text before discussing transformer architecture
|
|
259
|
+
- Show a specific attention pattern before the general mechanism
|
|
260
|
+
|
|
261
|
+
---
|
|
262
|
+
|
|
263
|
+
## Color Conventions for ML
|
|
264
|
+
|
|
265
|
+
| Element | Color | Context |
|
|
266
|
+
|---------|-------|---------|
|
|
267
|
+
| Positive weights | BLUE gradient | Weight matrices |
|
|
268
|
+
| Negative weights | RED gradient | Weight matrices |
|
|
269
|
+
| Activation (high) | Bright/opaque | Neuron activations |
|
|
270
|
+
| Activation (low) | Dark/transparent | Neuron activations |
|
|
271
|
+
| Input data | BLUE | Network flow |
|
|
272
|
+
| Output/prediction | GREEN or YELLOW | Network flow |
|
|
273
|
+
| Attention arcs | Random bright colors | Token-to-token attention |
|
|
274
|
+
| Embedding vectors | Color-coded per entry | Using value_to_color() |
|
|
275
|
+
| Token rectangles | Pastel variants | Tokenization display |
|
|
276
|
+
|
|
277
|
+
---
|
|
278
|
+
|
|
279
|
+
## Common Pitfalls in ML Videos
|
|
280
|
+
|
|
281
|
+
1. **Architecture diagram without motivation**: Don't show the transformer diagram first. Show the PROBLEM first (language modeling), then derive the architecture.
|
|
282
|
+
2. **Too many numbers**: Weight matrices should be visualized as color grids, not numerical tables.
|
|
283
|
+
3. **Skipping the "why" of softmax**: Don't just say "apply softmax." Show WHY you need to normalize, and why exponentials are the right way.
|
|
284
|
+
4. **Static architecture diagrams**: Use animation to show data FLOWING through the network. LaggedStart, VShowPassingFlash, and arc animations create flow.
|
|
285
|
+
5. **Ignoring scale**: Mention that real models have billions of parameters. The visualized 4x4 matrix represents a concept that scales to millions of dimensions.
|
|
286
|
+
6. **Not grounding in real outputs**: Use actual model outputs (GPT-2 predictions) to make the abstract architecture tangible.
|
|
@@ -0,0 +1,187 @@
|
|
|
1
|
+
# Number Theory Video Planning Guide
|
|
2
|
+
|
|
3
|
+
Based on the Zeta function series (2016, 2022), prime spirals (2019), Euler's identity explorations, and number-theoretic content across the 3b1b codebase.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Core Visual Language
|
|
8
|
+
|
|
9
|
+
### Prime Number Visualization
|
|
10
|
+
|
|
11
|
+
**Prime-colored number lines** (`_2022/zeta/part1.py`):
|
|
12
|
+
```python
|
|
13
|
+
PRIME_COLOR = YELLOW # Primes are always YELLOW
|
|
14
|
+
|
|
15
|
+
def get_prime_animation(self, numberline, prime):
|
|
16
|
+
point = numberline.n2p(prime)
|
|
17
|
+
dot = GlowDot(point, color=PRIME_COLOR)
|
|
18
|
+
arrow = Vector(DOWN, color=PRIME_COLOR)
|
|
19
|
+
return AnimationGroup(
|
|
20
|
+
ShowCreation(arrow),
|
|
21
|
+
MoveToTarget(dot, rate_func=rush_into),
|
|
22
|
+
numberline.numbers[prime].animate.set_color(PRIME_COLOR),
|
|
23
|
+
)
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
**Staggered prime reveal** using `time_span`:
|
|
27
|
+
```python
|
|
28
|
+
# Primes appear one at a time along the number line
|
|
29
|
+
all_prime_animations.append(self.get_prime_animation(
|
|
30
|
+
numberline, n,
|
|
31
|
+
run_time=11,
|
|
32
|
+
time_span=(t, t + 1) # Staggered timing within one play() call
|
|
33
|
+
))
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
### Prime Spirals
|
|
37
|
+
|
|
38
|
+
**From `_2019/spirals.py`:**
|
|
39
|
+
|
|
40
|
+
Plot natural numbers in polar coordinates: number n at angle n radians, radius sqrt(n). Primes form visible spiral arms!
|
|
41
|
+
|
|
42
|
+
**Ulam spiral variant**: Numbers on a square spiral, primes highlighted. Diagonal lines emerge.
|
|
43
|
+
|
|
44
|
+
### Complex Plane Visualization
|
|
45
|
+
|
|
46
|
+
**For zeta function, Euler's identity, etc.:**
|
|
47
|
+
```python
|
|
48
|
+
plane = ComplexPlane()
|
|
49
|
+
# Plot function as a curve in the complex plane
|
|
50
|
+
zeta_spiral = ParametricCurve(
|
|
51
|
+
lambda t: plane.n2p(complex(mpmath.zeta(complex(0.5, t)))),
|
|
52
|
+
t_range=(0, 42),
|
|
53
|
+
)
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
**Phase coloring:**
|
|
57
|
+
```python
|
|
58
|
+
def z_to_color(z, sat=0.5, lum=0.5):
|
|
59
|
+
angle = math.atan2(z.imag, z.real)
|
|
60
|
+
return Color(hsl=(angle / TAU, sat, lum))
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
---
|
|
64
|
+
|
|
65
|
+
## Topic Planning
|
|
66
|
+
|
|
67
|
+
### The Riemann Zeta Function
|
|
68
|
+
|
|
69
|
+
**From `_2016/zeta.py` and `_2022/zeta/part1.py`:**
|
|
70
|
+
|
|
71
|
+
**Structure:**
|
|
72
|
+
1. Start with the sum: 1 + 1/4 + 1/9 + ... = pi^2/6 (Basel problem hook)
|
|
73
|
+
2. Generalize: zeta(s) = sum of 1/n^s
|
|
74
|
+
3. Visualize on the number line (prime density)
|
|
75
|
+
4. Move to the complex plane (analytic continuation)
|
|
76
|
+
5. The critical strip and the hypothesis
|
|
77
|
+
|
|
78
|
+
**Prime density visualization** (`ShowPrimeDensity`):
|
|
79
|
+
- Number line with primes highlighted in YELLOW
|
|
80
|
+
- Logarithmic weighting shows primes thin out
|
|
81
|
+
- 1/ln(x) approximation overlaid
|
|
82
|
+
|
|
83
|
+
**The zeta spiral**: Plotting zeta(1/2 + it) as a parametric curve in the complex plane. The zeros are where the curve passes through the origin.
|
|
84
|
+
|
|
85
|
+
### Euler's Identity and e^(i*pi)
|
|
86
|
+
|
|
87
|
+
**From `_2019/diffyq/` Part 5 and various:**
|
|
88
|
+
|
|
89
|
+
**The rotational interpretation:**
|
|
90
|
+
1. e^x as exponential growth on the number line
|
|
91
|
+
2. Move to e^(it): now it's rotation in the complex plane
|
|
92
|
+
3. e^(i*pi) = walking pi radians around the unit circle = -1
|
|
93
|
+
4. "The most beautiful equation": e^(i*pi) + 1 = 0
|
|
94
|
+
|
|
95
|
+
**Visual technique**: Point walking around the unit circle, with trail showing the path.
|
|
96
|
+
|
|
97
|
+
### Modular Arithmetic and Number Patterns
|
|
98
|
+
|
|
99
|
+
**From `_2019/spirals.py`:**
|
|
100
|
+
|
|
101
|
+
**Residue patterns**: Color numbers by their residue mod n. In the spiral, same-residue numbers form spiral arms.
|
|
102
|
+
|
|
103
|
+
**Clock arithmetic visualization**: Numbers on a circle, connecting those related by a multiplication.
|
|
104
|
+
|
|
105
|
+
### Subset Sum and Integers (`_2022/puzzles/subsets.py`)
|
|
106
|
+
|
|
107
|
+
**Set display:**
|
|
108
|
+
```python
|
|
109
|
+
def get_set_tex(values, max_shown=7):
|
|
110
|
+
# {1, 2, 3, ..., 2000}
|
|
111
|
+
# Handles ellipsis for large sets
|
|
112
|
+
|
|
113
|
+
def get_subset_highlights(set_tex, subset):
|
|
114
|
+
# Rounded rectangles around specific elements
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
**Connection to zeta function**: The "unreasonable usefulness of complex numbers" for discrete problems.
|
|
118
|
+
|
|
119
|
+
### Combinatorics and Counting
|
|
120
|
+
|
|
121
|
+
**Moser's Circle Problem** (`_2023/moser_reboot/main.py`):
|
|
122
|
+
|
|
123
|
+
```python
|
|
124
|
+
def moser(n):
|
|
125
|
+
return choose(n, 4) + choose(n, 2) + 1
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
**Circle diagram**:
|
|
129
|
+
```python
|
|
130
|
+
points = [circle.pfp(a) for a in np.arange(0, 1, 1/n)]
|
|
131
|
+
chords = VGroup(
|
|
132
|
+
Line(p1, p2).set_stroke(BLUE_B, 1)
|
|
133
|
+
for p1, p2 in it.combinations(points, 2)
|
|
134
|
+
)
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
**Pascal's Triangle** (`_2018/eop/pascal.py`): Visual construction with highlighting of patterns (even/odd, mod 3, etc.)
|
|
138
|
+
|
|
139
|
+
---
|
|
140
|
+
|
|
141
|
+
## Storytelling Patterns for Number Theory
|
|
142
|
+
|
|
143
|
+
### The "Pattern That Breaks" Hook
|
|
144
|
+
|
|
145
|
+
Number theory is perfect for this: patterns seem obvious, then fail.
|
|
146
|
+
|
|
147
|
+
- 1, 2, 4, 8, 16... then 31 (Moser)
|
|
148
|
+
- Borwein integrals equal pi... until they don't
|
|
149
|
+
- Fermat numbers seem prime... until they aren't
|
|
150
|
+
|
|
151
|
+
### The "Unexpected Connection" Arc
|
|
152
|
+
|
|
153
|
+
Number theory's greatest moments come from surprising connections:
|
|
154
|
+
- Primes and pi (Basel problem, prime counting)
|
|
155
|
+
- Primes and complex analysis (zeta function)
|
|
156
|
+
- Discrete problems and continuous methods
|
|
157
|
+
|
|
158
|
+
### The "Visualization Reveals Structure" Pattern
|
|
159
|
+
|
|
160
|
+
Number theory concepts that seem abstract become obvious when visualized:
|
|
161
|
+
- Prime spirals: plot numbers in polar coordinates, primes form patterns
|
|
162
|
+
- Gaussian integers: lattice points in the complex plane
|
|
163
|
+
- Modular arithmetic: clock face with connections
|
|
164
|
+
|
|
165
|
+
---
|
|
166
|
+
|
|
167
|
+
## Color Conventions for Number Theory
|
|
168
|
+
|
|
169
|
+
| Element | Color | Context |
|
|
170
|
+
|---------|-------|---------|
|
|
171
|
+
| Primes | YELLOW | Universal across 3b1b |
|
|
172
|
+
| Composite numbers | WHITE/GREY | Default |
|
|
173
|
+
| Even numbers | BLUE | Residue coloring |
|
|
174
|
+
| Odd numbers | RED | Residue coloring |
|
|
175
|
+
| Complex values | Phase-colored (HSL by angle) | Zeta function |
|
|
176
|
+
| Set elements (highlighted) | TEAL or MAROON | Subset problems |
|
|
177
|
+
| Fibonacci/special sequences | GREEN | Sequence visualization |
|
|
178
|
+
|
|
179
|
+
---
|
|
180
|
+
|
|
181
|
+
## Common Pitfalls in Number Theory Videos
|
|
182
|
+
|
|
183
|
+
1. **Starting with the formalism**: "Let s = sigma + it where sigma > 1..." -- NO. Start with "what's the sum of 1/n^2?"
|
|
184
|
+
2. **Not enough visual grounding**: Number theory CAN feel abstract. Every concept needs a picture -- a number line, a spiral, a complex plane.
|
|
185
|
+
3. **Assuming too much background**: Explain complex numbers before using them for zeta. Show what "analytic continuation" means visually.
|
|
186
|
+
4. **Forgetting the "so what?"**: After proving a result, connect it to something the viewer cares about. "This is why your phone's encryption works."
|
|
187
|
+
5. **Not using the surprise factor**: Number theory is FULL of surprises. Lean into them. "Wait, pi shows up in PRIME NUMBERS?"
|