j1-template 2022.4.2 → 2022.4.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (40) hide show
  1. checksums.yaml +4 -4
  2. data/assets/themes/j1/core/js/template.min.js.map +1 -1
  3. data/assets/themes/j1/modules/vega/js/vega-lite/README.md +0 -13
  4. data/lib/j1/version.rb +1 -1
  5. data/lib/starter_web/Gemfile +1 -1
  6. data/lib/starter_web/_config.yml +1 -1
  7. data/lib/starter_web/_data/modules/defaults/nbinteract.yml +1 -1
  8. data/lib/starter_web/_data/modules/navigator_menu.yml +60 -73
  9. data/lib/starter_web/_data/modules/nbinteract.yml +291 -314
  10. data/lib/starter_web/_plugins/lunr_index.rb +1 -1
  11. data/lib/starter_web/assets/images/modules/attics/shubham-dhage-2-1920x1280.jpg +0 -0
  12. data/lib/starter_web/package.json +1 -1
  13. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/j1_altair_interactive.html +2216 -0
  14. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/j1_altair_non_interactive.html +1170 -0
  15. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/j1_bokeh_01_basic_plotting.html +1479 -0
  16. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/j1_bokeh_02_styling_and_theming.html +1524 -0
  17. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/j1_bokeh_03_data_sources_and_transformations.html +983 -0
  18. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/j1_bokeh_04_adding_annotations.html +1280 -0
  19. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/j1_bokeh_05_presentation_layouts.html +660 -0
  20. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/j1_bokeh_06_linking_and_interactions.html +1563 -0
  21. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/j1_bokeh_07_bar_and_categorical_data_plots.html +1888 -0
  22. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/j1_bokeh_08_graph_and_network_plots.html +689 -0
  23. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/j1_bokeh_09_geographic_plots.html +767 -0
  24. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/j1_circular_times_table.html +2 -1
  25. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/j1_interactive_widgets.html +21 -0
  26. data/lib/starter_web/utilsrv/_defaults/package.json +1 -1
  27. data/lib/starter_web/utilsrv/package.json +1 -1
  28. metadata +14 -14
  29. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/nbi_docs_examples_central_limit_theorem.html +0 -290
  30. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/nbi_docs_examples_correlation.html +0 -818
  31. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/nbi_docs_examples_empirical_distributions.html +0 -351
  32. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/nbi_docs_examples_linear_regression.html +0 -106
  33. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/nbi_docs_examples_probability_distribution_plots.html +0 -228
  34. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/nbi_docs_examples_sampling_from_a_population.html +0 -518
  35. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/nbi_docs_examples_variability_of_the_sample_mean.html +0 -372
  36. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/nbi_docs_recipes_graphing.html +0 -473
  37. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/nbi_docs_recipes_interactive_questions.html +0 -242
  38. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/nbi_docs_recipes_layout.html +0 -496
  39. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/nbi_docs_tutorial_interact.html +0 -329
  40. data/lib/starter_web/pages/public/jupyter/notebooks/textbooks/nbi_docs_tutorial_monty_hall.html +0 -866
@@ -1,818 +0,0 @@
1
- <div class="cell text_cell">
2
- <button class="js-nbinteract-widget">
3
- Loading widgets...
4
- </button>
5
- </div>
6
-
7
-
8
-
9
-
10
-
11
-
12
- <div class="nbinteract-hide_in
13
- cell border-box-sizing code_cell rendered">
14
- <div class="input">
15
-
16
- <div class="inner_cell">
17
- <div class="input_area">
18
- <div class=" highlight hl-ipython3"><pre><span></span><span class="c1"># HIDDEN</span>
19
- <span class="kn">from</span> <span class="nn">datascience</span> <span class="kn">import</span> <span class="o">*</span>
20
- <span class="o">%</span><span class="k">matplotlib</span> inline
21
- <span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plots</span>
22
- <span class="n">plots</span><span class="o">.</span><span class="n">style</span><span class="o">.</span><span class="n">use</span><span class="p">(</span><span class="s1">&#39;fivethirtyeight&#39;</span><span class="p">)</span>
23
- <span class="kn">import</span> <span class="nn">math</span>
24
- <span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
25
- <span class="kn">from</span> <span class="nn">scipy</span> <span class="kn">import</span> <span class="n">stats</span>
26
- <span class="kn">from</span> <span class="nn">ipywidgets</span> <span class="kn">import</span> <span class="n">interact</span><span class="p">,</span> <span class="n">interactive</span><span class="p">,</span> <span class="n">fixed</span><span class="p">,</span> <span class="n">interact_manual</span>
27
- <span class="kn">import</span> <span class="nn">ipywidgets</span> <span class="k">as</span> <span class="nn">widgets</span>
28
- <span class="kn">import</span> <span class="nn">nbinteract</span> <span class="k">as</span> <span class="nn">nbi</span>
29
- </pre></div>
30
-
31
- </div>
32
- </div>
33
- </div>
34
-
35
- </div>
36
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
37
- <div class="text_cell_render border-box-sizing rendered_html">
38
- <h3 id="Correlation">Correlation<a class="anchor-link" href="#Correlation">&#182;</a></h3><p>In this section we will develop a measure of how tightly clustered a scatter diagram is about a straight line. Formally, this is called measuring <em>linear association</em>.</p>
39
-
40
- </div>
41
- </div>
42
- </div>
43
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
44
- <div class="text_cell_render border-box-sizing rendered_html">
45
- <h4 id="The-correlation-coefficient">The correlation coefficient<a class="anchor-link" href="#The-correlation-coefficient">&#182;</a></h4><p>The <em>correlation coefficient</em> measures the strength of the linear relationship between two variables. Graphically, it measures how clustered the scatter diagram is around a straight line.</p>
46
- <p>The term <em>correlation coefficient</em> is a quite long word, so usually the term shortened to <em>correlation</em> and denoted by $r$.</p>
47
- <p>Here are some mathematical facts about $r$ that we will just observe by simulation.</p>
48
- <ul>
49
- <li>The correlation coefficient $r$ is a number between -1 and 1.</li>
50
- <li>$r$ measures the extent to which the scatter plot clusters around a straight line.</li>
51
- <li>$r$ = 1 if the scatter diagram is a perfect straight line sloping upwards, and $r$ = -1 if the scatter diagram is a perfect straight line sloping downwards.</li>
52
- </ul>
53
-
54
- </div>
55
- </div>
56
- </div>
57
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
58
- <div class="text_cell_render border-box-sizing rendered_html">
59
- <p>The function <code>r_scatter</code> takes a value of $r$ as its argument and simulates a scatter plot with a correlation very close to $r$. Because of randomness in the simulation, the correlation is not expected to be exactly equal to $r$.</p>
60
- <p>Call <code>r_scatter</code> a few times, with different values of $r$ as the argument, and see how the scatter plot changes.</p>
61
- <p>When $r$ = 1 the scatter plot is perfectly linear and slopes upward. When $r$ = -1, the scatter plot is perfectly linear and slopes downward. When $r$ = 0, the scatter plot is a formless cloud around the horizontal axis, and the variables are said to be <em>uncorrelated</em>.</p>
62
-
63
- </div>
64
- </div>
65
- </div>
66
-
67
-
68
-
69
- <div class="
70
- cell border-box-sizing code_cell rendered">
71
- <div class="input">
72
-
73
- <div class="inner_cell">
74
- <div class="input_area">
75
- <div class=" highlight hl-ipython3"><pre><span></span><span class="n">z</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">normal</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">500</span><span class="p">)</span>
76
- <span class="k">def</span> <span class="nf">r_scatter</span><span class="p">(</span><span class="n">xs</span><span class="p">,</span> <span class="n">r</span><span class="p">):</span>
77
- <span class="sd">&quot;&quot;&quot;</span>
78
- <span class="sd"> Generate y-values for a scatter plot with correlation approximately r</span>
79
- <span class="sd"> &quot;&quot;&quot;</span>
80
- <span class="k">return</span> <span class="n">r</span><span class="o">*</span><span class="n">xs</span> <span class="o">+</span> <span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">(</span><span class="mi">1</span><span class="o">-</span><span class="n">r</span><span class="o">**</span><span class="mi">2</span><span class="p">))</span><span class="o">*</span><span class="n">z</span>
81
-
82
- <span class="n">corr_opts</span> <span class="o">=</span> <span class="p">{</span>
83
- <span class="s1">&#39;aspect_ratio&#39;</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span>
84
- <span class="s1">&#39;xlim&#39;</span><span class="p">:</span> <span class="p">(</span><span class="o">-</span><span class="mf">3.5</span><span class="p">,</span> <span class="mf">3.5</span><span class="p">),</span>
85
- <span class="s1">&#39;ylim&#39;</span><span class="p">:</span> <span class="p">(</span><span class="o">-</span><span class="mf">3.5</span><span class="p">,</span> <span class="mf">3.5</span><span class="p">),</span>
86
- <span class="p">}</span>
87
-
88
- <span class="n">nbi</span><span class="o">.</span><span class="n">scatter</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">normal</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">500</span><span class="p">),</span> <span class="n">r_scatter</span><span class="p">,</span> <span class="n">options</span><span class="o">=</span><span class="n">corr_opts</span><span class="p">,</span> <span class="n">r</span><span class="o">=</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mf">0.05</span><span class="p">))</span>
89
- </pre></div>
90
-
91
- </div>
92
- </div>
93
- </div>
94
-
95
- <div class="output_wrapper">
96
- <div class="output">
97
-
98
-
99
- <div class="output_area">
100
-
101
-
102
-
103
-
104
-
105
- <div class="output_subarea output_widget_view ">
106
- <button class="js-nbinteract-widget">
107
- Loading widgets...
108
- </button>
109
- </div>
110
-
111
- </div>
112
-
113
- </div>
114
- </div>
115
-
116
- </div>
117
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
118
- <div class="text_cell_render border-box-sizing rendered_html">
119
- <h4 id="Calculating-the-correlation">Calculating the correlation<a class="anchor-link" href="#Calculating-the-correlation">&#182;</a></h4><p>The formula for $r$ is not apparent from our observations so far. It has a mathematical basis that is outside the scope of this class. However, as you will see, the calculation is straightforward and helps us understand several of the properties of $r$.</p>
120
- <p><strong>Formula</strong> for $r$:</p>
121
- <p>$r$ is the <strong>average of the products of the two variables</strong>, when both variables are measured in standard units.</p>
122
- <p>Here are the steps in the calculation. We will apply the steps to a simple table of values of <strong>x</strong> and <strong>y</strong>.</p>
123
-
124
- </div>
125
- </div>
126
- </div>
127
-
128
-
129
-
130
- <div class="
131
- cell border-box-sizing code_cell rendered">
132
- <div class="input">
133
-
134
- <div class="inner_cell">
135
- <div class="input_area">
136
- <div class=" highlight hl-ipython3"><pre><span></span><span class="n">x</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
137
- <span class="n">y</span> <span class="o">=</span> <span class="n">make_array</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">7</span><span class="p">)</span>
138
- <span class="n">t</span> <span class="o">=</span> <span class="n">Table</span><span class="p">()</span><span class="o">.</span><span class="n">with_columns</span><span class="p">(</span>
139
- <span class="s1">&#39;x&#39;</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span>
140
- <span class="s1">&#39;y&#39;</span><span class="p">,</span> <span class="n">y</span>
141
- <span class="p">)</span>
142
- <span class="n">t</span>
143
- </pre></div>
144
-
145
- </div>
146
- </div>
147
- </div>
148
-
149
- <div class="output_wrapper">
150
- <div class="output">
151
-
152
-
153
- <div class="output_area">
154
-
155
-
156
-
157
-
158
- <div class="output_html rendered_html output_subarea output_execute_result">
159
- <table border="1" class="dataframe">
160
- <thead>
161
- <tr>
162
- <th>x</th> <th>y</th>
163
- </tr>
164
- </thead>
165
- <tbody>
166
- <tr>
167
- <td>1 </td> <td>2 </td>
168
- </tr>
169
- <tr>
170
- <td>2 </td> <td>3 </td>
171
- </tr>
172
- <tr>
173
- <td>3 </td> <td>1 </td>
174
- </tr>
175
- <tr>
176
- <td>4 </td> <td>5 </td>
177
- </tr>
178
- <tr>
179
- <td>5 </td> <td>2 </td>
180
- </tr>
181
- <tr>
182
- <td>6 </td> <td>7 </td>
183
- </tr>
184
- </tbody>
185
- </table>
186
- </div>
187
-
188
- </div>
189
-
190
- </div>
191
- </div>
192
-
193
- </div>
194
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
195
- <div class="text_cell_render border-box-sizing rendered_html">
196
- <p>Based on the scatter diagram, we expect that $r$ will be positive but not equal to 1.</p>
197
-
198
- </div>
199
- </div>
200
- </div>
201
-
202
-
203
-
204
- <div class="
205
- cell border-box-sizing code_cell rendered">
206
- <div class="input">
207
-
208
- <div class="inner_cell">
209
- <div class="input_area">
210
- <div class=" highlight hl-ipython3"><pre><span></span><span class="n">nbi</span><span class="o">.</span><span class="n">scatter</span><span class="p">(</span><span class="n">t</span><span class="o">.</span><span class="n">column</span><span class="p">(</span><span class="mi">0</span><span class="p">),</span> <span class="n">t</span><span class="o">.</span><span class="n">column</span><span class="p">(</span><span class="mi">1</span><span class="p">),</span> <span class="n">options</span><span class="o">=</span><span class="p">{</span><span class="s1">&#39;aspect_ratio&#39;</span><span class="p">:</span> <span class="mi">1</span><span class="p">})</span>
211
- </pre></div>
212
-
213
- </div>
214
- </div>
215
- </div>
216
-
217
- <div class="output_wrapper">
218
- <div class="output">
219
-
220
-
221
- <div class="output_area">
222
-
223
-
224
-
225
-
226
-
227
- <div class="output_subarea output_widget_view ">
228
- <button class="js-nbinteract-widget">
229
- Loading widgets...
230
- </button>
231
- </div>
232
-
233
- </div>
234
-
235
- </div>
236
- </div>
237
-
238
- </div>
239
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
240
- <div class="text_cell_render border-box-sizing rendered_html">
241
- <p><strong>Step 1.</strong> Convert each variable to standard units.</p>
242
-
243
- </div>
244
- </div>
245
- </div>
246
-
247
-
248
-
249
- <div class="
250
- cell border-box-sizing code_cell rendered">
251
- <div class="input">
252
-
253
- <div class="inner_cell">
254
- <div class="input_area">
255
- <div class=" highlight hl-ipython3"><pre><span></span><span class="k">def</span> <span class="nf">standard_units</span><span class="p">(</span><span class="n">nums</span><span class="p">):</span>
256
- <span class="k">return</span> <span class="p">(</span><span class="n">nums</span> <span class="o">-</span> <span class="n">np</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="n">nums</span><span class="p">))</span> <span class="o">/</span> <span class="n">np</span><span class="o">.</span><span class="n">std</span><span class="p">(</span><span class="n">nums</span><span class="p">)</span>
257
- </pre></div>
258
-
259
- </div>
260
- </div>
261
- </div>
262
-
263
- </div>
264
-
265
-
266
-
267
- <div class="
268
- cell border-box-sizing code_cell rendered">
269
- <div class="input">
270
-
271
- <div class="inner_cell">
272
- <div class="input_area">
273
- <div class=" highlight hl-ipython3"><pre><span></span><span class="n">t_su</span> <span class="o">=</span> <span class="n">t</span><span class="o">.</span><span class="n">with_columns</span><span class="p">(</span>
274
- <span class="s1">&#39;x (standard units)&#39;</span><span class="p">,</span> <span class="n">standard_units</span><span class="p">(</span><span class="n">x</span><span class="p">),</span>
275
- <span class="s1">&#39;y (standard units)&#39;</span><span class="p">,</span> <span class="n">standard_units</span><span class="p">(</span><span class="n">y</span><span class="p">)</span>
276
- <span class="p">)</span>
277
- <span class="n">t_su</span>
278
- </pre></div>
279
-
280
- </div>
281
- </div>
282
- </div>
283
-
284
- <div class="output_wrapper">
285
- <div class="output">
286
-
287
-
288
- <div class="output_area">
289
-
290
-
291
-
292
-
293
- <div class="output_html rendered_html output_subarea output_execute_result">
294
- <table border="1" class="dataframe">
295
- <thead>
296
- <tr>
297
- <th>x</th> <th>y</th> <th>x (standard units)</th> <th>y (standard units)</th>
298
- </tr>
299
- </thead>
300
- <tbody>
301
- <tr>
302
- <td>1 </td> <td>2 </td> <td>-1.46385 </td> <td>-0.648886 </td>
303
- </tr>
304
- <tr>
305
- <td>2 </td> <td>3 </td> <td>-0.87831 </td> <td>-0.162221 </td>
306
- </tr>
307
- <tr>
308
- <td>3 </td> <td>1 </td> <td>-0.29277 </td> <td>-1.13555 </td>
309
- </tr>
310
- <tr>
311
- <td>4 </td> <td>5 </td> <td>0.29277 </td> <td>0.811107 </td>
312
- </tr>
313
- <tr>
314
- <td>5 </td> <td>2 </td> <td>0.87831 </td> <td>-0.648886 </td>
315
- </tr>
316
- <tr>
317
- <td>6 </td> <td>7 </td> <td>1.46385 </td> <td>1.78444 </td>
318
- </tr>
319
- </tbody>
320
- </table>
321
- </div>
322
-
323
- </div>
324
-
325
- </div>
326
- </div>
327
-
328
- </div>
329
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
330
- <div class="text_cell_render border-box-sizing rendered_html">
331
- <p><strong>Step 2.</strong> Multiply each pair of standard units.</p>
332
-
333
- </div>
334
- </div>
335
- </div>
336
-
337
-
338
-
339
- <div class="
340
- cell border-box-sizing code_cell rendered">
341
- <div class="input">
342
-
343
- <div class="inner_cell">
344
- <div class="input_area">
345
- <div class=" highlight hl-ipython3"><pre><span></span><span class="n">t_product</span> <span class="o">=</span> <span class="n">t_su</span><span class="o">.</span><span class="n">with_column</span><span class="p">(</span><span class="s1">&#39;product of standard units&#39;</span><span class="p">,</span> <span class="n">t_su</span><span class="o">.</span><span class="n">column</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span> <span class="o">*</span> <span class="n">t_su</span><span class="o">.</span><span class="n">column</span><span class="p">(</span><span class="mi">3</span><span class="p">))</span>
346
- <span class="n">t_product</span>
347
- </pre></div>
348
-
349
- </div>
350
- </div>
351
- </div>
352
-
353
- <div class="output_wrapper">
354
- <div class="output">
355
-
356
-
357
- <div class="output_area">
358
-
359
-
360
-
361
-
362
- <div class="output_html rendered_html output_subarea output_execute_result">
363
- <table border="1" class="dataframe">
364
- <thead>
365
- <tr>
366
- <th>x</th> <th>y</th> <th>x (standard units)</th> <th>y (standard units)</th> <th>product of standard units</th>
367
- </tr>
368
- </thead>
369
- <tbody>
370
- <tr>
371
- <td>1 </td> <td>2 </td> <td>-1.46385 </td> <td>-0.648886 </td> <td>0.949871 </td>
372
- </tr>
373
- <tr>
374
- <td>2 </td> <td>3 </td> <td>-0.87831 </td> <td>-0.162221 </td> <td>0.142481 </td>
375
- </tr>
376
- <tr>
377
- <td>3 </td> <td>1 </td> <td>-0.29277 </td> <td>-1.13555 </td> <td>0.332455 </td>
378
- </tr>
379
- <tr>
380
- <td>4 </td> <td>5 </td> <td>0.29277 </td> <td>0.811107 </td> <td>0.237468 </td>
381
- </tr>
382
- <tr>
383
- <td>5 </td> <td>2 </td> <td>0.87831 </td> <td>-0.648886 </td> <td>-0.569923 </td>
384
- </tr>
385
- <tr>
386
- <td>6 </td> <td>7 </td> <td>1.46385 </td> <td>1.78444 </td> <td>2.61215 </td>
387
- </tr>
388
- </tbody>
389
- </table>
390
- </div>
391
-
392
- </div>
393
-
394
- </div>
395
- </div>
396
-
397
- </div>
398
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
399
- <div class="text_cell_render border-box-sizing rendered_html">
400
- <p><strong>Step 3.</strong> $r$ is the average of the products computed in Step 2.</p>
401
-
402
- </div>
403
- </div>
404
- </div>
405
-
406
-
407
-
408
- <div class="
409
- cell border-box-sizing code_cell rendered">
410
- <div class="input">
411
-
412
- <div class="inner_cell">
413
- <div class="input_area">
414
- <div class=" highlight hl-ipython3"><pre><span></span><span class="c1"># r is the average of the products of standard units</span>
415
-
416
- <span class="n">r</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="n">t_product</span><span class="o">.</span><span class="n">column</span><span class="p">(</span><span class="mi">4</span><span class="p">))</span>
417
- <span class="n">r</span>
418
- </pre></div>
419
-
420
- </div>
421
- </div>
422
- </div>
423
-
424
- <div class="output_wrapper">
425
- <div class="output">
426
-
427
-
428
- <div class="output_area">
429
-
430
-
431
-
432
-
433
-
434
- <div class="output_text output_subarea output_execute_result">
435
- <pre>0.6174163971897709</pre>
436
- </div>
437
-
438
- </div>
439
-
440
- </div>
441
- </div>
442
-
443
- </div>
444
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
445
- <div class="text_cell_render border-box-sizing rendered_html">
446
- <p>As expected, $r$ is positive but not equal to 1.</p>
447
-
448
- </div>
449
- </div>
450
- </div>
451
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
452
- <div class="text_cell_render border-box-sizing rendered_html">
453
- <h4 id="Properties-of-$r$">Properties of $r$<a class="anchor-link" href="#Properties-of-$r$">&#182;</a></h4><p>The calculation shows that:</p>
454
- <ul>
455
- <li>$r$ is a pure number. It has no units. This is because $r$ is based on standard units.</li>
456
- <li>$r$ is unaffected by changing the units on either axis. This too is because $r$ is based on standard units.</li>
457
- <li>$r$ is unaffected by switching the axes. Algebraically, this is because the product of standard units does not depend on which variable is called <strong>x</strong> and which <strong>y</strong>. Geometrically, switching axes reflects the scatter plot about the line <strong>y = x</strong>, but does not change the amount of clustering nor the sign of the association.</li>
458
- </ul>
459
-
460
- </div>
461
- </div>
462
- </div>
463
-
464
-
465
-
466
- <div class="
467
- cell border-box-sizing code_cell rendered">
468
- <div class="input">
469
-
470
- <div class="inner_cell">
471
- <div class="input_area">
472
- <div class=" highlight hl-ipython3"><pre><span></span><span class="n">nbi</span><span class="o">.</span><span class="n">scatter</span><span class="p">(</span><span class="n">t</span><span class="o">.</span><span class="n">column</span><span class="p">(</span><span class="mi">1</span><span class="p">),</span> <span class="n">t</span><span class="o">.</span><span class="n">column</span><span class="p">(</span><span class="mi">0</span><span class="p">),</span> <span class="n">options</span><span class="o">=</span><span class="p">{</span><span class="s1">&#39;aspect_ratio&#39;</span><span class="p">:</span> <span class="mi">1</span><span class="p">})</span>
473
- </pre></div>
474
-
475
- </div>
476
- </div>
477
- </div>
478
-
479
- <div class="output_wrapper">
480
- <div class="output">
481
-
482
-
483
- <div class="output_area">
484
-
485
-
486
-
487
-
488
-
489
- <div class="output_subarea output_widget_view ">
490
- <button class="js-nbinteract-widget">
491
- Loading widgets...
492
- </button>
493
- </div>
494
-
495
- </div>
496
-
497
- </div>
498
- </div>
499
-
500
- </div>
501
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
502
- <div class="text_cell_render border-box-sizing rendered_html">
503
- <h4 id="The-correlation-function">The correlation function<a class="anchor-link" href="#The-correlation-function">&#182;</a></h4><p>We are going to be calculating correlations repeatedly, so it will help to define a function that computes it by performing all the steps described above. Let's define a function <code>correlation</code> that takes a table and the labels of two columns in the table. The function returns $r$, the mean of the products of those column values in standard units.</p>
504
-
505
- </div>
506
- </div>
507
- </div>
508
-
509
-
510
-
511
- <div class="
512
- cell border-box-sizing code_cell rendered">
513
- <div class="input">
514
-
515
- <div class="inner_cell">
516
- <div class="input_area">
517
- <div class=" highlight hl-ipython3"><pre><span></span><span class="k">def</span> <span class="nf">correlation</span><span class="p">(</span><span class="n">t</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
518
- <span class="k">return</span> <span class="n">np</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="n">standard_units</span><span class="p">(</span><span class="n">t</span><span class="o">.</span><span class="n">column</span><span class="p">(</span><span class="n">x</span><span class="p">))</span><span class="o">*</span><span class="n">standard_units</span><span class="p">(</span><span class="n">t</span><span class="o">.</span><span class="n">column</span><span class="p">(</span><span class="n">y</span><span class="p">)))</span>
519
- </pre></div>
520
-
521
- </div>
522
- </div>
523
- </div>
524
-
525
- </div>
526
-
527
-
528
-
529
- <div class="
530
- cell border-box-sizing code_cell rendered">
531
- <div class="input">
532
-
533
- <div class="inner_cell">
534
- <div class="input_area">
535
- <div class=" highlight hl-ipython3"><pre><span></span><span class="n">interact</span><span class="p">(</span><span class="n">correlation</span><span class="p">,</span> <span class="n">t</span><span class="o">=</span><span class="n">fixed</span><span class="p">(</span><span class="n">t</span><span class="p">),</span>
536
- <span class="n">x</span><span class="o">=</span><span class="n">widgets</span><span class="o">.</span><span class="n">ToggleButtons</span><span class="p">(</span><span class="n">options</span><span class="o">=</span><span class="p">[</span><span class="s1">&#39;x&#39;</span><span class="p">,</span> <span class="s1">&#39;y&#39;</span><span class="p">],</span> <span class="n">description</span><span class="o">=</span><span class="s1">&#39;x-axis&#39;</span><span class="p">),</span>
537
- <span class="n">y</span><span class="o">=</span><span class="n">widgets</span><span class="o">.</span><span class="n">ToggleButtons</span><span class="p">(</span><span class="n">options</span><span class="o">=</span><span class="p">[</span><span class="s1">&#39;x&#39;</span><span class="p">,</span> <span class="s1">&#39;y&#39;</span><span class="p">],</span> <span class="n">description</span><span class="o">=</span><span class="s1">&#39;y-axis&#39;</span><span class="p">))</span>
538
- </pre></div>
539
-
540
- </div>
541
- </div>
542
- </div>
543
-
544
- <div class="output_wrapper">
545
- <div class="output">
546
-
547
-
548
- <div class="output_area">
549
-
550
-
551
-
552
-
553
-
554
- <div class="output_subarea output_widget_view ">
555
- <button class="js-nbinteract-widget">
556
- Loading widgets...
557
- </button>
558
- </div>
559
-
560
- </div>
561
-
562
- <div class="output_area">
563
-
564
-
565
-
566
-
567
-
568
- <div class="output_text output_subarea output_execute_result">
569
- <pre>&lt;function __main__.correlation(t, x, y)&gt;</pre>
570
- </div>
571
-
572
- </div>
573
-
574
- </div>
575
- </div>
576
-
577
- </div>
578
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
579
- <div class="text_cell_render border-box-sizing rendered_html">
580
- <p>Let's call the function on the <code>x</code> and <code>y</code> columns of <code>t</code>. The function returns the same answer to the correlation between $x$ and $y$ as we got by direct application of the formula for $r$.</p>
581
-
582
- </div>
583
- </div>
584
- </div>
585
-
586
-
587
-
588
- <div class="
589
- cell border-box-sizing code_cell rendered">
590
- <div class="input">
591
-
592
- <div class="inner_cell">
593
- <div class="input_area">
594
- <div class=" highlight hl-ipython3"><pre><span></span><span class="n">correlation</span><span class="p">(</span><span class="n">t</span><span class="p">,</span> <span class="s1">&#39;x&#39;</span><span class="p">,</span> <span class="s1">&#39;y&#39;</span><span class="p">)</span>
595
- </pre></div>
596
-
597
- </div>
598
- </div>
599
- </div>
600
-
601
- <div class="output_wrapper">
602
- <div class="output">
603
-
604
-
605
- <div class="output_area">
606
-
607
-
608
-
609
-
610
-
611
- <div class="output_text output_subarea output_execute_result">
612
- <pre>0.6174163971897709</pre>
613
- </div>
614
-
615
- </div>
616
-
617
- </div>
618
- </div>
619
-
620
- </div>
621
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
622
- <div class="text_cell_render border-box-sizing rendered_html">
623
- <p>As we noticed, the order in which the variables are specified doesn't matter.</p>
624
-
625
- </div>
626
- </div>
627
- </div>
628
-
629
-
630
-
631
- <div class="
632
- cell border-box-sizing code_cell rendered">
633
- <div class="input">
634
-
635
- <div class="inner_cell">
636
- <div class="input_area">
637
- <div class=" highlight hl-ipython3"><pre><span></span><span class="n">correlation</span><span class="p">(</span><span class="n">t</span><span class="p">,</span> <span class="s1">&#39;y&#39;</span><span class="p">,</span> <span class="s1">&#39;x&#39;</span><span class="p">)</span>
638
- </pre></div>
639
-
640
- </div>
641
- </div>
642
- </div>
643
-
644
- <div class="output_wrapper">
645
- <div class="output">
646
-
647
-
648
- <div class="output_area">
649
-
650
-
651
-
652
-
653
-
654
- <div class="output_text output_subarea output_execute_result">
655
- <pre>0.6174163971897709</pre>
656
- </div>
657
-
658
- </div>
659
-
660
- </div>
661
- </div>
662
-
663
- </div>
664
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
665
- <div class="text_cell_render border-box-sizing rendered_html">
666
- <p>Calling <code>correlation</code> on columns of the table <code>suv</code> gives us the correlation between price and mileage as well as the correlation between price and acceleration.</p>
667
-
668
- </div>
669
- </div>
670
- </div>
671
-
672
-
673
-
674
- <div class="
675
- cell border-box-sizing code_cell rendered">
676
- <div class="input">
677
-
678
- <div class="inner_cell">
679
- <div class="input_area">
680
- <div class=" highlight hl-ipython3"><pre><span></span><span class="n">suv</span> <span class="o">=</span> <span class="p">(</span><span class="n">Table</span><span class="o">.</span><span class="n">read_table</span><span class="p">(</span><span class="s1">&#39;https://raw.githubusercontent.com/data-8/materials-fa17/master/lec/hybrid.csv&#39;</span><span class="p">)</span>
681
- <span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="s1">&#39;class&#39;</span><span class="p">,</span> <span class="s1">&#39;SUV&#39;</span><span class="p">))</span>
682
-
683
- <span class="n">interact</span><span class="p">(</span><span class="n">correlation</span><span class="p">,</span> <span class="n">t</span><span class="o">=</span><span class="n">fixed</span><span class="p">(</span><span class="n">suv</span><span class="p">),</span>
684
- <span class="n">x</span><span class="o">=</span><span class="n">widgets</span><span class="o">.</span><span class="n">ToggleButtons</span><span class="p">(</span><span class="n">options</span><span class="o">=</span><span class="p">[</span><span class="s1">&#39;mpg&#39;</span><span class="p">,</span> <span class="s1">&#39;msrp&#39;</span><span class="p">,</span> <span class="s1">&#39;acceleration&#39;</span><span class="p">],</span>
685
- <span class="n">description</span><span class="o">=</span><span class="s1">&#39;x-axis&#39;</span><span class="p">),</span>
686
- <span class="n">y</span><span class="o">=</span><span class="n">widgets</span><span class="o">.</span><span class="n">ToggleButtons</span><span class="p">(</span><span class="n">options</span><span class="o">=</span><span class="p">[</span><span class="s1">&#39;mpg&#39;</span><span class="p">,</span> <span class="s1">&#39;msrp&#39;</span><span class="p">,</span> <span class="s1">&#39;acceleration&#39;</span><span class="p">],</span>
687
- <span class="n">description</span><span class="o">=</span><span class="s1">&#39;y-axis&#39;</span><span class="p">))</span>
688
- </pre></div>
689
-
690
- </div>
691
- </div>
692
- </div>
693
-
694
- <div class="output_wrapper">
695
- <div class="output">
696
-
697
-
698
- <div class="output_area">
699
-
700
-
701
-
702
-
703
-
704
- <div class="output_subarea output_widget_view ">
705
- <button class="js-nbinteract-widget">
706
- Loading widgets...
707
- </button>
708
- </div>
709
-
710
- </div>
711
-
712
- <div class="output_area">
713
-
714
-
715
-
716
-
717
-
718
- <div class="output_text output_subarea output_execute_result">
719
- <pre>&lt;function __main__.correlation(t, x, y)&gt;</pre>
720
- </div>
721
-
722
- </div>
723
-
724
- </div>
725
- </div>
726
-
727
- </div>
728
-
729
-
730
-
731
- <div class="
732
- cell border-box-sizing code_cell rendered">
733
- <div class="input">
734
-
735
- <div class="inner_cell">
736
- <div class="input_area">
737
- <div class=" highlight hl-ipython3"><pre><span></span><span class="n">correlation</span><span class="p">(</span><span class="n">suv</span><span class="p">,</span> <span class="s1">&#39;mpg&#39;</span><span class="p">,</span> <span class="s1">&#39;msrp&#39;</span><span class="p">)</span>
738
- </pre></div>
739
-
740
- </div>
741
- </div>
742
- </div>
743
-
744
- <div class="output_wrapper">
745
- <div class="output">
746
-
747
-
748
- <div class="output_area">
749
-
750
-
751
-
752
-
753
-
754
- <div class="output_text output_subarea output_execute_result">
755
- <pre>-0.6667143635709919</pre>
756
- </div>
757
-
758
- </div>
759
-
760
- </div>
761
- </div>
762
-
763
- </div>
764
-
765
-
766
-
767
- <div class="
768
- cell border-box-sizing code_cell rendered">
769
- <div class="input">
770
-
771
- <div class="inner_cell">
772
- <div class="input_area">
773
- <div class=" highlight hl-ipython3"><pre><span></span><span class="n">correlation</span><span class="p">(</span><span class="n">suv</span><span class="p">,</span> <span class="s1">&#39;acceleration&#39;</span><span class="p">,</span> <span class="s1">&#39;msrp&#39;</span><span class="p">)</span>
774
- </pre></div>
775
-
776
- </div>
777
- </div>
778
- </div>
779
-
780
- <div class="output_wrapper">
781
- <div class="output">
782
-
783
-
784
- <div class="output_area">
785
-
786
-
787
-
788
-
789
-
790
- <div class="output_text output_subarea output_execute_result">
791
- <pre>0.48699799279959155</pre>
792
- </div>
793
-
794
- </div>
795
-
796
- </div>
797
- </div>
798
-
799
- </div>
800
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
801
- <div class="text_cell_render border-box-sizing rendered_html">
802
- <p>These values confirm what we had observed:</p>
803
- <ul>
804
- <li>There is a negative association between price and efficiency, whereas the association between price and acceleration is positive.</li>
805
- <li>The linear relation between price and acceleration is a little weaker (correlation about 0.5) than between price and mileage (correlation about -0.67). </li>
806
- </ul>
807
-
808
- </div>
809
- </div>
810
- </div>
811
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
812
- <div class="text_cell_render border-box-sizing rendered_html">
813
- <p>Correlation is a simple and powerful concept, but it is sometimes misused. Before using $r$, it is important to be aware of what correlation does and does not measure.</p>
814
-
815
- </div>
816
- </div>
817
- </div>
818
-