my-markdown-library 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/F24LS_md/ Lecture 4 - Public.md +347 -0
- data/F24LS_md/Lecture 1 - Introduction and Overview.md +327 -0
- data/F24LS_md/Lecture 10 - Development_.md +631 -0
- data/F24LS_md/Lecture 11 - Econometrics.md +345 -0
- data/F24LS_md/Lecture 12 - Finance.md +692 -0
- data/F24LS_md/Lecture 13 - Environmental Economics.md +299 -0
- data/F24LS_md/Lecture 15 - Conclusion.md +272 -0
- data/F24LS_md/Lecture 2 - Demand.md +349 -0
- data/F24LS_md/Lecture 3 - Supply.md +329 -0
- data/F24LS_md/Lecture 5 - Production C-D.md +291 -0
- data/F24LS_md/Lecture 6 - Utility and Latex.md +440 -0
- data/F24LS_md/Lecture 7 - Inequality.md +607 -0
- data/F24LS_md/Lecture 8 - Macroeconomics.md +704 -0
- data/F24LS_md/Lecture 8 - Macro.md +700 -0
- data/F24LS_md/Lecture 9 - Game Theory_.md +436 -0
- data/F24LS_md/summary.yaml +105 -0
- data/F24Lec_MD/LecNB_summary.yaml +206 -0
- data/F24Lec_MD/lec01/lec01.md +267 -0
- data/F24Lec_MD/lec02/Avocados_demand.md +425 -0
- data/F24Lec_MD/lec02/Demand_Steps_24.md +126 -0
- data/F24Lec_MD/lec02/PriceElasticity.md +83 -0
- data/F24Lec_MD/lec02/ScannerData_Beer.md +171 -0
- data/F24Lec_MD/lec02/demand-curve-Fa24.md +213 -0
- data/F24Lec_MD/lec03/3.0-CubicCostCurve.md +239 -0
- data/F24Lec_MD/lec03/3.1-Supply.md +274 -0
- data/F24Lec_MD/lec03/3.2-sympy.md +332 -0
- data/F24Lec_MD/lec03/3.3a-california-energy.md +120 -0
- data/F24Lec_MD/lec03/3.3b-a-really-hot-tuesday.md +121 -0
- data/F24Lec_MD/lec04/lec04-CSfromSurvey-closed.md +335 -0
- data/F24Lec_MD/lec04/lec04-CSfromSurvey.md +331 -0
- data/F24Lec_MD/lec04/lec04-Supply-Demand-closed.md +519 -0
- data/F24Lec_MD/lec04/lec04-Supply-Demand.md +514 -0
- data/F24Lec_MD/lec04/lec04-four-plot-24.md +34 -0
- data/F24Lec_MD/lec04/lec04-four-plot.md +34 -0
- data/F24Lec_MD/lec05/Lec5-Cobb-Douglas.md +131 -0
- data/F24Lec_MD/lec05/Lec5-CobbD-AER1928.md +283 -0
- data/F24Lec_MD/lec06/6.1-Sympy-Differentiation.md +253 -0
- data/F24Lec_MD/lec06/6.2-3D-utility.md +287 -0
- data/F24Lec_MD/lec06/6.3-QuantEcon-Optimization.md +399 -0
- data/F24Lec_MD/lec06/6.4-latex.md +138 -0
- data/F24Lec_MD/lec06/6.5-Edgeworth.md +269 -0
- data/F24Lec_MD/lec07/7.1-inequality.md +283 -0
- data/F24Lec_MD/lec07/7.2-historical-inequality.md +237 -0
- data/F24Lec_MD/lec08/macro-fred-api.md +313 -0
- data/F24Lec_MD/lec09/lecNB-prisoners-dilemma.md +88 -0
- data/F24Lec_MD/lec10/Lec10.2-waterguard.md +401 -0
- data/F24Lec_MD/lec10/lec10.1-mapping.md +199 -0
- data/F24Lec_MD/lec11/11.1-slr.md +305 -0
- data/F24Lec_MD/lec11/11.2-mlr.md +171 -0
- data/F24Lec_MD/lec12/Lec12-4-PersonalFinance.md +590 -0
- data/F24Lec_MD/lec12/lec12-1_Interest_Payments.md +267 -0
- data/F24Lec_MD/lec12/lec12-2-stocks-options.md +235 -0
- data/F24Lec_MD/lec13/Co2_ClimateChange.md +139 -0
- data/F24Lec_MD/lec13/ConstructingMAC.md +213 -0
- data/F24Lec_MD/lec13/EmissionsTracker.md +170 -0
- data/F24Lec_MD/lec13/KuznetsHypothesis.md +219 -0
- data/F24Lec_MD/lec13/RoslingPlots.md +217 -0
- data/F24Lec_MD/lec15/vibecession.md +485 -0
- data/F24Textbook_MD/00-intro/index.md +292 -0
- data/F24Textbook_MD/01-demand/01-demand.md +152 -0
- data/F24Textbook_MD/01-demand/02-example.md +131 -0
- data/F24Textbook_MD/01-demand/03-log-log.md +284 -0
- data/F24Textbook_MD/01-demand/04-elasticity.md +248 -0
- data/F24Textbook_MD/01-demand/index.md +15 -0
- data/F24Textbook_MD/02-supply/01-supply.md +203 -0
- data/F24Textbook_MD/02-supply/02-eep147-example.md +86 -0
- data/F24Textbook_MD/02-supply/03-sympy.md +138 -0
- data/F24Textbook_MD/02-supply/04-market-equilibria.md +204 -0
- data/F24Textbook_MD/02-supply/index.md +16 -0
- data/F24Textbook_MD/03-public/govt-intervention.md +73 -0
- data/F24Textbook_MD/03-public/index.md +10 -0
- data/F24Textbook_MD/03-public/surplus.md +351 -0
- data/F24Textbook_MD/03-public/taxes-subsidies.md +282 -0
- data/F24Textbook_MD/04-production/index.md +15 -0
- data/F24Textbook_MD/04-production/production.md +178 -0
- data/F24Textbook_MD/04-production/shifts.md +296 -0
- data/F24Textbook_MD/05-utility/budget-constraints.md +166 -0
- data/F24Textbook_MD/05-utility/index.md +15 -0
- data/F24Textbook_MD/05-utility/utility.md +136 -0
- data/F24Textbook_MD/06-inequality/historical-inequality.md +253 -0
- data/F24Textbook_MD/06-inequality/index.md +15 -0
- data/F24Textbook_MD/06-inequality/inequality.md +226 -0
- data/F24Textbook_MD/07-game-theory/bertrand.md +257 -0
- data/F24Textbook_MD/07-game-theory/cournot.md +333 -0
- data/F24Textbook_MD/07-game-theory/equilibria-oligopolies.md +96 -0
- data/F24Textbook_MD/07-game-theory/expected-utility.md +61 -0
- data/F24Textbook_MD/07-game-theory/index.md +19 -0
- data/F24Textbook_MD/07-game-theory/python-classes.md +340 -0
- data/F24Textbook_MD/08-development/index.md +35 -0
- data/F24Textbook_MD/09-macro/CentralBanks.md +101 -0
- data/F24Textbook_MD/09-macro/Indicators.md +77 -0
- data/F24Textbook_MD/09-macro/fiscal_policy.md +36 -0
- data/F24Textbook_MD/09-macro/index.md +14 -0
- data/F24Textbook_MD/09-macro/is_curve.md +76 -0
- data/F24Textbook_MD/09-macro/phillips_curve.md +70 -0
- data/F24Textbook_MD/10-finance/index.md +10 -0
- data/F24Textbook_MD/10-finance/options.md +178 -0
- data/F24Textbook_MD/10-finance/value-interest.md +60 -0
- data/F24Textbook_MD/11-econometrics/index.md +16 -0
- data/F24Textbook_MD/11-econometrics/multivariable.md +218 -0
- data/F24Textbook_MD/11-econometrics/reading-econ-papers.md +25 -0
- data/F24Textbook_MD/11-econometrics/single-variable.md +483 -0
- data/F24Textbook_MD/11-econometrics/statsmodels.md +58 -0
- data/F24Textbook_MD/12-environmental/KuznetsHypothesis-Copy1.md +187 -0
- data/F24Textbook_MD/12-environmental/KuznetsHypothesis.md +187 -0
- data/F24Textbook_MD/12-environmental/MAC.md +254 -0
- data/F24Textbook_MD/12-environmental/index.md +36 -0
- data/F24Textbook_MD/LICENSE.md +11 -0
- data/F24Textbook_MD/intro.md +26 -0
- data/F24Textbook_MD/references.md +25 -0
- data/F24Textbook_MD/summary.yaml +414 -0
- metadata +155 -0
@@ -0,0 +1,345 @@
|
|
1
|
+
---
|
2
|
+
title: "Lecture 11 - Econometrics"
|
3
|
+
type: slides
|
4
|
+
week: 11
|
5
|
+
source_path: "/Users/ericvandusen/Documents/Data88E-ForTraining/F24LS/Lecture 11 - Econometrics.pptx"
|
6
|
+
---
|
7
|
+
|
8
|
+
## Slide 1: Lecture 11: Econometrics
|
9
|
+
|
10
|
+
- Lecture 11: Econometrics
|
11
|
+
- Data 88E: Economic Models
|
12
|
+
|
13
|
+
## Slide 2: Announcements
|
14
|
+
|
15
|
+
- Project 4 will be released - Econometrics
|
16
|
+
- No Lab
|
17
|
+
|
18
|
+
## Slide 3: Academic Dishonesty and Attendance
|
19
|
+
|
20
|
+
- Using ChatGPT to paste in code is dishonesty
|
21
|
+
|
22
|
+
## Slide 4: Follow-ups
|
23
|
+
|
24
|
+
- Last time I ranted about Facebook 2016 and 2020 - its in court today!!!
|
25
|
+
- Did Facebook lie to shareholders about knowledge of risks?
|
26
|
+
- Brent Kavanaugh’s best friend Joel Kaplan is Facebook Policy head?
|
27
|
+
|
28
|
+
## Slide 5: Fed Meeting tomorrow!
|
29
|
+
|
30
|
+
- What is forecast for tomorrow?
|
31
|
+
- How might the election affect the Fed decisions
|
32
|
+
|
33
|
+
## Slide 6: (untitled)
|
34
|
+
|
35
|
+
- https://www.spglobal.com/marketintelligence/en/news-insights/latest-news-headlines/fed-s-autonomy-could-be-at-risk-if-trump-wins-in-november-85804211
|
36
|
+
|
37
|
+
## Slide 7: Project 2025 has a chapter on the Fed - Chapter 24!
|
38
|
+
|
39
|
+
- Do away with Dual Mandate
|
40
|
+
- https://static.project2025.org/2025\_MandateForLeadership\_CHAPTER-24.pdf
|
41
|
+
|
42
|
+
## Slide 8: Today’s class
|
43
|
+
|
44
|
+
- Intro to econometrics
|
45
|
+
- Intro to regression
|
46
|
+
- Simple linear regression
|
47
|
+
- Multiple linear regression
|
48
|
+
- Dummy variables
|
49
|
+
- Reading econometrics tables
|
50
|
+
- Project demo
|
51
|
+
- Upper division econometrics classes
|
52
|
+
|
53
|
+
## Slide 9: Memes of the day
|
54
|
+
|
55
|
+
## Slide 10: Intro to econometrics
|
56
|
+
|
57
|
+
- An important part of economics is understanding the relationship between variables, especially cause-and-effect relationships
|
58
|
+
- How does price affect the quantity demanded?
|
59
|
+
- How does income tax affect consumption?
|
60
|
+
- How does schooling affect income?
|
61
|
+
- Ideally, we would do randomized controlled trials to see if one variable (independent variable) causes another (dependent variable): gold standard for establishing causality
|
62
|
+
- But we can’t usually conduct experiments in economics, the best we have is observational data
|
63
|
+
- What do you think are some challenges with conducting experiments in economics?
|
64
|
+
- This is why we use econometrics: we use statistical techniques (like regression) to model relationships between economic variables
|
65
|
+
|
66
|
+
## Slide 11: Intro to regression
|
67
|
+
|
68
|
+
- Really important tool in econometrics
|
69
|
+
- Used for modelling relationships between variables
|
70
|
+
- E.g. relationship between price and quantity demanded
|
71
|
+
- 3 main purposes of regression:
|
72
|
+
- Describing associations: What kind of association do x and y have? Positive/negative? Strong/weak?
|
73
|
+
- Prediction: E.g. by how much will quantity demanded decrease if price increases by $1?
|
74
|
+
- Causal inference: Does x cause y or is it just a correlation?
|
75
|
+
|
76
|
+
## Slide 12: Simple linear regression
|
77
|
+
|
78
|
+
- You hypothesize that there’s some association between two variables – say price and quantity demanded
|
79
|
+
- Start by creating a scatter plot to visualize the association
|
80
|
+
- This gives you an intuition for the association
|
81
|
+
- Gives you a sense for whether the association is positive/negative, strong/weak/none
|
82
|
+
- There is some randomness: e.g. in the example, y doesn’t always increase as x increases (otherwise the points would all lie on a straight line)
|
83
|
+
|
84
|
+
## Slide 13: Simple linear regression
|
85
|
+
|
86
|
+
- You can quantify this association by calculating the correlation coefficient or r (DATA 8 review!)
|
87
|
+
- Review of correlation coefficient:
|
88
|
+
- Measure of linear association (won’t work for nonlinear association), i.e. slope is constant with respect to x
|
89
|
+
- Value between -1 and 1
|
90
|
+
- The sign tells you the direction of the association: positive or negative
|
91
|
+
- Take absolute value of r to determine strength of association
|
92
|
+
- 0 means no correlation
|
93
|
+
- Correlation is not causation!
|
94
|
+
|
95
|
+
## Slide 14: Simple linear regression
|
96
|
+
|
97
|
+
- How to calculate the correlation coefficient:
|
98
|
+
- Step 1: Convert data to standard units
|
99
|
+
- Express each value in terms of the number of standard deviations it is from the mean (“distance” from the mean)
|
100
|
+
- Step 2: Take the mean of product of x and y
|
101
|
+
|
102
|
+
## Slide 15: Simple linear regression
|
103
|
+
|
104
|
+
- We can draw a line of best fit through the data: the line that most accurately models the relationship between x and y
|
105
|
+
- Recall that this line has the form y = slope \* x + intercept
|
106
|
+
- We can use the correlation coefficient to find the slope and intercept
|
107
|
+
- Slope:
|
108
|
+
- Intercept:
|
109
|
+
- Equation of the line of best fit: → Regression equation
|
110
|
+
- A note notation: the hats indicate that the values are estimates (rather than actual values)
|
111
|
+
- (fact: because average of x and y are always on the regression line)
|
112
|
+
|
113
|
+
## Slide 16: Simple linear regression
|
114
|
+
|
115
|
+
- Can plot the regression line (the line of best fit) using the equation we found
|
116
|
+
- This is the same as the line of best fit you get by using a function you already looked at: np.polyfit
|
117
|
+
- Not all the points lie on the regression line because of
|
118
|
+
- Random variation
|
119
|
+
- The model is not a perfect fit (not perfectly accurate)
|
120
|
+
- Note: assuming all assumptions of doing linear regression are satisfied
|
121
|
+
|
122
|
+
## Slide 17: Avocado Demand in Week 2? np.polyfit(X,Y,1)
|
123
|
+
|
124
|
+
## Slide 18: polyfit
|
125
|
+
|
126
|
+
## Slide 19: Simple linear regression
|
127
|
+
|
128
|
+
- Use the regression equation to generate predictions for y
|
129
|
+
- Once we’ve generated predictions, we would be interested in knowing how accurate they are
|
130
|
+
- We can calculate root mean squared error (RMSE) to determine accuracy:
|
131
|
+
- Calculate residuals → Square them → Take the average → Take the square root
|
132
|
+
|
133
|
+
## Slide 20: Simple linear regression
|
134
|
+
|
135
|
+
- RMSE: on average, how far are your predictions from the actual values?
|
136
|
+
- Think about whether your value for RMSE is a large number
|
137
|
+
- Most useful for comparing across models
|
138
|
+
- Regression minimizes the RMSE of your data: of all the lines you can draw through your points, the regression line has the lowest RMSE
|
139
|
+
- This is why it’s called least squares (OLS) regression
|
140
|
+
- We can use the minimize() function to see this
|
141
|
+
- Numpy minimizes RMSE when it’s calculating the slope and intercept (e.g. when you use np.polyfit)
|
142
|
+
|
143
|
+
## Slide 21: Simple linear regression
|
144
|
+
|
145
|
+
- So far: DATA 8 version of regression
|
146
|
+
- We use the statsmodels library to do regression in Python
|
147
|
+
- First, import statsmodels.api as sm
|
148
|
+
- Code for regression:
|
149
|
+
- To get the coefficients (intercept, slope): result.params
|
150
|
+
|
151
|
+
## Slide 22: (untitled)
|
152
|
+
|
153
|
+
## Slide 23: Simple linear regression
|
154
|
+
|
155
|
+
- Always include intercept term in your model (that is, don’t forget sm.add\_constant)
|
156
|
+
- Regression line passes through point of averages (property of the regression line)
|
157
|
+
- Best prediction for the average of x is the average of y
|
158
|
+
- Including the intercept term makes sure that your regression line passes through the point of averages
|
159
|
+
- Also, can’t generally assume that your model doesn’t have a y-intercept
|
160
|
+
|
161
|
+
## Slide 24: Simple linear regression
|
162
|
+
|
163
|
+
- Important things to focus on in the regression output:
|
164
|
+
- Slope (𝛃) : By how much does y increase/decrease when you increase x by 1 unit?
|
165
|
+
- Intercept (𝛂) : What is the value of y when x = 0?
|
166
|
+
- Not always meaningful (e.g. what are earnings when height is 0 inches?)
|
167
|
+
- Confidence interval: Is there a statistically significant association between x and y? (H0: 𝛃 = 0, H1: 𝛃 ≠ 0)
|
168
|
+
|
169
|
+
## Slide 25: Simple linear regression
|
170
|
+
|
171
|
+
- Confidence interval on coefficient estimates:
|
172
|
+
- Related to hypothesis testing:
|
173
|
+
- Null hypothesis: No association between x and y (𝛃=0)
|
174
|
+
- Alternative hypothesis: There is an association between x and y (𝛃≠0)
|
175
|
+
- Regression output gives you 95% CI → test hypotheses at the 5% significance level
|
176
|
+
- If CI contains 0: evidence for the null
|
177
|
+
- If CI doesn’t contain 0: evidence against the null
|
178
|
+
- Can simulate using bootstrapping
|
179
|
+
- Center of the CI: regression slope
|
180
|
+
|
181
|
+
## Slide 26: Regression Output ( Statsmodels)
|
182
|
+
|
183
|
+
## Slide 27: Regression Output ( SM vs R vs Stata)
|
184
|
+
|
185
|
+
- Python (statsmodels)
|
186
|
+
- R - LM
|
187
|
+
- Stata
|
188
|
+
|
189
|
+
## Slide 28: Regression Output ( Statsmodels)
|
190
|
+
|
191
|
+
- This is the amount of variation explained by the model
|
192
|
+
- Number of Observations - reality check on size of data N matters!
|
193
|
+
- coef = 𝛃 = magnitude of estimated coefficient
|
194
|
+
- std err = variability of estimate of coefficient
|
195
|
+
- t= a t-test testing whether 𝛃 = 0
|
196
|
+
- P>[t]= probability of that t-test
|
197
|
+
|
198
|
+
## Slide 29: Jump to Notebook 1
|
199
|
+
|
200
|
+
## Slide 30: Interactive Demo
|
201
|
+
|
202
|
+
## Slide 31: (untitled)
|
203
|
+
|
204
|
+
## Slide 32: (untitled)
|
205
|
+
|
206
|
+
## Slide 33: Multiple linear regression
|
207
|
+
|
208
|
+
- These variables that affect the dependent variable and are correlated with the independent variable that are not included in the model are called omitted variables
|
209
|
+
- E.g. your socioeconomic background affects your earnings and is positively correlated with how much education you get
|
210
|
+
- They cause omitted variable bias: cause your estimate of the regression slope to be different from the actual slope
|
211
|
+
- What would a positive vs. negative value for omitted variable bias mean?
|
212
|
+
- Can use multiple linear regression to account for omitted variables (really important purpose of regression!)
|
213
|
+
|
214
|
+
## Slide 34: Multiple linear regression
|
215
|
+
|
216
|
+
- These variables that affect the dependent variable and are correlated with the independent variable that are not included in the model are called omitted variables
|
217
|
+
- E.g. your socioeconomic background affects your earnings and is positively correlated with how much education you get
|
218
|
+
- They cause omitted variable bias: cause your estimate of the regression slope to be different from the actual slope
|
219
|
+
- Can use multiple linear regression to account for omitted variables
|
220
|
+
|
221
|
+
## Slide 35: Multiple linear regression
|
222
|
+
|
223
|
+
- Using statsmodels: just select multiple columns when defining the x-variable
|
224
|
+
- Multiple regression slopes: each independent variable has a slope
|
225
|
+
- Compare slope on independent variable of interest from simple and multiple linear regression to see if it’s overstated or understated
|
226
|
+
- Example of a MLR model:
|
227
|
+
|
228
|
+
## Slide 36: Multiple linear regression
|
229
|
+
|
230
|
+
- Mathematical intuition:
|
231
|
+
- Each slope is the partial effect of the corresponding independent variable on y: effect holding all other independent variables constant
|
232
|
+
|
233
|
+
## Slide 37: Multiple linear regression
|
234
|
+
|
235
|
+
- Graphically with 2 independent variables: 3D plot with a regression plane (not line)
|
236
|
+
- Y
|
237
|
+
- X1 - continuous variable
|
238
|
+
- X2 - dummy variable
|
239
|
+
- Can only be 0,1
|
240
|
+
|
241
|
+
## Slide 38: Dummy variables
|
242
|
+
|
243
|
+
- Dummy variables are variables that take on a value of either 0 or 1 to indicate the presence or absence in a category
|
244
|
+
- E.g. smoker or non-smoker, went to college or didn’t go to college
|
245
|
+
- Also known as indicator variables
|
246
|
+
- Mutually exclusive: you can only be in one of the categories (you either went to college or you didn’t – not both)
|
247
|
+
- Collectively exhaustive: you must be in one of the categories (only 2 possible scenarios)
|
248
|
+
|
249
|
+
## Slide 39: Dummy variables
|
250
|
+
|
251
|
+
- Example of regression model with dummy variable:
|
252
|
+
- col is a dummy variable indicating whether or not the person went to college (0 = didn’t go to college)
|
253
|
+
- Coefficient on col → difference of mean earnings when col = 1 and col = 0
|
254
|
+
|
255
|
+
## Slide 40: Dummy variables
|
256
|
+
|
257
|
+
- Beware of the dummy variable trap
|
258
|
+
- Say you include all possible variables for a dummy variable
|
259
|
+
- E.g. col (for whether or not you went to college) and notcol (for whether or not you didn’t go to college)
|
260
|
+
- Example:
|
261
|
+
|
262
|
+
## Slide 41: Dummy variables
|
263
|
+
|
264
|
+
- There are infinite solutions for the coefficients (slopes and intercepts)
|
265
|
+
- This is because the independent variable in the model have a perfect correlation – perfect multicollinearity
|
266
|
+
- Happens any time one independent variable is a linear combination of another
|
267
|
+
- E.g. if you have age and schooling in your model and age = schooling + 3
|
268
|
+
- Use pd.get\_dummies() to convert categorical variables to dummy variables
|
269
|
+
|
270
|
+
## Slide 42: Jump to Lec NB2
|
271
|
+
|
272
|
+
## Slide 43: Reading econometrics tables
|
273
|
+
|
274
|
+
- In papers, the results of econometric analysis are summarized in regression tables
|
275
|
+
- Example of regression table from project 4:
|
276
|
+
|
277
|
+
## Slide 44: Reading econometrics tables
|
278
|
+
|
279
|
+
- In papers, the results of econometric analysis are summarized in regression tables
|
280
|
+
- Example of regression table from project 4:
|
281
|
+
- Filtering Variable
|
282
|
+
- X1
|
283
|
+
- X2
|
284
|
+
- X3,X4,X5…
|
285
|
+
- Y
|
286
|
+
- n
|
287
|
+
|
288
|
+
## Slide 45: Reading econometrics tables
|
289
|
+
|
290
|
+
- Always look at how the dependent variable is being measured: in the example, the authors are taking log earnings
|
291
|
+
- Recall what you learned about semi-log demand curves: interpreted as % change in y per unit change in x
|
292
|
+
- Look at what is being controlled for, e.g.
|
293
|
+
- They are estimating separate models for men and women so they are controlling for gender
|
294
|
+
- In the second column for each group, they are controlling for test scores
|
295
|
+
|
296
|
+
## Slide 46: Project demo
|
297
|
+
|
298
|
+
- In this project, you will use regression to analyze the relationship between a person’s height and earnings
|
299
|
+
- Based on a study conducted by Anne Case and Christina Paxson (very interesting study!)
|
300
|
+
- Divided into 3 parts: simple linear regression, multiple linear regression and reading econometrics tables
|
301
|
+
|
302
|
+
## Slide 47: Upper division econometrics classes
|
303
|
+
|
304
|
+
- ECON 140: Economic Statistics and Econometrics
|
305
|
+
- Edwards - in R in Fall
|
306
|
+
- ECON 141: Econometric Analysis (with Linear Algebra)
|
307
|
+
- ECON 142: Applied Econometrics and Public Policy
|
308
|
+
- ECON 143: Econometrics: Advanced Methods and Applications ( New)
|
309
|
+
- ECON 144: Financial Econometrics
|
310
|
+
- STAT 153: Time Series
|
311
|
+
- ECON 148: Data Science for Economists
|
312
|
+
- Learn more here: http://guide.berkeley.edu/courses/econ/
|
313
|
+
|
314
|
+
## Slide 48: Aside - Econometrics vs ML
|
315
|
+
|
316
|
+
- The idea of Econometrics is that the model has an underlying structure
|
317
|
+
- Economic Theory gives us a reason to structure the model
|
318
|
+
- We seek to explain the effect of X on Y
|
319
|
+
- We also need to hold multiple variables constant
|
320
|
+
- We will adapt a lot of techniques to get an unbiased estimator
|
321
|
+
- OLS is the starting point - variations depart from there
|
322
|
+
- Machine Learning - we don’t need to have an underlying model
|
323
|
+
- Whatever does the best job in prediction! Or modeling, or classification...
|
324
|
+
- ML has many models that can inform econometrics!
|
325
|
+
- Random Forest, Network Graph theory, nonparametric approaches
|
326
|
+
- Neural Networks
|
327
|
+
|
328
|
+
## Slide 49: Terminology - an aside econometrics vs ML
|
329
|
+
|
330
|
+
- Dependent Variable ( Y) ~ predictor and predicted values
|
331
|
+
- Independent Variable ( X) ~ regressors, explanatory variables ~ coefficients, betas
|
332
|
+
- X in the ML world might be called “model features”
|
333
|
+
- Y in the ML world might be called “target”
|
334
|
+
- 0-1 variables when they are in the Y variable - call for a different class of models- especially Logit model - in ML this would be a classifier model
|
335
|
+
- 0-1 dummy variables - in the X variable would be called “one hot encoding” of categorical variables
|
336
|
+
- ML - split up your datasetTraining data - used for training the modelTesting data - used for measuring the accuracy of your model
|
337
|
+
|
338
|
+
## Slide 50: Matrix notation for Linear Regression
|
339
|
+
|
340
|
+
- https://online.stat.psu.edu/stat462/node/132/
|
341
|
+
- Formula for Data Model
|
342
|
+
- Formula for OLS
|
343
|
+
|
344
|
+
## Slide 51: SKlearn
|
345
|
+
|