-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathMathReview_v10.Rmd
694 lines (499 loc) · 25.9 KB
/
MathReview_v10.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
---
title: "Math Review - MIDS W203 - Statistics"
author: "w203 Staff"
date: "24 August 2020"
version: "11"
header-includes:
- \usepackage{amssymb}
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(ggplot2)
# To Do
# 1. add chain rule and integration by parts examples
# 2. add u-substitution example
# 3. add exponential example - mutilplying two exp's together
# 4. add exampe of integration to inf - ref page in book define limits not as inf...
```
This document is designed to provide a quick refresher on exponents and logarithms, functions, differential and integral calculus, histograms, and density functions for MIDS W203 students.
This document is made available as both a PDF and an R Markdown document. It is full of examples of how to format reports using R Markdown and LaTeX, formally \LaTeX (pronounced "LAY-Tek"). I encourage you to keep the source file handy through the semester as a mini-cookbook for making nice reports with many of the mathematical expressions you will use.
## 1. Mathematical Symbols and Nomenclature
\begin{align}
wrt\hspace{6mm} &\text{with respect to}\notag \\
\mathbb{R} \hspace{6mm} &\text{The set of Real numbers}\notag \\
\mathbb{Z} \hspace{6mm} &\text{The set of Integers} \notag \\
\mathcal{N}(\mu,\sigma^2) \hspace{6mm} &\text{Normal distribution with mean }\mu \text{ and variance }\sigma^2 \notag \\
\forall \hspace{6mm} &\text{For all (members of a set)}\notag \\
| \hspace{6mm} &\text{Such that (the pipe symbol, more common)}\notag \\
\ni \hspace{6mm} &\text{Such that (less common)}\notag \\
\in \hspace{6mm} &\text{In (a set)}\notag \\
\approx \hspace{6mm} &\text{Approximately equal to}\notag \\
\left\{ x \in A \hspace{1mm}| \hspace{1mm}0\leq Pr(x) \leq 1\right\} \hspace{6mm} &\text{The set of all x in A such that the probability of x is greater than}\notag \\
\hspace{6mm} &\text {or equal to 0 and less than or equal to 1}\notag \\
\left\{ x \in \mathbb{R} \hspace{1mm}: x > 0\right\} \hspace{6mm} &\text{The set of all x in the set of Real numbers such that x is greater than zero}\notag
\end{align} \notag
## 2. Rules of Logarithms and Exponents
\begin{align}
a^n \times a^m &= a^{n+m} \notag \\
a^n \times b^n &= (a \cdot b)^n \notag \\
a^n / a^m &= a^{n-m} \notag \\
a^n / b^n &= (a/b)^n\notag \\
(b^n)^m &= b^{n \cdot m}\notag \\
(b^n)^m &= (b^m)^n \notag \\
\sqrt[m]{b^n} &= b^{n/m} \notag \\
\notag \\
\notag \\
log(uv) &= log(u) + log(v) \notag \\
log(u/v) &= log(u) - log(v) \notag \\
log(u^n) &= n\cdot log(u) \notag \\
\end{align} \notag
\pagebreak
## 3. Discrete Data and Summation Notation
Some data can assume only specific values or they form a series of values which are countable (although the series may be infinite). For discrete data, we common use subscripts to denote individual instances of a variable. Examples that will become very familiar include:
The sample mean:
$$\bar{x} = \frac{x_1 + x_2 + \ldots + x_n}{n} = \frac{\sum\limits_{i=1}^{n}{x_i}}{n}$$
The sample variance:
$$s^2 = \frac{(x_1 - \bar{x})^2 + (x_2 - \bar{x})^2 + \ldots + (x_n - \bar{x})^2} {n-1} = \frac{\displaystyle\sum_{i=1}^{n}(x_i - \bar{x})^2} {n-1}$$
\pagebreak
## 4. Functions and Functional Notation
### a. What is a function?
A function is a mathematical rule that maps one or more input (independent) variables to a unique value of an output (dependent) variable. Here are some examples of discrete and continuous functions:
$$Pr(x) =
\begin{cases}
p(1-p)^{x-1} &\text{x = 1, 2, 3,...}\\
0 &\text {otherwise.}
\end{cases}$$
Can you think of a real-world relationship that this equation might represent?
$$y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_k x_k + \mu$$
This is the equation for the multivariate linear regression model. You will know a great deal about this equation by the end of the term...
### b. What is not a function:
The following rule is not a function:
$$ 9 = x^2 + y^2$$
It is the equation for a circle of radius 3.
```{r circle, echo=FALSE}
library("plotrix")
plot(1:5,seq(-5,5,length=5),type="n",xlim=c(-8,8), ylim=c(-8,8),
xlab="X",ylab="y",main="Circle Graph")
draw.circle(0,0,3)
grid(nx=18, ny=18)
abline(v=-1.9, col="red")
abline(v=0, col="black", lty=2 )
abline(h=0, col="black", lty=2 )
```
The circle is not a function because for $x=-2$, there are two values of y that satisfy the equation. In general, if you can draw a vertical line through two or more points on a graph, it's not a $function$. The problem arises from the squared term on $y$. Here $y$ can take on the positive or negative value, making the outcome variable not a single value.
### c. Consider the line: $y = 2x-4$
```{r line plot, echo = FALSE}
x <- -1:6
dat <- data.frame(x,y=2*x - 4)
f <- function(x) 2*x-4
ggplot(dat, aes(x,y)) +
stat_function(fun=f, colour="firebrick") +
annotate("text", x = 1, y = 0, label = "f(x)" , color="firebrick", size=5 , angle=0, fontface="italic") +
ylim(-10,10)
```
We can also write this funciton as $f(x) = 2x - 4$ since $f$ is a function of $x$.
### d. Let's add a second-degree polynomial function, $g(x)$, and we'll write it in two different ways:
$$g(x) = (x-2)(x-5) = x^2 - 7x +10$$
What's the benefit of seeing the equation in these two ways? $(x-2)(x-5)$ and $x^2 - 7x + 10$
(Seriously, this is where you think about this insightful question. Why did we take the time and effort to write this equation in two different two ways? Each way has its own merit.)
\pagebreak
```{r line plot3, echo = FALSE}
x <- -1:8
dat <- data.frame(x,y=x^2-7*x+10)
f <- function(x) x^2-7*x+10
ggplot(dat, aes(x,y)) +
stat_function(fun=f, colour="darkblue") +
annotate("text", x = 2, y = 2, label = "g(x)" , color="darkblue", size=5 , angle=0, fontface="italic") +
ylim(-3,20) +
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0)
```
This is the plot of $g(x) = (x-2)(x-5) = x^2 - 7x +10$.
The pedantic instructor might insist that the plot and the axes should have arrows on them, else it might imply that the function is actually an arc or a segment. Let's agree to forgive each other for the lack of arrows.
You should be able to show on this graph why your answer to the previous question makes sense. Can you?
Note that $g(x)$ is a second-order polynomial, because the $x^2$ term is of degree 2, the highest of the degrees of the three terms $x^2$, $-7x$, and $10$.
### e. Zero-degree polynomial.
OK, quick, what does a zero-degree polynomial function look like?
(This is the part where you think, again. What's the graph look like? Draw a coordinate plane and sketch it in your notebook, or draw it with your finger in the palm of your hand - before you go to the next page, please.)
\pagebreak
```{r h_neg3, echo = FALSE}
f <- function(x) -3
ggplot(dat, aes(x,y)) +
stat_function(fun=f, colour="darkgreen") +
annotate("text", x = 1, y = -2.5, label = "h(x)" , color="darkgreen", size=5 , angle=0, fontface="italic") +
ylim(-4,1)+
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0)
```
Above is a plot of $h(x) = -3$, a zero-order function. How does $h(x)$ vary with $x$?
### f. Functions derived from other functions.
Let:
$$j(x) = f(x) + g(x)$$
and
$$k(x) = f(x) + g(x) + h(x)$$
$$
\begin{aligned}
j(x) &= f(x) + g(x) \\
&= (2x - 4) + (x^2 - 7x + 10) \\
&= x^2 - 5x + 6\\
&= (x-2)(x-3)
\end{aligned}
$$
Which is a nice form for graphing this equation, as we get the zero-crossings, $2$ and $3$.
```{r j_x, echo = FALSE, message=FALSE, warning=FALSE}
x <- -1:8
dat <- data.frame(x,y=x^2-5*x+6)
f <- function(x) x^2-5*x+6
ggplot(dat, aes(x,y)) +
stat_function(fun=f, colour="darkmagenta") +
annotate("text", x = 1.5, y = 2, label = "j(x)" , color="darkmagenta", size=5 , angle=0, fontface="italic") +
ylim(-4,6) +
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0)
```
$$
\begin{aligned}
k(x) &= f(x) + g(x) + h(x)\\
&= (2x - 4) + (x^2 - 7x + 10) -3 \\
&= x^2 - 5x + 6 - 3\\
&= (x-3)(x-2) - 3
\end{aligned}
$$
Why did I not write this as $k(x) = x^2 - 5x + 3$?
Let's look at $j(x)$ and $k(x)$ on the same plot...
```{r two_plots, echo = FALSE}
x <- 0:5
pointdata1 <- data.frame(
xname = c(0.7, 4.3),
ypos = c(0,0)
)
pointdata2 <- data.frame(
xname = c(2, 3),
ypos = c(0, 0)
)
dat <- data.frame(x,y=x^2-5*x+6)
f <- function(x) x^2-5*x+6
k <- function(x) x^2 - 5*x + 3
ggplot(dat, aes(x,y)) +
stat_function(fun=f, color="darkmagenta") +
stat_function(fun=k, color="darkgreen") +
annotate("text", x = 1.5, y = 2, label = "j(x)" , color="darkmagenta", size=5 , angle=0, fontface="italic") +
annotate("text", x = 1, y = -2.5, label = "k(x)" , color="darkgreen", size=5 , angle=0, fontface="italic") +
ylim(-4,6) +
geom_point(data = pointdata1, mapping = aes(x = xname, y = ypos), color="darkgreen")+
geom_point(data = pointdata2, mapping = aes(x = xname, y = ypos), color="darkmagenta") +
geom_vline(xintercept=2.5, linetype='dotted', colour="black")+
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0)
```
Note that $k(x)$ is just $j(x) - 3$. In other words, $k(x)$ is shifted down by $3$ on the $y$ axis. The two functions have their minima at the same value of $x$, and they have the same slope for every value of $x$. Do you see that?
Also, note that for $k(x)$, at $y=-3$, $x$ has values of $2$ and $3$. We can write this as $k(2) = k(3) = -3.$
Can you write the equation for the values at which $k(x)$ crosses the x-axis?
Here's a hint (the quadratic formula):
$$x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}$$
which is the solution for the values of $x$ which make $y = 0$ in an equation of form: $y = ax^2 + bx + c$.
What are $a$, $b$, and $c$?
So, we get $\frac{5 \pm \sqrt {13}}{2} \approx 2.5 \pm 1.8 \approx 0.7, 4.3$ for the zero-crossings.
\pagebreak
## 5. Differential Calculus
### a. How does $f(x)$ change as $x$ changes? $f(x) = 2x - 4$
```{r line plot2, echo = FALSE}
x <- -1:6
dat <- data.frame(x,y=2*x - 4)
f <- function(x) 2*x-4
ggplot(dat, aes(x,y)) +
stat_function(fun=f, colour="firebrick") +
annotate("text", x = 1, y = 2, label = "f(x)" , color="firebrick", size=5 , angle=0, fontface="italic") +
ylim(-10,10)+
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0)
```
For every increase of $x$ by $1$, $f(x)$ increases by $2$. How does this fact reveal itself in the graph of $f(x)$?
We can write this in several ways:
\begin{itemize}
\item The derivative of $f(x)$ with respect to $x$ is $2$.
\item $\frac{d}{dx}f(x) = 2$ This is Liebniz' Notation
\item $f^\prime(x) = 2$ This is Lagrange's Notation
\item $y^\prime = 2$
\end{itemize}
These are all equivalent. Which do you like best?
### b. The derivative of $h(x)$ wrt $x$.
Recall the graph of $h(x)$:
```{r h_no_change, echo = FALSE}
x <- -1:5
dat <- data.frame(x,y=-3)
f <- function(x) -3
ggplot(dat, aes(x,y)) +
stat_function(fun=f, colour="darkgreen") +
annotate("text", x = 1, y = -2.5, label = "h(x)" , color="darkgreen", size=5 , angle=0, fontface="italic") +
ylim(-4,1)+
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0)
```
How does $h(x)$ change as $x$ changes?
This equation captures that relationship: $\frac{d}{dx}h(x) = 0$
In other words, $h(x)$ does not vary with $x$; it has constant value of $-3$ regardless of the value of $x$. The slope of this line is zero, which is another way of saying the same thing.
\pagebreak
Recall that:
$$k(x) = j(x) + h(x)$$
We can differentiate both sides of this equation, with respect to $x$ to get:
$$\frac{d}{dx}k(x) = \frac{d}{dx}j(x) + \frac{d}{dx}h(x)$$
Since
$$\frac{d}{dx}h(x) = 0$$
we have:
$$\frac{d}{dx}k(x) = \frac{d}{dx}j(x)$$
Given your understanding that the derivative of a function is the slope of that function, the equation above should be clear from the graph below:
```{r two_plots_again, echo = FALSE}
x <- 0:5
dat <- data.frame(x,y=x^2-5*x+6)
f <- function(x) x^2-5*x+6
k <- function(x) x^2 - 5*x + 3
ggplot(dat, aes(x,y)) +
stat_function(fun=f, color="darkmagenta") +
stat_function(fun=k, color="darkgreen") +
annotate("text", x = 1.5, y = 2, label = "j(x)" , color="darkmagenta", size=5 , angle=0, fontface="italic") +
annotate("text", x = 1, y = -2.5, label = "k(x)" , color="darkgreen", size=5 , angle=0, fontface="italic") +
ylim(-4,6)+
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0)
```
See how the functions always have the same slope for all values of $x$?
\pagebreak
### c. Taking the derivative of a polynomial function.
This is where we see the benefit of writing $g(x)$ in this form...
$$ g(x) = x^2 - 7x + 10$$
$$\frac{d}{dx}g(x) = 2x - 7$$
The formula for the derivative of a function,
$$l(x) = ax^b$$
with respect to $x$ where $a$ and $b$ are constants is
$$l^\prime(x) = abx^{b-1}$$
For polynomials, simply repeat this for each term of the function.
\pagebreak
### d. Let's look at $g(x)$ and $g^\prime(x)$ together.
```{r line plot4, fig.width = 8, fig.height = 5, echo = FALSE}
x <- 0:7
dat <- data.frame(x,y=x^2-7*x+10)
f <- function(x) x^2-7*x+10
ggplot(dat, aes(x,y)) +
stat_function(fun=f, colour="darkblue") +
annotate("text", x = 2, y = 2, label = "g(x)" , color="darkblue", size=5 , angle=0, fontface="italic") +
ylim(-3,11)+
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0)
```
```{r line plot4_prime, fig.width = 8, fig.height = 5, echo = FALSE}
x <- 0:7
dat <- data.frame(x,y=2*x-7)
f <- function(x) 2*x-7
ggplot(dat, aes(x,y)) +
stat_function(fun=f, colour="darkblue") +
annotate("text", x = 2, y = 2, label = "g'(x)" , color="darkblue", size=5 , angle=0, fontface="italic") +
ylim(-8,8)+
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0)
```
Note the value of $x$ where $g^\prime(x) = 0$ and the behavior of $g(x)$ at that value of $x$. How would you describe this?
### e. Plots for the derivatives of a 3rd-degree polynomial.
```{r line plot m, fig.width = 8, fig.height = 2.5, echo = FALSE}
x <- 0:4
dat <- data.frame(x,y=x^3-6*x^2+11*x-6)
f <- function(x) x^3 - 6*x^2 + 11*x - 6
ggplot(dat, aes(x,y)) +
stat_function(fun=f, colour="darkcyan") +
annotate("text", x = 2, y = 2, label = "m(x)" , color="darkcyan", size=5 , angle=0, fontface="italic") +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
panel.background = element_rect(fill = 'gray94')) +
ylim(-12,12)+
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0)
```
```{r line m_prime, fig.width = 8, fig.height = 2.5, echo = FALSE}
x <- 0:4
dat <- data.frame(x,y=3*x^2 - 12*x + 11)
f <- function(x) 3*x^2 - 12*x + 11
ggplot(dat, aes(x,y)) +
stat_function(fun=f, colour="darkcyan") +
annotate("text", x = 2, y = 3, label = "m'(x)" , color="darkcyan", size=5 , angle=0, fontface="italic")+
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
panel.background = element_rect(fill = 'gray88')) +
ylim(-12,12)+
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0)
```
```{r line m_2prime, fig.width = 8, fig.height = 2.5, echo = FALSE}
x <- 0:4
dat <- data.frame(x,y=6*x - 12)
f <- function(x) 6*x - 12
ggplot(dat, aes(x,y)) +
stat_function(fun=f, colour="darkcyan") +
annotate("text", x = 1, y = -3, label = "m''(x)" , color="darkcyan", size=5 , angle=0, fontface="italic")+
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
panel.background = element_rect(fill = 'gray94')) +
ylim(-12,12)+
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0)
```
```{r line m_3prime, fig.width = 8, fig.height = 2.5, echo = FALSE}
x <- 0:4
dat <- data.frame(x,y=6)
f <- function(x) 6
ggplot(dat, aes(x,y)) +
stat_function(fun=f, colour="darkcyan") +
annotate("text", x = 1, y = 3, label = "m'''(x)" , color="darkcyan", size=5 , angle=0, fontface="italic") +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
panel.background = element_rect(fill = 'gray88')) +
ylim(-12,12)+
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0)
```
\pagebreak
### f. Notes on derivatives and notation
If $$n(y) = 3y^2 + 4y + 7$$
and $y$ is not a function of $x$, then
$$ \frac{d}{dx}n(y) = 0$$
but
$$\frac{d}{dy}n(y) = 6y + 4$$
Which is just a way of encouraging you to make sure you pay attention to the independent variable that you are using when taking a derivative. This is a reason I do not like the $y^\prime = 2$ notation: it can be ambiguous as to the independent variable (although usually not, in context).
Finally, let's review the notation for higher derivatives showing Leibniz and Lagrange notation:
$$\frac{dy}{dx} = f^{\prime}(x)$$
$$\frac{d^2y}{dx^2} = f^{\prime\prime}(x)$$
$$\frac{d^3y}{dx^3} = f^{\prime\prime\prime}(x)$$
### g. Partial Derivatives
Functions may have many input variables. For such a function, the derivative of that function with respect to a single input variable is referred to as the partial derivative of that function with respect to that single variable. It is interpreted as the change in the output variable based on a change in a given input variable, *holding all other variables constant.* For partial derivatives, it is standard to use the greek letter delta, $\delta$, instead of $d$. For example, consider:
$$f(x,y) = x^2 + xy + 4$$
$$\frac{\delta}{\delta x}f(x,y) = 2x + y$$
$$\frac{\delta}{\delta y}f(x,y) = x$$
Take a moment to consider what these statements mean, in the context of *holding all other variables constant.*
\pagebreak
### h. Important Derivatives
(From Tallarida's "Pocket Book of Integrals and Mathematical Functions, 2nd Edition")
Note: $u = f(x)$, $v = f(x)$, $a$ and $n$ are constants, and $e$ is the base of the natural logrithm.
$$\frac{d}{dx}(au) = a \frac{du}{dx}$$
$$\frac{d}{dx}(u + v) = \frac{du}{dx} + \frac{dv}{dx}$$
$$\frac{d}{dx}(uv) = u\frac{dv}{dx} + v\frac{du}{dx}$$
$$\frac{d}{dx}(u^n) = nu^{n-1}\frac{du}{dx}$$
$$\frac{d}{dx}(e^u) = e^u\frac{du}{dx}$$
$$\frac{d}{dx}ln(u) = \frac{1}{u}\frac{du}{dx}$$
### i. The Chain Rule
(From Paul's Online Math notes [Paul Dawkins at Lamar University] https://tutorial.math.lamar.edu/classes/calci/chainrule.aspx)
Suppose we have two functions $f(x)$ and $g(x)$ and they are both differentiable.
If we define $h(x) = f(g(x))$, then the derivative of $h(x)$ is:
$$ h'(x) = f'(g(x)) \cdot g'(x) $$
Put another way, if we have $y = f(u)$ and $u = g(x)$ then the derivative of $y$ is:
$$ \frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dx}$$
### j. An application - The Wage Equation
Let's look at one example you will see this semester, the wage equation, where wage is a function of education. Note that Woolridge uses $log$ to denote the natural logarithm with a base of $e$, differing from $log_{10}$, which has a base of 10.
$$log(wage) = \beta_0 + \beta_1 educ + u$$
For reasons that you may already understand, and which you will certainly understand in a few weeks, $wage$ and other monetary variables are often modeled as natural log functions.
To take the derivative of the Wage Equation $wrt$ $education$, we apply the rule:
$$\frac{d}{dx}log(u) = \frac{1}{u}\frac{du}{dx}$$
Differentiating both sides of the Wage Equation $wrt$ $education$, we get:
$$\frac{d}{d\,educ}log(wage) = \frac{d}{d\,educ}(\beta_0 + \beta_1\,educ + u) $$
$$\frac{1}{wage}\frac{d\,wage}{d\,educ} = \beta_1$$
Rearranging, we get:
$$\beta_1 = \frac{\frac{d\,wage}{wage}}{d\,educ}$$
which is interpted as $\beta_1$ being the *percent change* in $wage$ that results from an additional unit (typically a year) of education. Note that this *percent change* outcome is a direct result of $\frac{d}{dx}e^x = e^x$, and it is for this reason that we use the *natural* log for variables such as $wage$. Does intuitively make sense to you that people people, regardless of their salary level would view a 10% raise similarly? Lots more on this later in the semester.
The perceptive student will question if $\beta_0$ and $\beta_1$ are truly constants $wrt$ $educ$. We will cover these important assumptions during the class, and learn the meaning of *exogeneity* and *endogeneity*.
\pagebreak
## 6. Integral Calculus
### a. Integrals - Basic Idea and Intuition
The integral of a function represents how that function accumulates as a function of some input variable. Graphically, the integral of a function is the area under the curve, either for the entire range of the function (the indefinite integral), or for some specific range (the definite integral).
The integral of a function is often written as a capital letter-version of the function. For example:
$$ F(x) = \int f(x) dx$$
The area under the curve of the function $f(x)$ from $a$ to $b$ is the integral of $f(x)$ $wrt$ $x$ evaulated at $b$ minus that same function evaluated at $a$. This is known as the Fundamental Theorem of Calculus.
$$ F(b) - F(a) = \int_a^b f(x) dx$$
### b. Integrating a linear function.
Consider $$d(x) = \frac{1}{2} x$$
which has the indefinite integral:
$$D(x) = \int \frac{1}{2}x \, dx$$
```{r line plot8, echo = FALSE, fig.width = 8, fig.height = 3}
x <- seq(-0.01,8.01,0.1)
dat <- data.frame(x,y=0.5*x)
f <- function(x) 0.5*x
ggplot(dat, aes(x,y)) +
stat_function(fun=f, colour="firebrick") +
annotate("text", x = 1, y = 1.5, label = "d(x)" , color="firebrick", size=5 , angle=0, fontface="italic") +
annotate("text", x = 6.5, y = 1, label = "D(x)" , color="deepskyblue", size=5 , angle=0, fontface="italic") +
ylim(-1,5)+
geom_area(mapping = aes(x = ifelse(x>= 0 & x<= 6 , x, 0)), fill = "deepskyblue")
```
We can define a function $D(a)$, the area of the triangle shown in blue, as follows:
$$D(a) = \int_0^a \frac{1}{2}x \, dx$$
$$D(a) = \frac{x^2}{4}\Big|_0^a = [\frac{a^2}{4} - \frac{0^2}{4}] = \frac{a^2}{4}$$
Computing, the area shown in blue: $D(6) = 9$. To check, we can compute the area of the triangle under $d(x)$ from $x=0$ to $x=6$.
The area of a triangle is $\frac{1}{2}\space base \times height$
The height is: $d(6) = \frac{1}{2} \times 6 = 3$
The base is 6, so the area is $\frac{1}{2} \times 6 \times 3 = 9$.
### c. Cumulative Distribution Functions (CDFs)
Pay close allention to the indices on the integral and the variables of the function being integrated. See p. 148 in Devore. *F(x)* is a function of *x*, which is the upper index on the integral. *F(x)* is the cumulative area under the curve from $-\infty$ to *x* of the probability density function $f(y)$.
$$F(x) = P(X\le x) = \int_{-\infty}^x f(y)\,dy$$
### d. Double Integrals
$$D(a,b) = \int_{x=0}^a \int_{y=0}^b f(x,y) \, dy \, dx = \int_{y=0}^b \int_{x=0}^a f(x,y) \, dx \, dy$$
Intuitively, if you are going to compute the volume of a solid, it does not matter whether you compute (or measure) along the x-axis or y-axis first, you will get the same answer. That's what this equation says. On the left side, you first integrate over $y$, then over $x$, working your way out. On the right side, you integrate over $x$ first, and over $y$.
Here's an example:
<!-- (the '&' marks the point in the equation that gets aligned vertically) -->
<!-- by the way this is a comment in an R Markdown file... -->
<!-- also, I'd like to have line breaks between the lines of this block, so if you figure that out, let me know that too -->
$$\begin{aligned}
f(x,y) &= 2xy \\
F(a,b) &= \int_{y=0}^a \int_{x=0}^b 2xy \, dx \, dy\\
&=\int_{y=0}^a 2y \int_{x=0}^b x \, dx \, dy \\
&=\int_{y=0}^a 2y \, \frac{x^2}{2}\Big|_0^b \, dy \\
&=\int_{y=0}^a 2y \, [\frac{b^2}{2}- \frac{0^2}{2}] \, dy \\
&=\int_{y=0}^a y \, b^2 dy \\
&= b^2 \, \int_{y=0}^a y \, dy \\
&= b^2 \, \frac{y^2}{2}\Big|_0^a \\
&= \frac{a^2b^2}{2}
\end{aligned}$$
\pagebreak
### e. Double Summations
Note that for a discrete function, $p(x,y)$, the equivalent functional representation would be:
$$D(a,b) = \sum_{x=0}^a \,\sum_{y=0}^b \,p(x,y) = \sum_{y=0}^b \, \sum_{x=0}^a \, p(x,y)$$
\pagebreak
## 7. Histograms and Density/Mass Functions
Consider the set of Data Points (7, 8, 5, 10, 5, 6, 8, 7, 8, 7, 4, 9, 6, 7, 6, 9, 6, 7, 12, 9).
### a. Stem-and-leaf plot - a horizontal histogram
Let's order and bin the data:
4 \newline
5 5 \newline
6 6 6 6 \newline
7 7 7 7 7 \newline
8 8 8 \newline
9 9 9 \newline
10 \newline
12
### b. Histogram vs. Density Function
Note the differences in the following two plots, and look at the R code to see how similar they are. Can you spot the difference? Also note how you can include code as part of of your reporting. This will be a basic skill to master.
```{r histogram}
x <- c(7,8,5,10,5,6,8,7,8,7,4,9,6,7,6,9,6,7,12,9)
hist(x, col="lightblue", main="Histogram of x", breaks = seq(3.5, 12.5, by=1))
```
Why did I include this statement "breaks = seq(3.5, 12.5, by=1)" in the code? What is its function?
Please use this pattern, where appropriate, when you generate histrograms.
\pagebreak
```{r histogram2}
#
# Note: this is the place to comment your code, NOT the place to write analysis that is
# part of your report.
#
# As a data scientist, write your report for readers who may not be able to read
# your code. Putting prose in the code comments like this is a TERRIBLE idea.
# Please don't do it. However, commenting your code so that others can understand
# it, and so that you can understand it later, is a fabulous idea.
#
# The seq(3.5, 12.5, by=1) comment offsets the histogram by 0.5 so that the labels appear
# directly under the center of the bars - an example of a useful code comment
# and the answer to the question on the previous page.
#
hist(x, prob=TRUE, col="lightblue", main="PMF of x", breaks = seq(3.5, 12.5, by=1))
```
*Probability Density Functions* are for continuous input variables; *Probability Mass Functions* are for discrete input variables. Devore makes this distinction (pp 100, 143). However, Woolridge does not make this clear distinction, nor does R (as you can see in the graph above). If you want to get this perfect, override the default y-axis name when generating histograms using hist in R.