Mathematical Miscellany
[latex]\newcommand{\IQR}{\mbox{IQR}} \newcommand{\pr}[1]{P(#1)} \newcommand{\var}[1]{\mbox{var}(#1)} \newcommand{\mean}[1]{\mbox{E}(#1)} \newcommand{\sd}[1]{\mbox{sd}(#1)} \newcommand{\Binomial}[3]{#1 \sim \mbox{Binomial}(#2,#3)} \newcommand{\Student}[2]{#1 \sim \mbox{Student}(#2)} \newcommand{\Normal}[3]{#1 \sim \mbox{Normal}(#2,#3)} \newcommand{\Poisson}[2]{#1 \sim \mbox{Poisson}(#2)} \newcommand{\se}[1]{\mbox{se}(#1)} \newcommand{\prbig}[1]{P\left(#1\right)} \newcommand{\degc}{$^{\circ}$C}[/latex]
This appendix gives a little background on some of the functions used in statistics, as well as “computing formulas” for some of the statistics we have used.
Logarithms
The logarithm of [latex]x[/latex] to the base [latex]b[/latex] is the number [latex]y[/latex] such that [latex]b^y = x[/latex], and is denoted by [latex]\log_b (x)[/latex]. For example, [latex]\log_{10}(100) = 2[/latex] and [latex]\log_{2}(0.25) = -2[/latex].
Logarithms obey some simple rules, all of which have been used several times in this book. The first rule,
\[ \log_b (xy) = \log_b (x) + \log_b (y), \]
says that the logarithm of a product is the sum of the logarithms. That is, logarithms turn multiplication into addition. Similarly,
\[ \log_b (x/y) = \log_b (x) – \log_b (y), \]
so the logarithm of a ratio is the difference of the logarithms. From these rules it is not hard to see that logarithms turn powers into multiplication, so that
\[ \log_b (x^y) = y \log_b (x). \]
Sometimes you might want to a logarithm to the base [latex]c[/latex] but your calculator can only give values for base [latex]b[/latex]. In this case, the change of base formula
\[ \log_c (x) = \frac{\log_b(x)}{\log_b(c)} \]
can be used. For example, [latex]\log_2(16) = \log_{10}(16)/\log_{10}(2) = 1.204/0.301 = 4[/latex].
In many of our applications of logarithms, we work with these formulas to find a value [latex]y[/latex] for [latex]\log_{b}(x)[/latex]. We are usually interested in the value [latex]x[/latex], rather than its logarithm, and we can find it simply by calculating [latex]x = b^y[/latex].
Natural Logarithms
Two popular bases for logarithms are 10 and [latex]e[/latex]. Base 10 is useful since it is directly related to our decimal number system. For example, if someone tells you the [latex]\log_{10}[/latex] of a number is 2.1 then you know the number is a bit more than 100. We’ll also see how to exploit this relationship below when using logarithm tables.
Base [latex]e[/latex] logarithms, known as natural logarithms, are very important in mathematics because of a fundamental role they play in calculus. The natural logarithm of [latex]x[/latex], written as [latex]\log_{e}(x)[/latex] or [latex]\ln(x)[/latex], is defined as the area under the hyperbola [latex]y = \frac{1}{x}[/latex] between 1 and [latex]x[/latex], and [latex]e[/latex] is defined to be the number such that the area between 1 and [latex]e[/latex] is 1. It is amazing that this definition of logarithms, involving the area under a curve, matches up with the notion of logarithms in terms of powers given above.
The inverse of the natural logarithm function, the function that gives the [latex]x[/latex] value such that the area under the hyperbola between 1 and [latex]x[/latex] is [latex]y[/latex], is called the exponential function, written [latex]\exp(y)[/latex] or [latex]e^y[/latex]. There is a simple formula for [latex]\exp(y)[/latex],
\[ \exp(y) = 1 + \frac{y}{1!} + \frac{y^2}{2!} + \frac{y^3}{3!} + \frac{y^4}{4!} + \cdots, \]
though you have to keep summing to infinity to get the exact answer. The exponential function has some remarkable properties, such as being its own derivative, which make it a fundamental part of mathematical modelling.
For this reason, the number [latex]e[/latex] is a very important constant, second only to [latex]\pi[/latex] in fame. Like [latex]\pi[/latex], the area of the unit circle, [latex]e[/latex] is irrational and transcendental. (An irrational number is one that cannot be written as a ratio of whole numbers while a transcendental number is one that cannot be expressed as the solution of an equation involving [latex]x[/latex] and powers of [latex]x[/latex] (Newman, 1997; Niven, Zuckerman, & Montgomery, 1991). For example, [latex]\sqrt{2}[/latex] is irrational but is not transcendental, since it is the solution of [latex]x^2 = 2[/latex].) Using the above formula for [latex]\exp(y)[/latex] we have that
\[ e = \exp(1) = 1 + 1 + \frac{1}{2!} + \frac{1}{3!}+ \frac{1}{4!} + \cdots. \]
The table below gives the value of [latex]e[/latex] obtained by adding up the first 739 terms of this infinite sum.
1800 decimal places of [latex]e[/latex]
2.71828 | 18284 | 59045 | 23536 | 02874 | 71352 | 66249 | 77572 | 47093 | 69995 | 95749 | 66967 |
62772 | 40766 | 30353 | 54759 | 45713 | 82178 | 52516 | 64274 | 27466 | 39193 | 20030 | 59921 |
81741 | 35966 | 29043 | 57290 | 03342 | 95260 | 59563 | 07381 | 32328 | 62794 | 34907 | 63233 |
82988 | 07531 | 95251 | 01901 | 15738 | 34187 | 93070 | 21540 | 89149 | 93488 | 41675 | 09244 |
76146 | 06680 | 82264 | 80016 | 84774 | 11853 | 74234 | 54424 | 37107 | 53907 | 77449 | 92069 |
55170 | 27618 | 38606 | 26133 | 13845 | 83000 | 75204 | 49338 | 26560 | 29760 | 67371 | 13200 |
70932 | 87091 | 27443 | 74704 | 72306 | 96977 | 20931 | 01416 | 92836 | 81902 | 55151 | 08657 |
46377 | 21112 | 52389 | 78442 | 50569 | 53696 | 77078 | 54499 | 69967 | 94686 | 44549 | 05987 |
93163 | 68892 | 30098 | 79312 | 77361 | 78215 | 42499 | 92295 | 76351 | 48220 | 82698 | 95193 |
66803 | 31825 | 28869 | 39849 | 64651 | 05820 | 93923 | 98294 | 88793 | 32036 | 25094 | 43117 |
30123 | 81970 | 68416 | 14039 | 70198 | 37679 | 32068 | 32823 | 76464 | 80429 | 53118 | 02328 |
78250 | 98194 | 55815 | 30175 | 67173 | 61332 | 06981 | 12509 | 96181 | 88159 | 30416 | 90351 |
59888 | 85193 | 45807 | 27386 | 67385 | 89422 | 87922 | 84998 | 92086 | 80582 | 57492 | 79610 |
48419 | 84443 | 63463 | 24496 | 84875 | 60233 | 62482 | 70419 | 78623 | 20900 | 21609 | 90235 |
30436 | 99418 | 49146 | 31409 | 34317 | 38143 | 64054 | 62531 | 52096 | 18369 | 08887 | 07016 |
76839 | 64243 | 78140 | 59271 | 45635 | 49061 | 30310 | 72085 | 10383 | 75051 | 01157 | 47704 |
17189 | 86106 | 87396 | 96552 | 12671 | 54688 | 95703 | 50354 | 02123 | 40784 | 98193 | 34321 |
06817 | 01210 | 05627 | 88023 | 51930 | 33224 | 74501 | 58539 | 04730 | 41995 | 77770 | 93503 |
66041 | 69973 | 29725 | 08868 | 76966 | 40355 | 57071 | 62268 | 44716 | 25607 | 98826 | 51787 |
13419 | 51246 | 65201 | 03059 | 21236 | 67719 | 43252 | 78675 | 39855 | 89448 | 96970 | 96409 |
75459 | 18569 | 56380 | 23637 | 01621 | 12047 | 74272 | 28364 | 89613 | 42251 | 64450 | 78182 |
44235 | 29486 | 36372 | 14174 | 02388 | 93441 | 24796 | 35743 | 70263 | 75529 | 44483 | 37998 |
01612 | 54922 | 78509 | 25778 | 25620 | 92622 | 64832 | 62779 | 33386 | 56648 | 16277 | 25164 |
01910 | 59004 | 91644 | 99828 | 93150 | 56604 | 72580 | 27786 | 31864 | 15519 | 56532 | 44258 |
69829 | 46959 | 30801 | 91529 | 87211 | 72556 | 34754 | 63964 | 47910 | 14590 | 40905 | 86298 |
49679 | 12874 | 06870 | 50489 | 58586 | 71747 | 98546 | 67757 | 57320 | 56812 | 88459 | 20541 |
33405 | 39220 | 00113 | 78630 | 09455 | 60688 | 16674 | 00169 | 84205 | 58040 | 33637 | 95376 |
45203 | 04024 | 32256 | 61352 | 78369 | 51177 | 88386 | 38744 | 39662 | 53224 | 98506 | 54995 |
88623 | 42818 | 99707 | 73327 | 61717 | 83928 | 03494 | 65014 | 34558 | 89707 | 19425 | 86398 |
77275 | 47109 | 62953 | 74152 | 11151 | 36835 | 06275 | 26023 | 26484 | 72870 | 39207 | 64310 |
For most of the transformations in this book, it does not matter which logarithm base you use. Typically 10 is used, since it is easy to interpret, unless there is a mathematical reason for using base [latex]e[/latex]. Unless otherwise specified, [latex]\log(x)[/latex] will be used for [latex]\log_{10}(x)[/latex] and [latex]\ln(x)[/latex] will be used for [latex]\log_e(x)[/latex].
Log Tables
In the days before calculators, logarithms and their properties were an important part of life through logarithm tables and slide rules. The reason for this is that it is fairly easy to add two numbers together but it is usually harder to multiply two numbers, particularly when they are large. Since logarithms turn multiplication into addition they can be used to convert the harder problem into the easier problem.
The next table gives a small table of logarithms to the base 10. Suppose we want to multiply 69 times 242. The table only gives logs for numbers between 1 and 10, but using our rules we can find
\[ \log_{10} (69) = \log_{10}(10 \times 6.9) = \log_{10}(10) + \log_{10}(6.9) = 1 + 0.839 = 1.839. \]
In general, you simply move the decimal point to the left and add 1 to your log for each step (since these are base 10). Similarly, [latex]\log_{10}(242) = 2.384[/latex] since the table gives [latex]\log_{10}(2.42) = 0.384[/latex]. Together we find that
\[ \log_{10}(69 \times 242) = \log_{10}(69) + \log_{10}(242) = 1.839 + 2.384 = 4.223. \]
We can use the log table in reverse to find that [latex]10^{0.223} = 1.67[/latex]. Thus
\[ 10^{4.223} = 10^4 \times 10^{0.223} \simeq 10000 \times 1.67 = 16700. \]
Hence our value for 69 times 242 is 16700. The real value is 16698, so this is not too bad, limited in accuracy by having tables with only 3 digits.
To calculate 69 divided by 242, we use
\[ \log_{10}(69/242) = \log_{10}(69) – \log_{10}(242) = 1.839 – 2.384 = -0.545. \]
Now [latex]-0.545 = -1 + 0.455[/latex]. Using the log table in reverse gives [latex]10^{0.455} = 2.85[/latex], so
\[ \frac{69}{240} = 10^{-1} \times 10^{0.455} = 0.1 \times 2.85 = 0.285. \]
The correct answer is 0.2851.
As a final example, suppose you wanted to find [latex]0.0365^5[/latex]. The log table gives [latex]\log_{10}(3.65) = 0.562[/latex] so
\[ \log_{10}(0.0365^5) = 5 \log_{10}(0.0365) = 5(-2 + 0.562) = 5 \times -1.438 = -7.190. \]
This is [latex]-8 + 0.810[/latex] and using the table below in reverse gives [latex]10^{0.810} = 6.455[/latex] (averaging the two values, 6.45 and 6.46, that give 0.810 in the table). Thus
\[ 0.0365^5 = 6.455 \times 10^{-8} = 0.00000006455. \]
Logarithms
[latex]x[/latex] | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|---|---|
1.0 | 0.000 | 0.004 | 0.009 | 0.013 | 0.017 | 0.021 | 0.025 | 0.029 | 0.033 | 0.037 |
1.1 | 0.041 | 0.045 | 0.049 | 0.053 | 0.057 | 0.061 | 0.064 | 0.068 | 0.072 | 0.076 |
1.2 | 0.079 | 0.083 | 0.086 | 0.090 | 0.093 | 0.097 | 0.100 | 0.104 | 0.107 | 0.111 |
1.3 | 0.114 | 0.117 | 0.121 | 0.124 | 0.127 | 0.130 | 0.134 | 0.137 | 0.140 | 0.143 |
1.4 | 0.146 | 0.149 | 0.152 | 0.155 | 0.158 | 0.161 | 0.164 | 0.167 | 0.170 | 0.173 |
1.5 | 0.176 | 0.179 | 0.182 | 0.185 | 0.188 | 0.190 | 0.193 | 0.196 | 0.199 | 0.201 |
1.6 | 0.204 | 0.207 | 0.210 | 0.212 | 0.215 | 0.217 | 0.220 | 0.223 | 0.225 | 0.228 |
1.7 | 0.230 | 0.233 | 0.236 | 0.238 | 0.241 | 0.243 | 0.246 | 0.248 | 0.250 | 0.253 |
1.8 | 0.255 | 0.258 | 0.260 | 0.262 | 0.265 | 0.267 | 0.270 | 0.272 | 0.274 | 0.276 |
1.9 | 0.279 | 0.281 | 0.283 | 0.286 | 0.288 | 0.290 | 0.292 | 0.294 | 0.297 | 0.299 |
2.0 | 0.301 | 0.303 | 0.305 | 0.307 | 0.310 | 0.312 | 0.314 | 0.316 | 0.318 | 0.320 |
2.1 | 0.322 | 0.324 | 0.326 | 0.328 | 0.330 | 0.332 | 0.334 | 0.336 | 0.338 | 0.340 |
2.2 | 0.342 | 0.344 | 0.346 | 0.348 | 0.350 | 0.352 | 0.354 | 0.356 | 0.358 | 0.360 |
2.3 | 0.362 | 0.364 | 0.365 | 0.367 | 0.369 | 0.371 | 0.373 | 0.375 | 0.377 | 0.378 |
2.4 | 0.380 | 0.382 | 0.384 | 0.386 | 0.387 | 0.389 | 0.391 | 0.393 | 0.394 | 0.396 |
2.5 | 0.398 | 0.400 | 0.401 | 0.403 | 0.405 | 0.407 | 0.408 | 0.410 | 0.412 | 0.413 |
2.6 | 0.415 | 0.417 | 0.418 | 0.420 | 0.422 | 0.423 | 0.425 | 0.427 | 0.428 | 0.430 |
2.7 | 0.431 | 0.433 | 0.435 | 0.436 | 0.438 | 0.439 | 0.441 | 0.442 | 0.444 | 0.446 |
2.8 | 0.447 | 0.449 | 0.450 | 0.452 | 0.453 | 0.455 | 0.456 | 0.458 | 0.459 | 0.461 |
2.9 | 0.462 | 0.464 | 0.465 | 0.467 | 0.468 | 0.470 | 0.471 | 0.473 | 0.474 | 0.476 |
3.0 | 0.477 | 0.479 | 0.480 | 0.481 | 0.483 | 0.484 | 0.486 | 0.487 | 0.489 | 0.490 |
3.1 | 0.491 | 0.493 | 0.494 | 0.496 | 0.497 | 0.498 | 0.500 | 0.501 | 0.502 | 0.504 |
3.2 | 0.505 | 0.507 | 0.508 | 0.509 | 0.511 | 0.512 | 0.513 | 0.515 | 0.516 | 0.517 |
3.3 | 0.519 | 0.520 | 0.521 | 0.522 | 0.524 | 0.525 | 0.526 | 0.528 | 0.529 | 0.530 |
3.4 | 0.531 | 0.533 | 0.534 | 0.535 | 0.537 | 0.538 | 0.539 | 0.540 | 0.542 | 0.543 |
3.5 | 0.544 | 0.545 | 0.547 | 0.548 | 0.549 | 0.550 | 0.551 | 0.553 | 0.554 | 0.555 |
3.6 | 0.556 | 0.558 | 0.559 | 0.560 | 0.561 | 0.562 | 0.563 | 0.565 | 0.566 | 0.567 |
3.7 | 0.568 | 0.569 | 0.571 | 0.572 | 0.573 | 0.574 | 0.575 | 0.576 | 0.577 | 0.579 |
3.8 | 0.580 | 0.581 | 0.582 | 0.583 | 0.584 | 0.585 | 0.587 | 0.588 | 0.589 | 0.590 |
3.9 | 0.591 | 0.592 | 0.593 | 0.594 | 0.595 | 0.597 | 0.598 | 0.599 | 0.600 | 0.601 |
4.0 | 0.602 | 0.603 | 0.604 | 0.605 | 0.606 | 0.607 | 0.609 | 0.610 | 0.611 | 0.612 |
4.1 | 0.613 | 0.614 | 0.615 | 0.616 | 0.617 | 0.618 | 0.619 | 0.620 | 0.621 | 0.622 |
4.2 | 0.623 | 0.624 | 0.625 | 0.626 | 0.627 | 0.628 | 0.629 | 0.630 | 0.631 | 0.632 |
4.3 | 0.633 | 0.634 | 0.635 | 0.636 | 0.637 | 0.638 | 0.639 | 0.640 | 0.641 | 0.642 |
4.4 | 0.643 | 0.644 | 0.645 | 0.646 | 0.647 | 0.648 | 0.649 | 0.650 | 0.651 | 0.652 |
4.5 | 0.653 | 0.654 | 0.655 | 0.656 | 0.657 | 0.658 | 0.659 | 0.660 | 0.661 | 0.662 |
4.6 | 0.663 | 0.664 | 0.665 | 0.666 | 0.667 | 0.667 | 0.668 | 0.669 | 0.670 | 0.671 |
4.7 | 0.672 | 0.673 | 0.674 | 0.675 | 0.676 | 0.677 | 0.678 | 0.679 | 0.679 | 0.680 |
4.8 | 0.681 | 0.682 | 0.683 | 0.684 | 0.685 | 0.686 | 0.687 | 0.688 | 0.688 | 0.689 |
4.9 | 0.690 | 0.691 | 0.692 | 0.693 | 0.694 | 0.695 | 0.695 | 0.696 | 0.697 | 0.698 |
5.0 | 0.699 | 0.700 | 0.701 | 0.702 | 0.702 | 0.703 | 0.704 | 0.705 | 0.706 | 0.707 |
5.1 | 0.708 | 0.708 | 0.709 | 0.710 | 0.711 | 0.712 | 0.713 | 0.713 | 0.714 | 0.715 |
5.2 | 0.716 | 0.717 | 0.718 | 0.719 | 0.719 | 0.720 | 0.721 | 0.722 | 0.723 | 0.723 |
5.3 | 0.724 | 0.725 | 0.726 | 0.727 | 0.728 | 0.728 | 0.729 | 0.730 | 0.731 | 0.732 |
5.4 | 0.732 | 0.733 | 0.734 | 0.735 | 0.736 | 0.736 | 0.737 | 0.738 | 0.739 | 0.740 |
5.5 | 0.740 | 0.741 | 0.742 | 0.743 | 0.744 | 0.744 | 0.745 | 0.746 | 0.747 | 0.747 |
5.6 | 0.748 | 0.749 | 0.750 | 0.751 | 0.751 | 0.752 | 0.753 | 0.754 | 0.754 | 0.755 |
5.7 | 0.756 | 0.757 | 0.757 | 0.758 | 0.759 | 0.760 | 0.760 | 0.761 | 0.762 | 0.763 |
5.8 | 0.763 | 0.764 | 0.765 | 0.766 | 0.766 | 0.767 | 0.768 | 0.769 | 0.769 | 0.770 |
5.9 | 0.771 | 0.772 | 0.772 | 0.773 | 0.774 | 0.775 | 0.775 | 0.776 | 0.777 | 0.777 |
6.0 | 0.778 | 0.779 | 0.780 | 0.780 | 0.781 | 0.782 | 0.782 | 0.783 | 0.784 | 0.785 |
6.1 | 0.785 | 0.786 | 0.787 | 0.787 | 0.788 | 0.789 | 0.790 | 0.790 | 0.791 | 0.792 |
6.2 | 0.792 | 0.793 | 0.794 | 0.794 | 0.795 | 0.796 | 0.797 | 0.797 | 0.798 | 0.799 |
6.3 | 0.799 | 0.800 | 0.801 | 0.801 | 0.802 | 0.803 | 0.803 | 0.804 | 0.805 | 0.806 |
6.4 | 0.806 | 0.807 | 0.808 | 0.808 | 0.809 | 0.810 | 0.810 | 0.811 | 0.812 | 0.812 |
6.5 | 0.813 | 0.814 | 0.814 | 0.815 | 0.816 | 0.816 | 0.817 | 0.818 | 0.818 | 0.819 |
6.6 | 0.820 | 0.820 | 0.821 | 0.822 | 0.822 | 0.823 | 0.823 | 0.824 | 0.825 | 0.825 |
6.7 | 0.826 | 0.827 | 0.827 | 0.828 | 0.829 | 0.829 | 0.830 | 0.831 | 0.831 | 0.832 |
6.8 | 0.833 | 0.833 | 0.834 | 0.834 | 0.835 | 0.836 | 0.836 | 0.837 | 0.838 | 0.838 |
6.9 | 0.839 | 0.839 | 0.840 | 0.841 | 0.841 | 0.842 | 0.843 | 0.843 | 0.844 | 0.844 |
7.0 | 0.845 | 0.846 | 0.846 | 0.847 | 0.848 | 0.848 | 0.849 | 0.849 | 0.850 | 0.851 |
7.1 | 0.851 | 0.852 | 0.852 | 0.853 | 0.854 | 0.854 | 0.855 | 0.856 | 0.856 | 0.857 |
7.2 | 0.857 | 0.858 | 0.859 | 0.859 | 0.860 | 0.860 | 0.861 | 0.862 | 0.862 | 0.863 |
7.3 | 0.863 | 0.864 | 0.865 | 0.865 | 0.866 | 0.866 | 0.867 | 0.867 | 0.868 | 0.869 |
7.4 | 0.869 | 0.870 | 0.870 | 0.871 | 0.872 | 0.872 | 0.873 | 0.873 | 0.874 | 0.874 |
7.5 | 0.875 | 0.876 | 0.876 | 0.877 | 0.877 | 0.878 | 0.879 | 0.879 | 0.880 | 0.880 |
7.6 | 0.881 | 0.881 | 0.882 | 0.883 | 0.883 | 0.884 | 0.884 | 0.885 | 0.885 | 0.886 |
7.7 | 0.886 | 0.887 | 0.888 | 0.888 | 0.889 | 0.889 | 0.890 | 0.890 | 0.891 | 0.892 |
7.8 | 0.892 | 0.893 | 0.893 | 0.894 | 0.894 | 0.895 | 0.895 | 0.896 | 0.897 | 0.897 |
7.9 | 0.898 | 0.898 | 0.899 | 0.899 | 0.900 | 0.900 | 0.901 | 0.901 | 0.902 | 0.903 |
8.0 | 0.903 | 0.904 | 0.904 | 0.905 | 0.905 | 0.906 | 0.906 | 0.907 | 0.907 | 0.908 |
8.1 | 0.908 | 0.909 | 0.910 | 0.910 | 0.911 | 0.911 | 0.912 | 0.912 | 0.913 | 0.913 |
8.2 | 0.914 | 0.914 | 0.915 | 0.915 | 0.916 | 0.916 | 0.917 | 0.918 | 0.918 | 0.919 |
8.3 | 0.919 | 0.920 | 0.920 | 0.921 | 0.921 | 0.922 | 0.922 | 0.923 | 0.923 | 0.924 |
8.4 | 0.924 | 0.925 | 0.925 | 0.926 | 0.926 | 0.927 | 0.927 | 0.928 | 0.928 | 0.929 |
8.5 | 0.929 | 0.930 | 0.930 | 0.931 | 0.931 | 0.932 | 0.932 | 0.933 | 0.933 | 0.934 |
8.6 | 0.934 | 0.935 | 0.936 | 0.936 | 0.937 | 0.937 | 0.938 | 0.938 | 0.939 | 0.939 |
8.7 | 0.940 | 0.940 | 0.941 | 0.941 | 0.942 | 0.942 | 0.943 | 0.943 | 0.943 | 0.944 |
8.8 | 0.944 | 0.945 | 0.945 | 0.946 | 0.946 | 0.947 | 0.947 | 0.948 | 0.948 | 0.949 |
8.9 | 0.949 | 0.950 | 0.950 | 0.951 | 0.951 | 0.952 | 0.952 | 0.953 | 0.953 | 0.954 |
9.0 | 0.954 | 0.955 | 0.955 | 0.956 | 0.956 | 0.957 | 0.957 | 0.958 | 0.958 | 0.959 |
9.1 | 0.959 | 0.960 | 0.960 | 0.960 | 0.961 | 0.961 | 0.962 | 0.962 | 0.963 | 0.963 |
9.2 | 0.964 | 0.964 | 0.965 | 0.965 | 0.966 | 0.966 | 0.967 | 0.967 | 0.968 | 0.968 |
9.3 | 0.968 | 0.969 | 0.969 | 0.970 | 0.970 | 0.971 | 0.971 | 0.972 | 0.972 | 0.973 |
9.4 | 0.973 | 0.974 | 0.974 | 0.975 | 0.975 | 0.975 | 0.976 | 0.976 | 0.977 | 0.977 |
9.5 | 0.978 | 0.978 | 0.979 | 0.979 | 0.980 | 0.980 | 0.980 | 0.981 | 0.981 | 0.982 |
9.6 | 0.982 | 0.983 | 0.983 | 0.984 | 0.984 | 0.985 | 0.985 | 0.985 | 0.986 | 0.986 |
9.7 | 0.987 | 0.987 | 0.988 | 0.988 | 0.989 | 0.989 | 0.989 | 0.990 | 0.990 | 0.991 |
9.8 | 0.991 | 0.992 | 0.992 | 0.993 | 0.993 | 0.993 | 0.994 | 0.994 | 0.995 | 0.995 |
9.9 | 0.996 | 0.996 | 0.997 | 0.997 | 0.997 | 0.998 | 0.998 | 0.999 | 0.999 | 1.000 |
This table gives [latex]\log_{10}(x)[/latex] for [latex]1 \leq x \lt 10[/latex].
Factorials
The factorial of [latex]n[/latex], written [latex]n![/latex], is defined by
\[ n! = n \times (n-1) \times \cdots \times 2 \times 1. \]
In terms of counting, [latex]n![/latex] gives the number of ways that [latex]n[/latex] objects can be arranged in order. If you have people numbered 1 to 10 then there are 10! = 3628800 different ways you can place them in a line. Factorials get very large as [latex]n[/latex] increases and most calculators will decline to give you anything bigger than [latex]69![/latex].
The table below gives the value of [latex]739![/latex], a number with 1801 digits. Compare this to the random digits given in Chapter 2 or the 1800 digits of [latex]e[/latex] shown in the digits of [latex]e[/latex].
Digits of 739!
65777 | 74393 | 87227 | 82976 | 77590 | 72639 | 47193 | 98743 | 92760 | 94504 | 84743 | 64308 |
73699 | 73298 | 10727 | 48216 | 64078 | 22537 | 39417 | 56481 | 22808 | 00936 | 43565 | 08067 |
64735 | 00062 | 27685 | 77794 | 83336 | 84682 | 23940 | 01890 | 23966 | 85231 | 08912 | 15161 |
49057 | 43094 | 55453 | 08258 | 47461 | 25323 | 06103 | 29184 | 57905 | 44808 | 67757 | 57569 |
76706 | 28099 | 02000 | 85074 | 80491 | 47433 | 91302 | 81886 | 39132 | 43597 | 07362 | 89668 |
41158 | 65621 | 69966 | 26353 | 28002 | 72375 | 76125 | 50461 | 47685 | 27169 | 34080 | 47075 |
87254 | 02486 | 23490 | 23082 | 89110 | 60643 | 64925 | 30899 | 90130 | 89274 | 03875 | 91422 |
78509 | 02954 | 90984 | 63963 | 64871 | 25836 | 04319 | 52328 | 87702 | 32602 | 75245 | 03865 |
51244 | 26241 | 50212 | 40475 | 60401 | 78440 | 93955 | 78829 | 25177 | 33611 | 56263 | 74980 |
44091 | 66667 | 06634 | 94137 | 42804 | 67737 | 64888 | 92090 | 71228 | 08194 | 12326 | 76596 |
82985 | 72392 | 11666 | 57048 | 07030 | 64589 | 29110 | 05555 | 38079 | 00007 | 43449 | 49296 |
15667 | 60971 | 05284 | 56944 | 92298 | 59859 | 12684 | 22186 | 94954 | 22183 | 75208 | 74082 |
67127 | 59912 | 72131 | 66808 | 59621 | 14393 | 25131 | 72127 | 88965 | 47410 | 42645 | 82315 |
88663 | 61616 | 47249 | 34668 | 82461 | 12366 | 87988 | 74228 | 16382 | 66467 | 99109 | 75705 |
11726 | 72920 | 37565 | 00061 | 72445 | 77521 | 88650 | 17229 | 90334 | 84793 | 90035 | 47342 |
38468 | 06538 | 67848 | 73491 | 76419 | 69887 | 23524 | 32100 | 33464 | 45914 | 50837 | 25186 |
27320 | 33480 | 11280 | 66846 | 41595 | 20597 | 71791 | 47152 | 83651 | 67201 | 69428 | 43337 |
88888 | 19759 | 70947 | 64558 | 69184 | 59498 | 23952 | 53032 | 33148 | 75478 | 93150 | 94171 |
07102 | 56869 | 48509 | 06656 | 83125 | 42705 | 19608 | 89941 | 21933 | 87455 | 03516 | 52536 |
16940 | 72997 | 00879 | 75990 | 21460 | 45613 | 05888 | 03137 | 21574 | 08620 | 28689 | 87718 |
53815 | 87059 | 77617 | 68564 | 85244 | 95272 | 73073 | 77777 | 08928 | 75826 | 03255 | 63678 |
79676 | 44580 | 59782 | 99618 | 07659 | 53358 | 98093 | 23290 | 45288 | 88975 | 07257 | 48732 |
29450 | 47003 | 26741 | 09254 | 70029 | 67187 | 08781 | 76365 | 15154 | 12984 | 77805 | 92189 |
11394 | 45092 | 38673 | 95013 | 82294 | 20659 | 10356 | 60404 | 86693 | 66763 | 49904 | 92147 |
80016 | 45757 | 83462 | 33527 | 77171 | 21596 | 20189 | 99232 | 14715 | 22829 | 08188 | 29723 |
00337 | 20179 | 14672 | 41706 | 98915 | 43968 | 36104 | 88825 | 81428 | 17655 | 34941 | 93829 |
84895 | 73300 | 35262 | 43289 | 34647 | 74506 | 51304 | 85495 | 51406 | 35380 | 81287 | 37280 |
00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 00000 |
00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 00000 |
00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 00000 | 000000 |
Since logarithms are useful for dealing with large numbers they are an important tool for working with factorials. The following table gives the logarithms of factorials up to [latex]169![/latex], and we can use this table to estimate Binomial coefficients. For example, suppose we want to calculate [latex]\pr{X = 25}[/latex] when [latex]\Binomial{X}{80}{p}[/latex]. The first step is to calculate the Binomial coefficient [latex]{80 \choose 25}[/latex]. Taking logarithms we have
\begin{eqnarray*}
\log\left(\frac{80!}{25! \times 55!}\right) & = & \log(80!) – (\log(25!) + \log(55!)) \\
& = & 118.855 – (25.191 + 73.104) = 20.560.
\end{eqnarray*}
Using the log table in reverse, this gives
\[ {80 \choose 25} = 10^{0.560} \times 10^{20} = 3.63 \times 10^{20}, \]
compared to the exact value of 363413731121503794368. This can then be used with the Binomial formula to calculate [latex]\pr{X = 25}[/latex], where logarithms can also be used to raise [latex]p[/latex] and [latex]1-p[/latex] to their respective powers.
Logarithms of factorials
Second digit of [latex]n[/latex] | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
[latex]n[/latex] | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
0 | 0 | 0 | 0 | 0 | 1 | 2 | 2 | 3 | 4 | 5 |
.000 | .000 | .301 | .778 | .380 | .079 | .857 | .702 | .606 | .560 | |
1 | 6 | 7 | 8 | 9 | 10 | 12 | 13 | 14 | 15 | 17 |
.560 | .601 | .680 | .794 | .940 | .116 | .321 | .551 | .806 | .085 | |
2 | 18 | 19 | 21 | 22 | 23 | 25 | 26 | 28 | 29 | 30 |
.386 | .708 | .051 | .412 | .793 | .191 | .606 | .037 | .484 | .947 | |
3 | 32 | 33 | 35 | 36 | 38 | 40 | 41 | 43 | 44 | 46 |
.424 | .915 | .420 | .939 | .470 | .014 | .571 | .139 | .719 | .310 | |
4 | 47 | 49 | 51 | 52 | 54 | 56 | 57 | 59 | 61 | 62 |
.912 | .524 | .148 | .781 | .425 | .078 | .741 | .413 | .094 | .784 | |
5 | 64 | 66 | 67 | 69 | 71 | 73 | 74 | 76 | 78 | 80 |
.483 | .191 | .907 | .631 | .363 | .104 | .852 | .608 | .371 | .142 | |
6 | 81 | 83 | 85 | 87 | 89 | 90 | 92 | 94 | 96 | 98 |
.920 | .706 | .498 | .297 | .103 | .916 | .736 | .562 | .394 | .233 | |
7 | 100 | 101 | 103 | 105 | 107 | 109 | 111 | 113 | 115 | 116 |
.078 | .930 | .787 | .650 | .520 | .395 | .275 | .162 | .054 | .952 | |
8 | 118 | 120 | 122 | 124 | 126 | 128 | 130 | 132 | 134 | 136 |
.855 | .763 | .677 | .596 | .520 | .450 | .384 | .324 | .268 | .218 | |
9 | 138 | 140 | 142 | 144 | 146 | 148 | 149 | 151 | 153 | 155 |
.172 | .131 | .095 | .063 | .036 | .014 | .996 | .983 | .974 | .970 | |
10 | 157 | 159 | 161 | 163 | 166 | 168 | 170 | 172 | 174 | 176 |
.970 | .974 | .983 | .996 | .013 | .034 | .059 | .089 | .122 | .160 | |
11 | 178 | 180 | 182 | 184 | 186 | 188 | 190 | 192 | 194 | 196 |
.201 | .246 | .295 | .349 | .405 | .466 | .531 | .599 | .671 | .746 | |
12 | 198 | 200 | 202 | 205 | 207 | 209 | 211 | 213 | 215 | 217 |
.825 | .908 | .995 | .084 | .178 | .275 | .375 | .479 | .586 | .697 | |
13 | 219 | 221 | 224 | 226 | 228 | 230 | 232 | 234 | 236 | 238 |
.811 | .928 | .049 | .172 | .299 | .430 | .563 | .700 | .840 | .983 | |
14 | 241 | 243 | 245 | 247 | 249 | 251 | 254 | 256 | 258 | 260 |
.129 | .278 | .431 | .586 | .744 | .906 | .070 | .237 | .408 | .581 | |
15 | 262 | 264 | 267 | 269 | 271 | 273 | 275 | 278 | 280 | 282 |
.757 | .936 | .118 | .302 | .490 | .680 | .873 | .069 | .268 | .469 | |
16 | 284 | 286 | 289 | 291 | 293 | 295 | 297 | 300 | 302 | 304 |
.673 | .880 | .090 | .302 | .517 | .734 | .954 | .177 | .402 | .630 |
Stirling’s Formula
In addition to the values given in the table above, there is also a somewhat surprising approximation for [latex]n![/latex], particularly useful for large values of [latex]n[/latex]. This is known as Stirling’s formula (Newman, 1997) and is given by
\[ n! \approx \sqrt{2\pi n}\left(\frac{n}{e}\right)^n, \]
where [latex]\pi[/latex] and [latex]e[/latex] are as usual. In logarithms this is
\[ \log(n!) \approx \frac{\log(2) + \log(\pi) + \log(n)}{2} + n(\log(n) – \log(e)). \]
For example, this formula gives [latex]\log(69!) \approx 98.2328[/latex], compared to the correct value of 98.2333. For [latex]1000![/latex], Stirling’s formula gives [latex]\log(1000!) \approx 2567.60461[/latex], very close to the true value of 2567.60464.
Arithmetic-Geometric Mean
We have used two types of mean in this book. The first was the sample mean, also known as the arithmetic mean since it involves a simple sum of the numbers (see Chapter 5). The second was the geometric mean, the square root of the product, which arose through calculating the sample (arithmetic) mean of log-transformed data (see Chapter 14). The geometric mean is always less than or equal to the arithmetic mean.
There is an interesting combination of these two means, the arithmetic-geometric mean (AGM). It is only defined for the mean of two numbers, so it is not much use in data analysis, but we mention it here because of a surprising role it plays in calculating [latex]\pi[/latex].
The AGM of two numbers [latex]a[/latex] and [latex]b[/latex] is not calculated directly but is instead the result of a sequence of calculations. We start with two values, [latex]a_0 = a[/latex] and [latex]b_0 = b[/latex], and calculate their arithmetic mean, [latex]a_1[/latex], and their geometric mean, [latex]b_1[/latex]. We then repeat this with [latex]a_1[/latex] and [latex]b_1[/latex] instead. In general, for any step [latex]k[/latex], we calculate
\[ a_{k+1} = \frac{a_k + b_k}{2} \mbox{ and } b_{k+1} = \sqrt{a_k b_k}. \]
For example, to calculate the AGM of 3 and 8 we calculate the average [latex]a_1 = \frac{3+8}{2} = 5.5[/latex] and the geometric mean [latex]b_1 = \sqrt{3 \times 8} = 4.89897949[/latex]. Then we work out [latex]a_2 = \frac{a_1 + b_1}{2}[/latex] and [latex]b_2 = \sqrt{a_1 b_1}[/latex], giving
\[ a_2 = 5.199489743, b_2 = 5.190798317 \]
\[ a_3 = 5.195144030, b_3 = 5.195142212 \]
\[ a_4 = 5.195143121, b_4 = 5.195143121 \]
We now have [latex]a_4 = b_4[/latex] to (at least) 9 decimal places, so the AGM of 3 and 8 is 5.195143121, in between the arithmetic and geometric means.
The AGM and [latex]\pi[/latex]
At each step of the above calculation we can define [latex]c_k = a_k^2 - b_k^2[/latex]. Starting with [latex]a_0 = 1[/latex] and [latex]b_0 = \frac{1}{\sqrt{2}}[/latex], a modified formula to one derived by Salamin (1976) and Brent (1976) gives
\[ \pi \simeq \frac{4 a_k^2}{1 – \sum_{j=1}^k 2^{j+1} c_j}, \]
with the approximation getting better as [latex]k[/latex] increases. The amazing thing about this formula is that it gives a lot more decimal places of [latex]\pi[/latex] for each step of [latex]k[/latex]. Notice that in the above example the [latex]a_k[/latex] and [latex]b_k[/latex] values get close to each other very quickly, with the number of matching decimal places doubling after each step. In the same way, this formula for [latex]\pi[/latex] roughly doubles the number of accurate decimal places after each step. For [latex]k=1[/latex] it is correct to 1 decimal place; for [latex]k=2[/latex] it is correct to 3 decimal places; for [latex]k=3[/latex] it is correct to 9 places, almost enough for a standard calculator. After just 8 more steps ([latex]k=11[/latex]) this simple process gives [latex]\pi[/latex] correct to 2792 decimal places. The following table shows the first 1800 decimal places from this number. In 1999, Kanada and Takahashi used a method based on this formula to calculate 206 billion digits of [latex]\pi[/latex], then the world record. (Yasumasa Kanada led many world records for computations of [latex]\pi[/latex] but his work was surpassed by a newer algorithm in the late 2000s. This algorithm was used by Google employee Emma Haruka Iwao in 2019 to calculate [latex]\pi[/latex] to a record 31.4159 ([latex]10\pi[/latex]) trillion decimal places.)
1800 decimal places of [latex]\pi[/latex]
3.14159 | 26535 | 89793 | 23846 | 26433 | 83279 | 50288 | 41971 | 69399 | 37510 | 58209 | 74944 |
59230 | 78164 | 06286 | 20899 | 86280 | 34825 | 34211 | 70679 | 82148 | 08651 | 32823 | 06647 |
09384 | 46095 | 50582 | 23172 | 53594 | 08128 | 48111 | 74502 | 84102 | 70193 | 85211 | 05559 |
64462 | 29489 | 54930 | 38196 | 44288 | 10975 | 66593 | 34461 | 28475 | 64823 | 37867 | 83165 |
27120 | 19091 | 45648 | 56692 | 34603 | 48610 | 45432 | 66482 | 13393 | 60726 | 02491 | 41273 |
72458 | 70066 | 06315 | 58817 | 48815 | 20920 | 96282 | 92540 | 91715 | 36436 | 78925 | 90360 |
01133 | 05305 | 48820 | 46652 | 13841 | 46951 | 94151 | 16094 | 33057 | 27036 | 57595 | 91953 |
09218 | 61173 | 81932 | 61179 | 31051 | 18548 | 07446 | 23799 | 62749 | 56735 | 18857 | 52724 |
89122 | 79381 | 83011 | 94912 | 98336 | 73362 | 44065 | 66430 | 86021 | 39494 | 63952 | 24737 |
19070 | 21798 | 60943 | 70277 | 05392 | 17176 | 29317 | 67523 | 84674 | 81846 | 76694 | 05132 |
00056 | 81271 | 45263 | 56082 | 77857 | 71342 | 75778 | 96091 | 73637 | 17872 | 14684 | 40901 |
22495 | 34301 | 46549 | 58537 | 10507 | 92279 | 68925 | 89235 | 42019 | 95611 | 21290 | 21960 |
86403 | 44181 | 59813 | 62977 | 47713 | 09960 | 51870 | 72113 | 49999 | 99837 | 29780 | 49951 |
05973 | 17328 | 16096 | 31859 | 50244 | 59455 | 34690 | 83026 | 42522 | 30825 | 33446 | 85035 |
26193 | 11881 | 71010 | 00313 | 78387 | 52886 | 58753 | 32083 | 81420 | 61717 | 76691 | 47303 |
59825 | 34904 | 28755 | 46873 | 11595 | 62863 | 88235 | 37875 | 93751 | 95778 | 18577 | 80532 |
17122 | 68066 | 13001 | 92787 | 66111 | 95909 | 21642 | 01989 | 38095 | 25720 | 10654 | 85863 |
27886 | 59361 | 53381 | 82796 | 82303 | 01952 | 03530 | 18529 | 68995 | 77362 | 25994 | 13891 |
24972 | 17752 | 83479 | 13151 | 55748 | 57242 | 45415 | 06959 | 50829 | 53311 | 68617 | 27855 |
88907 | 50983 | 81754 | 63746 | 49393 | 19255 | 06040 | 09277 | 01671 | 13900 | 98488 | 24012 |
85836 | 16035 | 63707 | 66010 | 47101 | 81942 | 95559 | 61989 | 46767 | 83744 | 94482 | 55379 |
77472 | 68471 | 04047 | 53464 | 62080 | 46684 | 25906 | 94912 | 93313 | 67702 | 89891 | 52104 |
75216 | 20569 | 66024 | 05803 | 81501 | 93511 | 25338 | 24300 | 35587 | 64024 | 74964 | 73263 |
91419 | 92726 | 04269 | 92279 | 67823 | 54781 | 63600 | 93417 | 21641 | 21992 | 45863 | 15030 |
28618 | 29745 | 55706 | 74983 | 85054 | 94588 | 58692 | 69956 | 90927 | 21079 | 75093 | 02955 |
32116 | 53449 | 87202 | 75596 | 02364 | 80665 | 49911 | 98818 | 34797 | 75356 | 63698 | 07426 |
54252 | 78625 | 51818 | 41757 | 46728 | 90977 | 77279 | 38000 | 81647 | 06001 | 61452 | 49192 |
17321 | 72147 | 72350 | 14144 | 19735 | 68548 | 16136 | 11573 | 52552 | 13347 | 57418 | 49468 |
43852 | 33239 | 07394 | 14333 | 45477 | 62416 | 86251 | 89835 | 69485 | 56209 | 92192 | 22184 |
27255 | 02542 | 56887 | 67179 | 04946 | 01653 | 46680 | 49886 | 27232 | 79178 | 60857 | 84383 |
Choosing Histogram Bins
Choosing the number of bins for a histogram is a rather subjective process and you should always vary their number to determine whether the pattern you see appears for a range of bin widths. However, Sturges (1926) proposed a simple rule for choosing the number of bins based on what happens with the Binomial distribution. For example, consider the distribution of [latex]\Binomial{X}{3}{0.5}[/latex] that we calculated in Chapter 8 from scratch. This arose from looking at the different samples of size 3 we could get involving males and females. There were [latex]2^3 = 8[/latex] possible samples if we kept the outcomes in order, but there were only [latex]3 + 1 = 4[/latex] possible counts of females, as shown in the [latex]\Binomial{X}{3}{0.5}[/latex] distribution figure. Similarly, see the Binomial(10,0.5) distribution figure. This comes from [latex]2^{10} = 1024[/latex] samples which give [latex]10 + 1 = 11[/latex] possible counts.
Since the Binomial distribution matches the Normal distribution quite closely, it would thus be reasonable to draw a histogram of data that is roughly Normal using the number of bins present in the corresponding Binomial distribution. That is, if we have [latex]n[/latex] observations then we should choose [latex]b[/latex] bins such that
\[ 2^{b-1} \approx n. \]
For example, for the 42 observations plotted in the histogram of islander heights, we would use 6 bins since [latex]2^{6-1} = 2^5 = 32 \approx 42[/latex].
Scott (1992) gives a more comprehensive theory of how to select the bin width for different data sets. The important point is that Sturges’ rule could only ever make sense for symmetric data and so better rules will need to look at more aspects of the data than just the sample size. For example, Scott gives a rule for choosing bin width, [latex]h[/latex], by
\[ h = \frac{2 \times \IQR}{\sqrt[3]{n}}, \]
which uses the interquartile range to take into consideration the spread of the data. For example, for the 42 observations plotted in the histogram of islander heights the [latex]\IQR[/latex] was 20 cm, giving [latex]h[/latex] =11.5 cm. Since the whole range was 53 cm, this suggests using [latex]\frac{53}{11.5} = 4.61[/latex] bins, a little less than Sturges’ rule.
Desert Island Formulas
Computing formulas were once an important part of doing statistical calculations, giving equivalent expressions to the original definitions which are easier to do by hand with a simple calculator. However these are all now routinely calculated by software and many are also available on inexpensive scientific calculators. We include these formulas mainly for historical interest and just in case you find yourself stuck on a desert island with only a primitive calculator and this book. In fact with the logarithm tables earlier in this chapter you could probably survive without a calculator too!
Sample Standard Deviation
The computing formula for a sum of squared deviations from a mean is
\[ \sum (x_j – \overline{x})^2 = \sum x_j^2 – \frac{1}{n} \left( \sum x_j \right)^2. \]
This is the main idea behind many computing formulas, breaking the original expression into pieces which can be calculated easily, such as the sum of all the values or the sum of all the squared values.
The sample standard deviation can then be calculated using
\[ s = \sqrt{\frac{\sum x_j^2 – \left( \sum x_j \right)^2/n}{n-1}}. \]
You can see similarities amongst the formulas based on squared deviations, such as this one for [latex]s[/latex] and those for [latex]r[/latex] and [latex]b_1[/latex] given below.
Correlation
The correlation coefficient can be calculated using
\[ r = \frac{ \sum x_j y_j – \frac{1}{n} \left( \sum x_j \right) \left( \sum y_j \right)}{(n-1)s_x s_y}, \]
where [latex]s_x[/latex] and [latex]s_y[/latex] are the sample standard deviations of the [latex]x[/latex] and [latex]y[/latex] values, respectively.
Least-Squares Line
The intercept and slope of the least-squares line can be computed using
\[ b_1 = \frac{\sum x_j y_j – \frac{1}{n} \left( \sum x_j \right) \left( \sum y_j \right)}{\sum x_j^2 – \frac{1}{n} \left( \sum x_j \right)^2} \]
for the slope and [latex]b_0 = \overline{y} - b_1 \overline{x}[/latex] for the intercept.
Note that the formula for [latex]b_1[/latex] (and so [latex]b_0[/latex]) involves the sum of squared deviations of the [latex]x[/latex] values from a previous section. We saw this term in the formulas for the standard deviations of these in Chapter 18. Also note the similarities between the formulas for [latex]b_1[/latex] and [latex]r[/latex]. Comparing these, along with the formula for [latex]s_x[/latex], shows the useful relationship
\[ b_1 = r \frac{s_y}{s_x}. \]
As discussed by Moore and McCabe (1999), this shows that a change of one standard deviation in [latex]x[/latex] corresponds to a change of [latex]r[/latex] standard deviations in [latex]y[/latex].
To carry out a hypothesis test for the slope of a line you need the standard error of the least-squares estimate. Sedcole (2010) notes that
\[ \se{b_1} = \frac{b_1 \tan(\cos^{-1}(r))}{\sqrt{n-2}}, \]
a useful shortcut if you have already calculated the slope and correlation coefficient. For example, for the data on basal plasma oxytocin and age in the oxytocin example we found [latex]b_1 = -0.0097[/latex] and [latex]r = -0.5165[/latex] from [latex]n = 24[/latex] subjects. This gives
\[ \se{b_1} = \frac{ -0.0097 \times \tan(\cos^{-1}(-0.5165))}{\sqrt{22}} = \frac{-0.0097 \times -1.6579}{\sqrt{22}} = 0.0034, \]
the same value given in the regression summary in this table in Chapter 18. It may seem strange to find trigonometric functions related to formulas for least-squares lines but in fact the theory of regression and analysis of variance can all be developed from a geometric foundation, in contrast to the algebraic approach we have used in this book. Saville and Wood (1996) and Kaplan (2009) give excellent introductions to this interesting perspective on statistical modelling.
Other Distribution Functions
F Distribution
The probability density function of the [latex]F(n,d)[/latex] distribution with [latex]n[/latex] and [latex]d[/latex] degrees of freedom is given by
\[ f(x) = \frac{\Gamma\!\left(\frac{n+d}{2}\right) n^{\frac{n}{2}} d^{\frac{d}{2}} x^{\frac{n-2}{2}}}{\Gamma\!\left(\frac{n}{2}\right) \Gamma\!\left(\frac{d}{2}\right) (d + nx)^{\frac{n+d}{2}}}. \]
This looks reminiscent of the density curve for the [latex]t(n)[/latex] distribution, and indeed we saw that the [latex]F[/latex] test for analysis of variance was a generalisation of the two-sample [latex]t[/latex] test.
Compound Interest and Student’s t Distribution
Other Distribution Functions
F Distribution
Chi-Square Distribution
Studentized Range Distribution
\[\pr{Q_{k,d} \ge q}=C(d) \int_0^{\infty} x^{d-1}e^{-\frac{d x^2}{2}}\left\{ k \int_{-\infty}^{\infty} \theta(y)[\Theta(y) – \Theta(y – qx)]^{k-1} dy \right\} dx\]