Errata List for "The Data Science Design Manual" by Steven Skiena

Last addition: December 7, 2021

Non-trivial errata are denoted with a (*).  I thank Prof. Robert Piche
of the Tampere University of Technology for a particularly extensive list of
corrections.

=============================================================

Page vi  reducto -> reductio

Page 1 comes challenges -> come challenges

Page 1 have lead -> have led

Page 3 There exist an important class -> There exists an important class

Page 5 line -14  disinterested -> uninterested

Page 7 zero to ten stars scale -> one to ten starts scale

Page 12, line 10: "FOA" should be "FOIA" 

Page 15, line 11: "characters" may be better written as "options".

Page 16, line 10: (age and martial status) should be (age and marital status).

Page 26 each face baring -> each face bearing

Page 28, 83, 198, 208, 229, 230, 257, 261, 279,  360, 361, 400, 403, 413 alternate -> alternative

Page 29 a French nobleman -> a French writer  [“chevalier de Méré” was the pseudonym of Antoine Gombaud, who was not a nobleman]

Page 31, 32, 150, 153, 354, 356  Bayes theorem -> Bayes’ theorem

(*) Page 31, line -12: P(B) = 8/36, not 9/36.  This error propagates through
the next two corrections.

(*) Page 31, line -11: Should be $P(A)=27/36$ and $P(B|A) = 8/27$.

(*) Page 31 ,line -1: Should be $P(B|A)= (1 \cdot 8/36)/(27/36) = 8/27$.

(*) Page 32, Figure 2.2: The CDF's jumps should occur at integer-valued abscissas.

(*) Page 33 always get higher -> can only get higher

(*) Page 33, line -7:  The equation for $P(k = X)$ should be:
      $P(k=X) = C'(k) = C(X \leq k ) - C(X \leq k - \delta)$

Page 36, 109, 128, 287, 400  Alternately -> Alternatively

Page 37 there is no real phenomena -> there are no real phenomena

Page 40 degree to which $Y$ is a function of $X$ -> degree to which $Y$ is a linear function of $X$

Page 40 social economic status -> socioeconomic status

Page 43 $R^2$ -> $r^2$

Page 45 $V(Y)$ -> $V(y)$

Page 45, line 13: In the formula 1-r^2 = V(r)/V(y), the variable r is being used for two different things.  The first one is r-square and the second one is residual error.

Page 49 does terrible -> does terribly

Page 58 statistical methods, which -> statistical methods which

Page 62 LaTex -> LaTeX

Page 65: "The Freedom of Information Act (FOI) enables" should be "The Freedom of Information Act (FOIA) enables"

(*) Page 79 exponentially with $k$ -> exponentially with $k^2$

Page 82, line 12: "then wrote then wrote" should be "then wrote"

(*) Page 82, line 14: Delete the sentence:
"The median score among my students was closer than any single guess".
First it was meant to read "any other guess".   But actually 2350 is closer.

(*) Page 101, line -9: "92.5th percentile" should be "97.7th percentile"

(*) Page 103 unlikely to not make -> unlikely to make

Page 115 y and z, prefer -> y and z prefer

Page 116 Issac -> Isaac

Page 118 a through introduction -> a thorough introduction

Page 119 et. al. -> et al.

Page 119, problem 4-1:  "Suppose we observe X = 5.08." X should be lower case italic x

Page 121, 205, 289, 297, 299, 300, 435: Baysian -> Bayesian

Page 122 on the look out -> on the lookout

(*) Page 124 the most likely number of failures will be zero ->  the most likely number of failures will be one [ Prob(#fail=0)=0.3677, Prob(#fail=1)= 0.3681]

Page 124 the number of hits per season are drawn -> the number of hits per season is drawn

Page 124, 125, 139,  373  phenomenon -> phenomena

Page 125 are free to arbitrary real numbers -> are free to be arbitrary real numbers

Page 127 the average value of the distribution -> the mean of the distribution

(*) Page 128, line -2, $\lambda$ should be $\mu$.

Page 132 -- inequality of the world -> inequities of the world

Page 132,  Section 5.2 Line 2: "it is pays" should be "it pays"

(*) Page 132, last line, The equation for $P(k = X)$ should be:
      $P(k=X) = C'(k) = C(X \leq k ) - C(X \leq k - \delta)$

Page 134 Monte Carlo sampling  ->  acceptance-rejection sampling

Page 136  $1,000 bucks -> 1,000 bucks

(*) Page 136 line -6: formula should be $d=(|\mu-\mu'|)/\mbox{pooled}(\sigma}$

(*) Page 137, line 15: The order of effect sizes is reversed: it should
be "small effects 85\% overlap, medium effects 67\% overlap, and large effects 53\% overlap".

Page 138 weighted 166.2 pounds  ->  weighed 166.2 pounds

Page 138 if I what I observed ->  if what I observed

Page 138 The number of samples are large enough -> The number of samples is large enough

(*) Page 139, line 6: "essentially the sample sizes" should be $n_1+n_2-2$.

(*) Page 139, line 8: "$t > v$" should be "$|t| > |v|$".

(*) Page 144, line 26: (1-.0672) should be (1-.672)

Page 152, line -11:: "person will weight over" should be "person will weigh over"

Page 154 Each of friends -> Each friend

Page 156 changing in the way -> changing the way

Page 160 peak at the dot plots -> peek at the dot plots

Page 162 painter's palate -> painter's palette

Page 172 Thus grouping -> Grouping

Page 176 we are faced -> we are faced with

Page 177 \emph{Oh G-d} -> \emph{Oh G-d!}

Page 185 when each bin a non-trivial number of element -> when each bin has a non-trivial number of elements

Page 187 The cdf is monotonically increasing -> The cdf is monotonically nondecreasing

Page 188 Nobel gases -> noble gases [2 occurrences]
Page 191 to the handle of a single water pump. They changed the handle -> to a single public water pump. The authorities disabled the pump by removing its handle

(*) Page 195 There are few potential downsides -> There are a few potential downsides

Page 195 Knobs and features get added because they can, -> Knobs and features get added because they can be,

Page 197 Stronger friendships -> Stronger associations

Page 200 problems 6-11 - "Describe good practices in data visualization?"
Should end with a period instead of question mark

Page 200, line -9: Problem 6-13: "How would you to determine" should be "How would you determine"

Page 202 a 13th-century theologian -> a 14th-century theologian

(*) Page 203, line 7:  "Models that do much better on testing data than training data are overfit" should be "Models that do much better on training data than testing data are overfit".

Page 203 There are always a range of possible outcomes -> There is always a range of possible outcomes

Page 212 reduce the performance of a model -> reduce the performance assessment of a model

(*) Page 219, line 2: "false positive and false negative rates" should be "true positive and false positive rates"

(**) Page 219, line 3: The true positive rate is just recall, i.e., TP / (TP + FN). The false positive rate is FP / (FP + TN).

(**) Page 219, paragraphs 2 and 3:  This is a mess, because I reversed left and left and right..
line 5: the sweep should go from right to left.
line 8: "very left" should be "very right"
line 10: "as far to the right" should be "as far as the left"
line 15: "sweep our threshhold to the right" should be "sweep our threshhold to the left"

Page 220, line -5: "Ours in a hard task" should be "Ours is a hard task"

(*) Page 221, line 3: "none are classified as 1800" should be "almost none (1\%) are classified as 1800"

Page 221, line 14: "Sparse" rows and columns in the confusion matrix refers to raw counts.   Figure 7.7 has been normalized so the counts are indeterminable.

Page 222 adding an constant offset -> adding a constant offset

Page 225 Upon taking position of a data set -> Upon taking possession of a data set

Page 225: "Observe the the evaluation" should be "Observe the evaluation"

Page 236 back testing -> backtesting

Page 236, problem 7-16: "following types of betable events"
betable should be spelled bettable

Page 238, line 8: Add last plus to get
$y = c_0 + c_1 x_1 + c_2 x_2 + \cdots + c_{m-1} x_{m-1}$

(*) Page 239, line 6: The left variable of the equation should be $w$, not $c$.

(*) Page 239, line 7: "Ax = b" should be "Aw = b".

Page 243, Figure 8.4 caption: the transposition is in the center, not on the right as printed.

(*) Page 243, line 1: The transpose of a matrix sort of “rotates” it by 180 degrees -> Transposing a matrix flips it about the main diagonal
[  transposition is not a rotation, Lincoln^T parts his hair differently from Lincoln ]

(*) Page 243 The matrix product $X Y^T$ of these two vectors produces a $1 \times 1$ matrix -> The matrix product $X^T Y$ of these two column-vectors produces a $1 \times 1$  matrix

(*) Page 243 $C_{ij}= \sum_{i=1}^k A_{ik}\cdot B_{kj}$        ->       $C_{ij}= \sum_{l=1}^k A_{il} B_{lj}$

(*) Page 243 For any pair of non-square matrices A and B, at most one of either AB or BA has compatible dimensions. -> [remove the statement; a counterexample is  B=A^T].

(*) Page 244, line 12: Change 4 to 3 in lower-left entry of leftmost matrix:
$$ \Bigg(
\begin{bmatrix}
    1 & 2 \\
    3 & 4 \\
\end{bmatrix}
\begin{bmatrix}
    1 & 0 \\
    0 & 2 \\
\end{bmatrix}
\Bigg) $$

(*) Page 248-249 We say $A^{−1}$ is the multiplicative inverse of matrix $A$ if $A \cdot A^{−1} = I$ ->  We say $A^{−1}$ is the multiplicative inverse of square matrix $A$ if $A A^{−1} = I$, from which it follows that $A^{−1}A=I$.

(*) Page 249 such matrices are not invertible or \emph{singular}, -> such matrices are not invertible, they are \emph{singular},

Page 249 & 264 matricies -> matrices

Page 250 $C \cdot x$ -> $C X$

(*) Page 251 there is not enough information to solve an undetermined system of linear equations -> there is not enough information to determine a unique solution of an undetermined system of linear equations

(*) Page 257 Section 8.5, line 3:  $\lambda_i \geq  \lambda_{i−i} $ for all $i$  ->  $\lambda_i \geq  \lambda_{i−1} $ for all $i>1$

(*) Page 259, Section 8.5.1, lines 15 and 16: In both the equation and the
line following, $B_k^T$ should be $B_k$.

(*) Page 259 Should $M$ contain convex numbers ->  Should $M$ contain complex numbers

Page 264 8-10 "Ax=b?" Should end with a period instead of question mark

(*) Page 264  Problem 8.17 Show that the eigenvalues of $MM^T$ are the same as that of $M^T M$ -> Show that for any square matrix $M$, the eigenvalues of $MM^T$ are the same as those of $M^T M$

Page 265 What are the size of the numerical residuals -> What are the sizes of the numerical residuals

Page 268 verses -> versus

Page 269 a singular point -> a single point

(*) Page 270 is the projection of $y_i − f(X)$ down to $X$ ->   is $y_i-f(x_i)$

(*) Page 271, line 20:  In the right hand side of the equation for $w_1$,
the $(sigma_x/sigma_y)$ part should be flipped to $(sigma_y/sigma_x)$.

(*) Page 273 it is a linear function of its non-linear input values -> it is a linear function of the non-linear functions of input values

Page 275, line -4: between 0 and 12 + 4 + 5 = 19 years,
19 --> 21

Page 279 line-2: effects the results -> affects the results

Page 280 the error function is shaped sort of like a parabola -> the error function is shaped like a parabola

Page 280  has exactly one local minima -> has exactly one local minimum   [and many other occurrences where plural “minima” is used in place of singular “minimum” ]

Page 281 maxima or minima -> maximum or minimum

Page 281 & 284 optima -> optimum

(*) Page 283 Figure 9.9 add "t := t+1” inside the loop

(*) Page 283, lines 3,4: there are errors in the equation, which is twice missing partial derivative symbols.
2 --> \342\210\202  ($\partial$)

(*) Page 285, line 9: there are errors in the equation, which is twice missing partial derivative symbols.
2 --> âˆ‚  ($\partial$)
All partial derivatives should not have a 2 in the numerator.  The 1/2 factor cancels from the squared error term when differentiating.

(*) Page 287 line 2 w_j^2 -> w_{j-1}^2

(*) Page 287 dependent variables -> independent variables

(*) Page 287 Let $\Gamma$  be our $n \times n$ ->  Let $\Gamma$  be our $m \times m$

(*) Page 287  || \lambda \Gamma w ||^2 ->    \lambda  || \Gamma w ||^2

Page 287 & 288 criteria -> criterion

(*) Page 293 y_i is positive, i.e. y_i=1 ->  y_i is negative, i.e. y_i=1

(*) Page 293, line -3: punish it aggressively when $f(y_i) \rightarrow  0$,
$y_i$ --> $x_i$

(*) Page 294, line -1: there is an error in the summation symbol and square bracket.
[ \sum --> \sum [

Page 294 Figure 9.16 caption: positive (blue) and negative (right) -> positive (blue) and negative (red)

Page 301 the apples residual -> the apple's residual

Page 301 the set of variables are ->  the set of variables is

Page 307, line 24: principle component analysis -> principal component analysis

(*) Page 308, line 25:  0.03 should be $0.03^2$

Page 311 represents fair dice roll ->  represents a fair dice roll

Page 311 does not satisfies -> does not satisfy

(*) Page 317 restricting to vectors instead of points. Recall that sets of d-dimensional vectors can be thought of as points -> restricting to unit-length vectors instead of points. Recall that d-dimensional unit-length vectors can be thought of as points

Page 318 ”Thus we can compute the exact probability…” : change all instances of “plane” to “line” in this paragraph

Page 319 in a particular organisms -> in a particular organism

Page 319, line 6: LSH wants similar items to recieve the exact same hash code
recieve --> receive

(*) Page 320 node-ink diagrams -> node-link diagrams

Page 321 the the adjacency matrix -> the adjacency matrix

Page 321, line 7: like eigenvalue or singular value decomposition (see Section 8.5) on the the
the the --> the (duplicate)

Page 323, line -4: people do not see that a distance or similarity matrix is really just a graph than can take advantage of other tools.
"than" -> "that"

Page 323 dense graphs have a quadratic number of edges -> dense graphs have a number of edges that grows quadratically with the number of vertices

Page 325, line 21: The Pagerank formula has a long dash in out-degree, should be \text{out-degree}

Page 326, line 15: principle eigenvector -> principal eigenvector

Page 328, line -5: records in a (say) a hundred distinct subsets each ordered by similarity
in a --> into

Page 328, line -2: more accurate on this restricted class of items then a general model trained over all items.
then --> than

Page 332, line -3: sometimes called the k-mediods algorithm
mediod --> medoid

Page 333, line 17: k-means can proceed by reading the distances off this matrix.
off --> off of

Page 335  number of true sources ->  the number of true sources

Page 335  formal criterias of merit -> formal criteria of merit

Page 336 cultural/ethic groupings -> cultural/ethnic groupings
 
Page 336, Figure 10.17 -- this figure is much too low resolution and blurry.  Sorry.

Page 338 Identifying the right position -> Identifying the correct position

Page 339 Figure 10.20  on right -> on the right

Page 339 is tree drawn -> is the tree drawn

Page 340 this is the criteria -> this is the criterion

Page 340, line -11: point in each cluster (medioid) as a representative in the general case
medioid --> medoid

Page 340, line -4, "... with using the fastest algorithm ," extra space between 'algorithm' and ','

Page 341, line -12: where I asked asked how many clusters
asked asked --> asked (duplicate)

Page 342 reflecting the similar of -> reflecting the similarity of

Page 345 much better soon as -> much better as soon as

(*) Page 346  Problem 10-4: angular distance is a distance metric ->  angular distance is a distance metric for unit-length vectors

Page 346 Problem 10.11 (a): "Might is possible this classifier" should be "Might it be possible that this classifier"

Page 349 10-31: How can you ... of the data.  Should end with a question mark instead of period

Page 353 training test -> training set

Page 354 P(X) -> p(X) [2 occurrences]

Page 354, first equation (line 13): $P(B)$ should be $p(B)$.

Page 357 the frequency all outcomes -> the frequency of all outcomes

Page 358 faction of the examples -> fraction of the examples

Page 360 the complete set of observations of $x_i$ are -> the complete set of observations of $x_i$ is

Page 360 This criteria -> This criterion

Page 360, line 4: $t \in (10.5, 11.5, 17)$ might be clearer as $t \in (10.5, 12.5, 17)$, since (11 + 14) / 2 = 12.5

(*) Page 361, line -4: The XOR logic function is defined as (A or ~B) or (~A or B) butr it should instead be: (A and ~B) or (~A and B). 

Page 363 equality dictates -> fairness dictates

Page 364 principle components analysis -> principal component analysis

(*) Page 365 $C_1$ if $x_i ≥ t_i$ ->  $C_1$ if $x_i < t_i$

Page 367 This a natural objective -> This is a natural objective

Page 369 The optimization algorithm of efficient solvers -> The optimization algorithms of efficient solvers

(*) Page 369, line 6: repeated equations, one should be equal to -1 the other to 1

(*) Page 371, line -8: This take home lesson was badly botched.  It should read:
"Kernal fuctions are what gives SVMs their power.  They project $n$ $d$-dimensional points to $n$ dimensions, while somehow still taking no more than $d$ steps to manipulate each point."

Page 373, paragraph 4, line 5: "... and F[i, j] reflects ... times work i ..." work -> word

Page 373 represent genuine phenomenon -> represent genuine phenomena

Page 375  a large corpora -> a large corpus

Page 376 and, if so, replace them ->  and, if there are, replace them

Page 377 comes down to quality of -> comes down to the quality of

Page 378  the full weight of these networks are -> the full weight of these networks is

Page 378 split out -> spit out

Page 378 The value actually passed to y is the ->  The value actually passed to y is

(*) Page 382 The ReLU function remains differentiable -> The ReLU function remains differentiable except at x=0

Page 384 the number of documents in corpus -> the number of documents in the corpus

Page 384 (2x) principle components analysis -> principal component analysis

Page 386 training text be that would -> training text that would

Page 389 a large text corpora ->  a large text corpus

Page 389 11-14 What are some ... from traditional machine learning
No punctuation at end, should have question mark
(*) Page 403 second display equation, V(a) =  -> \frac{n-1}{n} V(A) =    [the RHS is equal to the “divide by n” formula of sample variance, not the “divide by n-1” formula]

Page 406 the chances of such phenomenon -> the chances of such issues

Page 411 Representatives of these class of problems —> Representatives of this class of problems

Page 413 a largest piece that is often substantially higher -> a largest piece whose size is often substantially greater

Page 419 line 7,  Chapter notes: Rajarman -> Rajaraman

Page 424 Be aware -> Beware

Page 442 (Index), Obama, Barack appearing 3 times in sequence (probably an issue with the square brackets)

Page 444 (Index), T-test —> t-test (the term was taken from a capitalized title)