Credits
Credits
My list of people to thank is large enough that I probably missed some.
I will try to do enumerate them systematically to minimize omissions, but ask those I've unfairly neglected for absolution.
First, I thank those who made concrete contributions to help me put this book together.
Yeseul Lee served as an apprentice on this project, helping with figures, exercises, and more during summer 2016 and beyond.
You will see evidence of her handiwork on almost every page, and I greatly
appreciate her help and dedication.
Aakriti Mittal and Jack Zheng also contributed to a few of the figures.
Students in my Fall 2016 Introduction to Data Science course (CSE 519) helped to debug the manuscript, and they found plenty of things to debug.
I particularly thank Rebecca Siford, who proposed over one hundred
corrections on her own.
Several data science friends/sages reviewed specific chapters for me, and I
thank Anshul Gandhi, Yifan Hu, Klaus Mueller, Francesco Orabona, Andy Schwartz, and Charles Ward for their efforts here.
Figures
My apprentice student Yeseul Lee helped me enormously with the figures,
drawing many of them to specification (and redrawing when my specification
changed).
Many of these are based on other figures for inspiration or constructed using
specific data sets, and I try to give proper credit here:
-
Chapter 1
Yeseul Lee:
Fig 1.6
You should cite:
Fig 1.6 - The taxi data is from www.nyc.gov.
-
Chapter 2
Yeseul Lee:
1,2,3,4,6,7,8,9,11,12,13
You should cite:
Fig 2.3 - The data is from
www.statista.com.
-
Chapter 4
Yeseul Lee:
1-3 is from another student of Skiena and I edited them.
4,5,6,7,8,9
-
Chapter 5
Yeseul Lee:
1,2,3,4,5, 8, 9,11,12,13,14,15,16,18,
You should cite:
and (on page 129 English Wiki) - the data is from
www.wordfrequency.info.
-
Chapter 6
Yeseul Lee:
1,2, 3, 5, 6, 10, 13, 15 (I made), 17,18,19,20,21,22,23,24,29,30,31
-
Chapter 7
Yeseul Lee:
2,3, 4, 5,6,9,
-
Chapter 8
Yeseul Lee:
1, 2, matrices
-
Chapter 9
Yeseul Lee:
1, 2, 3, 4, 5, 6, 7 (3d linear regression), 8, 9 (wrote in latex), 10, 11, 12,
13, 14, 15,16, 17 (logistic reg), 18,19
-
Chapter 10
Yeseul Lee:
1, 2,3, 4,5 left, 6,7,8,11, 12, 14, 15, 16, 18, 19, 20, 21, 22,
-
Chapter 11
Yeseul Lee:
1 (wrote in latex), 2,3, 5 (wrote in latex), 6,7, 8,9, 10, 11,12,
-
Chapter 12
Yeseul Lee
python code on page 390,
figures 1, 2,
Exercises
Several exercises were originated by colleagues or inspired by other sources.
Reconstructing the original sources years later can be challenging, but credits for each problem (to the best of my recollection) appear below.
-
Chapter 1
- 1-6 Open Intro, book, pg 64
- 1-7 Open Intro, book, pg 64
- 1-8 Open Intro, book, pg 61
- 1-12 DeZyre, https://www.dezyre.com/article/100-data-science-interview-questions-and-answers-general-for-2016/184
- 1-13 Workable, http://resources.workable.com/data-scientist-analysis-interview-questions
- 1-14 Vincent Granville, http://www.datasciencecentral.com/profiles/blogs/66-job-interview-questions-for-data-scientists
- 1-15 Vincent Granville, http://www.datasciencecentral.com/profiles/blogs/66-job-interview-questions-for-data-scientists
-
Chapter 2
- 2-1 Open Intro, book, pg 119 https://www.openintro.org/stat/textbook.php?stat_book=os
- 2-2 Open Intro, book, pg 119
- 2-3 Introduction to Machine Learning, Alex Smola, pg 35
- 2-4 Introduction to Machine Learning, Alex Smola, pg 35
- 2-5 Zvi Bar-Yosef, CMU, midterm or final exam question.
- 2-6 Introductory Statistics with Randomization and Simulation, First Edition, David M Diez
- 2-10 Open Intro, book, pg 359
- 2-11 Open Intro, book, pg 362
- 2-12 Introductory Statistics with Randomization and Simulation, First Edition, David M Diez, pg.250
- 2-30 Workable, http://resources.workable.com/data-scientist-analysis-interview-questions
- 2-31 Workable, http://resources.workable.com/data-scientist-analysis-interview-questions
- 2-32 DeZyre, https://www.dezyre.com/article/100-data-science-interview-questions-and-answers-general-for-2016/184
-
Chapter 3
-
Chapter 4
- 4-1 Open Intro, book, pg 130
- 4-2 Open Intro, book, pg 158
- 4-3 Open Intro, book, pg 158
- 4-4 Yeseul Lee
- 4-10 DeZyre, https://www.dezyre.com/article/100-data-science-interview-questions-and-answers-general-for-2016/184
- 4-11 Jonathan DAHAN, dahan1jonathan@gmail.com, http://rpubs.com/JDAHAN/172473
- 4-12 Vincent Granville, http://www.datasciencecentral.com/profiles/blogs/66-job-interview-questions-for-data-scientists
-
Chapter 5
- 5-4 Open Intro, book, pg 69
- 5-5 Open Intro, book, pg 69
- 5-7 Open Intro, book, pg 162
- 5-12 Workable, http://resources.workable.com/data-scientist-analysis-interview-questions
- 5-13 Workable, http://resources.workable.com/data-scientist-analysis-interview-questions
- 5-14 Jonathan DAHAN, dahan1jonathan@gmail.com, http://rpubs.com/JDAHAN/172473
- 5-15 DeZyre, https://www.dezyre.com/article/100-data-science-interview-questions-and-answers-general-for-2016/184
- 5-16 DeZyre, https://www.dezyre.com/article/100-data-science-interview-questions-and-answers-general-for-2016/184
- 5-17 DeZyre, https://www.dezyre.com/article/100-data-science-interview-questions-and-answers-general-for-2016/184
- 5-18 Jonathan DAHAN, dahan1jonathan@gmail.com, http://rpubs.com/JDAHAN/172473
-
Chapter 6
-
Chapter 7
- 7-7 Gregory Piatetsky, http://www.kdnuggets.com/2016/02/21-data-science-interview-questions-answers.html
- 7-8 Gregory Piatetsky, http://www.kdnuggets.com/2016/02/21-data-science-interview-questions-answers.html
- 7-9 Gregory Piatetsky, http://www.kdnuggets.com/2016/02/21-data-science-interview-questions-answers-part2.html
- 7-11 Jonathan DAHAN, http://rpubs.com/JDAHAN/172473
- 7-12 Workable, http://resources.workable.com/data-scientist-analysis-interview-questions
- 7-13 Workable, http://resources.workable.com/data-scientist-analysis-interview-questions
- 7-14 Workable, http://resources.workable.com/data-scientist-analysis-interview-questions
- 7-19 Workable, http://resources.workable.com/data-scientist-analysis-interview-questions
- 7-20 DeZyre, https://www.dezyre.com/article/100-data-science-interview-questions-and-answers-general-for-2016/184
- 7-21 Jonathan DAHAN, dahan1jonathan@gmail.com, http://rpubs.com/JDAHAN/172473
- 7-22 Jonathan DAHAN, dahan1jonathan@gmail.com, http://rpubs.com/JDAHAN/172473
- 7-23 Jonathan DAHAN, dahan1jonathan@gmail.com, http://rpubs.com/JDAHAN/172473
-
Chapter 8
- 8-5 Pettofrezzo, Matricies and Transformations, pg 12.
- 8-6 Stanford, CS246, HW2, http://web.stanford.edu/class/cs246/homeworks/hw2/hw2.pdf
- 8-7 Pettofrezzo, Matricies and Transformations, pg 35.
- 8-8 Pettofrezzo, Matricies and Transformations, pg 35.
- 8-9 Linear Algebra, David Cherney, pg.61
- 8-10 Jonathan DAHAN, http://rpubs.com/JDAHAN/172473
- 8-11 Linear Algebra, David Cherney, pg.162
- 8-12 Linear Algebra, David Cherney, pg.231
- 8-13 Pettofrezzo, Matricies and Transformations, pg 87.
- 8-14 Pettofrezzo, Matricies and Transformations, pg 87.
- 8-15 Linear Algebra, David Cherney, pg.232
- 8-17 Stanford, CS246, HW2, http://web.stanford.edu/class/cs246/homeworks/hw2/hw2.pdf
- 8-20 DeZyre, https://www.dezyre.com/article/100-data-science-interview-questions-and-answers-general-for-2016/184
- 8-21 Jonathan DAHAN, http://rpubs.com/JDAHAN/172473
- 8-22 Jonathan DAHAN, http://rpubs.com/JDAHAN/172473
-
Chapter 9
- 9-2 Open Intro, book, pg 362
- 9-3 Zvi Bar-Yosef, CMU, midterm or final exam question.
- 9-4 Zvi Bar-Yosef, CMU, midterm or final exam question.
- 9-5 Jonathan DAHAN, http://rpubs.com/JDAHAN/172473
- 9-15 Workable, http://resources.workable.com/data-scientist-analysis-interview-questions
- 9-16 DeZyre, https://www.dezyre.com/article/100-data-science-interview-questions-and-answers-general-for-2016/184
- 9-17 Jonathan DAHAN, http://rpubs.com/JDAHAN/172473
- 9-18 Jonathan DAHAN, http://rpubs.com/JDAHAN/172473
-
Chapter 10
- 10-6 Leskovec, Mining of Massive Datasets
- 10-9 Zvi Bar-Yosef, CMU, midterm or final exam question.
- 10-10 Zvi Bar-Yosef, CMU, midterm or final exam question.
- 10-11 Zvi Bar-Yosef, CMU, midterm or final exam question.
- 10-19 An Introduction to Statistical Learning with Applications in R, Gareth James, pg.414
- 10-21 Leskovec, Mining of Massive Data sets.
- 10-23 Zvi Bar-Yosef, CMU, midterm or final exam question.
- 10-24 Zvi Bar-Yosef, CMU, midterm or final exam question.
- 10-27 Jonathan DAHAN, http://rpubs.com/JDAHAN/172473
- 10-28 Workable, http://resources.workable.com/data-scientist-analysis-interview-questions
- 10-29 Workable, http://resources.workable.com/data-scientist-analysis-interview-questions
- 10-30 Workable, http://resources.workable.com/data-scientist-analysis-interview-questions
- 10-31 Melissa Hill, http://blog.udacity.com/2015/04/data-science-interview-questions.html
- 10-32 Jonathan DAHAN, http://rpubs.com/JDAHAN/172473
-
Chapter 11
- 11-2 Cathy O'Neil, Doing Data Science
- 11-3 DeZyre, https://www.dezyre.com/article/100-data-science-interview-questions-and-answers-general-for-2016/184
- 11-6 Zvi Bar-Yosef, CMU, midterm or final exam question.
- 11-9 Zvi Bar-Yosef, CMU, midterm or final exam question.
- 11-10 Zvi Bar-Yosef, CMU, midterm or final exam question.
- 11-11 Zvi Bar-Yosef, CMU, midterm or final exam question.
- 11-14 Workable, https://resources.workable.com/machine-learning-engineer-interview-questions
- 11-15 Jonathan DAHAN, http://rpubs.com/JDAHAN/172473
- 11-16 Jonathan DAHAN, http://rpubs.com/JDAHAN/172473
- 11-17 Jonathan DAHAN, http://rpubs.com/JDAHAN/172473
-
Chapter 12
- 12-1 Yeseul Lee
- 12-2 Yeseul Lee
- 12-3 Leskovec, et.al. Mining of Massive Datasets
- 12-4 Leskovec, et.al. Mining of Massive Datasets
- 12-13 Jonathan DAHAN, http://rpubs.com/JDAHAN/172473
- 12-14 Vincent Granville, http://www.datasciencecentral.com/profiles/blogs/66-job-interview-questions-for-data-scientists
- 12-15 Vincent Granville, http://www.datasciencecentral.com/profiles/blogs/66-job-interview-questions-for-data-scientists
- 12-16 Vincent Granville, http://www.datasciencecentral.com/profiles/blogs/66-job-interview-questions-for-data-scientists
- 12-17 Vincent Granville, http://www.datasciencecentral.com/profiles/blogs/66-job-interview-questions-for-data-scientists
- 12-18 Workable, http://resources.workable.com/data-scientist-analysis-interview-questions
- 12-19 Vincent Granville, http://www.datasciencecentral.com/profiles/blogs/66-job-interview-questions-for-data-scientists
- 12-20 Yeseul Lee
- 12-21 Vincent Granville, http://www.datasciencecentral.com/profiles/blogs/66-job-interview-questions-for-data-scientists