-
Notifications
You must be signed in to change notification settings - Fork 0
/
preface.tex
47 lines (35 loc) · 4.9 KB
/
preface.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
\documentclass[12pt]{article}
\usepackage{amsmath, amssymb, amsthm, graphicx, epsfig, fancyhdr, url}
\title{Preface}
\author {Arjen P. de Vries}
\setlength{\headheight}{28pt}
\pagestyle{fancy}
\fancyhf{}
\fancyhead[R]{Arjen P. de Vries \\ Preface}
\fancyfoot[C]{\thepage}
\begin{document}
\begin{center} \Large Preface\end{center}
In 2001, William S. Cleveland has been the first to define \emph{Data Science} as a new field of study, in his Bell Labs technical report intended as an \emph{action plan} for the practicing data analyst.%
\footnote{\url{%
https://web.archive.org/web/20060111162626/http://cm.bell-labs.com/cm/ms/departments/sia/doc/datascience.pdf%
}}
More than a decade later, society has embraced this call for experts who combine a strong mathematics background with a solid understanding of computer science.
Data scientists are experts in machine learning and its mathematical underpinnings \emph{as well as} the computer science necessary to process data at scale. To illustrate the challenge to train those experts: a majority of the recent Turing Awards has recognised break-through ideas that are crucial in understanding data science. Leslie Valiant (2010) and Judea Pearl (2011) received this honour for their contributions to the theory of computation and learning; Leslie Lamport (2013) for distributed and concurrent systems; Mike Stonebraker (2014) for modern database systems; and, of course, we should not overlook the invention of the Web by Tim Berners-Lee (2016). This year, 2018, David Patterson has been lauded for his contributions to computer architecture. Yes, the hardware itself is important too: at Google for example, Patterson helped design Tensor Processing Units (TPUs) that enable 15-30x faster execution of machine learning algorithms at orders lower energy consumption \cite{Jouppi:2017:IPA:3140659.3080246}. We can only conclude that the data scientist needs a solid foundation to grasp the key concepts in all of these sub-areas of maths and computing.
Myself a researcher with a background in data management and information retrieval, I have long been intrigued by the idea that data powers insight to help improve science and society. I recall being excited by the wonderful bundle of essays titled `The Fourth Paradigm: Data-intensive Scientific Discovery',%
\footnote{\url{%
https://web.archive.org/web/20091223044640/http://research.microsoft.com/en-us/collaboration/fourthparadigm/4th_paradigm_book_complete_lr.pdf%
}}
edited by Microsoft Research, that showcases a kaleidoscope of scientific progress enabled by the use of computers to gain understanding from data created and stored electronically.
But, the impact of data science reaches far beyond science itself. Can you name one organisation, public or private, that is not looking to hire data scientists?
The fact that I highlighted scientific contributions to our area from three different \emph{industry labs} in the introduction of this academic text book is not a coincidence -- the immediate value of hands-on experience necessary to be successful in this new domain is such that it really is the industry that pushes us forward, asking us to deliver graduates that develop smoothly into capable data scientists.
Already five years have passed since the Harvard Business Review put the spotlight on this exciting field%
\footnote{\url{%
https://web.archive.org/web/20120920114903/http://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/ar/pr%
%http://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/%
}}
-- \emph{and} predicted a shortage of qualified people! Higher education, however, has not reached a definite answer to the question what should be the curriculum of the data scientist, or even where it should be taught in the institution \cite{DBLP:journals/cacm/BermanRHCDEFMRS18}.
The book you have in front of you is a very welcome contribution to resolve this situation, that needs a response so urgently. Grounded in the Data Science Master's programme offered at university of Sk\"ovde, the authors cover the topics that every data scientist should be intimately familiar with. I especially appreciate that the book explores both theory and practice, it does not ignore the societal and organisational context the data scientist will work in, and includes ample material to develop practical skills -- exactly what has been missing in curricula in the past. I believe this to be the main motivation for Cleveland to define Data Science as a new field, and I expect that students mastering this book have not only acquired a future-proof foundation to follow developments in this fast pacing area of study, but at the same time will be ready to apply their analytic skills in real life problems.
Now read this book cover to cover, develop your programming skills, and find yourself ready to help shape this bright future that realises the promise of data science!
\bibliographystyle{alpha}
\bibliography{preface.bib}
\end{document}