123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263 |
- ---
- interact_link: content/chapters/01/1/intro.md
- kernel_name:
- has_widgets: false
- title: |-
- Introduction
- prev_page:
- url: /chapters/01/what-is-data-science.html
- title: |-
- Data Science
- next_page:
- url: /chapters/01/1/1/computational-tools.html
- title: |-
- Computational Tools
- comment: "***PROGRAMMATICALLY GENERATED, DO NOT EDIT. SEE ORIGINAL FILES IN /content***"
- ---
- <div class="jb_cell">
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
- <div class="text_cell_render border-box-sizing rendered_html">
- <h1 id="Chapter-1:-Introduction">Chapter 1: Introduction<a class="anchor-link" href="#Chapter-1:-Introduction"> </a></h1><p>Data are descriptions of the world around us, collected through observation and
- stored on computers. Computers enable us to infer properties of the world from
- these descriptions. Data science is the discipline of drawing conclusions from
- data using computation. There are three core aspects of effective data
- analysis: exploration, prediction, and inference. This text develops a
- consistent approach to all three, introducing statistical ideas and fundamental
- ideas in computer science concurrently. We focus on a minimal set of core
- techniques that can be applied to a vast range of real-world
- applications. A foundation in data science requires not only understanding
- statistical and computational techniques, but also recognizing how they apply
- to real scenarios.</p>
- <p>For whatever aspect of the world we wish to study—whether it's the Earth's
- weather, the world's markets, political polls, or the human mind—data we
- collect typically offer an incomplete description of the subject at hand. A
- central challenge of data science is to make reliable conclusions using this
- partial information.</p>
- <p>In this endeavor, we will combine two essential tools: computation and
- randomization. For example, we may want to understand climate change trends
- using temperature observations. Computers will allow us to use all available
- information to draw conclusions. Rather than focusing only on the average
- temperature of a region, we will consider the whole range of temperatures
- together to construct a more nuanced analysis. Randomness will allow us to
- consider the many different ways in which incomplete information might be
- completed. Rather than assuming that temperatures vary in a particular way, we
- will learn to use randomness as a way to imagine many possible scenarios that
- are all consistent with the data we observe.</p>
- <p>Applying this approach requires learning to program a computer, and so this
- text interleaves a complete introduction to programming that assumes no prior
- knowledge. Readers with programming experience will find that we cover several
- topics in computation that do not appear in a typical introductory computer
- science curriculum. Data science also requires careful reasoning about numerical
- quantities, but this text does not assume any background in mathematics or
- statistics beyond basic algebra. You will find very few equations in this text.
- Instead, techniques are described to readers in the same language in which they
- are described to the computers that execute them—a programming language.</p>
- </div>
- </div>
- </div>
- </div>
-
|