statistical-techniques.html 2.3 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
  1. ---
  2. interact_link: content/chapters/01/1/2/statistical-techniques.md
  3. kernel_name:
  4. has_widgets: false
  5. title: |-
  6. Statistical Techniques
  7. prev_page:
  8. url: /chapters/01/1/1/computational-tools.html
  9. title: |-
  10. Computational Tools
  11. next_page:
  12. url: /chapters/01/2/why-data-science.html
  13. title: |-
  14. Why Data Science?
  15. comment: "***PROGRAMMATICALLY GENERATED, DO NOT EDIT. SEE ORIGINAL FILES IN /content***"
  16. ---
  17. <div class="jb_cell">
  18. <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
  19. <div class="text_cell_render border-box-sizing rendered_html">
  20. <h2 id="Statistical-Techniques">Statistical Techniques<a class="anchor-link" href="#Statistical-Techniques"> </a></h2><p>The discipline of statistics has long addressed the same fundamental challenge
  21. as data science: how to draw robust conclusions about the world using incomplete
  22. information. One of the most important contributions of statistics is a
  23. consistent and precise vocabulary for describing the relationship between
  24. observations and conclusions. This text continues in the same tradition,
  25. focusing on a set of core inferential problems from statistics: testing
  26. hypotheses, estimating confidence, and predicting unknown quantities.</p>
  27. <p>Data science extends the field of statistics by taking full advantage of
  28. computing, data visualization, machine learning, optimization, and access
  29. to information. The combination of fast computers and the Internet gives
  30. anyone the ability to access and analyze
  31. vast datasets: millions of news articles, full encyclopedias, databases for
  32. any domain, and massive repositories of music, photos, and video.</p>
  33. <p>Applications to real data sets motivate the statistical techniques that we
  34. describe throughout the text. Real data often do not follow regular patterns or
  35. match standard equations. The interesting variation in real data can be lost by
  36. focusing too much attention on simplistic summaries such as average values.
  37. Computers enable a family of methods based on resampling that apply to a wide
  38. range of different inference problems, take into account all available
  39. information, and require few assumptions or conditions. Although these
  40. techniques have often been reserved for advanced courses in statistics, their
  41. flexibility and simplicity are a natural fit for data science applications.</p>
  42. </div>
  43. </div>
  44. </div>
  45. </div>