what-is-data-science.html 2.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445
  1. ---
  2. interact_link: content/chapters/01/what-is-data-science.md
  3. kernel_name:
  4. has_widgets: false
  5. title: |-
  6. Data Science
  7. prev_page:
  8. url: /chapters/intro.html
  9. title: |-
  10. Introduction
  11. next_page:
  12. url: /chapters/01/1/intro.html
  13. title: |-
  14. Introduction
  15. comment: "***PROGRAMMATICALLY GENERATED, DO NOT EDIT. SEE ORIGINAL FILES IN /content***"
  16. ---
  17. <div class="jb_cell">
  18. <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
  19. <div class="text_cell_render border-box-sizing rendered_html">
  20. <h1 id="What-is-Data-Science?">What is Data Science?<a class="anchor-link" href="#What-is-Data-Science?"> </a></h1><p>Data Science is about drawing useful conclusions from large and diverse data
  21. sets through exploration, prediction, and inference. Exploration involves
  22. identifying patterns in information. Prediction involves using information
  23. we know to make informed guesses about values we wish we knew. Inference
  24. involves quantifying our degree of certainty: will the patterns that we found in our data also appear in new observations? How accurate are our predictions? Our primary
  25. tools for exploration are visualizations and descriptive statistics, for
  26. prediction are machine learning and optimization, and for inference are
  27. statistical tests and models.</p>
  28. <p>Statistics is a central component of data science because statistics
  29. studies how to make robust conclusions based on incomplete information. Computing
  30. is a central component because programming allows us to apply analysis
  31. techniques to the large and diverse data sets that arise in real-world
  32. applications: not just numbers, but text, images, videos, and sensor readings.
  33. Data science is all of these things, but it is more than the sum of its parts
  34. because of the applications. Through understanding a particular domain, data
  35. scientists learn to ask appropriate questions about their data and correctly
  36. interpret the answers provided by our inferential and computational tools.</p>
  37. </div>
  38. </div>
  39. </div>
  40. </div>