1234567891011121314151617181920212223242526272829303132333435363738394041 |
- ---
- redirect_from:
- - "/chapters/16/inference-for-regression"
- interact_link: content/chapters/16/Inference_for_Regression.ipynb
- kernel_name: python3
- has_widgets: false
- title: |-
- Inference for Regression
- prev_page:
- url: /chapters/15/6/Numerical_Diagnostics.html
- title: |-
- Numerical Diagnostics
- next_page:
- url: /chapters/16/1/Regression_Model.html
- title: |-
- A Regression Model
- comment: "***PROGRAMMATICALLY GENERATED, DO NOT EDIT. SEE ORIGINAL FILES IN /content***"
- ---
- <div class="jb_cell tag_remove_input">
- <div class="cell border-box-sizing code_cell rendered">
- </div>
- </div>
- <div class="jb_cell">
- <div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
- <div class="text_cell_render border-box-sizing rendered_html">
- <h3 id="Inference-for-Regression">Inference for Regression<a class="anchor-link" href="#Inference-for-Regression"> </a></h3><p>Thus far, our analysis of the relation between variables has been purely descriptive. We know how to find the best straight line to draw through a scatter plot. The line is the best in the sense that it has the smallest mean squared error of estimation among all straight lines.</p>
- <p>But what if our data were only a sample from a larger population? If in the sample we found a linear relation between the two variables, would the same be true for the population? Would it be exactly the same linear relation? Could we predict the response of a new individual who is not in our sample?</p>
- <p>Such questions of inference and prediction arise if we believe that a scatter plot reflects the underlying relation between the two variables being plotted but does not specify the relation completely. For example, a scatter plot of birth weight versus gestational days shows us the precise relation between the two variables in our sample; but we might wonder whether that relation holds true, or almost true, for all babies in the population from which the sample was drawn, or indeed among babies in general.</p>
- <p>As always, inferential thinking begins with a careful examination of the assumptions about the data. Sets of assumptions are known as <em>models</em>. Sets of assumptions about randomness in roughly linear scatter plots are called <em>regression models</em>.</p>
- </div>
- </div>
- </div>
- </div>
-
|