{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": { "tags": [ "remove_input" ] }, "outputs": [], "source": [ "path_data = '../../../../data/'\n", "\n", "import numpy as np\n", "import pandas as pd\n", "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "plt.style.use('fivethirtyeight')\n", "\n", "import warnings\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sampling from a Population ###\n", "\n", "The law of averages also holds when the random sample is drawn from individuals in a large population.\n", "\n", "As an example, we will study a population of flight delay times. The table `united` contains data for United Airlines domestic flights departing from San Francisco in the summer of 2015. The data are made publicly available by the [Bureau of Transportation Statistics](http://www.transtats.bts.gov/Fields.asp?Table_ID=293) in the United States Department of Transportation.\n", "\n", "There are 13,825 rows, each corresponding to a flight. The columns are the date of the flight, the flight number, the destination airport code, and the departure delay time in minutes. Some delay times are negative; those flights left early." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Date | \n", "Flight Number | \n", "Destination | \n", "Delay | \n", "
---|---|---|---|---|
0 | \n", "6/1/15 | \n", "73 | \n", "HNL | \n", "257 | \n", "
1 | \n", "6/1/15 | \n", "217 | \n", "EWR | \n", "28 | \n", "
2 | \n", "6/1/15 | \n", "237 | \n", "STL | \n", "-3 | \n", "
3 | \n", "6/1/15 | \n", "250 | \n", "SAN | \n", "0 | \n", "
4 | \n", "6/1/15 | \n", "267 | \n", "PHL | \n", "64 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
13820 | \n", "8/31/15 | \n", "1978 | \n", "LAS | \n", "-4 | \n", "
13821 | \n", "8/31/15 | \n", "1993 | \n", "IAD | \n", "8 | \n", "
13822 | \n", "8/31/15 | \n", "1994 | \n", "ORD | \n", "3 | \n", "
13823 | \n", "8/31/15 | \n", "2000 | \n", "PHX | \n", "-1 | \n", "
13824 | \n", "8/31/15 | \n", "2013 | \n", "EWR | \n", "-2 | \n", "
13825 rows × 4 columns
\n", "