Browse Source

change to notebooks

Robert Blair 4 năm trước cách đây
mục cha
commit
3b07ac8f6b
96 tập tin đã thay đổi với 18148 bổ sung6205 xóa
  1. BIN
      content/chapters/.DS_Store
  2. BIN
      content/chapters/02/.DS_Store
  3. 1 1
      content/chapters/03/1/Expressions.ipynb
  4. 1 1
      content/chapters/03/2/1/Growth.ipynb
  5. 5 13
      content/chapters/03/2/Names.ipynb
  6. 1 1
      content/chapters/03/3/Calls.ipynb
  7. 260 35
      content/chapters/03/4/Introduction_to_Tables.ipynb
  8. 1 1
      content/chapters/04/1/Numbers.ipynb
  9. 1 1
      content/chapters/04/2/1/String_Methods.ipynb
  10. 5 13
      content/chapters/04/2/Strings.ipynb
  11. 1 1
      content/chapters/04/3/Comparison.ipynb
  12. 14 3
      content/chapters/04/Data_Types.ipynb
  13. 17 17
      content/chapters/05/1/Arrays.ipynb
  14. 95 18
      content/chapters/05/2/Ranges.ipynb
  15. 28 6
      content/chapters/05/3/More_on_Arrays.ipynb
  16. 28 6
      content/chapters/05/Sequences.ipynb
  17. 9 9
      content/chapters/06/1/Sorting_Rows.ipynb
  18. 33 33
      content/chapters/06/2/Selecting_Rows.ipynb
  19. 2754 2754
      content/chapters/06/3/Example_Trends_in_the_Population_of_the_United_States.ipynb
  20. 59 59
      content/chapters/06/4/Example_Gender_Ratio_in_the_US_Population.ipynb
  21. 135 135
      content/chapters/06/Tables.ipynb
  22. 67 8
      content/chapters/07/1/Visualizing_Categorical_Distributions.ipynb
  23. 433 9
      content/chapters/07/2/Visualizing_Numerical_Distributions.ipynb
  24. 127 5
      content/chapters/07/3/Overlaid_Graphs.ipynb
  25. 1 1
      content/chapters/07/Visualization.ipynb
  26. 556 26
      content/chapters/08/1/Applying_a_Function_to_a_Column.ipynb
  27. 894 30
      content/chapters/08/2/Classifying_by_One_Variable.ipynb
  28. 894 27
      content/chapters/08/3/Cross-Classifying_by_More_than_One_Variable.ipynb
  29. 635 20
      content/chapters/08/4/Joining_Tables_by_Columns.ipynb
  30. 267 8
      content/chapters/08/5/Bike_Sharing_in_the_Bay_Area.ipynb
  31. 130 25
      content/chapters/08/Functions_and_Tables.ipynb
  32. 96 23
      content/chapters/09/1/Conditional_Statements.ipynb
  33. 151 29
      content/chapters/09/2/Iteration.ipynb
  34. 173 17
      content/chapters/09/3/Simulation.ipynb
  35. 385 22
      content/chapters/09/4/Monty_Hall_Problem.ipynb
  36. 76 5
      content/chapters/09/5/Finding_Probabilities.ipynb
  37. 145 24
      content/chapters/09/Randomness.ipynb
  38. 75 5
      content/chapters/10/1/Empirical_Distributions.ipynb
  39. 165 9
      content/chapters/10/2/Sampling_from_a_Population.ipynb
  40. 6 4
      content/chapters/10/3/Empirical_Distribution_of_a_Statistic.ipynb
  41. 965 15
      content/chapters/10/Sampling_and_Empirical_Distributions.ipynb
  42. 38 14
      content/chapters/11/1/Assessing_Models.ipynb
  43. 85 8
      content/chapters/11/2/Multiple_Categories.ipynb
  44. 542 19
      content/chapters/11/3/Decisions_and_Uncertainty.ipynb
  45. 6 4
      content/chapters/11/4/Error_Probabilities.ipynb
  46. 231 9
      content/chapters/12/1/AB_Testing.ipynb
  47. 732 30
      content/chapters/12/2/Deflategate.ipynb
  48. 818 27
      content/chapters/12/3/Causality.ipynb
  49. 2 2
      content/chapters/12/Comparing_Two_Samples.ipynb
  50. 142 10
      content/chapters/13/1/Percentiles.ipynb
  51. 951 11
      content/chapters/13/2/Bootstrap.ipynb
  52. 300 9
      content/chapters/13/3/Confidence_Intervals.ipynb
  53. 9 7
      content/chapters/13/4/Using_Confidence_Intervals.ipynb
  54. 2 2
      content/chapters/13/Estimation.ipynb
  55. 104 25
      content/chapters/14/1/Properties_of_the_Mean.ipynb
  56. 365 22
      content/chapters/14/2/Variability.ipynb
  57. 32 8
      content/chapters/14/3/SD_and_the_Normal_Curve.ipynb
  58. 185 9
      content/chapters/14/4/Central_Limit_Theorem.ipynb
  59. 117 7
      content/chapters/14/5/Variability_of_the_Sample_Mean.ipynb
  60. 5 3
      content/chapters/14/6/Choosing_a_Sample_Size.ipynb
  61. BIN
      content/chapters/15/.DS_Store
  62. 0 0
      content/chapters/15/1/Correlation.ipynb
  63. 129 7
      content/chapters/15/2/Regression_Line.ipynb
  64. BIN
      content/chapters/15/2/regline.png
  65. 65 6
      content/chapters/15/3/Method_of_Least_Squares.ipynb
  66. 217 8
      content/chapters/15/4/Least_Squares_Regression.ipynb
  67. 143 9
      content/chapters/15/5/Visual_Diagnostics.ipynb
  68. 247 25
      content/chapters/15/6/Numerical_Diagnostics.ipynb
  69. 116 6
      content/chapters/15/Prediction.ipynb
  70. 7 5
      content/chapters/16/1/Regression_Model.ipynb
  71. 9 7
      content/chapters/16/2/Inference_for_the_True_Slope.ipynb
  72. 7 5
      content/chapters/16/3/Prediction_Intervals.ipynb
  73. 3 5
      content/chapters/16/Inference_for_Regression.ipynb
  74. 678 15
      content/chapters/17/1/Nearest_Neighbors.ipynb
  75. 154 9
      content/chapters/17/2/Training_and_Testing.ipynb
  76. 435 16
      content/chapters/17/3/Rows_of_Tables.ipynb
  77. 310 8
      content/chapters/17/4/Implementing_the_Classifier.ipynb
  78. 830 19
      content/chapters/17/5/Accuracy_of_the_Classifier.ipynb
  79. BIN
      content/chapters/17/5/benign.png
  80. BIN
      content/chapters/17/5/malignant.png
  81. 32 8
      content/chapters/17/6/Multiple_Regression.ipynb
  82. 1 1
      content/chapters/17/Classification.ipynb
  83. 0 303
      content/chapters/17_A/1/Nearest_Neighbors.ipynb
  84. 0 159
      content/chapters/17_A/2/Training_and_Testing.ipynb
  85. 0 236
      content/chapters/17_A/3/Rows_of_Tables.ipynb
  86. 0 151
      content/chapters/17_A/4/Implementing_the_Classifier.ipynb
  87. 0 321
      content/chapters/17_A/5/Accuracy_of_the_Classifier.ipynb
  88. 0 166
      content/chapters/17_A/6/Multiple_Regression.ipynb
  89. 0 94
      content/chapters/17_A/Classification.ipynb
  90. 219 15
      content/chapters/18/1/More_Likely_than_Not_Binary_Classifier.ipynb
  91. 187 18
      content/chapters/18/2/Making_Decisions.ipynb
  92. 2 2
      content/chapters/18/Updating_Predictions.ipynb
  93. 0 415
      content/chapters/18_A/1/More_Likely_than_Not_Binary_Classifier.ipynb
  94. 0 458
      content/chapters/18_A/2/Making_Decisions.ipynb
  95. 0 58
      content/chapters/18_A/Updating_Predictions.ipynb
  96. 2 5
      content/chapters/intro.md

BIN
content/chapters/.DS_Store


BIN
content/chapters/02/.DS_Store


+ 1 - 1
content/chapters/03/1/Expressions.ipynb

@@ -182,7 +182,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 1 - 1
content/chapters/03/2/1/Growth.ipynb

@@ -299,7 +299,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 5 - 13
content/chapters/03/2/Names.ipynb

@@ -12,9 +12,7 @@
   {
    "cell_type": "code",
    "execution_count": 1,
-   "metadata": {
-    "collapsed": false
-   },
+   "metadata": {},
    "outputs": [
     {
      "data": {
@@ -43,9 +41,7 @@
   {
    "cell_type": "code",
    "execution_count": 2,
-   "metadata": {
-    "collapsed": false
-   },
+   "metadata": {},
    "outputs": [
     {
      "data": {
@@ -74,9 +70,7 @@
   {
    "cell_type": "code",
    "execution_count": 3,
-   "metadata": {
-    "collapsed": false
-   },
+   "metadata": {},
    "outputs": [
     {
      "data": {
@@ -104,9 +98,7 @@
   {
    "cell_type": "code",
    "execution_count": 4,
-   "metadata": {
-    "collapsed": false
-   },
+   "metadata": {},
    "outputs": [
     {
      "data": {
@@ -147,7 +139,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.6.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 1 - 1
content/chapters/03/3/Calls.ipynb

@@ -254,7 +254,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 260 - 35
content/chapters/03/4/Introduction_to_Tables.ipynb

@@ -220,7 +220,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 4,
    "metadata": {},
    "outputs": [
     {
@@ -235,7 +235,7 @@
        "Name: Flavor, dtype: object"
       ]
      },
-     "execution_count": 13,
+     "execution_count": 4,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -252,7 +252,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 5,
    "metadata": {},
    "outputs": [
     {
@@ -318,7 +318,7 @@
        "5   bubblegum"
       ]
      },
-     "execution_count": 9,
+     "execution_count": 5,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -335,7 +335,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 6,
    "metadata": {},
    "outputs": [
     {
@@ -350,7 +350,7 @@
        "Name: Flavor, dtype: object"
       ]
      },
-     "execution_count": 10,
+     "execution_count": 6,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -368,7 +368,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 7,
    "metadata": {},
    "outputs": [
     {
@@ -448,7 +448,7 @@
        "5   bubblegum         pink   4.75"
       ]
      },
-     "execution_count": 6,
+     "execution_count": 7,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -466,9 +466,84 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 8,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Flavor</th>\n",
+       "      <th>Price</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>3.55</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>4.75</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>5.25</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>5.25</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>5.25</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>bubblegum</td>\n",
+       "      <td>4.75</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "       Flavor  Price\n",
+       "0  strawberry   3.55\n",
+       "1   chocolate   4.75\n",
+       "2   chocolate   5.25\n",
+       "3  strawberry   5.25\n",
+       "4   chocolate   5.25\n",
+       "5   bubblegum   4.75"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "cones[['Flavor', 'Price']]"
    ]
@@ -482,9 +557,84 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 9,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Flavor</th>\n",
+       "      <th>Price</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>3.55</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>4.75</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>5.25</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>5.25</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>5.25</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>bubblegum</td>\n",
+       "      <td>4.75</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "       Flavor  Price\n",
+       "0  strawberry   3.55\n",
+       "1   chocolate   4.75\n",
+       "2   chocolate   5.25\n",
+       "3  strawberry   5.25\n",
+       "4   chocolate   5.25\n",
+       "5   bubblegum   4.75"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "cones.drop(columns=['Color'])"
    ]
@@ -498,9 +648,84 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 10,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Flavor</th>\n",
+       "      <th>Price</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>3.55</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>4.75</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>5.25</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>5.25</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>5.25</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>bubblegum</td>\n",
+       "      <td>4.75</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "       Flavor  Price\n",
+       "0  strawberry   3.55\n",
+       "1   chocolate   4.75\n",
+       "2   chocolate   5.25\n",
+       "3  strawberry   5.25\n",
+       "4   chocolate   5.25\n",
+       "5   bubblegum   4.75"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "no_colors = cones.drop(columns=['Color'])\n",
     "\n",
@@ -532,7 +757,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 11,
    "metadata": {},
    "outputs": [
     {
@@ -612,7 +837,7 @@
        "4   chocolate   dark brown   5.25"
       ]
      },
-     "execution_count": 14,
+     "execution_count": 11,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -632,7 +857,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 12,
    "metadata": {},
    "outputs": [
     {
@@ -712,7 +937,7 @@
        "0  strawberry         pink   3.55"
       ]
      },
-     "execution_count": 15,
+     "execution_count": 12,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -740,7 +965,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 16,
+   "execution_count": 13,
    "metadata": {},
    "outputs": [
     {
@@ -799,7 +1024,7 @@
        "4  chocolate   dark brown   5.25"
       ]
      },
-     "execution_count": 16,
+     "execution_count": 13,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -819,7 +1044,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": 14,
    "metadata": {},
    "outputs": [
     {
@@ -859,7 +1084,7 @@
        "Index: []"
       ]
      },
-     "execution_count": 17,
+     "execution_count": 14,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -904,7 +1129,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 18,
+   "execution_count": 15,
    "metadata": {},
    "outputs": [
     {
@@ -1034,7 +1259,7 @@
        "[417 rows x 4 columns]"
       ]
      },
-     "execution_count": 18,
+     "execution_count": 15,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1052,7 +1277,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 19,
+   "execution_count": 16,
    "metadata": {},
    "outputs": [
     {
@@ -1099,7 +1324,7 @@
        "121  Stephen Curry       PG  Golden State Warriors  11.370786"
       ]
      },
-     "execution_count": 19,
+     "execution_count": 16,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1117,7 +1342,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 20,
+   "execution_count": 17,
    "metadata": {},
    "outputs": [
     {
@@ -1268,7 +1493,7 @@
        "130   Anderson Varejao       PF  Golden State Warriors   0.289755"
       ]
      },
-     "execution_count": 20,
+     "execution_count": 17,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1287,7 +1512,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 21,
+   "execution_count": 18,
    "metadata": {},
    "outputs": [
     {
@@ -1438,7 +1663,7 @@
        "130   Anderson Varejao       PF  Golden State Warriors   0.289755"
       ]
      },
-     "execution_count": 21,
+     "execution_count": 18,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1456,7 +1681,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 22,
+   "execution_count": 19,
    "metadata": {},
    "outputs": [
     {
@@ -1586,7 +1811,7 @@
        "[417 rows x 4 columns]"
       ]
      },
-     "execution_count": 22,
+     "execution_count": 19,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1606,7 +1831,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 23,
+   "execution_count": 20,
    "metadata": {},
    "outputs": [
     {
@@ -1736,7 +1961,7 @@
        "[417 rows x 4 columns]"
       ]
      },
-     "execution_count": 23,
+     "execution_count": 20,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1777,7 +2002,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 1 - 1
content/chapters/04/1/Numbers.ipynb

@@ -494,7 +494,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 1 - 1
content/chapters/04/2/1/String_Methods.ipynb

@@ -139,7 +139,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 5 - 13
content/chapters/04/2/Strings.ipynb

@@ -14,9 +14,7 @@
   {
    "cell_type": "code",
    "execution_count": 1,
-   "metadata": {
-    "collapsed": false
-   },
+   "metadata": {},
    "outputs": [
     {
      "data": {
@@ -43,9 +41,7 @@
   {
    "cell_type": "code",
    "execution_count": 2,
-   "metadata": {
-    "collapsed": false
-   },
+   "metadata": {},
    "outputs": [
     {
      "data": {
@@ -72,9 +68,7 @@
   {
    "cell_type": "code",
    "execution_count": 3,
-   "metadata": {
-    "collapsed": false
-   },
+   "metadata": {},
    "outputs": [
     {
      "data": {
@@ -108,9 +102,7 @@
   {
    "cell_type": "code",
    "execution_count": 4,
-   "metadata": {
-    "collapsed": false
-   },
+   "metadata": {},
    "outputs": [
     {
      "data": {
@@ -145,7 +137,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.6.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 1 - 1
content/chapters/04/3/Comparison.ipynb

@@ -145,7 +145,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 14 - 3
content/chapters/04/Data_Types.ipynb

@@ -18,9 +18,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "builtin_function_or_method"
+      ]
+     },
+     "execution_count": 1,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "type(abs)"
    ]
@@ -50,7 +61,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 17 - 17
content/chapters/05/1/Arrays.ipynb

@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 1,
    "metadata": {
     "tags": [
      "remove_input"
@@ -28,7 +28,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 2,
    "metadata": {},
    "outputs": [
     {
@@ -38,7 +38,7 @@
        "       'preposition', 'interjection'], dtype='<U12')"
       ]
      },
-     "execution_count": 3,
+     "execution_count": 2,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -57,7 +57,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 3,
    "metadata": {},
    "outputs": [
     {
@@ -66,7 +66,7 @@
        "array([13.6  , 14.387, 14.585, 15.164])"
       ]
      },
-     "execution_count": 4,
+     "execution_count": 3,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -89,7 +89,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 4,
    "metadata": {},
    "outputs": [
     {
@@ -98,7 +98,7 @@
        "array([56.48  , 57.8966, 58.253 , 59.2952])"
       ]
      },
-     "execution_count": 5,
+     "execution_count": 4,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -123,7 +123,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 5,
    "metadata": {},
    "outputs": [
     {
@@ -132,7 +132,7 @@
        "4"
       ]
      },
-     "execution_count": 6,
+     "execution_count": 5,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -143,7 +143,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 6,
    "metadata": {},
    "outputs": [
     {
@@ -152,7 +152,7 @@
        "57.736000000000004"
       ]
      },
-     "execution_count": 7,
+     "execution_count": 6,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -163,7 +163,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": 7,
    "metadata": {},
    "outputs": [
     {
@@ -172,7 +172,7 @@
        "14.434000000000001"
       ]
      },
-     "execution_count": 8,
+     "execution_count": 7,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -191,7 +191,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 8,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -207,7 +207,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 9,
    "metadata": {},
    "outputs": [
     {
@@ -216,7 +216,7 @@
        "array([0.787, 0.198, 0.579])"
       ]
      },
-     "execution_count": 10,
+     "execution_count": 9,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -302,7 +302,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 95 - 18
content/chapters/05/2/Ranges.ipynb

@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
    "metadata": {
     "tags": [
      "remove_input"
@@ -35,9 +35,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 2,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([0, 1, 2, 3, 4])"
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "np.arange(5)"
    ]
@@ -59,9 +70,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 3,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([3, 4, 5, 6, 7, 8])"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "np.arange(3, 9)"
    ]
@@ -76,9 +98,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 4,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([ 3,  8, 13, 18, 23, 28])"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "np.arange(3, 30, 5)"
    ]
@@ -94,9 +127,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 5,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([ 1.5,  1. ,  0.5,  0. , -0.5, -1. , -1.5])"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "np.arange(1.5, -2, -0.5)"
    ]
@@ -140,9 +184,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 6,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([ 1,  5,  9, 13, 17])"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "by_four_to_20 = np.arange(1, 20, 4)\n",
     "by_four_to_20"
@@ -157,9 +212,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 7,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([   1,    5,    9, ..., 9989, 9993, 9997])"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "positive_term_denominators = np.arange(1, 10000, 4)\n",
     "positive_term_denominators"
@@ -174,7 +240,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 8,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -190,7 +256,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 9,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -206,9 +272,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 10,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "3.1413926535917955"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "4 * ( sum(positive_terms) - sum(negative_terms) )"
    ]
@@ -247,7 +324,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 28 - 6
content/chapters/05/3/More_on_Arrays.ipynb

@@ -119,9 +119,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 5,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([11.472, 12.016, 11.711, 11.436])"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "highs - lows"
    ]
@@ -187,7 +198,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 6,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -205,9 +216,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 7,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "3.1415910827951143"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "2 * np.prod(even/one_below_even) * np.prod(even/one_above_even)"
    ]
@@ -246,7 +268,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 28 - 6
content/chapters/05/Sequences.ipynb

@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
    "metadata": {
     "scrolled": true,
     "tags": [
@@ -29,11 +29,22 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 2,
    "metadata": {
     "scrolled": true
    },
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([13.6  , 14.387, 14.585, 15.164])"
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "baseline_high = 14.48\n",
     "highs = np.array([baseline_high - 0.880, baseline_high - 0.093,\n",
@@ -50,11 +61,22 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 3,
    "metadata": {
     "scrolled": true
    },
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "14.434000000000001"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "sum(highs)/len(highs)"
    ]
@@ -100,7 +122,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 9 - 9
content/chapters/06/1/Sorting_Rows.ipynb

@@ -271,7 +271,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 4,
    "metadata": {},
    "outputs": [
     {
@@ -350,7 +350,7 @@
        "1        Al Horford        C           Atlanta Hawks       12.000000"
       ]
      },
-     "execution_count": 5,
+     "execution_count": 4,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -372,7 +372,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": 5,
    "metadata": {},
    "outputs": [
     {
@@ -502,7 +502,7 @@
        "[417 rows x 4 columns]"
       ]
      },
-     "execution_count": 11,
+     "execution_count": 5,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -525,7 +525,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 6,
    "metadata": {},
    "outputs": [
     {
@@ -655,7 +655,7 @@
        "[417 rows x 4 columns]"
       ]
      },
-     "execution_count": 13,
+     "execution_count": 6,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -682,7 +682,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 7,
    "metadata": {},
    "outputs": [
     {
@@ -691,7 +691,7 @@
      "text": [
       "Help on method sort_values in module pandas.core.frame:\n",
       "\n",
-      "sort_values(by, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last', ignore_index=False, key: Union[Callable[[ForwardRef('Series')], Union[ForwardRef('Series'), ~AnyArrayLike]], NoneType] = None) method of pandas.core.frame.DataFrame instance\n",
+      "sort_values(by, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last', ignore_index=False, key:Union[Callable[[_ForwardRef('Series')], Union[_ForwardRef('Series'), ~AnyArrayLike]], NoneType]=None) method of pandas.core.frame.DataFrame instance\n",
       "    Sort by the values along either axis.\n",
       "    \n",
       "    Parameters\n",
@@ -869,7 +869,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 33 - 33
content/chapters/06/2/Selecting_Rows.ipynb

@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 1,
    "metadata": {
     "tags": [
      "remove_input"
@@ -18,7 +18,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 2,
    "metadata": {
     "tags": [
      "remove_input"
@@ -51,7 +51,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 3,
    "metadata": {},
    "outputs": [
     {
@@ -181,7 +181,7 @@
        "[417 rows x 4 columns]"
       ]
      },
-     "execution_count": 6,
+     "execution_count": 3,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -192,7 +192,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 4,
    "metadata": {},
    "outputs": [
     {
@@ -205,7 +205,7 @@
        "Name: 0, dtype: object"
       ]
      },
-     "execution_count": 7,
+     "execution_count": 4,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -216,7 +216,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": 5,
    "metadata": {},
    "outputs": [
     {
@@ -263,7 +263,7 @@
        "0  Paul Millsap       PF  Atlanta Hawks  18.671659"
       ]
      },
-     "execution_count": 8,
+     "execution_count": 5,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -283,7 +283,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 6,
    "metadata": {},
    "outputs": [
     {
@@ -346,7 +346,7 @@
        "5  Thabo Sefolosha       SF  Atlanta Hawks  4.000000"
       ]
      },
-     "execution_count": 9,
+     "execution_count": 6,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -364,7 +364,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 7,
    "metadata": {},
    "outputs": [
     {
@@ -443,7 +443,7 @@
        "131    Dwight Howard        C      Houston Rockets  22.359364"
       ]
      },
-     "execution_count": 10,
+     "execution_count": 7,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -470,7 +470,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 8,
    "metadata": {},
    "outputs": [
     {
@@ -479,7 +479,7 @@
        "(array([  0,   1,  29, ..., 400, 401, 402]),)"
       ]
      },
-     "execution_count": 12,
+     "execution_count": 8,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -504,7 +504,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 9,
    "metadata": {},
    "outputs": [
     {
@@ -634,7 +634,7 @@
        "[69 rows x 4 columns]"
       ]
      },
-     "execution_count": 13,
+     "execution_count": 9,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -652,7 +652,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 10,
    "metadata": {},
    "outputs": [
     {
@@ -699,7 +699,7 @@
        "121  Stephen Curry       PG  Golden State Warriors  11.370786"
       ]
      },
-     "execution_count": 14,
+     "execution_count": 10,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -719,7 +719,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 11,
    "metadata": {},
    "outputs": [
     {
@@ -870,7 +870,7 @@
        "130   Anderson Varejao       PF  Golden State Warriors   0.289755"
       ]
      },
-     "execution_count": 15,
+     "execution_count": 11,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -896,7 +896,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 16,
+   "execution_count": 12,
    "metadata": {},
    "outputs": [
     {
@@ -975,7 +975,7 @@
        "400          John Wall       PG     Washington Wizards  15.851950"
       ]
      },
-     "execution_count": 16,
+     "execution_count": 12,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -996,7 +996,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": 13,
    "metadata": {},
    "outputs": [
     {
@@ -1075,7 +1075,7 @@
        "368   DeMar DeRozan       SG     Toronto Raptors  10.050000"
       ]
      },
-     "execution_count": 17,
+     "execution_count": 13,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1093,7 +1093,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 18,
+   "execution_count": 14,
    "metadata": {},
    "outputs": [
     {
@@ -1134,7 +1134,7 @@
        "Index: []"
       ]
      },
-     "execution_count": 18,
+     "execution_count": 14,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1159,7 +1159,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 19,
+   "execution_count": 15,
    "metadata": {},
    "outputs": [
     {
@@ -1310,7 +1310,7 @@
        "130   Anderson Varejao       PF  Golden State Warriors   0.289755"
       ]
      },
-     "execution_count": 19,
+     "execution_count": 15,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1328,7 +1328,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 20,
+   "execution_count": 16,
    "metadata": {},
    "outputs": [
     {
@@ -1458,7 +1458,7 @@
        "[181 rows x 4 columns]"
       ]
      },
-     "execution_count": 20,
+     "execution_count": 16,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1476,7 +1476,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 21,
+   "execution_count": 17,
    "metadata": {},
    "outputs": [
     {
@@ -1579,7 +1579,7 @@
        "268     Kevin Durant       SF  Oklahoma City Thunder  20.158622"
       ]
      },
-     "execution_count": 21,
+     "execution_count": 17,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1614,7 +1614,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 2754 - 2754
content/chapters/06/3/Example_Trends_in_the_Population_of_the_United_States.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 59 - 59
content/chapters/06/4/Example_Gender_Ratio_in_the_US_Population.ipynb


+ 135 - 135
content/chapters/06/Tables.ipynb

@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 1,
    "metadata": {
     "tags": [
      "remove_input"
@@ -32,7 +32,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 2,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -48,7 +48,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 3,
    "metadata": {},
    "outputs": [
     {
@@ -121,7 +121,7 @@
        "4  0  0  0"
       ]
      },
-     "execution_count": 9,
+     "execution_count": 3,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -146,7 +146,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 4,
    "metadata": {},
    "outputs": [
     {
@@ -197,7 +197,7 @@
        "2                 5"
       ]
      },
-     "execution_count": 10,
+     "execution_count": 4,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -215,7 +215,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": 5,
    "metadata": {},
    "outputs": [
     {
@@ -270,7 +270,7 @@
        "2                 5       rose"
       ]
      },
-     "execution_count": 11,
+     "execution_count": 5,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -291,7 +291,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 6,
    "metadata": {},
    "outputs": [
     {
@@ -325,19 +325,19 @@
        "      <th>0</th>\n",
        "      <td>8</td>\n",
        "      <td>lotus</td>\n",
-       "      <td>{yellow, red, pink}</td>\n",
+       "      <td>{pink, red, yellow}</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td>34</td>\n",
        "      <td>sunflower</td>\n",
-       "      <td>{yellow, red, pink}</td>\n",
+       "      <td>{pink, red, yellow}</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td>5</td>\n",
        "      <td>rose</td>\n",
-       "      <td>{yellow, red, pink}</td>\n",
+       "      <td>{pink, red, yellow}</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
@@ -345,12 +345,12 @@
       ],
       "text/plain": [
        "   Number of petals       Name                Color\n",
-       "0                 8      lotus  {yellow, red, pink}\n",
-       "1                34  sunflower  {yellow, red, pink}\n",
-       "2                 5       rose  {yellow, red, pink}"
+       "0                 8      lotus  {pink, red, yellow}\n",
+       "1                34  sunflower  {pink, red, yellow}\n",
+       "2                 5       rose  {pink, red, yellow}"
       ]
      },
-     "execution_count": 12,
+     "execution_count": 6,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -375,7 +375,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 7,
    "metadata": {},
    "outputs": [
     {
@@ -409,19 +409,19 @@
        "      <th>0</th>\n",
        "      <td>8</td>\n",
        "      <td>lotus</td>\n",
-       "      <td>{yellow, red, pink}</td>\n",
+       "      <td>{pink, red, yellow}</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td>34</td>\n",
        "      <td>sunflower</td>\n",
-       "      <td>{yellow, red, pink}</td>\n",
+       "      <td>{pink, red, yellow}</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td>5</td>\n",
        "      <td>rose</td>\n",
-       "      <td>{yellow, red, pink}</td>\n",
+       "      <td>{pink, red, yellow}</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
@@ -429,12 +429,12 @@
       ],
       "text/plain": [
        "   Number of petals       Name                Color\n",
-       "0                 8      lotus  {yellow, red, pink}\n",
-       "1                34  sunflower  {yellow, red, pink}\n",
-       "2                 5       rose  {yellow, red, pink}"
+       "0                 8      lotus  {pink, red, yellow}\n",
+       "1                34  sunflower  {pink, red, yellow}\n",
+       "2                 5       rose  {pink, red, yellow}"
       ]
      },
-     "execution_count": 13,
+     "execution_count": 7,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -454,7 +454,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 8,
    "metadata": {},
    "outputs": [
     {
@@ -509,7 +509,7 @@
        "2                 5       rose"
       ]
      },
-     "execution_count": 14,
+     "execution_count": 8,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -533,7 +533,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 9,
    "metadata": {},
    "outputs": [
     {
@@ -645,7 +645,7 @@
        "7       26.8      54.3    Moiodexno   Retreat      12000"
       ]
      },
-     "execution_count": 15,
+     "execution_count": 9,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -675,7 +675,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 16,
+   "execution_count": 10,
    "metadata": {},
    "outputs": [
     {
@@ -684,7 +684,7 @@
        "(8, 5)"
       ]
      },
-     "execution_count": 16,
+     "execution_count": 10,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -695,7 +695,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": 11,
    "metadata": {},
    "outputs": [
     {
@@ -704,7 +704,7 @@
        "5"
       ]
      },
-     "execution_count": 17,
+     "execution_count": 11,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -715,7 +715,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 18,
+   "execution_count": 12,
    "metadata": {},
    "outputs": [
     {
@@ -724,7 +724,7 @@
        "8"
       ]
      },
-     "execution_count": 18,
+     "execution_count": 12,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -744,7 +744,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 19,
+   "execution_count": 13,
    "metadata": {},
    "outputs": [
     {
@@ -753,7 +753,7 @@
        "5"
       ]
      },
-     "execution_count": 19,
+     "execution_count": 13,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -764,7 +764,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 20,
+   "execution_count": 14,
    "metadata": {},
    "outputs": [
     {
@@ -773,7 +773,7 @@
        "8"
       ]
      },
-     "execution_count": 20,
+     "execution_count": 14,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -792,7 +792,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 21,
+   "execution_count": 15,
    "metadata": {},
    "outputs": [
     {
@@ -801,7 +801,7 @@
        "Index(['Longitude', 'Latitude', 'City', 'Direction', 'Survivors'], dtype='object')"
       ]
      },
-     "execution_count": 21,
+     "execution_count": 15,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -819,7 +819,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 22,
+   "execution_count": 16,
    "metadata": {},
    "outputs": [
     {
@@ -931,7 +931,7 @@
        "7       26.8      54.3    Moiodexno   Retreat      12000"
       ]
      },
-     "execution_count": 22,
+     "execution_count": 16,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -949,7 +949,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 23,
+   "execution_count": 17,
    "metadata": {},
    "outputs": [
     {
@@ -1061,7 +1061,7 @@
        "7       26.8      54.3    Moiodexno   Retreat      12000"
       ]
      },
-     "execution_count": 23,
+     "execution_count": 17,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1079,7 +1079,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 24,
+   "execution_count": 18,
    "metadata": {},
    "outputs": [
     {
@@ -1191,7 +1191,7 @@
        "7       26.8      54.3    Moiodexno   Retreat      12000"
       ]
      },
-     "execution_count": 24,
+     "execution_count": 18,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1211,7 +1211,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 25,
+   "execution_count": 19,
    "metadata": {
     "scrolled": true
    },
@@ -1230,7 +1230,7 @@
        "Name: Survivors, dtype: int64"
       ]
      },
-     "execution_count": 25,
+     "execution_count": 19,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1250,7 +1250,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 26,
+   "execution_count": 20,
    "metadata": {},
    "outputs": [
     {
@@ -1259,7 +1259,7 @@
        "pandas.core.frame.DataFrame"
       ]
      },
-     "execution_count": 26,
+     "execution_count": 20,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1277,7 +1277,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 27,
+   "execution_count": 21,
    "metadata": {},
    "outputs": [
     {
@@ -1353,7 +1353,7 @@
        "7      12000"
       ]
      },
-     "execution_count": 27,
+     "execution_count": 21,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1364,7 +1364,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 28,
+   "execution_count": 22,
    "metadata": {},
    "outputs": [
     {
@@ -1373,7 +1373,7 @@
        "pandas.core.frame.DataFrame"
       ]
      },
-     "execution_count": 28,
+     "execution_count": 22,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1397,7 +1397,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 29,
+   "execution_count": 23,
    "metadata": {},
    "outputs": [
     {
@@ -1414,7 +1414,7 @@
        "Name: Survivors, dtype: int64"
       ]
      },
-     "execution_count": 29,
+     "execution_count": 23,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1432,7 +1432,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 30,
+   "execution_count": 24,
    "metadata": {},
    "outputs": [
     {
@@ -1441,7 +1441,7 @@
        "145000"
       ]
      },
-     "execution_count": 30,
+     "execution_count": 24,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1452,7 +1452,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 31,
+   "execution_count": 25,
    "metadata": {},
    "outputs": [
     {
@@ -1461,7 +1461,7 @@
        "24000"
       ]
      },
-     "execution_count": 31,
+     "execution_count": 25,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1481,7 +1481,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 32,
+   "execution_count": 26,
    "metadata": {},
    "outputs": [
     {
@@ -1490,7 +1490,7 @@
        "24000"
       ]
      },
-     "execution_count": 32,
+     "execution_count": 26,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1509,7 +1509,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 33,
+   "execution_count": 27,
    "metadata": {},
    "outputs": [
     {
@@ -1630,7 +1630,7 @@
        "7       26.8      54.3    Moiodexno   Retreat      12000           0.082759"
       ]
      },
-     "execution_count": 33,
+     "execution_count": 27,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1656,93 +1656,93 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 34,
+   "execution_count": 28,
    "metadata": {},
    "outputs": [
     {
      "data": {
       "text/html": [
        "<style  type=\"text/css\" >\n",
-       "</style><table id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122\" ><thead>    <tr>        <th class=\"blank level0\" ></th>        <th class=\"col_heading level0 col0\" >Longitude</th>        <th class=\"col_heading level0 col1\" >Latitude</th>        <th class=\"col_heading level0 col2\" >City Name</th>        <th class=\"col_heading level0 col3\" >Direction</th>        <th class=\"col_heading level0 col4\" >Survivors</th>        <th class=\"col_heading level0 col5\" >Percent Surviving</th>    </tr></thead><tbody>\n",
+       "</style><table id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23\" ><thead>    <tr>        <th class=\"blank level0\" ></th>        <th class=\"col_heading level0 col0\" >Longitude</th>        <th class=\"col_heading level0 col1\" >Latitude</th>        <th class=\"col_heading level0 col2\" >City Name</th>        <th class=\"col_heading level0 col3\" >Direction</th>        <th class=\"col_heading level0 col4\" >Survivors</th>        <th class=\"col_heading level0 col5\" >Percent Surviving</th>    </tr></thead><tbody>\n",
        "                <tr>\n",
-       "                        <th id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122level0_row0\" class=\"row_heading level0 row0\" >0</th>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row0_col0\" class=\"data row0 col0\" >32.000000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row0_col1\" class=\"data row0 col1\" >54.800000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row0_col2\" class=\"data row0 col2\" >Smolensk</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row0_col3\" class=\"data row0 col3\" >Advance</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row0_col4\" class=\"data row0 col4\" >145000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row0_col5\" class=\"data row0 col5\" >100.00%</td>\n",
+       "                        <th id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23level0_row0\" class=\"row_heading level0 row0\" >0</th>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row0_col0\" class=\"data row0 col0\" >32.000000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row0_col1\" class=\"data row0 col1\" >54.800000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row0_col2\" class=\"data row0 col2\" >Smolensk</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row0_col3\" class=\"data row0 col3\" >Advance</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row0_col4\" class=\"data row0 col4\" >145000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row0_col5\" class=\"data row0 col5\" >100.00%</td>\n",
        "            </tr>\n",
        "            <tr>\n",
-       "                        <th id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122level0_row1\" class=\"row_heading level0 row1\" >1</th>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row1_col0\" class=\"data row1 col0\" >33.200000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row1_col1\" class=\"data row1 col1\" >54.900000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row1_col2\" class=\"data row1 col2\" >Dorogobouge</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row1_col3\" class=\"data row1 col3\" >Advance</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row1_col4\" class=\"data row1 col4\" >140000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row1_col5\" class=\"data row1 col5\" >96.55%</td>\n",
+       "                        <th id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23level0_row1\" class=\"row_heading level0 row1\" >1</th>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row1_col0\" class=\"data row1 col0\" >33.200000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row1_col1\" class=\"data row1 col1\" >54.900000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row1_col2\" class=\"data row1 col2\" >Dorogobouge</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row1_col3\" class=\"data row1 col3\" >Advance</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row1_col4\" class=\"data row1 col4\" >140000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row1_col5\" class=\"data row1 col5\" >96.55%</td>\n",
        "            </tr>\n",
        "            <tr>\n",
-       "                        <th id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122level0_row2\" class=\"row_heading level0 row2\" >2</th>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row2_col0\" class=\"data row2 col0\" >34.400000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row2_col1\" class=\"data row2 col1\" >55.500000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row2_col2\" class=\"data row2 col2\" >Chjat</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row2_col3\" class=\"data row2 col3\" >Advance</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row2_col4\" class=\"data row2 col4\" >127100</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row2_col5\" class=\"data row2 col5\" >87.66%</td>\n",
+       "                        <th id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23level0_row2\" class=\"row_heading level0 row2\" >2</th>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row2_col0\" class=\"data row2 col0\" >34.400000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row2_col1\" class=\"data row2 col1\" >55.500000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row2_col2\" class=\"data row2 col2\" >Chjat</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row2_col3\" class=\"data row2 col3\" >Advance</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row2_col4\" class=\"data row2 col4\" >127100</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row2_col5\" class=\"data row2 col5\" >87.66%</td>\n",
        "            </tr>\n",
        "            <tr>\n",
-       "                        <th id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122level0_row3\" class=\"row_heading level0 row3\" >3</th>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row3_col0\" class=\"data row3 col0\" >37.600000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row3_col1\" class=\"data row3 col1\" >55.800000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row3_col2\" class=\"data row3 col2\" >Moscou</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row3_col3\" class=\"data row3 col3\" >Advance</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row3_col4\" class=\"data row3 col4\" >100000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row3_col5\" class=\"data row3 col5\" >68.97%</td>\n",
+       "                        <th id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23level0_row3\" class=\"row_heading level0 row3\" >3</th>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row3_col0\" class=\"data row3 col0\" >37.600000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row3_col1\" class=\"data row3 col1\" >55.800000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row3_col2\" class=\"data row3 col2\" >Moscou</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row3_col3\" class=\"data row3 col3\" >Advance</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row3_col4\" class=\"data row3 col4\" >100000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row3_col5\" class=\"data row3 col5\" >68.97%</td>\n",
        "            </tr>\n",
        "            <tr>\n",
-       "                        <th id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122level0_row4\" class=\"row_heading level0 row4\" >4</th>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row4_col0\" class=\"data row4 col0\" >34.300000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row4_col1\" class=\"data row4 col1\" >55.200000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row4_col2\" class=\"data row4 col2\" >Wixma</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row4_col3\" class=\"data row4 col3\" >Retreat</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row4_col4\" class=\"data row4 col4\" >55000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row4_col5\" class=\"data row4 col5\" >37.93%</td>\n",
+       "                        <th id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23level0_row4\" class=\"row_heading level0 row4\" >4</th>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row4_col0\" class=\"data row4 col0\" >34.300000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row4_col1\" class=\"data row4 col1\" >55.200000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row4_col2\" class=\"data row4 col2\" >Wixma</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row4_col3\" class=\"data row4 col3\" >Retreat</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row4_col4\" class=\"data row4 col4\" >55000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row4_col5\" class=\"data row4 col5\" >37.93%</td>\n",
        "            </tr>\n",
        "            <tr>\n",
-       "                        <th id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122level0_row5\" class=\"row_heading level0 row5\" >5</th>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row5_col0\" class=\"data row5 col0\" >32.000000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row5_col1\" class=\"data row5 col1\" >54.600000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row5_col2\" class=\"data row5 col2\" >Smolensk</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row5_col3\" class=\"data row5 col3\" >Retreat</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row5_col4\" class=\"data row5 col4\" >24000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row5_col5\" class=\"data row5 col5\" >16.55%</td>\n",
+       "                        <th id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23level0_row5\" class=\"row_heading level0 row5\" >5</th>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row5_col0\" class=\"data row5 col0\" >32.000000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row5_col1\" class=\"data row5 col1\" >54.600000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row5_col2\" class=\"data row5 col2\" >Smolensk</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row5_col3\" class=\"data row5 col3\" >Retreat</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row5_col4\" class=\"data row5 col4\" >24000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row5_col5\" class=\"data row5 col5\" >16.55%</td>\n",
        "            </tr>\n",
        "            <tr>\n",
-       "                        <th id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122level0_row6\" class=\"row_heading level0 row6\" >6</th>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row6_col0\" class=\"data row6 col0\" >30.400000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row6_col1\" class=\"data row6 col1\" >54.400000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row6_col2\" class=\"data row6 col2\" >Orscha</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row6_col3\" class=\"data row6 col3\" >Retreat</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row6_col4\" class=\"data row6 col4\" >20000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row6_col5\" class=\"data row6 col5\" >13.79%</td>\n",
+       "                        <th id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23level0_row6\" class=\"row_heading level0 row6\" >6</th>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row6_col0\" class=\"data row6 col0\" >30.400000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row6_col1\" class=\"data row6 col1\" >54.400000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row6_col2\" class=\"data row6 col2\" >Orscha</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row6_col3\" class=\"data row6 col3\" >Retreat</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row6_col4\" class=\"data row6 col4\" >20000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row6_col5\" class=\"data row6 col5\" >13.79%</td>\n",
        "            </tr>\n",
        "            <tr>\n",
-       "                        <th id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122level0_row7\" class=\"row_heading level0 row7\" >7</th>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row7_col0\" class=\"data row7 col0\" >26.800000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row7_col1\" class=\"data row7 col1\" >54.300000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row7_col2\" class=\"data row7 col2\" >Moiodexno</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row7_col3\" class=\"data row7 col3\" >Retreat</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row7_col4\" class=\"data row7 col4\" >12000</td>\n",
-       "                        <td id=\"T_5192bdfc_4f7f_11eb_bd4f_acde48001122row7_col5\" class=\"data row7 col5\" >8.28%</td>\n",
+       "                        <th id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23level0_row7\" class=\"row_heading level0 row7\" >7</th>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row7_col0\" class=\"data row7 col0\" >26.800000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row7_col1\" class=\"data row7 col1\" >54.300000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row7_col2\" class=\"data row7 col2\" >Moiodexno</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row7_col3\" class=\"data row7 col3\" >Retreat</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row7_col4\" class=\"data row7 col4\" >12000</td>\n",
+       "                        <td id=\"T_175380e4_5106_11eb_a7e3_685b35b96a23row7_col5\" class=\"data row7 col5\" >8.28%</td>\n",
        "            </tr>\n",
        "    </tbody></table>"
       ],
       "text/plain": [
-       "<pandas.io.formats.style.Styler at 0x7fe4933694f0>"
+       "<pandas.io.formats.style.Styler at 0x7fe11dc64358>"
       ]
      },
-     "execution_count": 34,
+     "execution_count": 28,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1770,7 +1770,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 35,
+   "execution_count": 29,
    "metadata": {},
    "outputs": [
     {
@@ -1855,7 +1855,7 @@
        "7       26.8      54.3"
       ]
      },
-     "execution_count": 35,
+     "execution_count": 29,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1875,7 +1875,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 36,
+   "execution_count": 30,
    "metadata": {},
    "outputs": [
     {
@@ -1960,7 +1960,7 @@
        "7       26.8      54.3"
       ]
      },
-     "execution_count": 36,
+     "execution_count": 30,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -1978,7 +1978,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 37,
+   "execution_count": 31,
    "metadata": {},
    "outputs": [
     {
@@ -1995,7 +1995,7 @@
        "Name: Survivors, dtype: int64"
       ]
      },
-     "execution_count": 37,
+     "execution_count": 31,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -2013,7 +2013,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 38,
+   "execution_count": 32,
    "metadata": {},
    "outputs": [
     {
@@ -2107,7 +2107,7 @@
        "7    Moiodexno      12000           0.082759"
       ]
      },
-     "execution_count": 38,
+     "execution_count": 32,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -2125,7 +2125,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 39,
+   "execution_count": 33,
    "metadata": {},
    "outputs": [
     {
@@ -2246,7 +2246,7 @@
        "7       26.8      54.3    Moiodexno   Retreat      12000           0.082759"
       ]
      },
-     "execution_count": 39,
+     "execution_count": 33,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -2287,7 +2287,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 67 - 8
content/chapters/07/1/Visualizing_Categorical_Distributions.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 433 - 9
content/chapters/07/2/Visualizing_Numerical_Distributions.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 127 - 5
content/chapters/07/3/Overlaid_Graphs.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 1 - 1
content/chapters/07/Visualization.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 556 - 26
content/chapters/08/1/Applying_a_Function_to_a_Column.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 894 - 30
content/chapters/08/2/Classifying_by_One_Variable.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 894 - 27
content/chapters/08/3/Cross-Classifying_by_More_than_One_Variable.ipynb


+ 635 - 20
content/chapters/08/4/Joining_Tables_by_Columns.ipynb

@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 11,
    "metadata": {
     "tags": [
      "remove_input"
@@ -45,9 +45,78 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 12,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Flavor</th>\n",
+       "      <th>Price</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>3.55</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>vanilla</td>\n",
+       "      <td>4.75</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>6.55</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>5.25</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>5.75</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "       Flavor  Price\n",
+       "0  strawberry   3.55\n",
+       "1     vanilla   4.75\n",
+       "2   chocolate   6.55\n",
+       "3  strawberry   5.25\n",
+       "4   chocolate   5.75"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "cones = pd.DataFrame(\n",
     "    {'Flavor':np.array(['strawberry', 'vanilla', 'chocolate', 'strawberry', 'chocolate']),\n",
@@ -58,9 +127,66 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 13,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Kind</th>\n",
+       "      <th>Stars</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>2.5</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>3.5</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>vanilla</td>\n",
+       "      <td>4.0</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "         Kind  Stars\n",
+       "0  strawberry    2.5\n",
+       "1   chocolate    3.5\n",
+       "2     vanilla    4.0"
+      ]
+     },
+     "execution_count": 13,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "ratings = pd.DataFrame(\n",
     "    {'Kind':np.array(['strawberry', 'chocolate', 'vanilla']),\n",
@@ -84,9 +210,90 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 14,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Flavor</th>\n",
+       "      <th>Price</th>\n",
+       "      <th>Kind</th>\n",
+       "      <th>Stars</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>3.55</td>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>2.5</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>vanilla</td>\n",
+       "      <td>4.75</td>\n",
+       "      <td>vanilla</td>\n",
+       "      <td>4.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>6.55</td>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>3.5</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>5.25</td>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>2.5</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>5.75</td>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>3.5</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "       Flavor  Price        Kind  Stars\n",
+       "0  strawberry   3.55  strawberry    2.5\n",
+       "1     vanilla   4.75     vanilla    4.0\n",
+       "2   chocolate   6.55   chocolate    3.5\n",
+       "3  strawberry   5.25  strawberry    2.5\n",
+       "4   chocolate   5.75   chocolate    3.5"
+      ]
+     },
+     "execution_count": 14,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "cones = pd.DataFrame(\n",
     "    {'Flavor':np.array(['strawberry', 'vanilla', 'chocolate', 'strawberry', 'chocolate']),\n",
@@ -112,9 +319,84 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 15,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Flavor</th>\n",
+       "      <th>Price</th>\n",
+       "      <th>Stars</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>3.55</td>\n",
+       "      <td>2.5</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>vanilla</td>\n",
+       "      <td>4.75</td>\n",
+       "      <td>4.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>6.55</td>\n",
+       "      <td>3.5</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>5.25</td>\n",
+       "      <td>2.5</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>5.75</td>\n",
+       "      <td>3.5</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "       Flavor  Price  Stars\n",
+       "0  strawberry   3.55    2.5\n",
+       "1     vanilla   4.75    4.0\n",
+       "2   chocolate   6.55    3.5\n",
+       "3  strawberry   5.25    2.5\n",
+       "4   chocolate   5.75    3.5"
+      ]
+     },
+     "execution_count": 15,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "#rated = rates[['Flavor', 'Price', 'Stars']].sort_values(by=(['Flavor']))\n",
     "\n",
@@ -140,9 +422,90 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 16,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Flavor</th>\n",
+       "      <th>Price</th>\n",
+       "      <th>Stars</th>\n",
+       "      <th>$/Price</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>vanilla</td>\n",
+       "      <td>4.75</td>\n",
+       "      <td>4.0</td>\n",
+       "      <td>1.187500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>3.55</td>\n",
+       "      <td>2.5</td>\n",
+       "      <td>1.420000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>5.75</td>\n",
+       "      <td>3.5</td>\n",
+       "      <td>1.642857</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>6.55</td>\n",
+       "      <td>3.5</td>\n",
+       "      <td>1.871429</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>5.25</td>\n",
+       "      <td>2.5</td>\n",
+       "      <td>2.100000</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "       Flavor  Price  Stars   $/Price\n",
+       "1     vanilla   4.75    4.0  1.187500\n",
+       "0  strawberry   3.55    2.5  1.420000\n",
+       "4   chocolate   5.75    3.5  1.642857\n",
+       "2   chocolate   6.55    3.5  1.871429\n",
+       "3  strawberry   5.25    2.5  2.100000"
+      ]
+     },
+     "execution_count": 16,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "rated['$/Price'] = rated['Price'] / rated['Stars']\n",
     "\n",
@@ -165,9 +528,84 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 17,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Kind</th>\n",
+       "      <th>Stars</th>\n",
+       "      <th>Price</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>2.5</td>\n",
+       "      <td>3.55</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>strawberry</td>\n",
+       "      <td>2.5</td>\n",
+       "      <td>5.25</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>3.5</td>\n",
+       "      <td>6.55</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>3.5</td>\n",
+       "      <td>5.75</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>vanilla</td>\n",
+       "      <td>4.0</td>\n",
+       "      <td>4.75</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "         Kind  Stars  Price\n",
+       "0  strawberry    2.5   3.55\n",
+       "0  strawberry    2.5   5.25\n",
+       "1   chocolate    3.5   6.55\n",
+       "1   chocolate    3.5   5.75\n",
+       "2     vanilla    4.0   4.75"
+      ]
+     },
+     "execution_count": 17,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "cones1 = pd.DataFrame(\n",
     "    {'Flavor':np.array(['strawberry', 'vanilla', 'chocolate', 'strawberry', 'chocolate']),\n",
@@ -195,9 +633,72 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 18,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Flavor</th>\n",
+       "      <th>Stars</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>vanilla</td>\n",
+       "      <td>5</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>3</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>vanilla</td>\n",
+       "      <td>5</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>4</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "      Flavor  Stars\n",
+       "0    vanilla      5\n",
+       "1  chocolate      3\n",
+       "2    vanilla      5\n",
+       "3  chocolate      4"
+      ]
+     },
+     "execution_count": 18,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "reviews = pd.DataFrame(\n",
     "    {'Flavor':np.array(['vanilla', 'chocolate', 'vanilla', 'chocolate']),\n",
@@ -208,9 +709,62 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 19,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Stars average</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Flavor</th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>chocolate</th>\n",
+       "      <td>3.5</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>vanilla</th>\n",
+       "      <td>5.0</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "           Stars average\n",
+       "Flavor                  \n",
+       "chocolate            3.5\n",
+       "vanilla              5.0"
+      ]
+     },
+     "execution_count": 19,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "average_review = reviews.groupby('Flavor').mean()\n",
     "\n",
@@ -228,9 +782,70 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 20,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Flavor</th>\n",
+       "      <th>Price</th>\n",
+       "      <th>Stars average</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>6.55</td>\n",
+       "      <td>3.5</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>chocolate</td>\n",
+       "      <td>5.75</td>\n",
+       "      <td>3.5</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>vanilla</td>\n",
+       "      <td>4.75</td>\n",
+       "      <td>5.0</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "      Flavor  Price  Stars average\n",
+       "2  chocolate   6.55            3.5\n",
+       "4  chocolate   5.75            3.5\n",
+       "1    vanilla   4.75            5.0"
+      ]
+     },
+     "execution_count": 20,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "reviewers = cones.join(average_review, on='Flavor')\n",
     "\n",
@@ -273,7 +888,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 267 - 8
content/chapters/08/5/Bike_Sharing_in_the_Bay_Area.ipynb


+ 130 - 25
content/chapters/08/Functions_and_Tables.ipynb

@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
    "metadata": {
     "tags": [
      "remove_input"
@@ -48,7 +48,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 2,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -79,18 +79,40 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 3,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "34"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "double(17)"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 4,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "-0.3"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "double(-0.6/4)"
    ]
@@ -112,9 +134,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 5,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "84"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "any_name = 42\n",
     "double(any_name)"
@@ -129,9 +162,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 6,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([ 6,  8, 10])"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "double(np.array([3, 4, 5]))"
    ]
@@ -147,13 +191,25 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 7,
    "metadata": {
     "tags": [
      "raises-exception"
     ]
    },
-   "outputs": [],
+   "outputs": [
+    {
+     "ename": "NameError",
+     "evalue": "name 'x' is not defined",
+     "output_type": "error",
+     "traceback": [
+      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
+      "\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
+      "\u001b[0;32m<ipython-input-7-6fcf9dfbd479>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mx\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
+      "\u001b[0;31mNameError\u001b[0m: name 'x' is not defined"
+     ]
+    }
+   ],
    "source": [
     "x"
    ]
@@ -171,7 +227,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 8,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -194,9 +250,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 9,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "16.5"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "percent(33, 200)"
    ]
@@ -210,7 +277,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 10,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -229,9 +296,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 11,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([33.33, 47.62, 19.05])"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "some_array = np.array([7, 10, 4])\n",
     "percents(some_array)"
@@ -246,9 +324,17 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 12,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "The biggest difference is 5\n"
+     ]
+    }
+   ],
    "source": [
     "def biggest_difference(array_x):\n",
     "    \"\"\"Find the biggest difference in absolute value between two adjacent elements of array_x.\"\"\"\n",
@@ -286,9 +372,19 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 13,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Rounded to 1 decimal place:  [28.6 14.3 57.1]\n",
+      "Rounded to 2 decimal places: [28.57 14.29 57.14]\n",
+      "Rounded to 3 decimal places: [28.571 14.286 57.143]\n"
+     ]
+    }
+   ],
    "source": [
     "def percents(counts, decimal_places):\n",
     "    \"\"\"Convert the values in array_x to percents out of the total of array_x.\"\"\"\n",
@@ -310,9 +406,18 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 14,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Rounded to 1 decimal place: [28.6 14.3 57.1]\n",
+      "Rounded to the default number of decimal places: [28.57 14.29 57.14]\n"
+     ]
+    }
+   ],
    "source": [
     "def percents(counts, decimal_places=2):\n",
     "    \"\"\"Convert the values in array_x to percents out of the total of array_x.\"\"\"\n",
@@ -359,7 +464,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 96 - 23
content/chapters/09/1/Conditional_Statements.ipynb

@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
    "metadata": {
     "tags": [
      "remove_input"
@@ -11,13 +11,13 @@
    "outputs": [],
    "source": [
     "path_data = '../../../../data/'\n",
+    "\n",
     "import numpy as np\n",
     "import pandas as pd\n",
-    "import matplotlib\n",
+    "\n",
     "%matplotlib inline\n",
     "import matplotlib.pyplot as plt\n",
     "plt.style.use('fivethirtyeight')\n",
-    "import numpy as np\n",
     "\n",
     "import warnings\n",
     "warnings.filterwarnings('ignore')"
@@ -39,7 +39,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 2,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -51,9 +51,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 3,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'Positive'"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "sign(3)"
    ]
@@ -67,7 +78,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 4,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -83,7 +94,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 5,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -105,9 +116,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 6,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'Negative'"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "sign(-3)"
    ]
@@ -121,7 +143,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 7,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -139,9 +161,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 8,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'Neither positive nor negative'"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "sign(0)"
    ]
@@ -155,7 +188,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 9,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -173,9 +206,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 10,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'Neither positive nor negative'"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "sign(0)"
    ]
@@ -216,7 +260,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 11,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -239,9 +283,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 12,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "(-1, -1, 0, 0, 1, 1)"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "one_bet(1), one_bet(2), one_bet(3), one_bet (4), one_bet(5), one_bet(6)"
    ]
@@ -261,7 +316,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 13,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -277,9 +332,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 14,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "0"
+      ]
+     },
+     "execution_count": 14,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "one_bet(np.random.choice(np.arange(1, 7)))"
    ]
@@ -290,6 +356,13 @@
    "source": [
     "At this point it is natural to want to collect the results of all the bets so that we can analyze them. In the next section we develop a way to do this without running the cell over and over again."
    ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
   }
  ],
  "metadata": {
@@ -309,7 +382,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 151 - 29
content/chapters/09/2/Iteration.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 173 - 17
content/chapters/09/3/Simulation.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 385 - 22
content/chapters/09/4/Monty_Hall_Problem.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 76 - 5
content/chapters/09/5/Finding_Probabilities.ipynb


+ 145 - 24
content/chapters/09/Randomness.ipynb

@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
    "metadata": {
     "tags": [
      "remove_input"
@@ -39,9 +39,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 2,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'treatment'"
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "two_groups = np.array(['treatment', 'control'])\n",
     "np.random.choice(two_groups)"
@@ -56,9 +67,22 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 3,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array(['treatment', 'treatment', 'control', 'control', 'treatment',\n",
+       "       'control', 'treatment', 'treatment', 'control', 'treatment'],\n",
+       "      dtype='<U9')"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "np.random.choice(two_groups, 10)"
    ]
@@ -87,9 +111,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 4,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "True"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "3 > 1 + 1"
    ]
@@ -119,22 +154,42 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 5,
    "metadata": {
     "tags": [
      "raises-exception"
     ]
    },
-   "outputs": [],
+   "outputs": [
+    {
+     "ename": "SyntaxError",
+     "evalue": "can't assign to literal (<ipython-input-5-e8c755f5e450>, line 1)",
+     "output_type": "error",
+     "traceback": [
+      "\u001b[0;36m  File \u001b[0;32m\"<ipython-input-5-e8c755f5e450>\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m    5 = 10/2\u001b[0m\n\u001b[0m            ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m can't assign to literal\n"
+     ]
+    }
+   ],
    "source": [
     "5 = 10/2"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 6,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "True"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "5 == 10/2"
    ]
@@ -148,9 +203,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 7,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "True"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "1 < 1 + 1 < 3"
    ]
@@ -164,9 +230,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 8,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "True"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "x = 12\n",
     "y = 5\n",
@@ -184,9 +261,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 9,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "True"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "'Dog' > 'Catastrophe' > 'Cat'"
    ]
@@ -200,9 +288,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 10,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "False"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "np.random.choice(two_groups) == 'treatment'"
    ]
@@ -224,9 +323,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 11,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([False,  True, False,  True,  True])"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "tosses = np.array(['Tails', 'Heads', 'Tails', 'Heads', 'Heads'])\n",
     "tosses == 'Heads'"
@@ -241,9 +351,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 12,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "3"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "np.count_nonzero(tosses == 'Heads')"
    ]
@@ -273,7 +394,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 75 - 5
content/chapters/10/1/Empirical_Distributions.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 165 - 9
content/chapters/10/2/Sampling_from_a_Population.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 6 - 4
content/chapters/10/3/Empirical_Distribution_of_a_Statistic.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 965 - 15
content/chapters/10/Sampling_and_Empirical_Distributions.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 38 - 14
content/chapters/11/1/Assessing_Models.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 85 - 8
content/chapters/11/2/Multiple_Categories.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 542 - 19
content/chapters/11/3/Decisions_and_Uncertainty.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 6 - 4
content/chapters/11/4/Error_Probabilities.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 231 - 9
content/chapters/12/1/AB_Testing.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 732 - 30
content/chapters/12/2/Deflategate.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 818 - 27
content/chapters/12/3/Causality.ipynb


+ 2 - 2
content/chapters/12/Comparing_Two_Samples.ipynb

@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
    "metadata": {
     "tags": [
      "remove_input"
@@ -58,7 +58,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 142 - 10
content/chapters/13/1/Percentiles.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 951 - 11
content/chapters/13/2/Bootstrap.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 300 - 9
content/chapters/13/3/Confidence_Intervals.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 9 - 7
content/chapters/13/4/Using_Confidence_Intervals.ipynb


+ 2 - 2
content/chapters/13/Estimation.ipynb

@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
    "metadata": {
     "tags": [
      "remove_input"
@@ -82,7 +82,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 104 - 25
content/chapters/14/1/Properties_of_the_Mean.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 365 - 22
content/chapters/14/2/Variability.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 32 - 8
content/chapters/14/3/SD_and_the_Normal_Curve.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 185 - 9
content/chapters/14/4/Central_Limit_Theorem.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 117 - 7
content/chapters/14/5/Variability_of_the_Sample_Mean.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 5 - 3
content/chapters/14/6/Choosing_a_Sample_Size.ipynb


BIN
content/chapters/15/.DS_Store


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 0 - 0
content/chapters/15/1/Correlation.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 129 - 7
content/chapters/15/2/Regression_Line.ipynb


BIN
content/chapters/15/2/regline.png


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 65 - 6
content/chapters/15/3/Method_of_Least_Squares.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 217 - 8
content/chapters/15/4/Least_Squares_Regression.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 143 - 9
content/chapters/15/5/Visual_Diagnostics.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 247 - 25
content/chapters/15/6/Numerical_Diagnostics.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 116 - 6
content/chapters/15/Prediction.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 7 - 5
content/chapters/16/1/Regression_Model.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 9 - 7
content/chapters/16/2/Inference_for_the_True_Slope.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 7 - 5
content/chapters/16/3/Prediction_Intervals.ipynb


+ 3 - 5
content/chapters/16/Inference_for_Regression.ipynb

@@ -40,10 +40,8 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {
-    "collapsed": true
-   },
+   "execution_count": null,
+   "metadata": {},
    "outputs": [],
    "source": []
   }
@@ -65,7 +63,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 678 - 15
content/chapters/17/1/Nearest_Neighbors.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 154 - 9
content/chapters/17/2/Training_and_Testing.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 435 - 16
content/chapters/17/3/Rows_of_Tables.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 310 - 8
content/chapters/17/4/Implementing_the_Classifier.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 830 - 19
content/chapters/17/5/Accuracy_of_the_Classifier.ipynb


BIN
content/chapters/17/5/benign.png


BIN
content/chapters/17/5/malignant.png


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 32 - 8
content/chapters/17/6/Multiple_Regression.ipynb


+ 1 - 1
content/chapters/17/Classification.ipynb

@@ -81,7 +81,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 0 - 303
content/chapters/17_A/1/Nearest_Neighbors.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 0 - 159
content/chapters/17_A/2/Training_and_Testing.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 0 - 236
content/chapters/17_A/3/Rows_of_Tables.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 0 - 151
content/chapters/17_A/4/Implementing_the_Classifier.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 0 - 321
content/chapters/17_A/5/Accuracy_of_the_Classifier.ipynb


Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 0 - 166
content/chapters/17_A/6/Multiple_Regression.ipynb


+ 0 - 94
content/chapters/17_A/Classification.ipynb

@@ -1,94 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {
-    "collapsed": false,
-    "deletable": true,
-    "editable": true,
-    "tags": [
-     "remove_input"
-    ]
-   },
-   "outputs": [],
-   "source": [
-    "import matplotlib\n",
-    "#matplotlib.use('Agg')\n",
-    "path_data = '../../../data/'\n",
-    "from datascience import *\n",
-    "%matplotlib inline\n",
-    "import matplotlib.pyplot as plt\n",
-    "from mpl_toolkits.mplot3d import Axes3D\n",
-    "import numpy as np\n",
-    "import math\n",
-    "import scipy.stats as stats\n",
-    "plt.style.use('fivethirtyeight')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": true,
-    "editable": true
-   },
-   "source": [
-    "### Classification ###\n",
-    "\n",
-    "*[David Wagner](https://en.wikipedia.org/wiki/David_A._Wagner) is the primary author of this chapter.*"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": true,
-    "editable": true
-   },
-   "source": [
-    "*Machine learning* is a class of techniques for automatically finding patterns in data and using it to draw inferences or make predictions.  You have already seen linear regression, which is one kind of machine learning.  This chapter introduces a new one: *classification*.\n",
-    "\n",
-    "Classification is about learning how to make predictions from past examples.  We are given some examples where we have been told what the correct prediction was, and we want to learn from those examples how to make good predictions in the future.  Here are a few applications where classification is used in practice:\n",
-    "\n",
-    "- For each order Amazon receives, Amazon would like to predict: ***is this order fraudulent?***  They have some information about each order (e.g., its total value, whether the order is being shipped to an address this customer has used before, whether the shipping address is the same as the credit card holder's billing address).  They have lots of data on past orders, and they know which of those past orders were fraudulent and which weren't.  They want to learn patterns that will help them predict, as new orders arrive, whether those new orders are fraudulent.\n",
-    "\n",
-    "- Online dating sites would like to predict: ***are these two people compatible?***  Will they hit it off?  They have lots of data on which matches they've suggested to their customers in the past, and they have some idea which ones were successful.  As new customers sign up, they'd like to make predictions about who might be a good match for them.\n",
-    "\n",
-    "- Doctors would like to know: ***does this patient have cancer?***  Based on the measurements from some lab test, they'd like to be able to predict whether the particular patient has cancer.  They have lots of data on past patients, including their lab measurements and whether they ultimately developed cancer, and from that, they'd like to try to infer what measurements tend to be characteristic of cancer (or non-cancer) so they can diagnose future patients accurately.\n",
-    "\n",
-    "- Politicians would like to predict: ***are you going to vote for them?***  This will help them focus fundraising efforts on people who are likely to support them, and focus get-out-the-vote efforts on voters who will vote for them.  Public databases and commercial databases have a lot of information about most people: e.g., whether they own a home or rent; whether they live in a rich neighborhood or poor neighborhood; their interests and hobbies; their shopping habits; and so on.  And political campaigns have surveyed some voters and found out who they plan to vote for, so they have some examples where the correct answer is known.  From this data, the campaigns would like to find patterns that will help them make predictions about all other potential voters.\n",
-    "\n",
-    "All of these are classification tasks.  Notice that in each of these examples, the prediction is a yes/no question -- we call this *binary classification*, because there are only two possible predictions.\n",
-    "\n",
-    "In a classification task, each individual or situation where we'd like to make a prediction is called an *observation*.  We ordinarily have many observations.  Each observation has multiple *attributes*, which are known (for example, the total value of the order on Amazon, or the voter's annual salary).  Also, each observation has a *class*, which is the answer to the question we care about (for example, fraudulent or not, or voting for you or not).\n",
-    "\n",
-    "When Amazon is predicting whether orders are fraudulent, each order corresponds to a single observation.  Each observation has several attributes: the total value of the order, whether the order is being shipped to an address this customer has used before, and so on.  The class of the observation is either 0 or 1, where 0 means that the order is not fraudulent and 1 means that the order is fraudulent.  When a customer makes a new order, we do not observe whether it is fraudulent, but we do observe its attributes, and we will try to predict its class using those attributes.\n",
-    "\n",
-    "Classification requires data.  It involves looking for patterns, and to find patterns, you need data.  That's where the data science comes in.  In particular, we're going to assume that we have access to *training data*: a bunch of observations, where we know the class of each observation.  The collection of these pre-classified observations is also called a training set.  A classification algorithm is going to analyze the training set, and then come up with a classifier: an algorithm for predicting the class of future observations.\n",
-    "\n",
-    "Classifiers do not need to be perfect to be useful.  They can be useful even if their accuracy is less than 100%.  For instance, if the online dating site occasionally makes a bad recommendation, that's OK; their customers already expect to have to meet many people before they'll find someone they hit it off with.  Of course, you don't want the classifier to make too many errors &mdash; but it doesn't have to get the right answer every single time."
-   ]
-  }
- ],
- "metadata": {
-  "anaconda-cloud": {},
-  "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.5.2"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 0
-}

+ 219 - 15
content/chapters/18/1/More_Likely_than_Not_Binary_Classifier.ipynb

@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
    "metadata": {
     "tags": [
      "remove_input"
@@ -62,7 +62,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 2,
    "metadata": {
     "tags": [
      "remove_input"
@@ -80,9 +80,66 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 3,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Year</th>\n",
+       "      <th>Major</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>Second</td>\n",
+       "      <td>Undeclared</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>Second</td>\n",
+       "      <td>Undeclared</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>Second</td>\n",
+       "      <td>Undeclared</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "     Year       Major\n",
+       "0  Second  Undeclared\n",
+       "1  Second  Undeclared\n",
+       "2  Second  Undeclared"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "students.head(3)"
    ]
@@ -96,9 +153,66 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 4,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th>Major</th>\n",
+       "      <th>Declared</th>\n",
+       "      <th>Undeclared</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Year</th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>Second</th>\n",
+       "      <td>30</td>\n",
+       "      <td>30</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Third</th>\n",
+       "      <td>32</td>\n",
+       "      <td>8</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "Major   Declared  Undeclared\n",
+       "Year                        \n",
+       "Second        30          30\n",
+       "Third         32           8"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "students_pivot = students.groupby(['Year', 'Major']).Major.agg('count').to_frame('count').reset_index()\n",
     "\n",
@@ -126,9 +240,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 5,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "0.5161290322580645"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "32/(30+32)"
    ]
@@ -145,9 +270,66 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 6,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th>Major</th>\n",
+       "      <th>Declared</th>\n",
+       "      <th>Undeclared</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Year</th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>Second</th>\n",
+       "      <td>30</td>\n",
+       "      <td>30</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Third</th>\n",
+       "      <td>32</td>\n",
+       "      <td>8</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "Major   Declared  Undeclared\n",
+       "Year                        \n",
+       "Second        30          30\n",
+       "Third         32           8"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "pd.pivot_table(students_pivot, values='count', index=['Year'], columns=['Major'], aggfunc=np.sum).fillna(0)"
    ]
@@ -174,9 +356,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 7,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "0.5161290322580645"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "(0.4 * 0.8)/(0.6 * 0.5  +  0.4 * 0.8)"
    ]
@@ -228,9 +421,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 8,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "0.4838709677419354"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "(0.6 * 0.5)/(0.6 * 0.5  +  0.4 * 0.8)"
    ]
@@ -275,7 +479,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 187 - 18
content/chapters/18/2/Making_Decisions.ipynb

@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
    "metadata": {
     "tags": [
      "remove_input"
@@ -25,7 +25,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 2,
    "metadata": {
     "tags": [
      "remove_input"
@@ -99,9 +99,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 3,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "0.44295302013422816"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "(0.004 * 0.99)/(0.004 * 0.99  +  0.996*0.005 )"
    ]
@@ -123,9 +134,66 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 4,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th>Test_Result</th>\n",
+       "      <th>Negative</th>\n",
+       "      <th>Positive</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>True Condition</th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>Disease</th>\n",
+       "      <td>4</td>\n",
+       "      <td>396</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>No Disease</th>\n",
+       "      <td>99102</td>\n",
+       "      <td>498</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "Test_Result     Negative  Positive\n",
+       "True Condition                    \n",
+       "Disease                4       396\n",
+       "No Disease         99102       498"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "pop_004 = population(0.004)\n",
     "\n",
@@ -149,9 +217,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 5,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "0.4429530201342282"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "396/(396 + 498)"
    ]
@@ -215,9 +294,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 6,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "0.9124423963133641"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "(0.05 * 0.99)/(0.05 * 0.99  +  0.95 * 0.005)"
    ]
@@ -243,9 +333,66 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 7,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th>Test_Result</th>\n",
+       "      <th>Negative</th>\n",
+       "      <th>Positive</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>True Condition</th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>Disease</th>\n",
+       "      <td>50</td>\n",
+       "      <td>4950</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>No Disease</th>\n",
+       "      <td>94525</td>\n",
+       "      <td>475</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "Test_Result     Negative  Positive\n",
+       "True Condition                    \n",
+       "Disease               50      4950\n",
+       "No Disease         94525       475"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "pop_004 = population(0.05)\n",
     "\n",
@@ -267,9 +414,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 8,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "0.9124423963133641"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "4950/(4950 + 475)"
    ]
@@ -283,7 +441,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 9,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -303,9 +461,20 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 10,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "0.912639405204461"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "len(sample[sample['True Condition'] == 'Disease'])/len(positive)"
    ]
@@ -337,7 +506,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 2 - 2
content/chapters/18/Updating_Predictions.ipynb

@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
    "metadata": {
     "tags": [
      "remove_input"
@@ -53,7 +53,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.6.12"
   }
  },
  "nbformat": 4,

+ 0 - 415
content/chapters/18_A/1/More_Likely_than_Not_Binary_Classifier.ipynb

@@ -1,415 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {
-    "collapsed": true,
-    "tags": [
-     "remove_input"
-    ]
-   },
-   "outputs": [],
-   "source": [
-    "import matplotlib\n",
-    "from datascience import *\n",
-    "path_data = '../../../../data/'\n",
-    "%matplotlib inline\n",
-    "import matplotlib.pyplot as plots\n",
-    "import numpy as np\n",
-    "plots.style.use('fivethirtyeight')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### A \"More Likely Than Not\" Binary Classifier ###\n",
-    "Let's try to use data to classify a point into one of two categories, choosing the category that we think is more likely than not. To do this, we not only need the data but also a clear description of how chances are involved.\n",
-    "\n",
-    "We will start out in a simple artifical setting just to develop the main technique, and then move to a more intriguing example.\n",
-    "\n",
-    "Suppose there is a university class with the following composition:\n",
-    "- 60% of the students are Second Years and the remaining 40% are Third Years\n",
-    "- 50% of the Second Years have declared their major\n",
-    "- 80% of the Third Years have declared their major\n",
-    "\n",
-    "Now suppose **I pick a student at random from the class**. Can you classify the student as Second Year or Third Year, using our \"more likely than not\" criterion?\n",
-    "\n",
-    "You can, because the student is picked at random and so you know that the chance that the student is a Second Year is 60%. That's greater than the 40% chance of being a Third Year, so you would classify the student as Second Year.\n",
-    "\n",
-    "The information about the majors is irrelevant, as we already know the proportions of Second and Third Years in the class.\n",
-    "\n",
-    "We have a pretty simple classifier! But now suppose I give you some additional information about the student who was picked:\n",
-    "\n",
-    "**The student has declared a major.**\n",
-    "\n",
-    "Would this knowledge change your classification?"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Updating the Prediction Based on New Information ###\n",
-    "Now that we know the student has declared a major, it becomes important to look at the relation between year and major declaration. It's still true that more students are Second Years than Third Years. But it's also true that among the Third Years, a much higher percent have declared their major than among the Second Years. Our classifier has to take both of these observations into account.\n",
-    "\n",
-    "To visualize this, we will use a table `students` that consists of one row for each of 100 students whose years and majors have the same proportions as given in the data."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {
-    "collapsed": true,
-    "tags": [
-     "remove_input"
-    ]
-   },
-   "outputs": [],
-   "source": [
-    "year = np.array(['Second']*60 + ['Third']*40)\n",
-    "major = np.array(['Undeclared']*30+['Declared']*30+['Undeclared']*8+['Declared']*32)\n",
-    "students = Table().with_columns(\n",
-    "    'Year', year,\n",
-    "    'Major', major\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {
-    "collapsed": false
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "    <thead>\n",
-       "        <tr>\n",
-       "            <th>Year</th> <th>Major</th>\n",
-       "        </tr>\n",
-       "    </thead>\n",
-       "    <tbody>\n",
-       "        <tr>\n",
-       "            <td>Second</td> <td>Undeclared</td>\n",
-       "        </tr>\n",
-       "        <tr>\n",
-       "            <td>Second</td> <td>Undeclared</td>\n",
-       "        </tr>\n",
-       "        <tr>\n",
-       "            <td>Second</td> <td>Undeclared</td>\n",
-       "        </tr>\n",
-       "    </tbody>\n",
-       "</table>\n",
-       "<p>... (97 rows omitted)</p>"
-      ],
-      "text/plain": [
-       "<IPython.core.display.HTML object>"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    }
-   ],
-   "source": [
-    "students.show(3)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "To check that the proportions are correct, let's use `pivot` to cross-classify each student according to the two variables."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {
-    "collapsed": false
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "    <thead>\n",
-       "        <tr>\n",
-       "            <th>Year</th> <th>Declared</th> <th>Undeclared</th>\n",
-       "        </tr>\n",
-       "    </thead>\n",
-       "    <tbody>\n",
-       "        <tr>\n",
-       "            <td>Second</td> <td>30      </td> <td>30        </td>\n",
-       "        </tr>\n",
-       "        <tr>\n",
-       "            <td>Third </td> <td>32      </td> <td>8         </td>\n",
-       "        </tr>\n",
-       "    </tbody>\n",
-       "</table>"
-      ],
-      "text/plain": [
-       "Year   | Declared | Undeclared\n",
-       "Second | 30       | 30\n",
-       "Third  | 32       | 8"
-      ]
-     },
-     "execution_count": 4,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "students.pivot('Major', 'Year')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "The total count is 100 students, of whom 60 are Second Years and 40 are Third Years. Among the Second Years, 50% are in each of the Major categories. Among the 40 Third Years, 20% are Undeclared and 80% Declared. So this population of 100 students has the same proportions as the class in our problem, and we can assume that our student has been picked at random from among all 100 students.\n",
-    "\n",
-    "We have to pick which row the student is most likely to be in. When we knew nothing more about the student, he or she could be in any of the four cells, and therefore were more likely to be in the top row (Second Year) because that contains more students.\n",
-    "\n",
-    "But now we know that the student has declared a major, so the space of possible outcomes has decreased: now the student can only be in one of the two Declared cells. \n",
-    "\n",
-    "There are 62 students in those cells, and 32 out of the 62 are Third Years. That's more than half, even though not by much. \n",
-    "\n",
-    "So, in the light of the new information about the student's major, we have to update our prediction and now classify the student as a Third Year.\n",
-    "\n",
-    "What is the chance that our classification is correct? We will be right for all the 32 Third Years who are Declared, and wrong for the 30 Second Years who are Declared. The chance that we are correct is therefore about 0.516.\n",
-    "\n",
-    "In other words, the chance that we are correct is **the proportion of Third Years among the students who have Declared**."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {
-    "collapsed": false
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "0.5161290322580645"
-      ]
-     },
-     "execution_count": 5,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "32/(30+32)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Tree Diagram ###\n",
-    "The proportion that we have just calculated was based on a class of 100 students. But there's no reason the class couldn't have had 200 students, for example, as long as all the proportions in the cells were correct. Then our calculation would just have been 64/(60 + 64) which is 0.516 as before.\n",
-    "\n",
-    "So the calculation depends only on the proportions in the different categories, not on the counts. The proportions can be visualized in a *tree diagram*, shown directly below the pivot table for ease of comparison."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "metadata": {
-    "collapsed": false
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "    <thead>\n",
-       "        <tr>\n",
-       "            <th>Year</th> <th>Declared</th> <th>Undeclared</th>\n",
-       "        </tr>\n",
-       "    </thead>\n",
-       "    <tbody>\n",
-       "        <tr>\n",
-       "            <td>Second</td> <td>30      </td> <td>30        </td>\n",
-       "        </tr>\n",
-       "        <tr>\n",
-       "            <td>Third </td> <td>32      </td> <td>8         </td>\n",
-       "        </tr>\n",
-       "    </tbody>\n",
-       "</table>"
-      ],
-      "text/plain": [
-       "Year   | Declared | Undeclared\n",
-       "Second | 30       | 30\n",
-       "Third  | 32       | 8"
-      ]
-     },
-     "execution_count": 6,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "students.pivot('Major', 'Year')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "![Students Tree Diagram](../../../images/tree_students.png)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Like the pivot table, this diagram *partitions* the students into four distinct groups known as \"branches\". Notice that the \"Third Year, Declared\" branch contains the proportion 0.4 x 0.8 = 0.32 of the students, corresponding to the 32 students in the \"Third Year, Declared\" cell of the pivot table. The \"Second Year, Declared\" branch contains 0.6 x 0.5 = 0.3 of the students, corresponding to the 30 in the \"Second Year, Declared\" cell of the pivot table.\n",
-    "\n",
-    "We know that the student who was picked belongs to a \"Declared\" branch; that is, the student is either in the top branch or the third from top. Those two branches now form our reduced space of possibilities, and all chances have to be calculated relative to the total chance of this reduced space.\n",
-    "\n",
-    "So, given that the student is Declared, the chance of them being a Third Year can be calculated directly from the tree. The answer is the proportion in the \"Third Year, Declared\" branch relative to the total proportion in the two \"Declared\" branches.\n",
-    "\n",
-    "That is, the answer is **the proportion of Third Years among students who are Declared**, as before."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "metadata": {
-    "collapsed": false
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "0.5161290322580645"
-      ]
-     },
-     "execution_count": 7,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "(0.4 * 0.8)/(0.6 * 0.5  +  0.4 * 0.8)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Bayes' Rule ###\n",
-    "The method that we have just used is due to the Reverend [Thomas Bayes](https://en.wikipedia.org/wiki/Thomas_Bayes) (1701-1761). His method solved what was called an \"inverse probability\" problem: given new data, how can you update chances you had found earlier? Though Bayes lived three centuries ago, his method is [widely used now](https://en.wikipedia.org/wiki/Naive_Bayes_classifier) in machine learning.\n",
-    "\n",
-    "We will state the rule in the context of our population of students. First, some terminology:\n",
-    "\n",
-    "**Prior probabilities.** Before we knew the chosen student's major declaration status, the chance that the student was a Second Year was 60% and the chance that the student was a Third Year was 40%. These are the *prior* probabilities of the two categories.\n",
-    "\n",
-    "**Likelihoods.** These are the chances of the Major status, given the category of student; thus they can be read off the tree diagram. For example, the likelihood of Declared status given that the student is a Second Year is 0.5.\n",
-    "\n",
-    "**Posterior probabilities.** These are the chances of the two Year categories, *after* we have taken into account information about the Major declaration status. We computed one of these:\n",
-    "\n",
-    "The *posterior probability* that the student is a Third Year, given that the student has Declared, is denoted $P(\\text{Third Year} ~\\big{\\vert}~ Declared)$ and is calculated as follows."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "$$\n",
-    "\\begin{align*}\n",
-    "P(Third Year ~\\big{\\vert}~ Declared) \n",
-    "~ &=~ \\frac{ 0.4 \\times 0.8}{0.6 \\times 0.5 ~+~ 0.4 \\times  0.8} \\\\ \\\\\n",
-    "&=~ \\frac{\\mbox{(prior probability of Third Year)} \\times\n",
-    "\\mbox{(likelihood of Declared given Third Year)}}\n",
-    "{\\mbox{total probability of Declared}}\n",
-    "\\end{align*}\n",
-    "$$\n",
-    "\n",
-    "The other posterior probability is\n",
-    "\n",
-    "$$\n",
-    "\\begin{align*}\n",
-    "P(\\mbox{Second Year} ~\\big{\\vert}~ \\mbox{Declared})\n",
-    "~ &=~ \\frac{ 0.6 \\times 0.5}{0.6 \\times 0.5 ~+~ 0.4 \\times  0.8} \\\\ \\\\\n",
-    "&=~ \\frac{\\mbox{(prior probability of Second Year)} \\times\n",
-    "\\mbox{(likelihood of Declared given Second Year)}}\n",
-    "{\\mbox{total probability of Declared}}\n",
-    "\\end{align*}\n",
-    "$$"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "metadata": {
-    "collapsed": false
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "0.4838709677419354"
-      ]
-     },
-     "execution_count": 8,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "(0.6 * 0.5)/(0.6 * 0.5  +  0.4 * 0.8)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "That's about 0.484, which is less than half, consistent with our classification of Third Year. \n",
-    "\n",
-    "Notice that both the posterior probabilities have the same denominator: the chance of the new information, which is that the student has Declared.\n",
-    "\n",
-    "Because of this, Bayes' method is sometimes summarized as a statement about proportionality:\n",
-    "\n",
-    "$$\n",
-    "\\mbox{posterior} ~ \\propto ~ \\mbox{prior} \\times \\mbox{likelihood}\n",
-    "$$"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Formulas are great for efficiently describing calculations. But in settings like our example about students, it is simpler not to think in terms of formulas. Just use the tree diagram."
-   ]
-  }
- ],
- "metadata": {
-  "anaconda-cloud": {},
-  "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.6.5"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}

+ 0 - 458
content/chapters/18_A/2/Making_Decisions.ipynb

@@ -1,458 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {
-    "collapsed": true,
-    "tags": [
-     "remove_input"
-    ]
-   },
-   "outputs": [],
-   "source": [
-    "import matplotlib\n",
-    "from datascience import *\n",
-    "path_data = '../../../../data/'\n",
-    "%matplotlib inline\n",
-    "import matplotlib.pyplot as plots\n",
-    "import numpy as np\n",
-    "plots.style.use('fivethirtyeight')"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {
-    "collapsed": true,
-    "tags": [
-     "remove_input"
-    ]
-   },
-   "outputs": [],
-   "source": [
-    "def population(prior_prob_disease):\n",
-    "    n_d = int(prior_prob_disease*100000)\n",
-    "    n_nd = 100000 - n_d\n",
-    "    n_pos_d = int(0.99*n_d)\n",
-    "    n_neg_d = n_d - n_pos_d\n",
-    "    n_pos_nd = int(0.005*n_nd)\n",
-    "    n_neg_nd = n_nd - n_pos_nd\n",
-    "    condition = np.array(['Disease']*n_d + ['No Disease']*n_nd)\n",
-    "    d_test = np.array(['Positive']*n_pos_d + ['Negative']*n_neg_d)\n",
-    "    nd_test = np.array(['Positive']*n_pos_nd + ['Negative']*n_neg_nd)\n",
-    "    test = np.append(d_test, nd_test)\n",
-    "    t = Table().with_columns(\n",
-    "        'True Condition', condition,\n",
-    "        'Test Result', test\n",
-    "    )\n",
-    "    return t"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Making Decisions ###\n",
-    "A primary use of Bayes' Rule is to make decisions based on incomplete information, incorporating new information as it comes in. This section points out the importance of keeping your assumptions in mind as you make decisions."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Many medical tests for diseases return Positive or Negative results. A Positive result means that according to the test, the patient has the disease. A Negative result means the test concludes that the patient doesn't have the disease. \n",
-    "\n",
-    "Medical tests are carefully designed to be very accurate. But few tests are accurate 100% of the time. Almost all tests make errors of two kinds:\n",
-    "\n",
-    "- A **false positive** is an error in which the test concludes Positive but the patient doesn't have the disease.\n",
-    "\n",
-    "- A **false negative** is an error in which the test concludes Negative but the patient does have the disease.\n",
-    "\n",
-    "These errors can affect people's decisions. False positives can cause anxiety and unnecessary treatment (which in some cases is expensive or dangerous). False negatives can have even more serious consequences if the patient doesn't receive treatment because of their Negative test result."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### A Test for a Rare Disease ###\n",
-    "Suppose there is a large population and a disease that strikes a tiny proportion of the population. The tree diagram below summarizes information about such a disease and about a medical test for it.\n",
-    "\n",
-    "![Tree Rare Disease](../../../images/tree_disease_rare.png)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Overall, only 4 in 1000 of the population has the disease. The test is quite accurate: it has a very small false positive rate of 5 in 1000, and a somewhat larger (though still small) false negative rate of 1 in 100.\n",
-    "\n",
-    "Individuals might or might not know whether they have the disease; typically, people get tested to find out whether they have it.\n",
-    "\n",
-    "So **suppose a person is picked at random from the population** and tested. If the test result is Positive, how would you classify them: Disease, or No disease?\n",
-    "\n",
-    "We can answer this by applying Bayes' Rule and using our \"more likely than not\" classifier. Given that the person has tested Positive, the chance that he or she has the disease is the proportion in the top branch, relative to the total proportion in the Test Positive branches."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {
-    "collapsed": false
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "0.44295302013422816"
-      ]
-     },
-     "execution_count": 3,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "(0.004 * 0.99)/(0.004 * 0.99  +  0.996*0.005 )"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Given that the person has tested Positive, the chance that he or she has the disease is about 44%. So we will classify them as: No disease.\n",
-    "\n",
-    "This is a strange conclusion. We have a pretty accurate test, and a person who has tested Positive, and our classification is ... that they **don't** have the disease? That doesn't seem to make any sense.\n",
-    "\n",
-    "When faced with a disturbing answer, the first thing to do is to check the calculations. The arithmetic above is correct. Let's see if we can get the same answer in a different way.\n",
-    "\n",
-    "The function `population` returns a table of outcomes for 100,000 patients, with columns that show the `True Condition` and `Test Result`. The test is the same as the one described in the tree. But the proportion who have the disease is an argument to the function.\n",
-    "\n",
-    "We will call `population` with 0.004 as the argument, and then pivot to cross-classify each of the 100,000 people."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {
-    "collapsed": false
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "    <thead>\n",
-       "        <tr>\n",
-       "            <th>True Condition</th> <th>Negative</th> <th>Positive</th>\n",
-       "        </tr>\n",
-       "    </thead>\n",
-       "    <tbody>\n",
-       "        <tr>\n",
-       "            <td>Disease       </td> <td>4       </td> <td>396     </td>\n",
-       "        </tr>\n",
-       "        <tr>\n",
-       "            <td>No Disease    </td> <td>99102   </td> <td>498     </td>\n",
-       "        </tr>\n",
-       "    </tbody>\n",
-       "</table>"
-      ],
-      "text/plain": [
-       "True Condition | Negative | Positive\n",
-       "Disease        | 4        | 396\n",
-       "No Disease     | 99102    | 498"
-      ]
-     },
-     "execution_count": 4,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "population(0.004).pivot('Test Result', 'True Condition')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "The cells of the table have the right counts. For example, according to the description of the population, 4 in 1000 people have the disease. There are 100,000 people in the table, so 400 should have the disease. That's what the table shows: 4 + 396 = 400. Of these 400, 99% get a Positive test result: 0.99 x 400 = 396.\n",
-    "\n",
-    "Among the Positives, the proportion that have the disease is:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {
-    "collapsed": false
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "0.4429530201342282"
-      ]
-     },
-     "execution_count": 5,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "396/(396 + 498)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "That's the answer we got by using Bayes' Rule. The counts in the Positives column show why it is less than 1/2. Among the Positives, more people **don't** have the disease than do have the disease. "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "The reason is that a huge fraction of the population doesn't have the disease in the first place. The tiny fraction of those that falsely test Positive are still greater in number than the people who correctly test Positive. This is easier to visualize in the tree diagram:\n",
-    "\n",
-    "![Tree Rare Disease](../../../images/tree_disease_rare.png)\n",
-    "\n",
-    "- The proportion of true Positives is a large fraction (0.99) of a tiny fraction (0.004) of the population.\n",
-    "- The proportion of false Positives is a tiny fraction (0.005) of a large fraction (0.996) of the population.\n",
-    "\n",
-    "These two proportions are comparable; the second is a little larger.\n",
-    "\n",
-    "So, given that the randomly chosen person tested positive, we were right to classify them as more likely than not to **not** have the disease."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### A Subjective Prior ###\n",
-    "Being right isn't always satisfying. Classifying a Positive patient as not having the disease still seems somehow wrong, for such an accurate test. Since the calculations are right, let's take a look at the basis of our probability calculation: the assumption of randomness.\n",
-    "\n",
-    "Our assumption was that a randomly chosen person was tested and got a Positive result. But this doesn't happen in reality. People go in to get tested because they think they might have the disease, or because their doctor thinks they might have the disease. **People getting tested are not randomly chosen members of the population.**\n",
-    "\n",
-    "That is why our intuition about people getting tested was not fitting well with the answer that we got. We were imagining a realistic situation of a patient going in to get tested because there was some reason for them to do so, whereas the calculation was based on a randomly chosen person being tested.\n",
-    "\n",
-    "So let's redo our calculation under the more realistic assumption that the patient is getting tested because the doctor thinks there's a chance the patient has the disease.\n",
-    "\n",
-    "Here it's important to note that \"the doctor thinks there's a chance\" means that the chance is the doctor's opinion, not the proportion in the population. It is called a *subjective probability*. In our context of whether or not the patient has the disease, it is also a *subective prior* probability.\n",
-    "\n",
-    "Some researchers insist that all probabilities must be relative frequencies, but subjective probabilities abound. The chance that a candidate wins the next election, the chance that a big earthquake will hit the Bay Area in the next decade, the chance that a particular country wins the next soccer World Cup: none of these are based on relative frequencies or long run frequencies. Each one contains a subjective element. All calculations involving them thus have a subjective element too."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Suppose the doctor's subjective opinion is that there is a 5% chance that the patient has the disease. Then just the prior probabilities in the tree diagram will change:\n",
-    "\n",
-    "![Tree: Subjective Prior](../../../images/tree_disease_subj.png)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Given that the patient tests Positive, the chance that he or she has the disease is given by Bayes' Rule."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "metadata": {
-    "collapsed": false
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "0.9124423963133641"
-      ]
-     },
-     "execution_count": 6,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "(0.05 * 0.99)/(0.05 * 0.99  +  0.95 * 0.005)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "The effect of changing the prior is stunning. Even though the doctor has a pretty low prior probability (5%) that the patient has the disease, once the patient tests Positive the posterior probability of having the disease shoots up to more than 91%. \n",
-    "\n",
-    "If the patient tests Positive, it would be reasonable for the doctor to proceed as though the patient has the disease."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Confirming the Answer ###\n",
-    "Though the doctor's opinion is subjective, we can generate an artificial population in which 5% of the people have the disease and are tested using the same test. Then we can count people in different categories to see if the counts are consistent with the answer we got by using Bayes' Rule.\n",
-    "\n",
-    "We can use `population(0.05)` and `pivot` to construct the corresponding population and look at the counts in the four cells."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "metadata": {
-    "collapsed": false
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "    <thead>\n",
-       "        <tr>\n",
-       "            <th>True Condition</th> <th>Negative</th> <th>Positive</th>\n",
-       "        </tr>\n",
-       "    </thead>\n",
-       "    <tbody>\n",
-       "        <tr>\n",
-       "            <td>Disease       </td> <td>50      </td> <td>4950    </td>\n",
-       "        </tr>\n",
-       "        <tr>\n",
-       "            <td>No Disease    </td> <td>94525   </td> <td>475     </td>\n",
-       "        </tr>\n",
-       "    </tbody>\n",
-       "</table>"
-      ],
-      "text/plain": [
-       "True Condition | Negative | Positive\n",
-       "Disease        | 50       | 4950\n",
-       "No Disease     | 94525    | 475"
-      ]
-     },
-     "execution_count": 7,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "population(0.05).pivot('Test Result', 'True Condition')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "In this artificially created population of 100,000 people, 5000 people (5%) have the disease, and 99% of them test Positive, leading to 4950 true Positives. Compare this with 475 false Positives: among the Positives, the proportion that have the disease is the same as what we got by Bayes' Rule."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "metadata": {
-    "collapsed": false
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "0.9124423963133641"
-      ]
-     },
-     "execution_count": 8,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "4950/(4950 + 475)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Because we can generate a population that has the right proportions, we can also use simulation to confirm that our answer is reasonable. The table `pop_05` contains a population of 100,000 people generated with the doctor's prior disease probability of 5% and the error rates of the test. We take a simple random sample of size 10,000 from the population, and extract the table `positive` consisting only of those in the sample that had Positive test results."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "metadata": {
-    "collapsed": true
-   },
-   "outputs": [],
-   "source": [
-    "pop_05 = population(0.05)\n",
-    "\n",
-    "sample = pop_05.sample(10000, with_replacement=False)\n",
-    "\n",
-    "positive = sample.where('Test Result', are.equal_to('Positive'))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Among these Positive results, what proportion were true Positives? That's the proportion of Positives that had the disease:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "metadata": {
-    "collapsed": false
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "0.906896551724138"
-      ]
-     },
-     "execution_count": 10,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "positive.where('True Condition', are.equal_to('Disease')).num_rows/positive.num_rows"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Run the two cells a few times and you will see that the proportion of true Positives among the Positives hovers around the value of 0.912 that we calculated by Bayes' Rule.\n",
-    "\n",
-    "You can also use the `population` function with a different argument to change the prior disease probability and see how the posterior probabilities are affected."
-   ]
-  }
- ],
- "metadata": {
-  "anaconda-cloud": {},
-  "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.6.5"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}

+ 0 - 58
content/chapters/18_A/Updating_Predictions.ipynb

@@ -1,58 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {
-    "collapsed": true,
-    "tags": [
-     "remove_input"
-    ]
-   },
-   "outputs": [],
-   "source": [
-    "import matplotlib\n",
-    "from datascience import *\n",
-    "path_data = '../../../data/'\n",
-    "%matplotlib inline\n",
-    "import matplotlib.pyplot as plots\n",
-    "import numpy as np\n",
-    "plots.style.use('fivethirtyeight')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Updating Predictions ###\n",
-    "We know how to use training data to classify a point into one of two categories. Our classification is just a prediction of the class, based on the most common class among the training points that are nearest our new point. \n",
-    "\n",
-    "Suppose that we eventually find out the true class of our new point. Then we will know whether we got the classification right. Also, we will have a new point that we can add to our training set, because we know its class. This *updates* our training set. So, naturally, we will want to *update our classifier* based on the new training set.\n",
-    "\n",
-    "This chapter looks at some simple scenarios where new data leads us to update our predictions. While the examples in the chapter are simple in terms of calculation, the method of updating can be generalized to work in complex settings and is one of the most powerful tools used for machine learning."
-   ]
-  }
- ],
- "metadata": {
-  "anaconda-cloud": {},
-  "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.4.5"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 0
-}

+ 2 - 5
content/chapters/intro.md

@@ -8,12 +8,9 @@ The Foundations of Data Science
 
 Contributions by [David Wagner](https://www.cs.berkeley.edu/~daw/) and Henry Milner
 
-This is the textbook for the [Foundations of Data Science class at UC Berkeley][data8].
+This is the textbook for the [Foundations of Data Science module at University of East Anglia].
 
-[View this textbook online on GitHub Pages.][ghpages]
-
-[data8]: http://data8.org/
-[ghpages]: https://inferentialthinking.com
+Ammendements by Rob Blair
 
 The contents of this book are licensed for free consumption under the following license:  
 [Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)](https://creativecommons.org/licenses/by-nc-nd/4.0/).

Một số tệp đã không được hiển thị bởi vì quá nhiều tập tin thay đổi trong này khác