{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "tags": [ "remove_input" ] }, "outputs": [], "source": [ "import numpy as np\n", "np.set_printoptions(threshold=50)\n", "path_data = '../../../data/'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Tables\n", "\n", "Tables are a fundamental object type for representing data sets. A table can be viewed in two ways:\n", "* a sequence of named columns that each describe a single aspect of all entries in a data set, or\n", "* a sequence of rows that each contain all information about a single entry in a data set.\n", "\n", "In order to use tables, import all of the module called `datascience`, a module created for this text." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [ { "ename": "ModuleNotFoundError", "evalue": "No module named 'datascience'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mfrom\u001b[0m \u001b[0mdatascience\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'datascience'" ] } ], "source": [ "from datascience import *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Empty tables can be created using the `Table` function. An empty table is usefuly because it can be extended to contain new rows and columns." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", "\n", " \n", " \n", " \n", " \n", "
" ], "text/plain": [] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Table()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `with_columns` method on a table constructs a new table with additional labeled columns. Each column of a table is an array. To add one new column to a table, call `with_columns` with a label and an array. (The `with_column` method can be used with the same effect.)\n", "\n", "Below, we begin each example with an empty table that has no columns. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Number of petals
8
34
5
" ], "text/plain": [ "Number of petals\n", "8\n", "34\n", "5" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Table().with_columns('Number of petals', make_array(8, 34, 5))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To add two (or more) new columns, provide the label and array for each column. All columns must have the same length, or an error will occur." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Number of petals Name
8 lotus
34 sunflower
5 rose
" ], "text/plain": [ "Number of petals | Name\n", "8 | lotus\n", "34 | sunflower\n", "5 | rose" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Table().with_columns(\n", " 'Number of petals', make_array(8, 34, 5),\n", " 'Name', make_array('lotus', 'sunflower', 'rose')\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can give this table a name, and then extend the table with another column." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Number of petals Name Color
8 lotus pink
34 sunflower yellow
5 rose red
" ], "text/plain": [ "Number of petals | Name | Color\n", "8 | lotus | pink\n", "34 | sunflower | yellow\n", "5 | rose | red" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "flowers = Table().with_columns(\n", " 'Number of petals', make_array(8, 34, 5),\n", " 'Name', make_array('lotus', 'sunflower', 'rose')\n", ")\n", "\n", "flowers.with_columns(\n", " 'Color', make_array('pink', 'yellow', 'red')\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `with_columns` method creates a new table each time it is called, so the original table is not affected. For example, the table `flowers` still has only the two columns that it had when it was created." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Number of petals Name
8 lotus
34 sunflower
5 rose
" ], "text/plain": [ "Number of petals | Name\n", "8 | lotus\n", "34 | sunflower\n", "5 | rose" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "flowers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Creating tables in this way involves a lot of typing. If the data have already been entered somewhere, it is usually possible to use Python to read it into a table, instead of typing it all in cell by cell.\n", "\n", "Often, tables are created from files that contain comma-separated values. Such files are called CSV files.\n", "\n", "Below, we use the Table method `read_table` to read a CSV file that contains some of the data used by Minard in his graphic about Napoleon's Russian campaign. The data are placed in a table named `minard`." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Longitude Latitude City Direction Survivors
32 54.8 Smolensk Advance 145000
33.2 54.9 Dorogobouge Advance 140000
34.4 55.5 Chjat Advance 127100
37.6 55.8 Moscou Advance 100000
34.3 55.2 Wixma Retreat 55000
32 54.6 Smolensk Retreat 24000
30.4 54.4 Orscha Retreat 20000
26.8 54.3 Moiodexno Retreat 12000
" ], "text/plain": [ "Longitude | Latitude | City | Direction | Survivors\n", "32 | 54.8 | Smolensk | Advance | 145000\n", "33.2 | 54.9 | Dorogobouge | Advance | 140000\n", "34.4 | 55.5 | Chjat | Advance | 127100\n", "37.6 | 55.8 | Moscou | Advance | 100000\n", "34.3 | 55.2 | Wixma | Retreat | 55000\n", "32 | 54.6 | Smolensk | Retreat | 24000\n", "30.4 | 54.4 | Orscha | Retreat | 20000\n", "26.8 | 54.3 | Moiodexno | Retreat | 12000" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard = Table.read_table(path_data + 'minard.csv')\n", "minard" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will use this small table to demonstrate some useful Table methods. We will then use those same methods, and develop other methods, on much larger tables of data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The Size of the Table ###\n", "\n", "The method `num_columns` gives the number of columns in the table, and `num_rows` the number of rows." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard.num_columns" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "8" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard.num_rows" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Column Labels \n", "The method `labels` can be used to list the labels of all the columns. With `minard` we don't gain much by this, but it can be very useful for tables that are so large that not all columns are visible on the screen." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "('Longitude', 'Latitude', 'City', 'Direction', 'Survivors')" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard.labels" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can change column labels using the `relabeled` method. This creates a new table and leaves `minard` unchanged." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Longitude Latitude City Name Direction Survivors
32 54.8 Smolensk Advance 145000
33.2 54.9 Dorogobouge Advance 140000
34.4 55.5 Chjat Advance 127100
37.6 55.8 Moscou Advance 100000
34.3 55.2 Wixma Retreat 55000
32 54.6 Smolensk Retreat 24000
30.4 54.4 Orscha Retreat 20000
26.8 54.3 Moiodexno Retreat 12000
" ], "text/plain": [ "Longitude | Latitude | City Name | Direction | Survivors\n", "32 | 54.8 | Smolensk | Advance | 145000\n", "33.2 | 54.9 | Dorogobouge | Advance | 140000\n", "34.4 | 55.5 | Chjat | Advance | 127100\n", "37.6 | 55.8 | Moscou | Advance | 100000\n", "34.3 | 55.2 | Wixma | Retreat | 55000\n", "32 | 54.6 | Smolensk | Retreat | 24000\n", "30.4 | 54.4 | Orscha | Retreat | 20000\n", "26.8 | 54.3 | Moiodexno | Retreat | 12000" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard.relabeled('City', 'City Name')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, this method does not change the original table. " ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Longitude Latitude City Direction Survivors
32 54.8 Smolensk Advance 145000
33.2 54.9 Dorogobouge Advance 140000
34.4 55.5 Chjat Advance 127100
37.6 55.8 Moscou Advance 100000
34.3 55.2 Wixma Retreat 55000
32 54.6 Smolensk Retreat 24000
30.4 54.4 Orscha Retreat 20000
26.8 54.3 Moiodexno Retreat 12000
" ], "text/plain": [ "Longitude | Latitude | City | Direction | Survivors\n", "32 | 54.8 | Smolensk | Advance | 145000\n", "33.2 | 54.9 | Dorogobouge | Advance | 140000\n", "34.4 | 55.5 | Chjat | Advance | 127100\n", "37.6 | 55.8 | Moscou | Advance | 100000\n", "34.3 | 55.2 | Wixma | Retreat | 55000\n", "32 | 54.6 | Smolensk | Retreat | 24000\n", "30.4 | 54.4 | Orscha | Retreat | 20000\n", "26.8 | 54.3 | Moiodexno | Retreat | 12000" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A common pattern is to assign the original name `minard` to the new table, so that all future uses of `minard` will refer to the relabeled table." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Longitude Latitude City Name Direction Survivors
32 54.8 Smolensk Advance 145000
33.2 54.9 Dorogobouge Advance 140000
34.4 55.5 Chjat Advance 127100
37.6 55.8 Moscou Advance 100000
34.3 55.2 Wixma Retreat 55000
32 54.6 Smolensk Retreat 24000
30.4 54.4 Orscha Retreat 20000
26.8 54.3 Moiodexno Retreat 12000
" ], "text/plain": [ "Longitude | Latitude | City Name | Direction | Survivors\n", "32 | 54.8 | Smolensk | Advance | 145000\n", "33.2 | 54.9 | Dorogobouge | Advance | 140000\n", "34.4 | 55.5 | Chjat | Advance | 127100\n", "37.6 | 55.8 | Moscou | Advance | 100000\n", "34.3 | 55.2 | Wixma | Retreat | 55000\n", "32 | 54.6 | Smolensk | Retreat | 24000\n", "30.4 | 54.4 | Orscha | Retreat | 20000\n", "26.8 | 54.3 | Moiodexno | Retreat | 12000" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard = minard.relabeled('City', 'City Name')\n", "minard" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Accessing the Data in a Column ###\n", "We can use a column's label to access the array of data in the column." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([145000, 140000, 127100, 100000, 55000, 24000, 20000, 12000])" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard.column('Survivors')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The 5 columns are indexed 0, 1, 2, 3, and 4. The column `Survivors` can also be accessed by using its column index." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([145000, 140000, 127100, 100000, 55000, 24000, 20000, 12000])" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard.column(4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The 8 items in the array are indexed 0, 1, 2, and so on, up to 7. The items in the column can be accessed using `item`, as with any array." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "145000" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard.column(4).item(0)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "24000" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard.column(4).item(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Working with the Data in a Column ###\n", "Because columns are arrays, we can use array operations on them to discover new information. For example, we can create a new column that contains the percent of all survivors at each city after Smolensk." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Longitude Latitude City Name Direction Survivors Percent Surviving
32 54.8 Smolensk Advance 145000 1
33.2 54.9 Dorogobouge Advance 140000 0.965517
34.4 55.5 Chjat Advance 127100 0.876552
37.6 55.8 Moscou Advance 100000 0.689655
34.3 55.2 Wixma Retreat 55000 0.37931
32 54.6 Smolensk Retreat 24000 0.165517
30.4 54.4 Orscha Retreat 20000 0.137931
26.8 54.3 Moiodexno Retreat 12000 0.0827586
" ], "text/plain": [ "Longitude | Latitude | City Name | Direction | Survivors | Percent Surviving\n", "32 | 54.8 | Smolensk | Advance | 145000 | 1\n", "33.2 | 54.9 | Dorogobouge | Advance | 140000 | 0.965517\n", "34.4 | 55.5 | Chjat | Advance | 127100 | 0.876552\n", "37.6 | 55.8 | Moscou | Advance | 100000 | 0.689655\n", "34.3 | 55.2 | Wixma | Retreat | 55000 | 0.37931\n", "32 | 54.6 | Smolensk | Retreat | 24000 | 0.165517\n", "30.4 | 54.4 | Orscha | Retreat | 20000 | 0.137931\n", "26.8 | 54.3 | Moiodexno | Retreat | 12000 | 0.0827586" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "initial = minard.column('Survivors').item(0)\n", "minard = minard.with_columns(\n", " 'Percent Surviving', minard.column('Survivors')/initial\n", ")\n", "minard" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To make the proportions in the new columns appear as percents, we can use the method `set_format` with the option `PercentFormatter`. The `set_format` method takes `Formatter` objects, which exist for dates (`DateFormatter`), currencies (`CurrencyFormatter`), numbers, and percentages." ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Longitude Latitude City Name Direction Survivors Percent Surviving
32 54.8 Smolensk Advance 145000 100.00%
33.2 54.9 Dorogobouge Advance 140000 96.55%
34.4 55.5 Chjat Advance 127100 87.66%
37.6 55.8 Moscou Advance 100000 68.97%
34.3 55.2 Wixma Retreat 55000 37.93%
32 54.6 Smolensk Retreat 24000 16.55%
30.4 54.4 Orscha Retreat 20000 13.79%
26.8 54.3 Moiodexno Retreat 12000 8.28%
" ], "text/plain": [ "Longitude | Latitude | City Name | Direction | Survivors | Percent Surviving\n", "32 | 54.8 | Smolensk | Advance | 145000 | 100.00%\n", "33.2 | 54.9 | Dorogobouge | Advance | 140000 | 96.55%\n", "34.4 | 55.5 | Chjat | Advance | 127100 | 87.66%\n", "37.6 | 55.8 | Moscou | Advance | 100000 | 68.97%\n", "34.3 | 55.2 | Wixma | Retreat | 55000 | 37.93%\n", "32 | 54.6 | Smolensk | Retreat | 24000 | 16.55%\n", "30.4 | 54.4 | Orscha | Retreat | 20000 | 13.79%\n", "26.8 | 54.3 | Moiodexno | Retreat | 12000 | 8.28%" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard.set_format('Percent Surviving', PercentFormatter)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Choosing Sets of Columns ###\n", "The method `select` creates a new table that contains only the specified columns." ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Longitude Latitude
32 54.8
33.2 54.9
34.4 55.5
37.6 55.8
34.3 55.2
32 54.6
30.4 54.4
26.8 54.3
" ], "text/plain": [ "Longitude | Latitude\n", "32 | 54.8\n", "33.2 | 54.9\n", "34.4 | 55.5\n", "37.6 | 55.8\n", "34.3 | 55.2\n", "32 | 54.6\n", "30.4 | 54.4\n", "26.8 | 54.3" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard.select('Longitude', 'Latitude')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The same selection can be made using column indices instead of labels." ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Longitude Latitude
32 54.8
33.2 54.9
34.4 55.5
37.6 55.8
34.3 55.2
32 54.6
30.4 54.4
26.8 54.3
" ], "text/plain": [ "Longitude | Latitude\n", "32 | 54.8\n", "33.2 | 54.9\n", "34.4 | 55.5\n", "37.6 | 55.8\n", "34.3 | 55.2\n", "32 | 54.6\n", "30.4 | 54.4\n", "26.8 | 54.3" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard.select(0, 1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The result of using `select` is a new table, even when you select just one column." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Survivors
145000
140000
127100
100000
55000
24000
20000
12000
" ], "text/plain": [ "Survivors\n", "145000\n", "140000\n", "127100\n", "100000\n", "55000\n", "24000\n", "20000\n", "12000" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard.select('Survivors')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that the result is a table, unlike the result of `column`, which is an array." ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([145000, 140000, 127100, 100000, 55000, 24000, 20000, 12000])" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard.column('Survivors')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another way to create a new table consisting of a set of columns is to `drop` the columns you don't want." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
City Name Survivors Percent Surviving
Smolensk 145000 100.00%
Dorogobouge 140000 96.55%
Chjat 127100 87.66%
Moscou 100000 68.97%
Wixma 55000 37.93%
Smolensk 24000 16.55%
Orscha 20000 13.79%
Moiodexno 12000 8.28%
" ], "text/plain": [ "City Name | Survivors | Percent Surviving\n", "Smolensk | 145000 | 100.00%\n", "Dorogobouge | 140000 | 96.55%\n", "Chjat | 127100 | 87.66%\n", "Moscou | 100000 | 68.97%\n", "Wixma | 55000 | 37.93%\n", "Smolensk | 24000 | 16.55%\n", "Orscha | 20000 | 13.79%\n", "Moiodexno | 12000 | 8.28%" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard.drop('Longitude', 'Latitude', 'Direction')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Neither `select` nor `drop` change the original table. Instead, they create new smaller tables that share the same data. The fact that the original table is preserved is useful! You can generate multiple different tables that only consider certain columns without worrying that one analysis will affect the other." ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Longitude Latitude City Name Direction Survivors Percent Surviving
32 54.8 Smolensk Advance 145000 100.00%
33.2 54.9 Dorogobouge Advance 140000 96.55%
34.4 55.5 Chjat Advance 127100 87.66%
37.6 55.8 Moscou Advance 100000 68.97%
34.3 55.2 Wixma Retreat 55000 37.93%
32 54.6 Smolensk Retreat 24000 16.55%
30.4 54.4 Orscha Retreat 20000 13.79%
26.8 54.3 Moiodexno Retreat 12000 8.28%
" ], "text/plain": [ "Longitude | Latitude | City Name | Direction | Survivors | Percent Surviving\n", "32 | 54.8 | Smolensk | Advance | 145000 | 100.00%\n", "33.2 | 54.9 | Dorogobouge | Advance | 140000 | 96.55%\n", "34.4 | 55.5 | Chjat | Advance | 127100 | 87.66%\n", "37.6 | 55.8 | Moscou | Advance | 100000 | 68.97%\n", "34.3 | 55.2 | Wixma | Retreat | 55000 | 37.93%\n", "32 | 54.6 | Smolensk | Retreat | 24000 | 16.55%\n", "30.4 | 54.4 | Orscha | Retreat | 20000 | 13.79%\n", "26.8 | 54.3 | Moiodexno | Retreat | 12000 | 8.28%" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "minard" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All of the methods that we have used above can be applied to any table." ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 2 }