{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": [
"remove_input"
]
},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"path_data = '../../../../data/'\n",
"np.set_printoptions(threshold=50)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"tags": [
"remove_input"
]
},
"outputs": [],
"source": [
"nba_salaries = pd.read_csv(path_data + 'nba_salaries.csv')\n",
"nba = nba_salaries.rename(columns={\"'15-'16 SALARY\": 'SALARY'})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Selecting Rows\n",
"\n",
"Often, we would like to extract just those rows that correspond to entries with a particular feature. For example, we might want only the rows corresponding to the Warriors, or to players who earned more than $\\$10$ million. Or we might just want the top five earners."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Specified Rows\n",
"The fact that a DataFrame creates an index by default startts to become very useful here as we can specify which rows (by default) we wish to inspect by stating an index or an index range. The argument used a row index or array of indices, and it creates a new DataFrame consisting of only those rows.\n",
"\n",
"For example, if we wanted just the first row of `nba`, we could use `df.iloc[]` as follows."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" PLAYER | \n",
" POSITION | \n",
" TEAM | \n",
" SALARY | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" Paul Millsap | \n",
" PF | \n",
" Atlanta Hawks | \n",
" 18.671659 | \n",
"
\n",
" \n",
" 1 | \n",
" Al Horford | \n",
" C | \n",
" Atlanta Hawks | \n",
" 12.000000 | \n",
"
\n",
" \n",
" 2 | \n",
" Tiago Splitter | \n",
" C | \n",
" Atlanta Hawks | \n",
" 9.756250 | \n",
"
\n",
" \n",
" 3 | \n",
" Jeff Teague | \n",
" PG | \n",
" Atlanta Hawks | \n",
" 8.000000 | \n",
"
\n",
" \n",
" 4 | \n",
" Kyle Korver | \n",
" SG | \n",
" Atlanta Hawks | \n",
" 5.746479 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 412 | \n",
" Gary Neal | \n",
" PG | \n",
" Washington Wizards | \n",
" 2.139000 | \n",
"
\n",
" \n",
" 413 | \n",
" DeJuan Blair | \n",
" C | \n",
" Washington Wizards | \n",
" 2.000000 | \n",
"
\n",
" \n",
" 414 | \n",
" Kelly Oubre Jr. | \n",
" SF | \n",
" Washington Wizards | \n",
" 1.920240 | \n",
"
\n",
" \n",
" 415 | \n",
" Garrett Temple | \n",
" SG | \n",
" Washington Wizards | \n",
" 1.100602 | \n",
"
\n",
" \n",
" 416 | \n",
" Jarell Eddie | \n",
" SG | \n",
" Washington Wizards | \n",
" 0.561716 | \n",
"
\n",
" \n",
"
\n",
"
417 rows × 4 columns
\n",
"
"
],
"text/plain": [
" PLAYER POSITION TEAM SALARY\n",
"0 Paul Millsap PF Atlanta Hawks 18.671659\n",
"1 Al Horford C Atlanta Hawks 12.000000\n",
"2 Tiago Splitter C Atlanta Hawks 9.756250\n",
"3 Jeff Teague PG Atlanta Hawks 8.000000\n",
"4 Kyle Korver SG Atlanta Hawks 5.746479\n",
".. ... ... ... ...\n",
"412 Gary Neal PG Washington Wizards 2.139000\n",
"413 DeJuan Blair C Washington Wizards 2.000000\n",
"414 Kelly Oubre Jr. SF Washington Wizards 1.920240\n",
"415 Garrett Temple SG Washington Wizards 1.100602\n",
"416 Jarell Eddie SG Washington Wizards 0.561716\n",
"\n",
"[417 rows x 4 columns]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nba"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"PLAYER Paul Millsap\n",
"POSITION PF\n",
"TEAM Atlanta Hawks\n",
"SALARY 18.6717\n",
"Name: 0, dtype: object"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nba.iloc[0]"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" PLAYER | \n",
" POSITION | \n",
" TEAM | \n",
" SALARY | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" Paul Millsap | \n",
" PF | \n",
" Atlanta Hawks | \n",
" 18.671659 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" PLAYER POSITION TEAM SALARY\n",
"0 Paul Millsap PF Atlanta Hawks 18.671659"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nba.iloc[[0]]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is a new table with just the single row that we specified.\n",
"\n",
"We could also get the fourth, fifth, and sixth rows by specifying a range of indices as the argument."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" PLAYER | \n",
" POSITION | \n",
" TEAM | \n",
" SALARY | \n",
"
\n",
" \n",
" \n",
" \n",
" 3 | \n",
" Jeff Teague | \n",
" PG | \n",
" Atlanta Hawks | \n",
" 8.000000 | \n",
"
\n",
" \n",
" 4 | \n",
" Kyle Korver | \n",
" SG | \n",
" Atlanta Hawks | \n",
" 5.746479 | \n",
"
\n",
" \n",
" 5 | \n",
" Thabo Sefolosha | \n",
" SF | \n",
" Atlanta Hawks | \n",
" 4.000000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" PLAYER POSITION TEAM SALARY\n",
"3 Jeff Teague PG Atlanta Hawks 8.000000\n",
"4 Kyle Korver SG Atlanta Hawks 5.746479\n",
"5 Thabo Sefolosha SF Atlanta Hawks 4.000000"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nba.iloc[np.arange(3, 6)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we want a table of the top 5 highest paid players, we can first sort the list by salary and then `df.iloc[]` the first five rows:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" PLAYER | \n",
" POSITION | \n",
" TEAM | \n",
" SALARY | \n",
"
\n",
" \n",
" \n",
" \n",
" 169 | \n",
" Kobe Bryant | \n",
" SF | \n",
" Los Angeles Lakers | \n",
" 25.000000 | \n",
"
\n",
" \n",
" 29 | \n",
" Joe Johnson | \n",
" SF | \n",
" Brooklyn Nets | \n",
" 24.894863 | \n",
"
\n",
" \n",
" 72 | \n",
" LeBron James | \n",
" SF | \n",
" Cleveland Cavaliers | \n",
" 22.970500 | \n",
"
\n",
" \n",
" 255 | \n",
" Carmelo Anthony | \n",
" SF | \n",
" New York Knicks | \n",
" 22.875000 | \n",
"
\n",
" \n",
" 131 | \n",
" Dwight Howard | \n",
" C | \n",
" Houston Rockets | \n",
" 22.359364 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" PLAYER POSITION TEAM SALARY\n",
"169 Kobe Bryant SF Los Angeles Lakers 25.000000\n",
"29 Joe Johnson SF Brooklyn Nets 24.894863\n",
"72 LeBron James SF Cleveland Cavaliers 22.970500\n",
"255 Carmelo Anthony SF New York Knicks 22.875000\n",
"131 Dwight Howard C Houston Rockets 22.359364"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nba.sort_values('SALARY', ascending=False).iloc[(np.arange(5))]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Rows Corresponding to a Specified Feature\n",
"More often, we will want to access data in a set of rows that have a certain feature, but whose indices we don't know ahead of time. For example, we might want data on all the players who made more than $\\$10$ million, but we don't want to spend time counting rows in the sorted table.\n",
"\n",
"Array version - if we wish to work with an array we can use `np.where(df['column'] criteria)`. \n",
"\n",
"[np.where()](https://numpy.org/doc/stable/reference/generated/numpy.where.html)\n",
"\n",
"DataFrame version - to implement a selection criteria the df is called with selection criteria being applied to the df.col i.e. `df[df['column_name']criteria]`.\n",
"\n",
"In the first example, we extract the data for all those who earned more than $\\$10$ million."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(array([ 0, 1, 29, ..., 400, 401, 402]),)"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.where(nba['SALARY'] > 10)\n",
"\n",
"# or - this is an example of alternatives being available to select,\n",
"# this may depend upon preference, the task at hand, the impact of processing time or export requirements\n",
"\n",
"#nba[nba['SALARY'] >10]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The use of the argument `df[df[col] > 10]` ensured that each selected row had a value of `SALARY` that was greater than 10.\n",
"\n",
"There are 69 rows in the new table, corresponding to the 69 players who made more than $10$ million dollars. Arranging these rows in order makes the data easier to analyze. DeMar DeRozan of the Toronto Raptors was the \"poorest\" of this group, at a salary of just over $10$ million dollars."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" PLAYER | \n",
" POSITION | \n",
" TEAM | \n",
" SALARY | \n",
"
\n",
" \n",
" \n",
" \n",
" 368 | \n",
" DeMar DeRozan | \n",
" SG | \n",
" Toronto Raptors | \n",
" 10.050000 | \n",
"
\n",
" \n",
" 298 | \n",
" Gerald Wallace | \n",
" SF | \n",
" Philadelphia 76ers | \n",
" 10.105855 | \n",
"
\n",
" \n",
" 204 | \n",
" Luol Deng | \n",
" SF | \n",
" Miami Heat | \n",
" 10.151612 | \n",
"
\n",
" \n",
" 144 | \n",
" Monta Ellis | \n",
" SG | \n",
" Indiana Pacers | \n",
" 10.300000 | \n",
"
\n",
" \n",
" 95 | \n",
" Wilson Chandler | \n",
" SF | \n",
" Denver Nuggets | \n",
" 10.449438 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 131 | \n",
" Dwight Howard | \n",
" C | \n",
" Houston Rockets | \n",
" 22.359364 | \n",
"
\n",
" \n",
" 255 | \n",
" Carmelo Anthony | \n",
" SF | \n",
" New York Knicks | \n",
" 22.875000 | \n",
"
\n",
" \n",
" 72 | \n",
" LeBron James | \n",
" SF | \n",
" Cleveland Cavaliers | \n",
" 22.970500 | \n",
"
\n",
" \n",
" 29 | \n",
" Joe Johnson | \n",
" SF | \n",
" Brooklyn Nets | \n",
" 24.894863 | \n",
"
\n",
" \n",
" 169 | \n",
" Kobe Bryant | \n",
" SF | \n",
" Los Angeles Lakers | \n",
" 25.000000 | \n",
"
\n",
" \n",
"
\n",
"
69 rows × 4 columns
\n",
"
"
],
"text/plain": [
" PLAYER POSITION TEAM SALARY\n",
"368 DeMar DeRozan SG Toronto Raptors 10.050000\n",
"298 Gerald Wallace SF Philadelphia 76ers 10.105855\n",
"204 Luol Deng SF Miami Heat 10.151612\n",
"144 Monta Ellis SG Indiana Pacers 10.300000\n",
"95 Wilson Chandler SF Denver Nuggets 10.449438\n",
".. ... ... ... ...\n",
"131 Dwight Howard C Houston Rockets 22.359364\n",
"255 Carmelo Anthony SF New York Knicks 22.875000\n",
"72 LeBron James SF Cleveland Cavaliers 22.970500\n",
"29 Joe Johnson SF Brooklyn Nets 24.894863\n",
"169 Kobe Bryant SF Los Angeles Lakers 25.000000\n",
"\n",
"[69 rows x 4 columns]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nba[nba['SALARY'] >10].sort_values('SALARY')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"How much did Stephen Curry make? For the answer, we have to access the row where the value of `PLAYER` is equal to `Stephen Curry`. That is placed a table consisting of just one line:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" PLAYER | \n",
" POSITION | \n",
" TEAM | \n",
" SALARY | \n",
"
\n",
" \n",
" \n",
" \n",
" 121 | \n",
" Stephen Curry | \n",
" PG | \n",
" Golden State Warriors | \n",
" 11.370786 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" PLAYER POSITION TEAM SALARY\n",
"121 Stephen Curry PG Golden State Warriors 11.370786"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nba[nba['PLAYER'] == 'Stephen Curry']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Curry made just under $\\$11.4$ million dollars. That's a lot of money, but it's less than half the salary of LeBron James. You'll find that salary in the \"Top 5\" table earlier in this section, or you could find it replacing `'Stephen Curry'` by `'LeBron James'` in the line of code above.\n",
"\n",
"Thus for example you can get a DataFrame where the 'TEAM' is exactly equal to 'Golden State Warriors':"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" PLAYER | \n",
" POSITION | \n",
" TEAM | \n",
" SALARY | \n",
"
\n",
" \n",
" \n",
" \n",
" 117 | \n",
" Klay Thompson | \n",
" SG | \n",
" Golden State Warriors | \n",
" 15.501000 | \n",
"
\n",
" \n",
" 118 | \n",
" Draymond Green | \n",
" PF | \n",
" Golden State Warriors | \n",
" 14.260870 | \n",
"
\n",
" \n",
" 119 | \n",
" Andrew Bogut | \n",
" C | \n",
" Golden State Warriors | \n",
" 13.800000 | \n",
"
\n",
" \n",
" 120 | \n",
" Andre Iguodala | \n",
" SF | \n",
" Golden State Warriors | \n",
" 11.710456 | \n",
"
\n",
" \n",
" 121 | \n",
" Stephen Curry | \n",
" PG | \n",
" Golden State Warriors | \n",
" 11.370786 | \n",
"
\n",
" \n",
" 122 | \n",
" Jason Thompson | \n",
" PF | \n",
" Golden State Warriors | \n",
" 7.008475 | \n",
"
\n",
" \n",
" 123 | \n",
" Shaun Livingston | \n",
" PG | \n",
" Golden State Warriors | \n",
" 5.543725 | \n",
"
\n",
" \n",
" 124 | \n",
" Harrison Barnes | \n",
" SF | \n",
" Golden State Warriors | \n",
" 3.873398 | \n",
"
\n",
" \n",
" 125 | \n",
" Marreese Speights | \n",
" C | \n",
" Golden State Warriors | \n",
" 3.815000 | \n",
"
\n",
" \n",
" 126 | \n",
" Leandro Barbosa | \n",
" SG | \n",
" Golden State Warriors | \n",
" 2.500000 | \n",
"
\n",
" \n",
" 127 | \n",
" Festus Ezeli | \n",
" C | \n",
" Golden State Warriors | \n",
" 2.008748 | \n",
"
\n",
" \n",
" 128 | \n",
" Brandon Rush | \n",
" SF | \n",
" Golden State Warriors | \n",
" 1.270964 | \n",
"
\n",
" \n",
" 129 | \n",
" Kevon Looney | \n",
" SF | \n",
" Golden State Warriors | \n",
" 1.131960 | \n",
"
\n",
" \n",
" 130 | \n",
" Anderson Varejao | \n",
" PF | \n",
" Golden State Warriors | \n",
" 0.289755 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" PLAYER POSITION TEAM SALARY\n",
"117 Klay Thompson SG Golden State Warriors 15.501000\n",
"118 Draymond Green PF Golden State Warriors 14.260870\n",
"119 Andrew Bogut C Golden State Warriors 13.800000\n",
"120 Andre Iguodala SF Golden State Warriors 11.710456\n",
"121 Stephen Curry PG Golden State Warriors 11.370786\n",
"122 Jason Thompson PF Golden State Warriors 7.008475\n",
"123 Shaun Livingston PG Golden State Warriors 5.543725\n",
"124 Harrison Barnes SF Golden State Warriors 3.873398\n",
"125 Marreese Speights C Golden State Warriors 3.815000\n",
"126 Leandro Barbosa SG Golden State Warriors 2.500000\n",
"127 Festus Ezeli C Golden State Warriors 2.008748\n",
"128 Brandon Rush SF Golden State Warriors 1.270964\n",
"129 Kevon Looney SF Golden State Warriors 1.131960\n",
"130 Anderson Varejao PF Golden State Warriors 0.289755"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nba[nba['TEAM'] == 'Golden State Warriors']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This portion of the table is already sorted by salary, because the original table listed players sorted by salary within the same team. By not using `.head()` at the end of the line all rows are shown, not just the first 10."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Multiple Features ###\n",
"You can access rows that have multiple specified features, by using the boolean `&` operator. For example, here is a way to extract all the Point Guards whose salaries were over $\\$15$ million."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" PLAYER | \n",
" POSITION | \n",
" TEAM | \n",
" SALARY | \n",
"
\n",
" \n",
" \n",
" \n",
" 60 | \n",
" Derrick Rose | \n",
" PG | \n",
" Chicago Bulls | \n",
" 20.093064 | \n",
"
\n",
" \n",
" 74 | \n",
" Kyrie Irving | \n",
" PG | \n",
" Cleveland Cavaliers | \n",
" 16.407501 | \n",
"
\n",
" \n",
" 156 | \n",
" Chris Paul | \n",
" PG | \n",
" Los Angeles Clippers | \n",
" 21.468695 | \n",
"
\n",
" \n",
" 269 | \n",
" Russell Westbrook | \n",
" PG | \n",
" Oklahoma City Thunder | \n",
" 16.744218 | \n",
"
\n",
" \n",
" 400 | \n",
" John Wall | \n",
" PG | \n",
" Washington Wizards | \n",
" 15.851950 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" PLAYER POSITION TEAM SALARY\n",
"60 Derrick Rose PG Chicago Bulls 20.093064\n",
"74 Kyrie Irving PG Cleveland Cavaliers 16.407501\n",
"156 Chris Paul PG Los Angeles Clippers 21.468695\n",
"269 Russell Westbrook PG Oklahoma City Thunder 16.744218\n",
"400 John Wall PG Washington Wizards 15.851950"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nba[(nba['POSITION'] == 'PG') & (nba['SALARY'] > 15)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### General Form ###\n",
"By now you will have realized that the general way to create a new df by selecting rows with a given feature is to use `&` or `OR` with the appropriate condition:\n",
"\n",
"`df[df['column_label_string'] condition(<, >, ==, =>, etc) criteria]`"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" PLAYER | \n",
" POSITION | \n",
" TEAM | \n",
" SALARY | \n",
"
\n",
" \n",
" \n",
" \n",
" 144 | \n",
" Monta Ellis | \n",
" SG | \n",
" Indiana Pacers | \n",
" 10.300000 | \n",
"
\n",
" \n",
" 204 | \n",
" Luol Deng | \n",
" SF | \n",
" Miami Heat | \n",
" 10.151612 | \n",
"
\n",
" \n",
" 298 | \n",
" Gerald Wallace | \n",
" SF | \n",
" Philadelphia 76ers | \n",
" 10.105855 | \n",
"
\n",
" \n",
" 356 | \n",
" Danny Green | \n",
" SG | \n",
" San Antonio Spurs | \n",
" 10.000000 | \n",
"
\n",
" \n",
" 368 | \n",
" DeMar DeRozan | \n",
" SG | \n",
" Toronto Raptors | \n",
" 10.050000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" PLAYER POSITION TEAM SALARY\n",
"144 Monta Ellis SG Indiana Pacers 10.300000\n",
"204 Luol Deng SF Miami Heat 10.151612\n",
"298 Gerald Wallace SF Philadelphia 76ers 10.105855\n",
"356 Danny Green SG San Antonio Spurs 10.000000\n",
"368 DeMar DeRozan SG Toronto Raptors 10.050000"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nba[(nba['SALARY'] >= 10) & (nba['SALARY'] <=10.3)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we specify a condition that isn't satisfied by any row, we get a table with column labels but no rows."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" PLAYER | \n",
" POSITION | \n",
" TEAM | \n",
" SALARY | \n",
"
\n",
" \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
"Empty DataFrame\n",
"Columns: [PLAYER, POSITION, TEAM, SALARY]\n",
"Index: []"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nba[nba['PLAYER'] == 'Barack Obama']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We end the section with a series of examples. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The use of `are.containing` can help save some typing. For example, you can just specify `Warriors` instead of `Golden State Warriors`:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" PLAYER | \n",
" POSITION | \n",
" TEAM | \n",
" SALARY | \n",
"
\n",
" \n",
" \n",
" \n",
" 117 | \n",
" Klay Thompson | \n",
" SG | \n",
" Golden State Warriors | \n",
" 15.501000 | \n",
"
\n",
" \n",
" 118 | \n",
" Draymond Green | \n",
" PF | \n",
" Golden State Warriors | \n",
" 14.260870 | \n",
"
\n",
" \n",
" 119 | \n",
" Andrew Bogut | \n",
" C | \n",
" Golden State Warriors | \n",
" 13.800000 | \n",
"
\n",
" \n",
" 120 | \n",
" Andre Iguodala | \n",
" SF | \n",
" Golden State Warriors | \n",
" 11.710456 | \n",
"
\n",
" \n",
" 121 | \n",
" Stephen Curry | \n",
" PG | \n",
" Golden State Warriors | \n",
" 11.370786 | \n",
"
\n",
" \n",
" 122 | \n",
" Jason Thompson | \n",
" PF | \n",
" Golden State Warriors | \n",
" 7.008475 | \n",
"
\n",
" \n",
" 123 | \n",
" Shaun Livingston | \n",
" PG | \n",
" Golden State Warriors | \n",
" 5.543725 | \n",
"
\n",
" \n",
" 124 | \n",
" Harrison Barnes | \n",
" SF | \n",
" Golden State Warriors | \n",
" 3.873398 | \n",
"
\n",
" \n",
" 125 | \n",
" Marreese Speights | \n",
" C | \n",
" Golden State Warriors | \n",
" 3.815000 | \n",
"
\n",
" \n",
" 126 | \n",
" Leandro Barbosa | \n",
" SG | \n",
" Golden State Warriors | \n",
" 2.500000 | \n",
"
\n",
" \n",
" 127 | \n",
" Festus Ezeli | \n",
" C | \n",
" Golden State Warriors | \n",
" 2.008748 | \n",
"
\n",
" \n",
" 128 | \n",
" Brandon Rush | \n",
" SF | \n",
" Golden State Warriors | \n",
" 1.270964 | \n",
"
\n",
" \n",
" 129 | \n",
" Kevon Looney | \n",
" SF | \n",
" Golden State Warriors | \n",
" 1.131960 | \n",
"
\n",
" \n",
" 130 | \n",
" Anderson Varejao | \n",
" PF | \n",
" Golden State Warriors | \n",
" 0.289755 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" PLAYER POSITION TEAM SALARY\n",
"117 Klay Thompson SG Golden State Warriors 15.501000\n",
"118 Draymond Green PF Golden State Warriors 14.260870\n",
"119 Andrew Bogut C Golden State Warriors 13.800000\n",
"120 Andre Iguodala SF Golden State Warriors 11.710456\n",
"121 Stephen Curry PG Golden State Warriors 11.370786\n",
"122 Jason Thompson PF Golden State Warriors 7.008475\n",
"123 Shaun Livingston PG Golden State Warriors 5.543725\n",
"124 Harrison Barnes SF Golden State Warriors 3.873398\n",
"125 Marreese Speights C Golden State Warriors 3.815000\n",
"126 Leandro Barbosa SG Golden State Warriors 2.500000\n",
"127 Festus Ezeli C Golden State Warriors 2.008748\n",
"128 Brandon Rush SF Golden State Warriors 1.270964\n",
"129 Kevon Looney SF Golden State Warriors 1.131960\n",
"130 Anderson Varejao PF Golden State Warriors 0.289755"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nba[nba['TEAM'].str.contains('Warriors')]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can extract data for all the guards, both Point Guards and Shooting Guards:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" PLAYER | \n",
" POSITION | \n",
" TEAM | \n",
" SALARY | \n",
"
\n",
" \n",
" \n",
" \n",
" 3 | \n",
" Jeff Teague | \n",
" PG | \n",
" Atlanta Hawks | \n",
" 8.000000 | \n",
"
\n",
" \n",
" 4 | \n",
" Kyle Korver | \n",
" SG | \n",
" Atlanta Hawks | \n",
" 5.746479 | \n",
"
\n",
" \n",
" 8 | \n",
" Dennis Schroder | \n",
" PG | \n",
" Atlanta Hawks | \n",
" 1.763400 | \n",
"
\n",
" \n",
" 9 | \n",
" Tim Hardaway Jr. | \n",
" SG | \n",
" Atlanta Hawks | \n",
" 1.304520 | \n",
"
\n",
" \n",
" 11 | \n",
" Jason Richardson | \n",
" SG | \n",
" Atlanta Hawks | \n",
" 0.947276 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 409 | \n",
" Alan Anderson | \n",
" SG | \n",
" Washington Wizards | \n",
" 4.000000 | \n",
"
\n",
" \n",
" 411 | \n",
" Ramon Sessions | \n",
" PG | \n",
" Washington Wizards | \n",
" 2.170465 | \n",
"
\n",
" \n",
" 412 | \n",
" Gary Neal | \n",
" PG | \n",
" Washington Wizards | \n",
" 2.139000 | \n",
"
\n",
" \n",
" 415 | \n",
" Garrett Temple | \n",
" SG | \n",
" Washington Wizards | \n",
" 1.100602 | \n",
"
\n",
" \n",
" 416 | \n",
" Jarell Eddie | \n",
" SG | \n",
" Washington Wizards | \n",
" 0.561716 | \n",
"
\n",
" \n",
"
\n",
"
181 rows × 4 columns
\n",
"
"
],
"text/plain": [
" PLAYER POSITION TEAM SALARY\n",
"3 Jeff Teague PG Atlanta Hawks 8.000000\n",
"4 Kyle Korver SG Atlanta Hawks 5.746479\n",
"8 Dennis Schroder PG Atlanta Hawks 1.763400\n",
"9 Tim Hardaway Jr. SG Atlanta Hawks 1.304520\n",
"11 Jason Richardson SG Atlanta Hawks 0.947276\n",
".. ... ... ... ...\n",
"409 Alan Anderson SG Washington Wizards 4.000000\n",
"411 Ramon Sessions PG Washington Wizards 2.170465\n",
"412 Gary Neal PG Washington Wizards 2.139000\n",
"415 Garrett Temple SG Washington Wizards 1.100602\n",
"416 Jarell Eddie SG Washington Wizards 0.561716\n",
"\n",
"[181 rows x 4 columns]"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nba[nba['POSITION'].str.contains('G')]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can get all the players who were not Cleveland Cavaliers and had a salary of no less than $\\$20$ million:"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" PLAYER | \n",
" POSITION | \n",
" TEAM | \n",
" SALARY | \n",
"
\n",
" \n",
" \n",
" \n",
" 29 | \n",
" Joe Johnson | \n",
" SF | \n",
" Brooklyn Nets | \n",
" 24.894863 | \n",
"
\n",
" \n",
" 60 | \n",
" Derrick Rose | \n",
" PG | \n",
" Chicago Bulls | \n",
" 20.093064 | \n",
"
\n",
" \n",
" 131 | \n",
" Dwight Howard | \n",
" C | \n",
" Houston Rockets | \n",
" 22.359364 | \n",
"
\n",
" \n",
" 156 | \n",
" Chris Paul | \n",
" PG | \n",
" Los Angeles Clippers | \n",
" 21.468695 | \n",
"
\n",
" \n",
" 169 | \n",
" Kobe Bryant | \n",
" SF | \n",
" Los Angeles Lakers | \n",
" 25.000000 | \n",
"
\n",
" \n",
" 201 | \n",
" Chris Bosh | \n",
" PF | \n",
" Miami Heat | \n",
" 22.192730 | \n",
"
\n",
" \n",
" 255 | \n",
" Carmelo Anthony | \n",
" SF | \n",
" New York Knicks | \n",
" 22.875000 | \n",
"
\n",
" \n",
" 268 | \n",
" Kevin Durant | \n",
" SF | \n",
" Oklahoma City Thunder | \n",
" 20.158622 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" PLAYER POSITION TEAM SALARY\n",
"29 Joe Johnson SF Brooklyn Nets 24.894863\n",
"60 Derrick Rose PG Chicago Bulls 20.093064\n",
"131 Dwight Howard C Houston Rockets 22.359364\n",
"156 Chris Paul PG Los Angeles Clippers 21.468695\n",
"169 Kobe Bryant SF Los Angeles Lakers 25.000000\n",
"201 Chris Bosh PF Miami Heat 22.192730\n",
"255 Carmelo Anthony SF New York Knicks 22.875000\n",
"268 Kevin Durant SF Oklahoma City Thunder 20.158622"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"other_than_Cavs = nba[nba['TEAM'] != 'Cleveland Cavaliers']\n",
"other_than_Cavs[other_than_Cavs['SALARY'] > 20]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The same table can be created in many ways. Here is another, and no doubt you can think of more."
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}