{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"2020-11-05 - class #10 - activities.ipynb","provenance":[],"collapsed_sections":["8d6J9xZiFtU2","II3iXc9lS-Xb"]},"kernelspec":{"name":"python3","display_name":"Python 3"}},"cells":[{"cell_type":"markdown","metadata":{"id":"BRAp37uklN9X"},"source":["# Class \\#10 activities"]},{"cell_type":"markdown","metadata":{"id":"75Fru6pykhxR"},"source":["# Practice with `pandas`: Ballard Locks salmon counts"]},{"cell_type":"code","metadata":{"id":"XANgNomokhQp","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1604626782369,"user_tz":480,"elapsed":412,"user":{"displayName":"Ethan C Campbell","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14GjCBYTiuomqOsCakND1k_5wj0kYvFY53Jt7kunt=s64","userId":"11255944928409084259"}},"outputId":"ca4d09ea-e99e-4779-9eab-8baf33447421"},"source":["# Import NumPy, Pandas, Matplotlib, and datetime at the top of your code\n","import numpy as np\n","import pandas as pd\n","import matplotlib.pyplot as plt\n","from datetime import datetime, timedelta\n","\n","# Give Colab access to Google Drive\n","from google.colab import drive\n","drive.mount('/content/drive')"],"execution_count":null,"outputs":[{"output_type":"stream","text":["Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount(\"/content/drive\", force_remount=True).\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"id":"c258zxyZlOuk"},"source":["# Filepath for Ballard Locks salmon count data\n","\n","# Note: you may need to change this to match your own filepath,\n","# which you can get by opening the left sidebar (folder icon),\n","# navigating to the file, clicking the \"...\" on the file, and\n","# selecting \"Copy path\"\n","filepath = '/content/drive/My Drive/OCEAN 215 - Autumn \\'20/OCEAN 215 - Autumn \\'20 - Course documents/Zoom class slides and notebooks/2020-11-05 - class #10 - data/ballard_salmon_counts.csv'"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"eQCX7qkNSpUB"},"source":["## **Breakout rooms, round 1**"]},{"cell_type":"markdown","metadata":{"id":"arXwiCsM6wL1"},"source":["0. Assign roles:\n","> * **Choose one person to write code and share their screen.**\n","> * **Choose a second person to take notes on the answers to report back to the class.**\n","1. Load the salmon data CSV file into Pandas.\n","> * When you do this, specify that the 0th column (the dates) should be the index.\n",">\n","> * Also specify that Pandas should parse the index as dates (datetimes).\n",">\n","> * Consult the documentation for [`pd.read_csv()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html) to find the arguments to specify these two things.\n","2. Display the data.\n","3. Use `.describe()` to view the summary statistics.\n","4. Answer the following questions with your group:\n","\n","* How many salmon species are counted?\n","* When does this data start and end?\n","* What are the average daily counts for each species?\n","* What are the highest daily counts for each species?"]},{"cell_type":"code","metadata":{"id":"I93RpkPttjHH","colab":{"base_uri":"https://localhost:8080/","height":708},"executionInfo":{"status":"ok","timestamp":1604626785489,"user_tz":480,"elapsed":369,"user":{"displayName":"Ethan C Campbell","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14GjCBYTiuomqOsCakND1k_5wj0kYvFY53Jt7kunt=s64","userId":"11255944928409084259"}},"outputId":"601bfea5-e38c-408e-ccee-95d31ebe780b"},"source":["# Load the float data file from Google Drive as a Pandas DataFrame\n","salmon_data = pd.read_csv(filepath,index_col=0,parse_dates=True)\n","\n","# View data and stats\n","display(salmon_data)\n","salmon_data.describe()\n","\n","# Answers:\n","# a. 3 species\n","# b. June 2013 to October 2020\n","# c. 242, 247, 1316 for Chinook, Coho, Sockeye\n","# d. 916, 1026, 12936"],"execution_count":null,"outputs":[{"output_type":"display_data","data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
ChinookCohoSockeye
2013-06-12NaNNaN2778.0
2013-06-13NaNNaN2424.0
2013-06-14NaNNaN1285.0
2013-06-15NaNNaN2430.0
2013-06-16NaNNaN3081.0
............
2020-09-28NaN219.0NaN
2020-09-29NaN81.0NaN
2020-09-30NaN13.0NaN
2020-10-01NaN44.0NaN
2020-10-02NaN38.0NaN
\n","

419 rows × 3 columns

\n","
"],"text/plain":[" Chinook Coho Sockeye\n","2013-06-12 NaN NaN 2778.0\n","2013-06-13 NaN NaN 2424.0\n","2013-06-14 NaN NaN 1285.0\n","2013-06-15 NaN NaN 2430.0\n","2013-06-16 NaN NaN 3081.0\n","... ... ... ...\n","2020-09-28 NaN 219.0 NaN\n","2020-09-29 NaN 81.0 NaN\n","2020-09-30 NaN 13.0 NaN\n","2020-10-01 NaN 44.0 NaN\n","2020-10-02 NaN 38.0 NaN\n","\n","[419 rows x 3 columns]"]},"metadata":{"tags":[]}},{"output_type":"execute_result","data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
ChinookCohoSockeye
count55.00000032.000000370.000000
mean224.290909246.7187501315.670270
std218.044395238.5810521746.660107
min0.00000013.0000000.000000
25%59.00000075.750000159.250000
50%164.000000154.500000699.000000
75%305.000000378.5000001745.000000
max916.0000001026.00000012936.000000
\n","
"],"text/plain":[" Chinook Coho Sockeye\n","count 55.000000 32.000000 370.000000\n","mean 224.290909 246.718750 1315.670270\n","std 218.044395 238.581052 1746.660107\n","min 0.000000 13.000000 0.000000\n","25% 59.000000 75.750000 159.250000\n","50% 164.000000 154.500000 699.000000\n","75% 305.000000 378.500000 1745.000000\n","max 916.000000 1026.000000 12936.000000"]},"metadata":{"tags":[]},"execution_count":3}]},{"cell_type":"markdown","metadata":{"id":"h3XnSPJfSvB5"},"source":["## **Breakout rooms, round 2**"]},{"cell_type":"markdown","metadata":{"id":"M_QJKrnm8uaH"},"source":["**Plot the data!**\n","1. Create a single blank figure. Set the `figsize` argument so it will take up the entire width of the page.\n","2. Then use `plt.plot()` or `ax.plot()` to make line plots of each of the three species' counts over time. In other words, the x-values should be datetimes from the index and the y-values should be daily salmon counts.\n","3. Choose the following colors for each line:\n","> * Chinook: 'forestgreen'\n",">\n","> * Coho: 'darkcyan'\n",">\n","> * Sockeye: 'salmon'\n",">\n","\n","\n","4. Label your plot axes and add a title.\n","5. Add a grid to your plot."]},{"cell_type":"code","metadata":{"id":"Ut1zDnxE9yNF","colab":{"base_uri":"https://localhost:8080/","height":284},"executionInfo":{"status":"ok","timestamp":1604626822531,"user_tz":480,"elapsed":499,"user":{"displayName":"Ethan C Campbell","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14GjCBYTiuomqOsCakND1k_5wj0kYvFY53Jt7kunt=s64","userId":"11255944928409084259"}},"outputId":"ae73cdd5-e8e0-4c61-fc50-ac10318bbfda"},"source":["# Plot the data here:\n","\n","plt.figure(figsize=(14,5))\n","# Alternative way of setting up a figure; then you'll call ax.plot()\n","# fig, ax = plt.subplots(figsize=(30,15))\n","\n","# You can save each to a variable...\n","# x_chinook = salmon_data['Chinook'].index\n","# y_chinook = salmon_data['Chinook']\n","# p_chinook = plt.plot(x_chinook,y_chinook,c='forestgreen')\n","\n","# ... or you can plot each in a single of code\n","plt.plot(salmon_data['Chinook'].index,salmon_data['Chinook']
,c='forestgreen')\n","plt.plot(salmon_data['Coho'].index,salmon_data['Coho']
,c='darkcyan')\n","plt.plot(salmon_data['Sockeye'].index,salmon_data['Sockeye']
,c='salmon')\n","\n","plt.xlabel('Years')\n","plt.ylabel('Daily salmon count')\n","plt.title('Salmon counts at Ballard Locks')\n","plt.grid()"],"execution_count":null,"outputs":[{"output_type":"display_data","data":{"image/png":"\n","text/plain":["
"]},"metadata":{"tags":[],"needs_background":"light"}}]},{"cell_type":"markdown","metadata":{"id":"AQnarvUKSxA3"},"source":["## **Breakout rooms, round 3**"]},{"cell_type":"markdown","metadata":{"id":"2qGdvYufAY__"},"source":["**Index into the data to discover information**\n","1. Use indexing to find out how many coho salmon passed through Ballard Locks on September 30, 2020, the day that Autumn Quarter started at UW. Hint: you'll need either `.iloc[]` or `.loc[]` to do this.\n","2. Use indexing with slicing (`:`) to get the sockeye salmon counts for all dates in the years specified below. You may wish to `print()` the results to check that they're correct.\n","> * First, do this for 2020. Save the result as a new variable, `sockeye_2020`.\n",">\n","> * Then, do the same for 2013. Save this as `sockeye_2013`.\n",">\n","3. Apply NumPy functions to `sockeye_2020` and `sockeye_2013` to find the following:\n","> * The highest daily sockeye count in each year (2013 and 2020).\n",">\n","> * The total number of sockeye that passed through Ballard Locks in each year.\n",">\n","4. Think about how you'd do Steps 2-3 in a single line of code — in other words, without saving the sliced Pandas objects as new variables.\n"]},{"cell_type":"code","metadata":{"id":"U1rdkWr2AYTy","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1604626910424,"user_tz":480,"elapsed":359,"user":{"displayName":"Ethan C Campbell","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14GjCBYTiuomqOsCakND1k_5wj0kYvFY53Jt7kunt=s64","userId":"11255944928409084259"}},"outputId":"fa834797-5ac3-44e4-eeef-f775497a317b"},"source":["# Part 1\n","salmon_data.loc[datetime(2020,9,30)]['Coho']\n","print('Number of coho on 9/30/20:',salmon_data.loc[datetime(2020,9,30)]['Coho'])\n","\n","# Alternative methods:\n","# print(salmon_data['Coho'].loc[datetime(2020,9,30)])\n","# print(salmon_data.loc['2020-09-30']['Coho'])\n","\n","# Part 2\n","sockeye_2020 = salmon_data['Sockeye'].loc[datetime(2020,1,1):datetime(2020,12,31)]\n","sockeye_2013 = salmon_data['Sockeye'].loc[datetime(2013,1,1):datetime(2013,12,31)]\n","\n","# Alternative methods:\n","# sockeye_2020 = salmon_data['Sockeye'].loc[datetime(2020,1,1):]\n","# sockeye_2013 = salmon_data['Sockeye'].loc['2013-1-1':'2013-12-31']\n","# sockeye_2013 = salmon_data['Sockeye'].loc['2013']\n","\n","# Part 3\n","print('Highest sockeye count in 2013:',sockeye_2013.max())\n","print('Highest sockeye count in 2020:',sockeye_2020.max())\n","print('Total sockeye in 2013:',sockeye_2013.sum())\n","print('Total sockeye in 2020:',sockeye_2020.sum())\n","\n","# Part 4\n","# Example of syntax:\n","salmon_data['Sockeye'].loc[datetime(2020,1,1):datetime(2020,12,31)].max()"],"execution_count":null,"outputs":[{"output_type":"stream","text":["Number of coho on 9/30/20: 13.0\n","Highest sockeye count in 2013: 12936.0\n","Highest sockeye count in 2020: 1961.0\n","Total sockeye in 2013: 178422.0\n","Total sockeye in 2020: 22954.0\n"],"name":"stdout"},{"output_type":"execute_result","data":{"text/plain":["1961.0"]},"metadata":{"tags":[]},"execution_count":6}]},{"cell_type":"markdown","metadata":{"id":"8d6J9xZiFtU2"},"source":["# Practice with `xarray`: World Ocean Atlas global ocean temperatures"]},{"cell_type":"code","metadata":{"id":"XR63HVK2Fsu1","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1604619659481,"user_tz":480,"elapsed":5307,"user":{"displayName":"Ethan C Campbell","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14GjCBYTiuomqOsCakND1k_5wj0kYvFY53Jt7kunt=s64","userId":"11255944928409084259"}},"outputId":"3828557c-11b1-4067-c1d7-b7088b54eee2"},"source":["# Import xarray and download netCDF4 library\n","import xarray as xr\n","!pip install netcdf4 # You can comment this out once it has run\n","\n","# Filepath for World Ocean Atlas 2018 (WOA18) temperature netCDF file\n","# Note: you may need to change this to match your own filepath\n","filepath = '/content/drive/My Drive/OCEAN 215 - Autumn \\'20/OCEAN 215 - Autumn \\'20 - Course documents/Zoom class slides and notebooks/2020-11-05 - class #10 - data/woa18_temp.nc'"],"execution_count":null,"outputs":[{"output_type":"stream","text":["Collecting netcdf4\n","\u001b[?25l Downloading https://files.pythonhosted.org/packages/09/39/3687b2ba762a709cd97e48dfaf3ae36a78ae603ec3d1487f767ad58a7b2e/netCDF4-1.5.4-cp36-cp36m-manylinux1_x86_64.whl (4.3MB)\n","\u001b[K |████████████████████████████████| 4.3MB 4.5MB/s \n","\u001b[?25hCollecting cftime\n","\u001b[?25l Downloading https://files.pythonhosted.org/packages/81/f4/31cb9b65f462ea960bd334c5466313cb7b8af792f272546b68b7868fccd4/cftime-1.2.1-cp36-cp36m-manylinux1_x86_64.whl (287kB)\n","\u001b[K |████████████████████████████████| 296kB 46.3MB/s \n","\u001b[?25hRequirement already satisfied: numpy>=1.9 in /usr/local/lib/python3.6/dist-packages (from netcdf4) (1.18.5)\n","Installing collected packages: cftime, netcdf4\n","Successfully installed cftime-1.2.1 netcdf4-1.5.4\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"2RgrRzzmS2lF"},"source":["## **Breakout rooms, round 4**"]},{"cell_type":"markdown","metadata":{"id":"mgowxnn4Jr0-"},"source":["0. Assign new roles:\n","> * **Choose a different person to write code and share their screen.**\n","> * **Choose a different person to take notes on the answers to report back to the class.**\n","1. Load the WOA18 netCDF file into xarray using `xr.open_dataset()`.\n","2. Display the data.\n","3. Using just the interactive display, answer the following questions with your group:\n","\n","* How many data variables are there?\n","* The variable abbreviations aren't very informative. Using the attributes button (page icon), can you tell what the variables represent?\n","* What is the time range of the data?\n","* What is the latitude and longitude resolution (spacing) of the data? Note that we call this the \"grid spacing\" or \"resolution\" of the data.\n","* What is the deepest depth level in the data?\n","* Take a peek at the 46 attributes. What is one thing you can learn from them?"]},{"cell_type":"code","metadata":{"id":"7r4RNUVAJr0_"},"source":["# Load the data file from Google Drive as an xarray Dataset\n","\n","\n","# View data and stats\n","\n"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"QP5nPdjQS4fV"},"source":["## **Breakout rooms, round 5**"]},{"cell_type":"markdown","metadata":{"id":"iF62lvBHNQn2"},"source":["![Image](https://www.mapsofworld.com/images/map-of-world-oceans.jpg)\n","\n","1. Use the ocean map to find the longitude (in units of °N) and latitude (in units of °E) of your favorite part of the global oceans.\n","2. Use indexing to find out the most recent ocean surface temperature (in 2011) at that location. Hint: you'll need either `.isel()` or `.sel()` to do this.\n","3. Convert this result from `xarray` format to a single float number."]},{"cell_type":"code","metadata":{"id":"vDQK3HYuJpOe"},"source":["# Write code here:\n","\n"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"46ZS5QgBS7wv"},"source":["## **Breakout rooms, round 6**"]},{"cell_type":"markdown","metadata":{"id":"muw9gRBaO4hZ"},"source":["1. Use indexing to get a time series of ocean surface temperature at the location you choose earlier. Save this as a new variable, `time_series`. (In other words, select in latitude, longitude, and depth, leaving a single dimension: time.)\n","2. Use indexing to get a depth profile of ocean temperature at the location you choose earlier in 2011. Save this as a new variable, `depth_profile`. (In other words, select in latitude, longitude, and time, leaving a single dimension: depth.)\n","3. Take a peek into these new variables using `display()`. How would you convert these from `xarray` format to 1-D NumPy arrays?"]},{"cell_type":"code","metadata":{"id":"ZjDw5qBLO3i5"},"source":["# Write code here:\n","\n"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"II3iXc9lS-Xb"},"source":["## **Breakout rooms, round 7**"]},{"cell_type":"markdown","metadata":{"id":"T-9fo04ZRGHj"},"source":["1. Apply a NumPy function to `time_series` to calculate the average temperature at your chosen location over time. Get the answer as a single float number, not an `xarray` object.\n","\n","2. Create a new blank Matplotlib figure with two subplots, side-by-side. Use `ax.plot()` to make a line plot on each subplot:\n","> * On the left subplot, plot the time series (time vs. temperature).\n",">\n","> * On the right subplot, plot the depth profile (temperature vs. depth). Reverse the depth axis using `ax.invert_yaxis()`.\n",">\n","> * Label your axes and add a grid.\n","\n"]},{"cell_type":"code","metadata":{"id":"TyUljlReP5U3"},"source":["# Part 1:\n","\n","\n","# Part 2:\n","\n"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"zsWD2olqQceF"},"source":[""],"execution_count":null,"outputs":[]}]}