# Assignment \#4



**Due:** Monday, December 1 at 11:59 pm PT

**Objective:**
This assignment will give you experience loading and manipulating CSV and netCDF data files using Pandas and xarray, and making histograms, 2-D pseudocolor plots, and maps of data using Matplotlib.

**Instructions:**
1. This version of the assignment cannot be edited. To save an editable version, copy this Colab file to your individual class Google Drive folder ("OCEAN 215 - Autumn '20 - {your name}") by right clicking on this file and selecting "Move to".
2. Open the version you copied.
3. Complete the assignment by writing and executing text and code cells as specified. **To complete this assignment, you do not need any material beyond Lesson #14.** However, you may use material from beyond Lesson #14 if you wish as long as it has been discussed in a lesson or class, and is not prohibited in the question.
4. When you're finished and are ready to submit the assignment, simply save the Colab file ("File" menu –> "Save") before the deadline, close the file, and keep it in your individual class Google Drive folder.
5. If you need more time, please see the section "Late work policy" in the syllabus for details.

**Honor code:** In the space below, you can acknowledge and describe any assistance you've received on this assignment, whether that was from an instructor, classmate (either directly or on Piazza), and/or online resources other than official Python documentation websites like docs.python.org or numpy.org. Alternatively, if you prefer, you may acknowledge assistance at the relevant point(s) in your code using a Python comment (#). You do not have to acknowledge OCEAN 215 class or lesson resources.

*Acknowledge assistance here:*

## Question 1 (5 points)
### **Interpreting errors**

*Useful resources:* Class #11 discussion on error types (note that the graphic in the [Class #11 slides](https://drive.google.com/file/d/1K_8-6bqFOjLEf_lJeuaUNeD1nja9S_J9/view?usp=sharing) has some, but not all of the answers)

##### Name the type of Python error you would receive if you made the following mistakes. You do not have to write code to test these situations, but you can if you want. Provide your answers using a variable named with the question part, set to a string containing the error name.

Example: `q1_part6 = 'AttributeError'`

1. Printing a variable with a typo in its name.
2. Subtracting an integer from a list.
3. Forgetting to close the parentheses after function arguments.
4. Having a typo in your filepath when loading a file (note: there are two different answers that would be acceptable here).
5. Trying to access a list element that does not exist.

In [None]:
# Write your answers here:



## Question 2 (30 points)
### **Ocean glider measurements**

*Useful resources:* Class #8 activities on handling NaN values, Lesson #9 on Pandas and xarray, Lesson #11 and #12 on histograms, Lesson #12 on 2-D plotting, Lesson #14 on trends and interpolation

Glider|Trajectory
---|---
<img src="https://www.whoi.edu/wp-content/uploads/2019/01/slocum_en_42909.jpg" height="250"/>| <img src="https://www.seanoe.org/data/00453/56509/illustration.gif" height="250"/>

Ocean gliders are an autonomous sensor platform designed to operate down to a depth of about 1000 m. Unlike Argo floats, gliders can be "flown" (navigated) by scientists to collect measurements from specific areas of interest. They do not have propellers, but instead move through the water by adjusting their volume (and thus their buoyancy) and using short wings and a fin to change the dive angle and direction. They descend and ascend in a sawtooth pattern, taking depth profiles along the way to user-provided waypoints. Gliders often measure temperature and salinity, but can also carry other sensors. 

You are given a data file, `Oceanglider.csv`, from a glider deployed by [Rutgers Center for Ocean Observing Leadership (RUCOOL)](https://rucool.marine.rutgers.edu/data/underwater-gliders/). Use this data file to complete the following problem.

##### Run the following cell to import required packages and give Colab access to Google Drive:

In [None]:
# Import NumPy, Pandas, Matplotlib, and necessary Scipy packages for Parts 4 and 5
import numpy as np
import pandas as pd
from scipy import stats
from scipy import interpolate
import matplotlib.pyplot as plt

# Give Colab access to Google Drive
from google.colab import drive
drive.mount('/content/drive')

##### 1. Load the ocean glider data using `pd.read_csv()`, making sure that datetimes are parsed correctly. Examine the data using the `display()` and `.describe()` functions, then answer the following questions. (Hard-coding numbers in your print statements is okay here.)

> a. How many valid temperature and salinity measurements are provided in this data set?
>
> b. What are the average temperature and salinity in this data set?
>
> c. What were the start and end dates of this glider deployment?
>
> d. Where was the glider deployed? Use a world map or Google Maps to identify the ocean region or a nearby location.

In [None]:
# Write your code here:



##### 2. In this part, you'll plot a histogram showing the distribution of temperatures measured by the glider.

> a. Set up a Matplotlib figure and create a histogram of the temperature measurements. Specify that bins should be at intervals of 1°C within the range 5°C to 30°C.
>
> b. Draw a vertical solid line at the mean value of temperature. Add text next to the line indicating its value (with units), rounded to 2 decimal places. 
>
> c. Draw vertical dashed lines at the mean plus one standard deviation ($+1\sigma$) and the mean minus one standard deviation ($-1\sigma$). Add text next to each line that indicates the positive or negative standard deviation value with units, rounded to 2 decimal places.
>
> d. Properly label your axes, set y-axis limits that match the vertical lines, and select formatting options of your choosing.


In [None]:
# Write your code here:



##### 3. Create a scatter plot of the glider's salinity measurements, with depth on the y-axis and latitude on the x-axis. Choose a colormap to represent the salinity values. Format your figure properly by inverting the y-axis and adding axis labels, a colorbar labeled with the parameter and units, a title, and a grid.

Then, based on the figure, answer the following True/False questions using print statements.

Example: `print('Part 3e: False')`

> a. The least salty waters are at the surface. (True/False)
>
> b. The highest salinities occur between 200-400 m. (True/False)
>
> c. The salinities clearly decrease across all depths as the glider moved north. (True/False)
>
> d. Some salinity profiles have gaps where bad data (or no data) was collected. (True/False)


In [None]:
# Write your code here:



##### 4. Each vertical profile collected by the glider has a unique profile number (`profile_id`). **In a single line of code**, perform the following operations on your glider data DataFrame and save the final result as a new variable:

* Group the DataFrame by profile number using [`.groupby()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html).
* Calculate the maximum values of the grouped data using a NumPy function.
* Sort the resulting DataFrame by latitude using [`.sort_values()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html).

Note that your result is a DataFrame with its columns representing the maximum lat, lon, depth, salinity, and temperature for each profile. Using this new DataFrame, do the following:

> a. Find how many profiles are in the glider data, and print the answer.
> 
> b. Save the maximum profile latitudes, maximum profile longitudes, and maximum profile temperatures, converted to NumPy arrays. These should be saved as three new variables.
>
> c. Make a simple Matplotlib figure with a black line plot of the maximum profile latitudes vs. maximum profile temperatures. Add axis labels and a title. Think about whether there appears to be a significant trend.
>
> d. Use SciPy's [`linregress()`](https://docs.scipy.org/doc/scipy-1.5.4/reference/generated/scipy.stats.linregress.html) function to calculate a linear regression of maximum profile latitudes vs. maximum profile temperatures.
>
> e. Print answers to the following questions: (1) What is the gradient (slope) of maximum temperatures by latitude? (Round your answer to 1 decimal place and include units.) (2) Is the trend of maximum temperatures with latitude significant at the 95% confidence level? (Find the answer by writing a comparison expression that will give the answer as a boolean – True or False.)

In [None]:
# One line of code for the first operation:


# Part 4a


# Part 4b


# Part 4c


# Part 4d


# Part 4e



##### 5. In this part, you will plot a 2-D depth section of the temperature data (similar to Part 3, except filled in). However, there's a problem: the temperature measurements are irregularly spaced in latitude and depth, and Matplotlib needs them on a regularly-spaced grid to create the contour plot.

> a. First, we must create that grid. Use [`np.linspace()`](https://numpy.org/doc/stable/reference/generated/numpy.linspace.html) to create regularly spaced 1-D coordinates for latitude (from 17.90°N to 18.15°N) and depth (from 0 m to 500 m), with 40 points in each array.
>
> b. Next, use [`np.meshgrid()`](https://numpy.org/doc/stable/reference/generated/numpy.meshgrid.html) to "mesh" those 1-D coordinate arrays into 2-D latitude and depth grid arrays. Print the shape of one of the grids you've created. Does the shape make sense?
>
> c. Then use SciPy's [`griddata()`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.griddata.html) function to interpolate the original, irregular temperature measurements to the regular latitude and depth grids you just created. Specify the argument `method='linear'` to choose linear interpolation. The result will be a NumPy array of gridded temperature data with the same shape as in Part 5b.
>
> d. On a new figure, use [`pcolormesh()`](https://matplotlib.org/3.3.2/api/_as_gen/matplotlib.pyplot.pcolormesh.html) to plot the gridded temperature data; choose a colormap different than the one you used for salinity in Part 3. Then use [`contour()`](https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.contour.html) to add 20 black contour lines. Label the contour lines, add a colorbar, label the colorbar, invert the y-axis, label your axes, and set a title.
>
> e. Add upside-down triangle markers, all at a depth of 0 m (the surface), indicating the maximum latitude of each profile (use your result from Part 4b for this).


In [None]:
# Part 5a


# Part 5b


# Part 5c


# Part 5d


# Part 5e



## Question 3 (15 points)
### **Mapping and bathymetry**
Lesson #9 on Pandas and xarray, Lesson #11 on plotting, Lesson #12 on mapping


<img src="https://www.ngdc.noaa.gov/mgg/image/color_etopo1_ice_low.jpg" width="750"/>

Ocean bathymetry influences the physics of ocean circulation as well as where oceanographers can deploy instruments like moorings, floats, and gliders. For these reasons, it is important to have a high-resolution data set for ocean bathymetry. ETOPO1 (pictured above) is a frequently-used 1 arc-minute (1/60°-resolution) global relief model that combines both land topography and ocean bathymetry.

##### Run the required installation and import statements:

In [None]:
# # This code installs the netCDF4 module
# # Run this code once per session, then comment it out
# !pip install netcdf4

# # This code allows Cartopy to work with Google Colab
# # Run this code once per session, then comment it out
# !grep '^deb ' /etc/apt/sources.list | \
#   sed 's/^deb /deb-src /g' | \
#   tee /etc/apt/sources.list.d/deb-src.list
# !apt-get -qq update
# !apt-get -qq build-dep python3-cartopy
# !pip uninstall -y shapely
# !pip install shapely --no-binary shapely
# !pip install cartopy

# Import NumPy, xarray, Matplotlib, Cartopy (and related imports)
import numpy as np
import xarray as xr
import matplotlib.pyplot as plt

import cartopy.crs as ccrs
import cartopy.feature as cfeature
from cartopy.mpl.gridliner import LONGITUDE_FORMATTER, LATITUDE_FORMATTER

# Give Colab access to Google Drive
from google.colab import drive
drive.mount('/content/drive')

##### 1. Visit the [ETOPO1 website](https://www.ngdc.noaa.gov/mgg/global/) and answer the following questions. Provide your answers in print statements labeled with the question part. Example: `print('Part 1d: This is the answer.')`

> a. What is the difference between the "ice surface" and "bedrock" versions of ETOPO1?
>
> b. What part of grid cells do latitude and longitude lines correspond to in the "grid/node-registered" version of ETOPO1? What part do they correspond to in the "cell/pixel-registered" version?
>
> c. Download the ETOPO1 grid-registered bedrock netCDF file (use the "gmt4" version) and upload it to your Google Drive. Make a variable for the filepath and print the variable. Note that the file ends in `.grd.gz`. It is still a netCDF file, but it is in a compressed (`.gz`) format. If you want, you can unzip the file before uploading to Google Drive, or you can keep it compressed — xarray can handle it either way.

In [None]:
# Write your code here:



##### 2. In this part, you'll map the bathymetry around the glider measurements from Question 2.

> a. Open the data using `xr.open_dataset()`. Display the data and identify the variables you'll be working with. Then use selection-by-label and slicing to select the ETOPO1 data surrounding the glider locations from Question 2. Choose latitudes 17.5°N to 18.5°N and longitudes 64.5°W to 65.0°W; be careful about the sign [+/-] of the longitudes.
>
> b. Use Matplotlib and Cartopy to create a map with the PlateCarree projection. Display coastlines and add land features in a dark color.
>
> c. Use `contourf()` to draw 10 filled contour levels of the ETOPO1 bathymetry data on the map. Make sure that your colormap is intuitive (e.g. dark = deep), and set the transparency (`alpha`) to 30%.
>
> d. Use `contour()` to add labeled contour lines of bathymetry at the 10 levels. Add formatted gridlines, and label only the left and bottom axes of your plot.
>
> e. Use `scatter()` to add points representing the maximum glider profile temperatures at the corresponding maximum profile longitudes and latitudes that you saved in Question 2, Part 4b. Color these points by temperature, using a different colormap than your bathymetry. Add and label a colorbar for these points.

In [None]:
# Open the data file:


# Part 2a


# Part 2b


# Part 2c


# Part 2d


# Part 2e

