Quickstart¶
Authenticate¶
In order to use the ODP SDK, you need to authenticate using your provided API-key. This is achieved by setting the api_key-argument when instantiating ODPClient:
from odp_sdk import ODPClient
client = ODPClient(api_key="<my-api-key>")
You can also set the COGNITE_API_KEY environment variable:
$ export COGNITE_API_KEY=<my-api-key>
Download Ocean Data¶
Downloading ocean data is very easy once you have instantiated the ODPClient. The data is then returned as a Pandas DataFrame
df = client.casts(longitude=[-25, 35], latitude=[50, 80], timespan=["2018-06-01", "2018-06-30"])
It is also possible to specify what parameters to download:
df = client.casts(
longitude = [-25, 35],
latitude = [50, 80],
timespan = ["2018-06-01", "2018-06-30"],
parameters = ["date", "lon", "lat", "z", "Temperature", "Salinity"
)
In some instances, some filtering is necessary before downloading the data. This is achieved by first listing the available casts:
casts = client.get_available_casts(
longitude = [-25, 35],
latitude = [50, 80],
timespan = ["2018-06-01", "2018-06-30"],
metadata_parameters = ["extId", "date", "time", "lat", "lon", "country", "Platform", "dataset_code"
)
Then apply any desirable filters before downloading the data:
casts_norway = casts[casts.country == "NORWAY"]
df = client.download_data_from_casts(casts_norway.extId.tolist(),
parameters=["date", "lat", "lon", "z", "Temperature", "Salinity")
You can also download the cast metadata:
df = client.get_metadata(casts_norway.extId.tolist())
Utilities¶
Advanced Helper Functions¶
Interpolate Casts to Z¶
-
UtilityFunctions.
interpolate_casts_to_z
(variable, z_int, max_z_extrapolation=3, max_z_copy_single_value=1, kind='linear')¶ Interpolate profiles in dataframe to prescribed depth level.
Takes a complete dataframe from ODP and interpolates each cast by filtering out the values from each unique cast
Parameters: - df – Pandas DataFrame fromODP
- variable – Variable name to be interpolated as in the dataframe (Temperature, Oxygen, etc)
- z_int – List of the desired depth intervals to return, i.e [0,10,20]
- max_z_extrapolation – The maximum length to allow extrapolating. Nan values outside this distance.
- max_z_copy_single_value – If only one row is present in the cast, this is the maximum distance between the point and the interpolation level for copying the value
- kind – Type of interpolation as in interpolate_profile
Returns: DataFrame of parameter values at prescribed depth levels.
Interpolate Casts to grid¶
-
UtilityFunctions.
interpolate_to_grid
(values, int_points, interp_type='linear', minimum_neighbors=3, gamma=0.25, kappa_star=5.052, search_radius=0.1, rbf_func='linear', rbf_smooth=0.001, rescale=True)¶ Interpolate unstructured ND data to a Nd grid
Powered by the metpy library
Parameters: - points – (N,D) array of points, typically latitude and longitude
- values – (N,1) array of corresponding values, i.e Temperature, Oxygen etc
- int_points – list of arrays for gridding i.e lat/long grid –> (np.linspace(-25,35,60*10+1),np.linspace(50,80,30*10+1))
- interp_type – What type of interpolation to use. Available options include: 1) “linear”, “nearest”, “cubic”, or “rbf” from scipy.interpolate. 2) “natural_neighbor”, “barnes”, or “cressman” from metpy.interpolate. Default “linear”.
- minimum_neighbors – Minimum number of neighbors needed to perform barnes or cressman interpolation for a point. Default is 3.
- gamma – Adjustable smoothing parameter for the barnes interpolation. Default 0.25.
- kappa_star – Response parameter for barnes interpolation, specified nondimensionally in terms of the Nyquist. Default 5.052
- search_radius – A search radius to use for the barnes and cressman interpolation schemes. If search_radius is not specified, it will default to the average spacing of observations.
- rbf_func – Specifies which function to use for Rbf interpolation. Options include: ‘multiquadric’, ‘inverse’, ‘gaussian’, ‘linear’, ‘cubic’, ‘quintic’, and ‘thin_plate’. Defualt ‘linear’. See scipy.interpolate.Rbf for more information.
- rbf_smooth – Smoothing value applied to rbf interpolation. Higher values result in more smoothing.
- rescale –
Returns: Array representing the interpolated values for each input point
Return type: values_interpolated
Interpolate profile¶
-
UtilityFunctions.
interpolate_profile
(z_int, max_z_extrapolation=10, max_z_copy_single_value=1, kind='linear')¶ Interpolate profile zv (depth, parameter) to a user defined depth.
Parameters: - zv – 2-D array of depth and a parameter (temperature, oxygen, …)
- z_int – 1-D array of depth levles to interpolate to
- max_z_extrapolation – Maximum distance to extrapolate outside profile. Use 0 for no extrapolation.
- max_z_copy_single_value – Maximum distance for copying the value of a single value profile.
- kind – Specifies the kind of interpolation as a string (‘linear’, ‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘previous’, ‘next’, where ‘zero’, ‘slinear’)
Returns: Returns array of interpolated values
Example:
zv=array( [[ 0. , 21.64599991], [ 9.93530941, 21.54500008], [19.87013626, 20.96299934], [20.40699959, 29.80448341], [19.36800003, 49.67173004], [18.8010006 , 74.50308228], [18.27400017, 99.3314209 ]] ) z_int = [0,0,25,50,75,100,125] v_int = interpolate_profile(ZV,z_int) print(v_int) # >>> array([21.64599991, 20.67589412, 19.36050431, 18.79045314, 18.25980907, nan])
Plot Casts¶
-
UtilityFunctions.
plot_casts
(df, longitude, latitude, cmap='viridis', vrange=[None, None])¶ Plot casts :param variable: str of oceanographic vairable, i.e. ‘Temperature’ :param df: Pandas DataFrame from ODP with lat, lon, and variable columns :param longitude: List of min and max longitude, i.e [-10,35] :param latitude: List of min and max latitude, i.e [50,80] :param cmap: colormap specification :param vrange: Ranges for variables to be showsn, i.e. [0,20]
Returns: Map with variable measurments plotted as points
Plot Grid¶
-
UtilityFunctions.
plot_grid
(latitude, int_lon, int_lat, g, cmap='viridis', vrange=[None, None], crs_latlon=<sphinx.ext.autodoc.importer._MockObject object>, variable_name='')¶ Plot Grid :param int_lon: (M,N) array of longitude grid :param int_lat: (M,N) array of latitude grid :param g: (M,N) grid to be shown :param cmap: colormap :param vrange: Ranges for grid to be shown i.e [0,35] :param crs_latlon: :param variable_name:
Returns: Map with interpolated values
Get Units¶
-
UtilityFunctions.
get_units
()¶ Get dict describing the units of the different columns
Returns: Dict of units
Plot percentage of nulls for each variable in variable list¶
-
UtilityFunctions.
plot_nulls
(var_list=None)¶ Plot percentage of nulls for each variable in variable list.
Takes a dataframe from ODP and a list of variables and plots the percentage of missing values
Parameters: - df – Pandas dataframe from ODP
- var_list – list of variables (column names) that user is interested in default list is all the columns
Returns: Plot of percentage of values missing at each measuremtn (lat, lon, depth)
Plot metadata-statistics¶
-
UtilityFunctions.
plot_meta_stats
(variable)¶ Get bar graph of percentage of data belonging to a specific variable subset in the metadata
Parameters: - df – Pandas DataFrame with extId-column
- variable – Variable in subset of metadata
Returns: Bar graph with percentage of data belonging to variable subset (i.e. data belonging to different modes of data collection (‘dataset’))
Plot distribution of values¶
-
UtilityFunctions.
plot_distributions
(var_list)¶ Plot the distributions of the values for a list of variables
Parameters: - df – Pandas DataFrame from ODP containing oceanographic variables and values
- var_list – list of variables (column names) that should be plotted
Returns: Plots of distributions of values for each variable in variable list
Plot casts belonging to specific dataset¶
-
UtilityFunctions.
plot_datasets
(variable, latitude, longitude)¶ Plots on a map casts belonging to specific dataset (mode of data collection, i.e. ctd, xbt)
Parameters: - df – Pandas DataFrame
- variable – Variable of choice
- latitude – Bounding box latitude
- longitude – Bounding box longitude
Returns: Map with color coded casts based on dataset_code
Internal Helper Functions¶
-
UtilityFunctions.
geo_map
()¶ Helper function for mapping :param ax: Matplotlib axis
-
UtilityFunctions.
missing_values
(var_list)¶ Get dataframe of nulls for each variable in variable list.
Takes a dataframe from ODP and a list of variables and return dataframe of missing values
Parameters: - df – Pandas DataFrame from ODP
- var_list – list of variables (column names) that user is interested in default list is all the columns
Returns: Dataframe percentage of values missing at each measuremtn (lat, lon, depth)