PyWR module

PyWR.calc_classifiability(P, Q)

Implement the Michaelangeli (1995) Classifiability Index.

The variable naming here is not pythonic but follows the notation in the 1995 paper which makes it easier to follow what is going on. You shouldn’t need to call this function directly but it is called in cluster_xr_eof.

Parameters
  • P (array) – A cluster centroid

  • Q (array) – Another cluster centroid

Returns

ci – Classifiability index value.

Return type

float

PyWR.download_data(url, authkey, outfile, force_download=False)

A smart function to download data from IRI Data Library.

If the data can be read in and force_download is False, will read from file Otherwise will download from IRIDL and then read from file

Parameters
  • url (str) – The url pointing to the data.nc file.

  • authkey (str) – The authentication key for IRI DL (see above).

  • outfile (str) – The data filename.

  • force_download (Bool, optional) – False if it’s OK to read from existing file, True if data must be re-downloaded.

Returns

data – Dataframe of dataset specified in url or file.

Return type

dataFrame

PyWR.get_classifiability_index(centroids: numpy.ndarray)Tuple[float, int]

Get the classifiability of a set of centroids.

This function will compute the classifiability index for a set of centroids and indicate which is the best one.

Parameters

centroids (array) – Input array of centroids, indexed [simulation, dimension]

Returns

  • classifiability (float) – Classifiability index value.

  • best_part (int) – The best centroid.

PyWR.get_number_eof(X: numpy.ndarray, var_to_explain: float, plot=False)int

Get the number of EOFs of X that explain a given variance proportion.

Parameters
  • X (ndarray) –

  • var_to_explain (float) – Proportion (0 to 1) of variance to be explained.

  • plot (Bool, optional) – Default plot=Flase will not generate plot

Returns

  • n_eof (float)

  • Number of EOF’s retained for the chosen percentage of variance.

Notes

Plot generated by the function is ‘number of EOFs’ versus ‘Cumulative proportion of variance explained’.

PyWR.loop_kmeans(X: numpy.ndarray, n_cluster: int, n_sim: int)Tuple[numpy.ndarray, numpy.ndarray]

Compute weather types using k means clustering.

Should have more information on what this does.

Parameters
  • X (array) – PCA reanalysis data.

  • n_cluster (int) – How many clusters to compute.

  • n_sim (int) – how many times to initialize the clusters (note: computation increases order (n_sim**2)).

Returns

  • centroids (array) – Centroids.

  • w_types (array) – Weather types.

Notes

X should be in reduced dimension space already; indexed [time, dimension].

PyWR.plot_procrustesAnalysis(Procrustes, savefig=False)

Plot results of procrustes analysis.

Contour plots which show the different elements of the procrustes analysis used to correct the weather types.

Parameters
  • Procrustes (dataFrame) – Output of procrustesAnalysis().

  • savefig (Bool optional) – Determines if plot will be saved.

Returns

plt.show() – Contour plots showing model weather types without correction, and the scaled, rotated, and translated data.

Return type

matplotlib plot

Notes

If savefig=True, figure will be saved to current directory as ‘ProcrustesAnalysis_ + {model} + .pdf’

Procrustes includes the original model weather type data, as well as the scale,rotation, and translation data used to correct the model data in the procrustes analysis.

PyWR.plot_procrustesCorrection(WTf, savefig=False)

Plot the WTf output of the procrustes analysis.

Plot the corrected weather type model outputs against the original WT model and reanalysis datasets.

Parameters
  • WTf (dataFrame) – Output of procrustesAnalysis() which includes reanalysis and model WTs, and the corrected WTs.

  • savefig (Bool, optional) – Determines if plot will be saved.

Returns

plt.show() – Contour plots showing comparison between model, reanalysis, and corrected weather types.

Return type

matplotlib plot

Notes

If savefig=True, figure will be saved to current directory as ‘ProcrustesCorrection_ + {model} + .pdf’

PyWR.plot_reaVSmod(WTmod, WTrea, model, reanalysis='MERRA', savefig=False)

Plot reanalysis and model datasets

Plots smoothed reanalysis data and smoothed, interpolated model datasets as contour maps.

Parameters
  • WTmod (dataFrame) – Model dataset.

  • WTrea (dataFrame) – Reanalysis dataset.

  • model (str) – Name of model dataset.

  • reanalysis (str, optional) – Name of reanalysis dataset. Default reanalysis=’MERRA’ (data used in example calculations).

  • savefig (Bool, optional) – Determines if plot will be saved.

Returns

plt.show() – Contour plot showing reanalysis and model WTs.

Return type

matplotlib plot

Notes

If savefig=True, figure will be saved to current directory as ‘plot_reaVSmod.png’

PyWR.prepareDS_procrustes(WTmod, WTrea)

Prepare model and reanalysis datasets for analysis / plotting.

Takes raw mode and datasets

Parameters
  • WTmod (dataFrame) – Model dataset

  • WTrea (dataFrame) – Reanalysis dataset

Returns

  • WTmod (dataFrame) – Smoothed version of the original model dataset

  • WTrea (dataFrame) – Smoothed, interpolated version of the reanalysis dataset

Notes

WTrea domain needs to be slightly larger than WTmod domain in order to avoid NaNs after interpolation.

PyWR.procrustes2(data1, data2)

Perform procrustes analysis.

Needs more info on what exactly this is doing.

Parameters
  • data1 (dataFrame) – Reanalysis dataFrame.

  • data2 (dataFrame) – Model dataFrame.

Returns

  • mtx1 (array) – Matrix to be mapped.

  • mtx2 (array) – Target matrix.

  • disparity (float) – Dissimilarity between the two datasets.

  • R (ndarray) – The matrix solution of the orthogonal Procrustes problem. Returned from scipy’s orthogonal_procrustes() function.

  • s (float) – Scale; Sum of the singular values of mtx1.T @ mtx2

PyWR.procrustes2d(X, Y, scaling=True, reflection='best')

Procrustes analysis

A port of MATLAB’s procrustes function to Numpy. – Modified by Á.G. Muñoz (agmunoz@iri.columbia.edu)

Procrustes analysis determines a linear transformation (translation, reflection, orthogonal rotation and scaling) of the points in Y to best conform them to the points in matrix X, using the sum of squared errors as the goodness of fit criterion.

d, Z, [tform] = procrustes(X, Y)

Parameters
  • X (array) – The reference or target field.

  • Y (array) – The field to be transformed.

  • scaling (Bool, optional) – If False, the scaling component of the transformation is forced to 1.

  • reflection (str or Bool, optional) – If ‘best’ (default), the transformation solution may or may not include a reflection component, depending on which fits the data best. setting reflection to True or False forces a solution with reflection or no reflection respectively.

Returns

  • d (float) – The residual sum of squared errors, normalized according to a measure of the scale of X, ((X - X.mean(axis=0))**2).sum()

  • Z (array) – The matrix of transformed Y-values.

  • tform (dict) – Specifying the rotation, translation and scaling that maps X –> Y.

Notes

X and Y must have equal numbers of points (rows), but Y may have fewer dimensions (columns) than X.

c: The translation component T: The orthogonal rotation and reflection component b: The scale component That is, Z = TRANSFORM.b * Y * TRANSFORM.T + TRANSFORM.c.

PyWR.procrustesAnalysis(WTmod, WTrea, model, reanalysis='MERRA', smooth='SingleDay', printDisparity=False)

Run procrustes analysis

Takes model weather type data and reanalysis weather type data and performs procrustes analysis to correct and improve reanalysis WT dataset.

Parameters
  • WTmod (dataFrame) – Model WT dataset.

  • WTrea (dataFrame.) – Reanalysis WT dataset.

  • model (str) – Name of the model data used.

  • reanalysis (str) – Name of the reanalysis data used.

  • smooth (str, optional) – Determines if data will be smoothed, can be either ‘SingleDay’ (no smoothing) or ‘5DayAVG’ (smoothing).

  • printDisparity (Bool, optional) – If True, disparity values for each weather type will be printed. Default printDisparity=False where no output is printed.

Returns

  • WTf (Dataframe) – Includes model, reanalysis, and adjusted reanalysis WT data.

  • Procrustes (Dataframe) – Includes model data and procrustes analysis with components of scale, rotation, translation.

PyWR.resort_labels(old_labels)

Re-sort cluster labels.

Re-orders labels so that the lowest number is the most common, and the highest number is the least common.

Parameters

old_labels (vector) – The previous labels of the clusters.

Returns

new_labels – The new cluster labels, ranked by frequency of occurrence.

Return type

vector

PyWR.shiftedColorMap(cmap, start=0, midpoint=0.5, stop=1.0, name='shiftedcmap')

Offset center of colormap

Useful for data with a negative min and positive max and you want the middle of the colormap’s dynamic range to be at zero.

Parameters
  • cmap (matplotlib colormap) – The matplotlib colormap to be altered.

  • start (float, optional) – Offset from lowest point in the colormap’s range. Defaults to 0.0 (no lower offset). Should be between 0.0 and midpoint.

  • midpoint (float, optional) – The new center of the colormap. Defaults to 0.5 (no shift). Should be between 0.0 and 1.0.

  • stop (float, optional) – Offset from highest point in the colormap’s range. Defaults to 1.0 (no upper offset). Should be between midpoint and 1.0.

Returns

newcmap – New colormap that can be used for plotting.

Return type

matplotlib colormap

Notes

For midpoint, in general it should be 1 - vmax / (vmax + abs(vmin))

For example if your data range from -15.0 to +5.0 and you want the center of the colormap at 0.0, midpoint should be set to 1 - 5/(5 + 15)) or 0.75