PyWR module¶
-
PyWR.
calc_classifiability
(P, Q)¶ Implement the Michaelangeli (1995) Classifiability Index.
The variable naming here is not pythonic but follows the notation in the 1995 paper which makes it easier to follow what is going on. You shouldn’t need to call this function directly but it is called in cluster_xr_eof.
- Parameters
P (array) – A cluster centroid
Q (array) – Another cluster centroid
- Returns
ci – Classifiability index value.
- Return type
float
-
PyWR.
download_data
(url, authkey, outfile, force_download=False)¶ A smart function to download data from IRI Data Library.
If the data can be read in and force_download is False, will read from file Otherwise will download from IRIDL and then read from file
- Parameters
url (str) – The url pointing to the data.nc file.
authkey (str) – The authentication key for IRI DL (see above).
outfile (str) – The data filename.
force_download (Bool, optional) – False if it’s OK to read from existing file, True if data must be re-downloaded.
- Returns
data – Dataframe of dataset specified in url or file.
- Return type
dataFrame
-
PyWR.
get_classifiability_index
(centroids: numpy.ndarray) → Tuple[float, int]¶ Get the classifiability of a set of centroids.
This function will compute the classifiability index for a set of centroids and indicate which is the best one.
- Parameters
centroids (array) – Input array of centroids, indexed [simulation, dimension]
- Returns
classifiability (float) – Classifiability index value.
best_part (int) – The best centroid.
-
PyWR.
get_number_eof
(X: numpy.ndarray, var_to_explain: float, plot=False) → int¶ Get the number of EOFs of X that explain a given variance proportion.
- Parameters
X (ndarray) –
var_to_explain (float) – Proportion (0 to 1) of variance to be explained.
plot (Bool, optional) – Default plot=Flase will not generate plot
- Returns
n_eof (float)
Number of EOF’s retained for the chosen percentage of variance.
Notes
Plot generated by the function is ‘number of EOFs’ versus ‘Cumulative proportion of variance explained’.
-
PyWR.
loop_kmeans
(X: numpy.ndarray, n_cluster: int, n_sim: int) → Tuple[numpy.ndarray, numpy.ndarray]¶ Compute weather types using k means clustering.
Should have more information on what this does.
- Parameters
X (array) – PCA reanalysis data.
n_cluster (int) – How many clusters to compute.
n_sim (int) – how many times to initialize the clusters (note: computation increases order (n_sim**2)).
- Returns
centroids (array) – Centroids.
w_types (array) – Weather types.
Notes
X should be in reduced dimension space already; indexed [time, dimension].
-
PyWR.
plot_procrustesAnalysis
(Procrustes, savefig=False)¶ Plot results of procrustes analysis.
Contour plots which show the different elements of the procrustes analysis used to correct the weather types.
- Parameters
Procrustes (dataFrame) – Output of procrustesAnalysis().
savefig (Bool optional) – Determines if plot will be saved.
- Returns
plt.show() – Contour plots showing model weather types without correction, and the scaled, rotated, and translated data.
- Return type
matplotlib plot
Notes
If savefig=True, figure will be saved to current directory as ‘ProcrustesAnalysis_ + {model} + .pdf’
Procrustes includes the original model weather type data, as well as the scale,rotation, and translation data used to correct the model data in the procrustes analysis.
-
PyWR.
plot_procrustesCorrection
(WTf, savefig=False)¶ Plot the WTf output of the procrustes analysis.
Plot the corrected weather type model outputs against the original WT model and reanalysis datasets.
- Parameters
WTf (dataFrame) – Output of procrustesAnalysis() which includes reanalysis and model WTs, and the corrected WTs.
savefig (Bool, optional) – Determines if plot will be saved.
- Returns
plt.show() – Contour plots showing comparison between model, reanalysis, and corrected weather types.
- Return type
matplotlib plot
Notes
If savefig=True, figure will be saved to current directory as ‘ProcrustesCorrection_ + {model} + .pdf’
-
PyWR.
plot_reaVSmod
(WTmod, WTrea, model, reanalysis='MERRA', savefig=False)¶ Plot reanalysis and model datasets
Plots smoothed reanalysis data and smoothed, interpolated model datasets as contour maps.
- Parameters
WTmod (dataFrame) – Model dataset.
WTrea (dataFrame) – Reanalysis dataset.
model (str) – Name of model dataset.
reanalysis (str, optional) – Name of reanalysis dataset. Default reanalysis=’MERRA’ (data used in example calculations).
savefig (Bool, optional) – Determines if plot will be saved.
- Returns
plt.show() – Contour plot showing reanalysis and model WTs.
- Return type
matplotlib plot
Notes
If savefig=True, figure will be saved to current directory as ‘plot_reaVSmod.png’
-
PyWR.
prepareDS_procrustes
(WTmod, WTrea)¶ Prepare model and reanalysis datasets for analysis / plotting.
Takes raw mode and datasets
- Parameters
WTmod (dataFrame) – Model dataset
WTrea (dataFrame) – Reanalysis dataset
- Returns
WTmod (dataFrame) – Smoothed version of the original model dataset
WTrea (dataFrame) – Smoothed, interpolated version of the reanalysis dataset
Notes
WTrea domain needs to be slightly larger than WTmod domain in order to avoid NaNs after interpolation.
-
PyWR.
procrustes2
(data1, data2)¶ Perform procrustes analysis.
Needs more info on what exactly this is doing.
- Parameters
data1 (dataFrame) – Reanalysis dataFrame.
data2 (dataFrame) – Model dataFrame.
- Returns
mtx1 (array) – Matrix to be mapped.
mtx2 (array) – Target matrix.
disparity (float) – Dissimilarity between the two datasets.
R (ndarray) – The matrix solution of the orthogonal Procrustes problem. Returned from scipy’s orthogonal_procrustes() function.
s (float) – Scale; Sum of the singular values of mtx1.T @ mtx2
-
PyWR.
procrustes2d
(X, Y, scaling=True, reflection='best')¶ Procrustes analysis
A port of MATLAB’s procrustes function to Numpy. – Modified by Á.G. Muñoz (agmunoz@iri.columbia.edu)
Procrustes analysis determines a linear transformation (translation, reflection, orthogonal rotation and scaling) of the points in Y to best conform them to the points in matrix X, using the sum of squared errors as the goodness of fit criterion.
d, Z, [tform] = procrustes(X, Y)
- Parameters
X (array) – The reference or target field.
Y (array) – The field to be transformed.
scaling (Bool, optional) – If False, the scaling component of the transformation is forced to 1.
reflection (str or Bool, optional) – If ‘best’ (default), the transformation solution may or may not include a reflection component, depending on which fits the data best. setting reflection to True or False forces a solution with reflection or no reflection respectively.
- Returns
d (float) – The residual sum of squared errors, normalized according to a measure of the scale of X, ((X - X.mean(axis=0))**2).sum()
Z (array) – The matrix of transformed Y-values.
tform (dict) – Specifying the rotation, translation and scaling that maps X –> Y.
Notes
X and Y must have equal numbers of points (rows), but Y may have fewer dimensions (columns) than X.
c: The translation component T: The orthogonal rotation and reflection component b: The scale component That is, Z = TRANSFORM.b * Y * TRANSFORM.T + TRANSFORM.c.
-
PyWR.
procrustesAnalysis
(WTmod, WTrea, model, reanalysis='MERRA', smooth='SingleDay', printDisparity=False)¶ Run procrustes analysis
Takes model weather type data and reanalysis weather type data and performs procrustes analysis to correct and improve reanalysis WT dataset.
- Parameters
WTmod (dataFrame) – Model WT dataset.
WTrea (dataFrame.) – Reanalysis WT dataset.
model (str) – Name of the model data used.
reanalysis (str) – Name of the reanalysis data used.
smooth (str, optional) – Determines if data will be smoothed, can be either ‘SingleDay’ (no smoothing) or ‘5DayAVG’ (smoothing).
printDisparity (Bool, optional) – If True, disparity values for each weather type will be printed. Default printDisparity=False where no output is printed.
- Returns
WTf (Dataframe) – Includes model, reanalysis, and adjusted reanalysis WT data.
Procrustes (Dataframe) – Includes model data and procrustes analysis with components of scale, rotation, translation.
-
PyWR.
resort_labels
(old_labels)¶ Re-sort cluster labels.
Re-orders labels so that the lowest number is the most common, and the highest number is the least common.
- Parameters
old_labels (vector) – The previous labels of the clusters.
- Returns
new_labels – The new cluster labels, ranked by frequency of occurrence.
- Return type
vector
-
PyWR.
shiftedColorMap
(cmap, start=0, midpoint=0.5, stop=1.0, name='shiftedcmap')¶ Offset center of colormap
Useful for data with a negative min and positive max and you want the middle of the colormap’s dynamic range to be at zero.
- Parameters
cmap (matplotlib colormap) – The matplotlib colormap to be altered.
start (float, optional) – Offset from lowest point in the colormap’s range. Defaults to 0.0 (no lower offset). Should be between 0.0 and midpoint.
midpoint (float, optional) – The new center of the colormap. Defaults to 0.5 (no shift). Should be between 0.0 and 1.0.
stop (float, optional) – Offset from highest point in the colormap’s range. Defaults to 1.0 (no upper offset). Should be between midpoint and 1.0.
- Returns
newcmap – New colormap that can be used for plotting.
- Return type
matplotlib colormap
Notes
- For midpoint, in general it should be 1 - vmax / (vmax + abs(vmin))
For example if your data range from -15.0 to +5.0 and you want the center of the colormap at 0.0, midpoint should be set to 1 - 5/(5 + 15)) or 0.75