ncdata.utils package#
General user utility functions.
- ncdata.utils.rename_dimension(ncdata, name_from, name_to)#
Rename a dimension of an
NcData.This function calls
ncdata.dimensions.rename, but then it also renames the dimension in all the variables which reference it, including those in sub-groups.See: Rename
- Parameters:
- Return type:
None
Notes
- ncdata.utils.dataset_differences(dataset_or_path_1, dataset_or_path_2, check_names=False, check_dims_order=True, check_dims_unlimited=True, check_vars_order=True, check_attrs_order=True, check_groups_order=True, check_var_data=True, show_n_first_different=2, suppress_warnings=False)#
Compare two netcdf datasets.
Accepts paths, pathstrings, open
netCDF4.Datasets orNcDataobjects. File paths are opened with thenetCDF4module.See: Equality Testing
- Parameters:
dataset_or_path_1 (str or Path or netCDF4.Dataset or NcData) – First dataset to compare : either an open
netCDF4.Dataset, a path to open one, or anNcDataobject.dataset_or_path_2 (str or Path or netCDF4.Dataset or NcData) – Second dataset to compare : either an open
netCDF4.Dataset, a path to open one, or anNcDataobject.check_dims_order (bool, default True) – If False, no error results from the same dimensions appearing in a different order. However, unless suppress_warnings is True, the error string is issued as a warning.
check_vars_order (bool, default True) – If False, no error results from the same variables appearing in a different order. However unless suppress_warnings is True, the error string is issued as a warning.
check_attrs_order (bool, default True) – If False, no error results from the same attributes appearing in a different order. However unless suppress_warnings is True, the error string is issued as a warning.
check_groups_order (bool, default True) – If False, no error results from the same groups appearing in a different order. However unless suppress_warnings is True, the error string is issued as a warning.
check_names (bool, default False) – Whether to warn if the names of the top-level datasets are different
check_dims_unlimited (bool, default True) – Whether to compare the ‘unlimited’ status of dimensions
check_var_data (bool, default True) – If True, all variable data is also checked for equality. If False, only dtype and shape are compared. NOTE: comparison of arrays is done in-memory, so could be highly inefficient for large variable data.
show_n_first_different (int, default 2) – Number of value differences to display.
suppress_warnings (bool, default False) – When False (the default), report changes in content order as Warnings. When True, ignore changes in ordering. See also : Container ordering.
- Returns:
errs – A list of “error” strings, describing differences between the inputs. If empty, no differences were found.
- Return type:
Examples
>>> data = NcData( ... name="a", ... variables=[NcVariable("b", data=[1, 2, 3, 4])], ... attributes={"a1": 4} ... ) >>> data2 = data.copy() >>> data2.avals.update({"a1":3, "v":7}) >>> data2.variables["b"].data = np.array([1, 7, 3, 99]) # must be an array! >>> print('\n'.join(dataset_differences(data, data2))) Dataset attribute lists do not match: ['a1'] != ['a1', 'v'] Dataset "a1" attribute values differ : 4 != 3 Dataset variable "b" data contents differ, at 2 points: @INDICES[(1,), (3,)] : LHS=[2, 4], RHS=[7, 99]
See also
- ncdata.utils.variable_differences(v1, v2, check_attrs_order=True, check_var_data=True, show_n_first_different=2, suppress_warnings=False, _group_id_string=None)#
Compare variables.
See: Equality Testing
- Parameters:
v1 (NcVariable) – variables to compare
v2 (NcVariable) – variables to compare
check_attrs_order (bool, default True) – If False, no error results from the same contents in a different order, however unless suppress_warnings is True, the error string is issued as a warning.
check_var_data (bool, default True) – If True, all variable data is also checked for equality. If False, only dtype and shape are compared. NOTE: comparison of large arrays is done in-memory, so may be highly inefficient.
show_n_first_different (int, default 2) – Number of value differences to display.
suppress_warnings (bool, default False) – When False (the default), report changes in content order as Warnings. When True, ignore changes in ordering entirely.
_group_id_string (str) – (internal use only)
- Returns:
errs – A list of “error” strings, describing differences between the inputs. If empty, no differences were found.
- Return type:
See also
- ncdata.utils.index_by_dimensions(ncdata, **dim_index_kwargs)#
Index an NcData over named dimensions.
- Parameters:
- Return type:
A new copy of ‘ncdata’, with dimensions and all relevant variables sub-indexed.
Examples
>>> data1 = index_by_dimensions(data, time=slice(0, 10)) # equivalent to [:10] >>> data2 = index_by_dimensions(data, levels=[1,2,5]) >>> data3 = index_by_dimensions(data, time=3, levels=slice(2, 10, 3))
Notes
Where a dimension key is a single value, the dimension will be removed. This mimics how numpy arrays behave, i.e. the difference between a[1] and a[[1]] or a[1:2].
Supported types of index key are: a single number; a slice; a list of indices or booleans. A tuple, or one-dimensional array can also be used in place of a list.
Key types not supported are: Multi-dimensional arrays;
Ellipsis;np.newaxis/None.A
Slicerprovides the same functionality with a slicing syntax.
See also
- class ncdata.utils.Slicer#
Bases:
objectAn object which can index an NcData over its dimensions.
This wraps the
index_by_dimensions()method for convenience, returning an object which supports the Python extended slicing syntax.Examples
>>> subdata = Slicer(data, "time")[:3]
>>> ds = Slicer(data, 'levels', 'time') >>> subdata_2 = ds[:10, :2] >>> subdata_3 = ds[1, [1,2,4]]
>>> subdata_4 = Slicer(data)[:3, 1:4]
Notes
A Slicer contains the original ncdata and presents it in a “sliceable” form. Indexing it returns a new NcData, so the original data is unchanged. The Slicer is also unchanged and can be reused.
index_by_dimensions()provides the same functionality in a different form. See there for more exact details of the operation.
See also
- __init__(ncdata, *dimension_names)#
Create an indexer for an NcData, applying to specific dimensions.
This can then be indexed to produce a derived (sub-indexed) dataset.
- ncdata#
data to be indexed.
- dim_names#
dimensions to index, in order.
- ncdata.utils.save_errors(ncdata)#
Scan a dataset for consistency and completeness.
See: Correctness and Consistency
Describe any aspects of this dataset which would prevent it from saving (cause an error). If there are any such problems, then an attempt to save the ncdata to a netcdf file will fail. If there are none, then a save should succeed.
- Parameters:
ncdata (NcData) – data to check
- Returns:
A list of strings, error messages describing problems with the dataset. If no errors, returns an empty list.
- Return type:
errors
Notes
The checks made are roughly the following:
(1) check names in all components (dimensions, variables, attributes and groups):
all names are valid netcdf names
all element names match their key in the component, i.e.
component[key].name == key
(2) check that all attribute values have netcdf-compatible dtypes.
( E.G. no object or compound (recarray) dtypes )
(3) check that, for all contained variables:
its dimensions are all present in the enclosing dataset
it has an attached data array, of a netcdf-compatible dtype
the shape of its data matches the lengths of its dimensions
- ncdata.utils.ncdata_copy(ncdata)#
Return a copy of the data.
The operation makes fresh copies of all ncdata objects, but does not copy variable data arrays.
See: Copying
- Parameters:
ncdata (NcData) – data to copy
- Returns:
identical but distinct copy of input
- Return type:
ncdata
Notes
This operation is now also available as an object method:
copy().Syntactically, this is generally more convenient, but the operation is identical.
For example:
>>> data1 = ncdata_copy(data) >>> data2 = data.copy() >>> data1 == data2 True