earthdaily.earthdatastore.cube_utils package
Subpackages
Submodules
Module contents
- class earthdaily.earthdatastore.cube_utils.GeometryManager(geometry)[source]
Bases:
object
A class to manage and convert various types of geometries into different formats.
- Parameters:
geometry (various types) – Input geometry, can be GeoDataFrame, GeoSeries, WKT string, or GeoJSON.
- geometry
The input geometry provided by the user.
- Type:
various types
- _obj
The geometry converted into a GeoDataFrame.
- Type:
GeoDataFrame
- input_type
Type of input geometry inferred during processing.
- Type:
str
- __call__():
Returns the stored GeoDataFrame.
- to_intersects(crs='EPSG:4326'):
Converts geometry to GeoJSON intersects format with a specified CRS.
- to_wkt(crs='EPSG:4326'):
Converts geometry to WKT format with a specified CRS.
- to_json(crs='EPSG:4326'):
Converts geometry to GeoJSON format with a specified CRS.
- to_geopandas():
Returns the GeoDataFrame of the input geometry.
- to_bbox(crs='EPSG:4326'):
Returns the bounding box of the geometry with a specified CRS.
- buffer_in_meter(distance, crs_meters='EPSG:3857', \*\*kwargs):
Applies a buffer in meters to the geometry and returns it with the original CRS.
- buffer_in_meter(distance, crs_meters='EPSG:3857', **kwargs)[source]
Applies a buffer in meters to the geometry and returns it with the original CRS.
- Parameters:
distance (int) – The buffer distance in meters.
crs_meters (str, optional) – The CRS to use for calculating the buffer (default is EPSG:3857).
**kwargs (dict, optional) – Additional keyword arguments to pass to the buffer method.
- Returns:
The buffered geometry in the original CRS.
- Return type:
GeoSeries
- to_bbox(crs='EPSG:4326')[source]
Returns the bounding box of the geometry.
- Parameters:
crs (str, optional) – The CRS to convert to (default is EPSG:4326).
- Returns:
The bounding box as an array [minx, miny, maxx, maxy].
- Return type:
numpy.ndarray
- to_geopandas()[source]
Converts the input geometry to a GeoDataFrame.
- Returns:
The input geometry as a GeoDataFrame.
- Return type:
GeoDataFrame
- to_intersects(crs='EPSG:4326')[source]
Converts the geometry to GeoJSON intersects format.
- Parameters:
crs (str, optional) – The coordinate reference system (CRS) to convert to (default is EPSG:4326).
- Returns:
The geometry in GeoJSON intersects format.
- Return type:
dict
- to_json(crs='EPSG:4326')[source]
Converts the geometry to GeoJSON format.
- Parameters:
crs (str, optional) – The CRS to convert to (default is EPSG:4326).
- Returns:
The geometry in GeoJSON format.
- Return type:
dict
- to_wkt(crs='EPSG:4326')[source]
Converts the geometry to WKT format.
- Parameters:
crs (str, optional) – The CRS to convert to (default is EPSG:4326).
- Returns:
The geometry in WKT format. If there is only one geometry, a single WKT string is returned; otherwise, a list of WKT strings.
- Return type:
str or list of str
- earthdaily.earthdatastore.cube_utils.zonal_stats(dataset, geometries, method='numpy', lazy_load=True, max_memory_mb=None, reducers=['mean'], all_touched=True, preserve_columns=True, buffer_meters=None, **kwargs)[source]
Calculate zonal statistics for xarray Dataset based on geometric boundaries.
This function computes statistical summaries of Dataset values within each geometry’s zone, supporting parallel processing through xarray’s apply_ufunc and multiple computation methods.
- Parameters:
dataset (xarray.Dataset) – Input dataset containing variables for statistics computation.
geoms (Union[geopandas.GeoDataFrame, geopandas.GeoSeries]) – Geometries defining the zones for statistics calculation.
method (str, optional) –
- Method for computation. Options:
’numpy’: Uses numpy functions with parallel processing
’xvec’: Uses xvec library (must be installed)
Default is ‘numpy’.
lazy_load (bool, optional) – If True, optimizes memory usage by loading chunks of data for ‘numpy’ method. Default is False.
max_memory_mb (float, optional) – Maximum memory to use in megabytes. If None, uses maximum available memory. Default is None.
reducers (list[str], optional) – List of statistical operations to perform. Functions should be numpy nan-functions (e.g., ‘mean’ uses np.nanmean). Default is [‘mean’].
all_touched (bool, optional) – If True, includes all pixels touched by geometries in computation. Default is True.
preserve_columns (bool, optional) – If True, preserves all columns from input geometries in output. Default is True.
buffer_meters (Union[int, float, None], optional) – Buffer distance in meters to apply to geometries before computation. Default is None.
**kwargs (dict) – Additional keyword arguments passed to underlying computation functions.
- Returns:
- Dataset containing computed statistics with dimensions:
time (if present in input)
feature (number of geometries)
zonal_statistics (number of reducers)
Additional coordinates include geometry WKT and preserved columns if requested.
- Return type:
xarray.Dataset
See also
xarray.apply_ufunc
Function used for parallel computation
rasterio.features
Used for geometry rasterization
Notes
Memory usage is optimized for time series data when lazy_load=True by processing in chunks determined by available system memory.
The ‘xvec’ method requires the xvec package to be installed separately.
Examples
>>> import xarray as xr >>> import geopandas as gpd >>> dataset = xr.open_dataset("temperature.nc") >>> polygons = gpd.read_file("zones.geojson") >>> stats = compute_zonal_stats( ... dataset, ... polygons, ... reducers=["mean", "max"], ... lazy_load=True ... )
- Raises:
ImportError – If ‘xvec’ method is selected but xvec package is not installed.
ValueError – If invalid method or reducer is specified.
DeprecationWarning – If deprecated parameters are used.