earthdaily.earthdatastore.cube_utils package

Subpackages

Submodules

Module contents

class earthdaily.earthdatastore.cube_utils.GeometryManager(geometry)[source]

Bases: object

A class to manage and convert various types of geometries into different formats.

Parameters:

geometry (various types) – Input geometry, can be GeoDataFrame, GeoSeries, WKT string, or GeoJSON.

geometry

The input geometry provided by the user.

Type:

various types

_obj

The geometry converted into a GeoDataFrame.

Type:

GeoDataFrame

input_type

Type of input geometry inferred during processing.

Type:

str

__call__():

Returns the stored GeoDataFrame.

to_intersects(crs='EPSG:4326'):

Converts geometry to GeoJSON intersects format with a specified CRS.

to_wkt(crs='EPSG:4326'):

Converts geometry to WKT format with a specified CRS.

to_json(crs='EPSG:4326'):

Converts geometry to GeoJSON format with a specified CRS.

to_geopandas():

Returns the GeoDataFrame of the input geometry.

to_bbox(crs='EPSG:4326'):

Returns the bounding box of the geometry with a specified CRS.

buffer_in_meter(distance, crs_meters='EPSG:3857', \*\*kwargs):

Applies a buffer in meters to the geometry and returns it with the original CRS.

buffer_in_meter(distance, crs_meters='EPSG:3857', **kwargs)[source]

Applies a buffer in meters to the geometry and returns it with the original CRS.

Parameters:
  • distance (int) – The buffer distance in meters.

  • crs_meters (str, optional) – The CRS to use for calculating the buffer (default is EPSG:3857).

  • **kwargs (dict, optional) – Additional keyword arguments to pass to the buffer method.

Returns:

The buffered geometry in the original CRS.

Return type:

GeoSeries

to_bbox(crs='EPSG:4326')[source]

Returns the bounding box of the geometry.

Parameters:

crs (str, optional) – The CRS to convert to (default is EPSG:4326).

Returns:

The bounding box as an array [minx, miny, maxx, maxy].

Return type:

numpy.ndarray

to_geopandas()[source]

Converts the input geometry to a GeoDataFrame.

Returns:

The input geometry as a GeoDataFrame.

Return type:

GeoDataFrame

to_intersects(crs='EPSG:4326')[source]

Converts the geometry to GeoJSON intersects format.

Parameters:

crs (str, optional) – The coordinate reference system (CRS) to convert to (default is EPSG:4326).

Returns:

The geometry in GeoJSON intersects format.

Return type:

dict

to_json(crs='EPSG:4326')[source]

Converts the geometry to GeoJSON format.

Parameters:

crs (str, optional) – The CRS to convert to (default is EPSG:4326).

Returns:

The geometry in GeoJSON format.

Return type:

dict

to_wkt(crs='EPSG:4326')[source]

Converts the geometry to WKT format.

Parameters:

crs (str, optional) – The CRS to convert to (default is EPSG:4326).

Returns:

The geometry in WKT format. If there is only one geometry, a single WKT string is returned; otherwise, a list of WKT strings.

Return type:

str or list of str

earthdaily.earthdatastore.cube_utils.zonal_stats(dataset, geometries, method='numpy', lazy_load=True, max_memory_mb=None, reducers=['mean'], all_touched=True, preserve_columns=True, buffer_meters=None, **kwargs)[source]

Calculate zonal statistics for xarray Dataset based on geometric boundaries.

This function computes statistical summaries of Dataset values within each geometry’s zone, supporting parallel processing through xarray’s apply_ufunc and multiple computation methods.

Parameters:
  • dataset (xarray.Dataset) – Input dataset containing variables for statistics computation.

  • geoms (Union[geopandas.GeoDataFrame, geopandas.GeoSeries]) – Geometries defining the zones for statistics calculation.

  • method (str, optional) –

    Method for computation. Options:
    • ’numpy’: Uses numpy functions with parallel processing

    • ’xvec’: Uses xvec library (must be installed)

    Default is ‘numpy’.

  • lazy_load (bool, optional) – If True, optimizes memory usage by loading chunks of data for ‘numpy’ method. Default is False.

  • max_memory_mb (float, optional) – Maximum memory to use in megabytes. If None, uses maximum available memory. Default is None.

  • reducers (list[str], optional) – List of statistical operations to perform. Functions should be numpy nan-functions (e.g., ‘mean’ uses np.nanmean). Default is [‘mean’].

  • all_touched (bool, optional) – If True, includes all pixels touched by geometries in computation. Default is True.

  • preserve_columns (bool, optional) – If True, preserves all columns from input geometries in output. Default is True.

  • buffer_meters (Union[int, float, None], optional) – Buffer distance in meters to apply to geometries before computation. Default is None.

  • **kwargs (dict) – Additional keyword arguments passed to underlying computation functions.

Returns:

Dataset containing computed statistics with dimensions:
  • time (if present in input)

  • feature (number of geometries)

  • zonal_statistics (number of reducers)

Additional coordinates include geometry WKT and preserved columns if requested.

Return type:

xarray.Dataset

See also

xarray.apply_ufunc

Function used for parallel computation

rasterio.features

Used for geometry rasterization

Notes

Memory usage is optimized for time series data when lazy_load=True by processing in chunks determined by available system memory.

The ‘xvec’ method requires the xvec package to be installed separately.

Examples

>>> import xarray as xr
>>> import geopandas as gpd
>>> dataset = xr.open_dataset("temperature.nc")
>>> polygons = gpd.read_file("zones.geojson")
>>> stats = compute_zonal_stats(
...     dataset,
...     polygons,
...     reducers=["mean", "max"],
...     lazy_load=True
... )
Raises:
  • ImportError – If ‘xvec’ method is selected but xvec package is not installed.

  • ValueError – If invalid method or reducer is specified.

  • DeprecationWarning – If deprecated parameters are used.