perf: Keep data in minimal form (no pre-broadcasting) #575

FBumann · 2026-01-16T09:10:19Z

Summary

Major optimization: Data now stays in minimal form throughout the system. Scalars stay scalar, 1D arrays stay 1D - no pre-broadcasting to full model dimensions. This dramatically reduces memory usage and file sizes.

Key Changes

Core: Minimal Data Form

DataConverter.to_dataarray() now validates dimensions instead of broadcasting
Data keeps its natural dimensionality (e.g., relative_minimum=0 stays scalar)
Broadcasting happens at the linopy interface via FlowSystemModel.add_variables() when needed

New Helper: `_ensure_coords()` in `structure.py`

Broadcasts bounds to target coords only when creating linopy variables
Ensures correct dimension order via transpose when needed

New Helper: `_scalar_safe_reduce()` in `modeling.py`

Safely performs aggregations (mean/sum) over dimensions
Returns data unchanged if dimension doesn't exist (handles scalar data)

Clustering Fixes

expand_data(): Added transpose to ensure (cluster, time) dimension order before numpy indexing
_dataset_resample(): Handles all-scalar datasets by creating dummy variable for time coordinate resampling

Old Dataset Compatibility

convert_old_dataset(): Now reduces constant arrays dimension-by-dimension
Old files with pre-broadcasted data are automatically reduced on load

Benchmark Results

Configuration	Memory Before	Memory After	Reduction
8760 timesteps (1 year)	1.3 MB	137 KB	89%
8760 × 20 periods	24.1 MB	137 KB	99.4%

File sizes follow similar reductions. IO speed improved 25-63%.

Files Changed

File	Changes
`flixopt/core.py`	`_validate_dataarray_dims()` replaces broadcasting, adds transpose to canonical order
`flixopt/structure.py`	`_ensure_coords()` helper, `FlowSystemModel.add_variables()` override
`flixopt/modeling.py`	`_scalar_safe_reduce()` helper
`flixopt/components.py`	Use `_scalar_safe_reduce` for storage calculations
`flixopt/clustering/base.py`	Fix dimension order in `expand_data()`
`flixopt/transform_accessor.py`	Fix `_dataset_resample()` for all-scalar data
`flixopt/io.py`	`_reduce_constant_arrays()` for old datasets, removed collapse/expand layer

Type of Change

Performance optimization
Internal refactoring (backward compatible API)

Testing

All 1476 tests pass
Roundtrip tested (to_dataset → from_dataset preserves all data)
Old dataset compatibility tested

Checklist

Backward compatible (users see no API changes)
Old files load correctly and get reduced automatically

Summary by CodeRabbit

Bug Fixes
- Fixed cluster-based data expansion to correctly handle dimension ordering
- Improved scalar data handling for storage bounds and inter-cluster operations
- Better reduction of constant broadcasted arrays in conversions
Refactor
- Enhanced internal validation of data dimensions for improved consistency and robustness

_{✏️ Tip: You can customize this high-level summary in your review settings.}

- _reduce_constant_dims(): Reduces dimensions where all values are constant - _expand_reduced_dims(): Restores original dimensions with correct order 2. Added reduce_constants parameter to FlowSystem.to_dataset(): - Default: False (opt-in) - When True: Reduces constant dimensions for memory efficiency 3. Updated FlowSystem.from_dataset(): - Expands both collapsed arrays (from NetCDF) and reduced dimensions 4. Key fixes: - Store original dimension order in attrs to preserve (cluster, time) vs (time, cluster) ordering - Skip solution variables in reduction (they're prefixed with solution|) The optimization is opt-in via to_dataset(reduce_constants=True). For file storage, save_dataset_to_netcdf still collapses constants to scalars by default.

Benchmarks show 99%+ reduction in memory and file size for multi-period models, with faster IO speeds. The optimization is now enabled by default. Co-Authored-By: Claude Opus 4.5 <[email protected]>

coderabbitai · 2026-01-16T09:10:29Z

📝 Walkthrough

Walkthrough

This PR introduces dimension-aware utilities for safe xarray operations and refactors data conversion to validate dimensions first before broadcasting, shifting broadcasting responsibility from DataConverter.to_dataarray to FlowSystemModel.add_variables via new helper functions.

Changes

Cohort / File(s)	Summary
Scalar-safe dimensional helpers `flixopt/modeling.py`, `flixopt/components.py`, `flixopt/transform_accessor.py`	Introduces `_scalar_safe_isel`, `_scalar_safe_isel_drop`, and `_scalar_safe_reduce` utilities that safely apply dimensional operations only when dimensions exist; updates components storage and linked storage logic to use these helpers for relative bounds and loss reductions; applies safe reduction in SOC decay calculation.
Data validation and conversion `flixopt/core.py`, `flixopt/structure.py`	Adds `_validate_dataarray_dims` method to replace broadcasting in DataConverter.to_dataarray with dimension validation; introduces `_ensure_coords` helper and new `FlowSystemModel.add_variables` override to defer broadcasting until variable creation, ensuring minimal-dimensional scalar preservation.
Dataset and IO utilities `flixopt/flow_system.py`, `flixopt/io.py`	Ensures model coordinates (time, period, scenario, cluster) are always present in Dataset output; adds `_reduce_constant_arrays` helper and `reduce_constants` parameter to `convert_old_dataset` for post-conversion dimension reduction.
Cluster expansion `flixopt/clustering/base.py`	Ensures data operand is transposed to (cluster, time) order before indexing in cluster-based expansion within `_expand_slice`.
Test infrastructure `tests/conftest.py`	Introduces `assert_dims_compatible` helper to validate DataArray dimensions are a subset of model coordinates.
Updated test expectations `tests/test_component.py`, `tests/test_flow.py`, `tests/test_linear_converter.py`	Migrates dimensional assertions to use `assert_dims_compatible` instead of direct equality checks.
Deprecated tests removed/modified `tests/deprecated/test_dataconverter.py`, `tests/deprecated/test_component.py`, `tests/deprecated/test_flow.py`, `tests/deprecated/test_linear_converter.py`	Removes extensive DataConverter broadcasting tests (1262 lines); updates deprecated test expectations from full coordinate broadcasts to 1D time-only dimensions.
DataConverter test refactor `tests/test_dataconverter.py`	Shifts all assertions to expect 0D scalar results instead of broadcasted arrays; documents that broadcasting is deferred to linopy/higher-level interfaces.
IO conversion tests `tests/test_io_conversion.py`	Renames test from `test_returns_same_object` to `test_returns_equivalent_dataset` and changes assertion from object identity to attribute equivalence.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant DataConverter
    participant Validator as _validate_dataarray_dims
    participant FlowSystemModel
    participant Broadcaster as _ensure_coords
    
    Note over User,Broadcaster: OLD FLOW: Broadcast in to_dataarray
    User->>DataConverter: to_dataarray(data, target_coords)
    DataConverter->>DataConverter: _broadcast_dataarray_to_target_specification
    DataConverter-->>User: Broadcasted array
    
    Note over User,Broadcaster: NEW FLOW: Validate then defer broadcasting
    User->>DataConverter: to_dataarray(data, target_coords)
    DataConverter->>Validator: _validate_dataarray_dims(data, coords, dims)
    Validator->>Validator: Check dims subset, validate coords, transpose
    Validator-->>DataConverter: Validated (minimal dims) array
    DataConverter-->>User: Validated array
    
    User->>FlowSystemModel: add_variables(lower, upper, coords)
    FlowSystemModel->>Broadcaster: _ensure_coords(bounds, coords)
    Broadcaster->>Broadcaster: Broadcast scalars/lower-dims to target
    Broadcaster-->>FlowSystemModel: Broadcasted bounds
    FlowSystemModel-->>User: linopy.Variable

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

#552: Modifies cluster expansion logic in flixopt/clustering/base.py to ensure correct dimension ordering (cluster, time) during indexing operations, complementing this PR's dimension-validation strategy.

Poem

🐰 Hops through dimensions with care,
Safe indexing here and there,
Broadcasting deferred with grace,
Validation finds its rightful place—
Scalars shine in minimal state,
A better dimensional debate! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description check	✅ Passed	The PR description comprehensively covers key changes, benchmarks, file modifications, and testing results. However, it does not follow the provided template structure (Type of Change checkboxes, Related Issues section).
Docstring Coverage	✅ Passed	Docstring coverage is 85.42% which is sufficient. The required threshold is 80.00%.
Title check	✅ Passed	The title clearly and concisely summarizes the main performance optimization in the PR: keeping data in minimal form without pre-broadcasting to model dimensions.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

* Add safe isel wrappers for dimension-independent operations - Add _scalar_safe_isel_drop() to modeling.py for selecting from potentially reduced arrays (handles both scalar and array cases) - Update Storage validation to use _scalar_safe_isel for relative_minimum/maximum_charge_state access at time=0 - Update StorageModel._relative_charge_state_bounds to handle reduced arrays properly with dimension checks These wrappers prepare the codebase for future optimization where _expand_reduced_dims() could be removed from from_dataset(), keeping data compact throughout the system. Co-Authored-By: Claude Opus 4.5 <[email protected]> * Remove pre-broadcasting from DataConverter - data stays minimal Major change: DataConverter.to_dataarray() now validates dimensions instead of broadcasting to full target dimensions. This keeps data compact (scalars stay scalar, 1D arrays stay 1D) and lets linopy handle broadcasting at variable creation time. Changes: - core.py: Replace _broadcast_dataarray_to_target_specification with _validate_dataarray_dims in to_dataarray() method - components.py: Fix _relative_charge_state_bounds to handle scalar inputs that have no time dimension (expand to timesteps_extra) - conftest.py: Add assert_dims_compatible helper for tests - test_*.py: Update dimension assertions to use subset checking Memory impact (8760 timesteps × 20 periods): - Before: 24.1 MB dataset size - After: 137.1 KB dataset size Co-Authored-By: Claude Opus 4.5 <[email protected]> * Remove redundant dimension reduction - data is minimal from start Since DataConverter no longer broadcasts, these optimizations are redundant: - Remove _reduce_constant_dims() and _expand_reduced_dims() from io.py - Remove reduce_constants parameter from to_dataset() - Remove _expand_reduced_dims() call from from_dataset() Also ensure Dataset always has model coordinates (time, period, scenario, cluster) even if no data variable uses them. This is important for Dataset validity and downstream operations. Update benchmark_io.py to reflect the simplified API. Memory footprint now: - 24 timesteps: 568 bytes (23/24 vars are scalar) - 168 timesteps: 2.8 KB - 8760 timesteps × 20 periods: ~137 KB (vs 24 MB with old broadcasting) Co-Authored-By: Claude Opus 4.5 <[email protected]> * ⏺ Done. I've removed the _collapse_constant_arrays / _expand_collapsed_arrays optimization from the codebase. Here's a summary of the changes: Files modified: 1. flixopt/io.py: - Removed COLLAPSED_VAR_PREFIX constant - Removed _collapse_constant_arrays() function (~45 lines) - Removed _expand_collapsed_arrays() function (~40 lines) - Removed collapse_constants parameter from save_dataset_to_netcdf() - Removed expansion logic from load_dataset_from_netcdf() 2. flixopt/flow_system.py: - Removed import and call to _expand_collapsed_arrays in from_dataset() 3. benchmark_io.py: - Simplified benchmark_netcdf() to only benchmark single save/load (no more comparison) - Removed collapse_constants parameter from roundtrip function Rationale: Since data is now kept in minimal form throughout the system (scalars stay scalars, 1D arrays stay 1D), there's no need for the extra collapsing/expanding layer when saving to NetCDF. The file sizes are naturally small because the data isn't expanded to full dimensions. Test results: - All 10 IO tests pass - All 4 integration tests pass - Benchmark runs successfully * Summary of Fixes 1. _dataset_resample handling all-scalar data (transform_accessor.py) When all data variables are scalars (no time dimension), the resample function now: - Creates a dummy variable to resample the time coordinate - Preserves all original scalar data variables - Preserves all non-time coordinates (period, scenario, cluster) 2. _scalar_safe_reduce helper (modeling.py) Added a new helper function that safely performs aggregation operations (mean/sum/etc) over a dimension: - Returns reduced data if the dimension exists - Returns data unchanged if the dimension doesn't exist (scalar data) 3. Updated Storage intercluster linking (components.py) Used _scalar_safe_reduce for: - relative_loss_per_hour.mean('time') - timestep_duration.mean('time') 4. Updated clustering expansion (transform_accessor.py) Used _scalar_safe_reduce for relative_loss_per_hour.mean('time') 5. Fixed dimension order in expand_data (clustering/base.py) When expanding clustered data back to original timesteps: - Added transpose to ensure (cluster, time) dimension order before numpy indexing - Fixes IndexError when dimensions were in different order * Added reduction on loading old datasets * Summary of changes made: 1. flixopt/structure.py: Added dimension transpose in _ensure_coords() to ensure correct dimension order when data already has all dims but in wrong order 2. tests/test_io_conversion.py: Updated test_returns_same_object → test_returns_equivalent_dataset since convert_old_dataset now creates a new dataset when reducing constants 3. tests/deprecated/test_flow.py: Updated 6 assertions to expect minimal dimensions ('time',) instead of broadcasting to all model coords 4. tests/deprecated/test_component.py: Updated 2 assertions to expect minimal dimensions ('time',) 5. tests/deprecated/test_linear_converter.py: Updated 1 assertion to expect minimal dimensions ('time',) The key change is that data now stays in minimal form - a 1D time-varying array stays as ('time',) dimensions rather than being pre-broadcast to ('time', 'scenario') or other full model dimensions. Broadcasting happens at the linopy interface layer in FlowSystemModel.add_variables() when needed. --------- Co-authored-by: Claude Opus 4.5 <[email protected]>

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@flixopt/io.py`:
- Around line 602-613: The conditional currently uses the xarray DataArray
scalar returned by (reduced == first_slice).all() directly in an if, which
raises when used as a truth value; change the check in the loop over dim
(working with reduced and first_slice) to convert that 0-D boolean DataArray to
a Python bool (e.g., call .item() or wrap with bool(...)) so the if uses a
proper boolean before removing the dimension.

In `@flixopt/structure.py`:
- Around line 44-78: _hotfix: In _ensure_coords detect a 0-D DataArray
containing infinity and return it as a plain scalar so it isn't broadcast;
specifically, inside the function (before creating template and broadcasting)
add a guard: if isinstance(data, xr.DataArray) and data.ndim == 0 and
np.isinf(data.item()): return data.item(). This preserves the existing behavior
of keeping infinities scalar for linopy while avoiding broadcasting of 0-D
DataArray(np.inf).

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fb0fee4 and 72b437c.

📒 Files selected for processing (18)

flixopt/clustering/base.py
flixopt/components.py
flixopt/core.py
flixopt/flow_system.py
flixopt/io.py
flixopt/modeling.py
flixopt/structure.py
flixopt/transform_accessor.py
tests/conftest.py
tests/deprecated/test_component.py
tests/deprecated/test_dataconverter.py
tests/deprecated/test_flow.py
tests/deprecated/test_linear_converter.py
tests/test_component.py
tests/test_dataconverter.py
tests/test_flow.py
tests/test_io_conversion.py
tests/test_linear_converter.py

💤 Files with no reviewable changes (1)

tests/deprecated/test_dataconverter.py

🧰 Additional context used

🧬 Code graph analysis (12)

tests/test_component.py (2)

tests/conftest.py (1)

assert_dims_compatible (832-848)

flixopt/structure.py (2)

get_coords (350-385)

get_coords (1721-1726)

flixopt/flow_system.py (2)

tests/test_comparison.py (1)

timesteps (23-24)

flixopt/clustering/base.py (1)

clusters (810-946)

tests/deprecated/test_flow.py (2)

flixopt/flow_system.py (1)

dims (2018-2038)

flixopt/structure.py (1)

dims (285-287)

flixopt/structure.py (1)

flixopt/flow_system.py (2)

coords (2083-2101)

dims (2018-2038)

tests/test_flow.py (2)

tests/conftest.py (1)

assert_dims_compatible (832-848)

flixopt/structure.py (2)

get_coords (350-385)

get_coords (1721-1726)

tests/test_io_conversion.py (1)

flixopt/io.py (1)

convert_old_dataset (784-838)

flixopt/components.py (1)

flixopt/modeling.py (3)

_scalar_safe_isel (14-32)

_scalar_safe_isel_drop (35-54)

_scalar_safe_reduce (57-76)

tests/test_linear_converter.py (2)

tests/conftest.py (1)

assert_dims_compatible (832-848)

flixopt/structure.py (2)

get_coords (350-385)

get_coords (1721-1726)

flixopt/clustering/base.py (2)

flixopt/flow_system.py (1)

dims (2018-2038)

flixopt/structure.py (1)

dims (285-287)

tests/deprecated/test_component.py (2)

flixopt/flow_system.py (1)

dims (2018-2038)

flixopt/structure.py (1)

dims (285-287)

tests/conftest.py (2)

flixopt/flow_system.py (1)

dims (2018-2038)

flixopt/structure.py (1)

dims (285-287)

flixopt/transform_accessor.py (1)

flixopt/modeling.py (1)

_scalar_safe_reduce (57-76)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: test (3.13)
GitHub Check: test (3.12)
GitHub Check: test (3.14)
GitHub Check: test (3.11)

🔇 Additional comments (28)

flixopt/clustering/base.py (1)

442-444: Good safeguard for dimension ordering before indexing.

Ensuring ('cluster', 'time') order avoids mismatched numpy indexing when upstream data arrives as ('time','cluster').

tests/deprecated/test_linear_converter.py (1)

285-286: Keeps time-varying factor minimal.
This aligns with the new “no pre-broadcasting” behavior.

flixopt/io.py (2)

551-552: No actionable change in this line.

784-838: Conversion flow looks good.
The opt-in reduce_constants flag keeps backward compatibility while enabling automatic reduction.

flixopt/modeling.py (1)

14-76: Scalar-safe helpers look solid.
They preserve compact inputs while keeping xarray behavior consistent for callers.

flixopt/transform_accessor.py (4)

19-19: No actionable change here.

371-389: Scalar-only resample handling is robust.
Creating a dummy variable to drive time resampling keeps metadata consistent.

424-429: Good preservation of non-time coords.
This prevents coordinate loss when merging resampled and non-time variables.

1378-1379: Nice use of scalar-safe reduction.
Prevents failures when loss data is reduced or scalar.

flixopt/components.py (5)

19-19: No actionable change here.

573-578: Scalar-safe checks are appropriate here.
This keeps plausibility checks working with reduced-dimensional parameters.

1107-1147: Bounds expansion logic looks correct.
Handles scalar vs time-indexed bounds while preserving final-step semantics.

1507-1509: Scalar-safe decay computation is a good fit.
Maintains correctness when loss rates are reduced or constant.

1552-1554: Consistent scalar-safe reduction.
Keeps decay and timestep computations stable with minimal-form inputs.

tests/test_linear_converter.py (1)

7-7: Dims-compatibility assertion is a good fit here.
Keeps the test aligned with the minimal-dims strategy while still validating model compatibility.

Also applies to: 285-286

flixopt/flow_system.py (1)

694-704: Ensuring model coordinates are always present is solid.
This keeps dataset structure explicit even when no variable carries a given dimension.

tests/deprecated/test_component.py (2)

134-135: Time-only dims expectation is clear and consistent.
Matches the new “minimal form” policy for upper bound arrays.

315-316: Same minimal-dims assertion is appropriate here.
Keeps the test aligned with scalar/1D preservation.

tests/conftest.py (1)

832-848: Centralized dims-compatibility helper looks good.
Helps keep tests aligned with the minimal-dims behavior.

tests/test_component.py (2)

10-10: Updated dims-compatibility check is the right adjustment.
Keeps the test robust under minimal-dims storage.

Also applies to: 135-136

315-316: Consistent dims-compatibility check here as well.
Nice alignment with the new minimal-form behavior.

tests/deprecated/test_flow.py (1)

72-74: Minimal-dim assertions look correct.
These checks align with the new “keep data minimal” behavior in deprecated tests.

Also applies to: 186-188, 252-254, 328-330, 397-399, 637-639

tests/test_flow.py (1)

7-7: Good use of assert_dims_compatible for minimal-form inputs.
Keeps tests resilient while enforcing compatibility with model coordinates.

Also applies to: 72-73, 185-186, 250-251, 325-326, 393-394, 632-633

tests/test_dataconverter.py (2)

49-54: Minimal-form assertions are solid and consistent.
These updates comprehensively validate the “no broadcast at conversion time” behavior across scalar/1D/2D/Series/DataFrame/TimeSeriesData and edge cases.

Also applies to: 57-62, 65-70, 73-80, 111-118, 120-125, 139-147, 191-197, 206-213, 250-257, 285-293, 440-448, 450-456, 458-472, 489-499, 532-537, 539-550, 552-563, 564-578, 608-613, 712-722, 724-734, 757-767, 849-862, 864-875, 877-894, 919-946, 980-985, 1071-1075, 1113-1117, 1207-1212

275-283: Canonical-order transpose expectations are well captured.
The multidimensional permutation tests clearly enforce canonical dimension ordering.

Also applies to: 306-313

flixopt/core.py (1)

369-416: Dimension validation + canonical reordering is a good fit here.
This cleanly enforces compatibility without eager broadcasting.

Also applies to: 532-534

flixopt/structure.py (1)

139-155: Bounds broadcasting at the linopy interface looks good.
This preserves compact internal data while ensuring variable shapes are correctly inferred.

tests/test_io_conversion.py (1)

207-212: LGTM — equivalence check matches the conversion contract.
The attrs-focused assertion aligns with the updated conversion behavior and avoids relying on object identity.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-16T11:16:35Z

flixopt/io.py

+        # Try to reduce each dimension
+        reduced = da
+        for dim in list(da.dims):
+            if dim not in reduced.dims:
+                continue  # Already removed
+            # Check if constant along this dimension
+            first_slice = reduced.isel({dim: 0})
+            is_constant = (reduced == first_slice).all()
+            if is_constant:
+                # Remove this dimension by taking first slice
+                reduced = first_slice
+


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, locate and read the relevant file find . -name "io.py" -path "*/flixopt/*" | head -5

Repository: flixOpt/flixopt

Length of output: 74

🏁 Script executed:

# Read the specific lines mentioned in the review sed -n '600,615p' flixopt/io.py

Repository: flixOpt/flixopt

Length of output: 585

🏁 Script executed:

# Let's also check the context around this code sed -n '595,620p' flixopt/io.py

Repository: flixOpt/flixopt

Length of output: 879

🏁 Script executed:

python3 << 'EOF' import xarray as xr import numpy as np # Create test data data = xr.DataArray(np.array([[1, 1], [1, 1]]), dims=['x', 'y']) print("Original data:") print(data) print() # Test 1: Check what .all() returns comparison = (data == data).all() print("Result of (data == data).all():") print(comparison) print(f"Type: {type(comparison)}") print(f"Is it a scalar DataArray? {isinstance(comparison, xr.DataArray)}") print() # Test 2: Try using scalar DataArray in if statement try: if comparison: print("Scalar DataArray in if: SUCCESS") except ValueError as e: print(f"Scalar DataArray in if: ValueError - {e}") except Exception as e: print(f"Scalar DataArray in if: {type(e).__name__} - {e}") print() # Test 3: Check isel with and without drop print("Testing isel behavior:") data_2d = xr.DataArray(np.array([[1, 2], [3, 4]]), dims=['x', 'y']) print("Original shape:", data_2d.shape, "dims:", data_2d.dims) isel_without_drop = data_2d.isel({'x': 0}) print("After isel({'x': 0}):") print(f" Shape: {isel_without_drop.shape}, Dims: {isel_without_drop.dims}") isel_with_drop = data_2d.isel({'x': 0}, drop=True) print("After isel({'x': 0}, drop=True):") print(f" Shape: {isel_with_drop.shape}, Dims: {isel_with_drop.dims}") print() # Test 4: Check if first_slice from isel can be assigned back print("Testing assignment:") data_reduced = data_2d first_slice_no_drop = data_2d.isel({'x': 0}) first_slice_drop = data_2d.isel({'x': 0}, drop=True) print("Original shape:", data_reduced.shape) print("first_slice (no drop) shape:", first_slice_no_drop.shape) print("first_slice (drop=True) shape:", first_slice_drop.shape) EOF

Repository: flixOpt/flixopt

Length of output: 145

🏁 Script executed:

python3 << 'EOF' import xarray as xr import numpy as np # Quick test for scalar DataArray in if da = xr.DataArray(True) try: if da: pass print("No error with scalar in if") except ValueError as e: print(f"ValueError: {e}") EOF

Repository: flixOpt/flixopt

Length of output: 177

🏁 Script executed:

python3 << 'EOF' import xarray as xr import numpy as np # Test isel with/without drop da = xr.DataArray([[1, 2], [3, 4]], dims=['x', 'y']) s1 = da.isel({'x': 0}) s2 = da.isel({'x': 0}, drop=True) print("No drop - dims:", s1.dims, "shape:", s1.shape) print("With drop - dims:", s2.dims, "shape:", s2.shape) EOF

Repository: flixOpt/flixopt

Length of output: 177

🌐 Web query:

xarray scalar DataArray if statement ValueError

💡 Result:

When you use an if on an xarray DataArray that contains more than one element Python can't reduce it to a single True/False — you get the familiar ValueError about ambiguous truth. Use one of these fixes:

If it really is a scalar (0‑D) DataArray, convert to a Python scalar first: val = da.item() or val = da.values.item(); then use if val: ...

If it may be multi‑element and you want to test "any" or "all", reduce explicitly: if da.any(): ... or if da.all(): ...

If you want to act where a condition holds, use boolean masking or xarray methods (e.g. da.where(cond) or da.sel/isel with the condition).

Examples:

Scalar: if da.item() == 0: ...

Multi: if (da > 0).any(): ...

Masking: da2 = da.where(da > 0)

References: general ambiguity and fixes (NumPy/arrays). [1]; xarray where/boolean masking docs. [2]; note on single‑element behavior. [3].

🌐 Web query:

xarray isel drop=True parameter behavior dimension

💡 Result:

Short answer: isel's drop parameter only affects coordinate variables, not the data-array dimension removal semantics. When you index a dimension with an integer, xarray removes that dimension from the returned DataArray/Dataset; if drop=True, any coordinate variables that were indexed by that integer are dropped instead of being turned into 0‑d (scalar) coordinate variables. See the xarray docs. [1][2]

Sources:

xarray.DataArray.isel documentation. [1]

xarray.Dataset.isel documentation. [2]

Fix boolean scalar handling in conditional check.

(reduced == first_slice).all() returns a 0-D scalar DataArray, which cannot be used directly in an if statement. Convert to a Python boolean using .item().

- first_slice = reduced.isel({dim: 0}) - is_constant = (reduced == first_slice).all() - if is_constant: - reduced = first_slice + first_slice = reduced.isel({dim: 0}) + is_constant = bool((reduced == first_slice).all().item()) + if is_constant: + reduced = first_slice

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Try to reduce each dimension

reduced = da

for dim in list(da.dims):

if dim not in reduced.dims:

continue # Already removed

# Check if constant along this dimension

first_slice = reduced.isel({dim: 0})

is_constant = (reduced == first_slice).all()

if is_constant:

# Remove this dimension by taking first slice

reduced = first_slice

# Try to reduce each dimension

reduced = da

for dim in list(da.dims):

if dim not in reduced.dims:

continue # Already removed

# Check if constant along this dimension

first_slice = reduced.isel({dim: 0})

is_constant = bool((reduced == first_slice).all().item())

if is_constant:

# Remove this dimension by taking first slice

reduced = first_slice

🤖 Prompt for AI Agents

In `@flixopt/io.py` around lines 602 - 613, The conditional currently uses the xarray DataArray scalar returned by (reduced == first_slice).all() directly in an if, which raises when used as a truth value; change the check in the loop over dim (working with reduced and first_slice) to convert that 0-D boolean DataArray to a Python bool (e.g., call .item() or wrap with bool(...)) so the if uses a proper boolean before removing the dimension.

coderabbitai · 2026-01-16T11:16:36Z

flixopt/structure.py

+def _ensure_coords(
+    data: xr.DataArray | float | int,
+    coords: xr.Coordinates | dict,
+) -> xr.DataArray | float:
+    """Broadcast data to coords if needed.
+
+    This is used at the linopy interface to ensure bounds are properly broadcasted
+    to the target variable shape. Linopy needs at least one bound to have all
+    dimensions to determine the variable shape.
+
+    Note: Infinity values (-inf, inf) are kept as scalars because linopy uses
+    special checks like `if (lower != -inf)` that fail with DataArrays.
+    """
+    # Handle both dict and xr.Coordinates
+    if isinstance(coords, dict):
+        coord_dims = list(coords.keys())
+    else:
+        coord_dims = list(coords.dims)
+
+    # Keep infinity values as scalars (linopy uses them for special checks)
+    if not isinstance(data, xr.DataArray):
+        if np.isinf(data):
+            return data
+        # Finite scalar - create full DataArray
+        return xr.DataArray(data, coords=coords, dims=coord_dims)
+
+    if set(data.dims) == set(coord_dims):
+        # Has all dims - ensure correct order
+        if data.dims != tuple(coord_dims):
+            return data.transpose(*coord_dims)
+        return data
+
+    # Broadcast to full coords (broadcast_like ensures correct dim order)
+    template = xr.DataArray(coords=coords, dims=coord_dims)
+    return data.broadcast_like(template)


⚠️ Potential issue | 🟡 Minor

Handle scalar DataArray ±inf before broadcasting.
The helper keeps Python/NumPy infinities scalar, but a 0‑D xr.DataArray(np.inf) (e.g., from DataConverter) will be broadcast and can trigger linopy’s lower != -inf checks. A small guard avoids that edge case.

🐛 Proposed fix

- if not isinstance(data, xr.DataArray): - if np.isinf(data): - return data - # Finite scalar - create full DataArray - return xr.DataArray(data, coords=coords, dims=coord_dims) + if not isinstance(data, xr.DataArray): + if np.isinf(data): + return data + # Finite scalar - create full DataArray + return xr.DataArray(data, coords=coords, dims=coord_dims) + + # Keep scalar DataArray infinities as scalars too + if data.ndim == 0 and np.isinf(data.item()): + return data.item()

🤖 Prompt for AI Agents

In `@flixopt/structure.py` around lines 44 - 78, _hotfix: In _ensure_coords detect a 0-D DataArray containing infinity and return it as a plain scalar so it isn't broadcast; specifically, inside the function (before creating template and broadcasting) add a guard: if isinstance(data, xr.DataArray) and data.ndim == 0 and np.isinf(data.item()): return data.item(). This preserves the existing behavior of keeping infinities scalar for linopy while avoiding broadcasting of 0-D DataArray(np.inf).

FBumann and others added 4 commits January 15, 2026 20:04

Add collapse to IO

6efdf64

Temp

e07fbf3

Set reduce_constants=True by default in to_dataset()

83f1c2d

Benchmarks show 99%+ reduction in memory and file size for multi-period models, with faster IO speeds. The optimization is now enabled by default. Co-Authored-By: Claude Opus 4.5 <[email protected]>

FBumann changed the title ~~Feature/speedup io~~ Optimize IO: Reduce constant dimensions in to_dataset() Jan 16, 2026

FBumann changed the title ~~Optimize IO: Reduce constant dimensions in to_dataset()~~ Optimize IO: Keep data in minimal form (no pre-broadcasting) Jan 16, 2026

FBumann marked this pull request as ready for review January 16, 2026 11:06

coderabbitai bot reviewed Jan 16, 2026

View reviewed changes

FBumann changed the title ~~Optimize IO: Keep data in minimal form (no pre-broadcasting)~~ perf: Keep data in minimal form (no pre-broadcasting) Jan 16, 2026

FBumann merged commit d3ce064 into main Jan 16, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: Keep data in minimal form (no pre-broadcasting) #575

perf: Keep data in minimal form (no pre-broadcasting) #575

Uh oh!

FBumann commented Jan 16, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 16, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 16, 2026

Uh oh!

coderabbitai bot Jan 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

perf: Keep data in minimal form (no pre-broadcasting) #575

perf: Keep data in minimal form (no pre-broadcasting) #575

Uh oh!

Conversation

FBumann commented Jan 16, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Changes

Core: Minimal Data Form

New Helper: _ensure_coords() in structure.py

New Helper: _scalar_safe_reduce() in modeling.py

Clustering Fixes

Old Dataset Compatibility

Benchmark Results

Files Changed

Type of Change

Testing

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FBumann commented Jan 16, 2026 •

edited by coderabbitai bot

Loading

New Helper: `_ensure_coords()` in `structure.py`

New Helper: `_scalar_safe_reduce()` in `modeling.py`

coderabbitai bot commented Jan 16, 2026 •

edited

Loading