Skip to content

Conversation

@FBumann
Copy link
Member

@FBumann FBumann commented Jan 16, 2026

Summary

Major optimization: Data now stays in minimal form throughout the system. Scalars stay scalar, 1D arrays stay 1D - no pre-broadcasting to full model dimensions. This dramatically reduces memory usage and file sizes.

Key Changes

Core: Minimal Data Form

  • DataConverter.to_dataarray() now validates dimensions instead of broadcasting
  • Data keeps its natural dimensionality (e.g., relative_minimum=0 stays scalar)
  • Broadcasting happens at the linopy interface via FlowSystemModel.add_variables() when needed

New Helper: _ensure_coords() in structure.py

  • Broadcasts bounds to target coords only when creating linopy variables
  • Ensures correct dimension order via transpose when needed

New Helper: _scalar_safe_reduce() in modeling.py

  • Safely performs aggregations (mean/sum) over dimensions
  • Returns data unchanged if dimension doesn't exist (handles scalar data)

Clustering Fixes

  • expand_data(): Added transpose to ensure (cluster, time) dimension order before numpy indexing
  • _dataset_resample(): Handles all-scalar datasets by creating dummy variable for time coordinate resampling

Old Dataset Compatibility

  • convert_old_dataset(): Now reduces constant arrays dimension-by-dimension
  • Old files with pre-broadcasted data are automatically reduced on load

Benchmark Results

Configuration Memory Before Memory After Reduction
8760 timesteps (1 year) 1.3 MB 137 KB 89%
8760 × 20 periods 24.1 MB 137 KB 99.4%

File sizes follow similar reductions. IO speed improved 25-63%.

Files Changed

File Changes
flixopt/core.py _validate_dataarray_dims() replaces broadcasting, adds transpose to canonical order
flixopt/structure.py _ensure_coords() helper, FlowSystemModel.add_variables() override
flixopt/modeling.py _scalar_safe_reduce() helper
flixopt/components.py Use _scalar_safe_reduce for storage calculations
flixopt/clustering/base.py Fix dimension order in expand_data()
flixopt/transform_accessor.py Fix _dataset_resample() for all-scalar data
flixopt/io.py _reduce_constant_arrays() for old datasets, removed collapse/expand layer

Type of Change

  • Performance optimization
  • Internal refactoring (backward compatible API)

Testing

  • All 1476 tests pass
  • Roundtrip tested (to_dataset → from_dataset preserves all data)
  • Old dataset compatibility tested

Checklist

  • Backward compatible (users see no API changes)
  • Old files load correctly and get reduced automatically

Summary by CodeRabbit

  • Bug Fixes

    • Fixed cluster-based data expansion to correctly handle dimension ordering
    • Improved scalar data handling for storage bounds and inter-cluster operations
    • Better reduction of constant broadcasted arrays in conversions
  • Refactor

    • Enhanced internal validation of data dimensions for improved consistency and robustness

✏️ Tip: You can customize this high-level summary in your review settings.

FBumann and others added 4 commits January 15, 2026 20:04
    - _reduce_constant_dims(): Reduces dimensions where all values are constant
    - _expand_reduced_dims(): Restores original dimensions with correct order
  2. Added reduce_constants parameter to FlowSystem.to_dataset():
    - Default: False (opt-in)
    - When True: Reduces constant dimensions for memory efficiency
  3. Updated FlowSystem.from_dataset():
    - Expands both collapsed arrays (from NetCDF) and reduced dimensions
  4. Key fixes:
    - Store original dimension order in attrs to preserve (cluster, time) vs (time, cluster) ordering
    - Skip solution variables in reduction (they're prefixed with solution|)

  The optimization is opt-in via to_dataset(reduce_constants=True). For file storage, save_dataset_to_netcdf still collapses constants to scalars by default.
Benchmarks show 99%+ reduction in memory and file size for multi-period
models, with faster IO speeds. The optimization is now enabled by default.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 16, 2026

📝 Walkthrough

Walkthrough

This PR introduces dimension-aware utilities for safe xarray operations and refactors data conversion to validate dimensions first before broadcasting, shifting broadcasting responsibility from DataConverter.to_dataarray to FlowSystemModel.add_variables via new helper functions.

Changes

Cohort / File(s) Summary
Scalar-safe dimensional helpers
flixopt/modeling.py, flixopt/components.py, flixopt/transform_accessor.py
Introduces _scalar_safe_isel, _scalar_safe_isel_drop, and _scalar_safe_reduce utilities that safely apply dimensional operations only when dimensions exist; updates components storage and linked storage logic to use these helpers for relative bounds and loss reductions; applies safe reduction in SOC decay calculation.
Data validation and conversion
flixopt/core.py, flixopt/structure.py
Adds _validate_dataarray_dims method to replace broadcasting in DataConverter.to_dataarray with dimension validation; introduces _ensure_coords helper and new FlowSystemModel.add_variables override to defer broadcasting until variable creation, ensuring minimal-dimensional scalar preservation.
Dataset and IO utilities
flixopt/flow_system.py, flixopt/io.py
Ensures model coordinates (time, period, scenario, cluster) are always present in Dataset output; adds _reduce_constant_arrays helper and reduce_constants parameter to convert_old_dataset for post-conversion dimension reduction.
Cluster expansion
flixopt/clustering/base.py
Ensures data operand is transposed to (cluster, time) order before indexing in cluster-based expansion within _expand_slice.
Test infrastructure
tests/conftest.py
Introduces assert_dims_compatible helper to validate DataArray dimensions are a subset of model coordinates.
Updated test expectations
tests/test_component.py, tests/test_flow.py, tests/test_linear_converter.py
Migrates dimensional assertions to use assert_dims_compatible instead of direct equality checks.
Deprecated tests removed/modified
tests/deprecated/test_dataconverter.py, tests/deprecated/test_component.py, tests/deprecated/test_flow.py, tests/deprecated/test_linear_converter.py
Removes extensive DataConverter broadcasting tests (1262 lines); updates deprecated test expectations from full coordinate broadcasts to 1D time-only dimensions.
DataConverter test refactor
tests/test_dataconverter.py
Shifts all assertions to expect 0D scalar results instead of broadcasted arrays; documents that broadcasting is deferred to linopy/higher-level interfaces.
IO conversion tests
tests/test_io_conversion.py
Renames test from test_returns_same_object to test_returns_equivalent_dataset and changes assertion from object identity to attribute equivalence.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant DataConverter
    participant Validator as _validate_dataarray_dims
    participant FlowSystemModel
    participant Broadcaster as _ensure_coords
    
    Note over User,Broadcaster: OLD FLOW: Broadcast in to_dataarray
    User->>DataConverter: to_dataarray(data, target_coords)
    DataConverter->>DataConverter: _broadcast_dataarray_to_target_specification
    DataConverter-->>User: Broadcasted array
    
    Note over User,Broadcaster: NEW FLOW: Validate then defer broadcasting
    User->>DataConverter: to_dataarray(data, target_coords)
    DataConverter->>Validator: _validate_dataarray_dims(data, coords, dims)
    Validator->>Validator: Check dims subset, validate coords, transpose
    Validator-->>DataConverter: Validated (minimal dims) array
    DataConverter-->>User: Validated array
    
    User->>FlowSystemModel: add_variables(lower, upper, coords)
    FlowSystemModel->>Broadcaster: _ensure_coords(bounds, coords)
    Broadcaster->>Broadcaster: Broadcast scalars/lower-dims to target
    Broadcaster-->>FlowSystemModel: Broadcasted bounds
    FlowSystemModel-->>User: linopy.Variable
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • #552: Modifies cluster expansion logic in flixopt/clustering/base.py to ensure correct dimension ordering (cluster, time) during indexing operations, complementing this PR's dimension-validation strategy.

Poem

🐰 Hops through dimensions with care,
Safe indexing here and there,
Broadcasting deferred with grace,
Validation finds its rightful place—
Scalars shine in minimal state,
A better dimensional debate! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The PR description comprehensively covers key changes, benchmarks, file modifications, and testing results. However, it does not follow the provided template structure (Type of Change checkboxes, Related Issues section).
Docstring Coverage ✅ Passed Docstring coverage is 85.42% which is sufficient. The required threshold is 80.00%.
Title check ✅ Passed The title clearly and concisely summarizes the main performance optimization in the PR: keeping data in minimal form without pre-broadcasting to model dimensions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@FBumann FBumann changed the title Feature/speedup io Optimize IO: Reduce constant dimensions in to_dataset() Jan 16, 2026
* Add safe isel wrappers for dimension-independent operations

- Add _scalar_safe_isel_drop() to modeling.py for selecting from
  potentially reduced arrays (handles both scalar and array cases)
- Update Storage validation to use _scalar_safe_isel for
  relative_minimum/maximum_charge_state access at time=0
- Update StorageModel._relative_charge_state_bounds to handle
  reduced arrays properly with dimension checks

These wrappers prepare the codebase for future optimization where
_expand_reduced_dims() could be removed from from_dataset(),
keeping data compact throughout the system.

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* Remove pre-broadcasting from DataConverter - data stays minimal

Major change: DataConverter.to_dataarray() now validates dimensions
instead of broadcasting to full target dimensions. This keeps data
compact (scalars stay scalar, 1D arrays stay 1D) and lets linopy
handle broadcasting at variable creation time.

Changes:
- core.py: Replace _broadcast_dataarray_to_target_specification with
  _validate_dataarray_dims in to_dataarray() method
- components.py: Fix _relative_charge_state_bounds to handle scalar
  inputs that have no time dimension (expand to timesteps_extra)
- conftest.py: Add assert_dims_compatible helper for tests
- test_*.py: Update dimension assertions to use subset checking

Memory impact (8760 timesteps × 20 periods):
- Before: 24.1 MB dataset size
- After: 137.1 KB dataset size

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* Remove redundant dimension reduction - data is minimal from start

Since DataConverter no longer broadcasts, these optimizations are redundant:
- Remove _reduce_constant_dims() and _expand_reduced_dims() from io.py
- Remove reduce_constants parameter from to_dataset()
- Remove _expand_reduced_dims() call from from_dataset()

Also ensure Dataset always has model coordinates (time, period, scenario,
cluster) even if no data variable uses them. This is important for Dataset
validity and downstream operations.

Update benchmark_io.py to reflect the simplified API.

Memory footprint now:
- 24 timesteps: 568 bytes (23/24 vars are scalar)
- 168 timesteps: 2.8 KB
- 8760 timesteps × 20 periods: ~137 KB (vs 24 MB with old broadcasting)

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* ⏺ Done. I've removed the _collapse_constant_arrays / _expand_collapsed_arrays optimization from the codebase. Here's a summary of the changes:

  Files modified:

  1. flixopt/io.py:
    - Removed COLLAPSED_VAR_PREFIX constant
    - Removed _collapse_constant_arrays() function (~45 lines)
    - Removed _expand_collapsed_arrays() function (~40 lines)
    - Removed collapse_constants parameter from save_dataset_to_netcdf()
    - Removed expansion logic from load_dataset_from_netcdf()
  2. flixopt/flow_system.py:
    - Removed import and call to _expand_collapsed_arrays in from_dataset()
  3. benchmark_io.py:
    - Simplified benchmark_netcdf() to only benchmark single save/load (no more comparison)
    - Removed collapse_constants parameter from roundtrip function

  Rationale: Since data is now kept in minimal form throughout the system (scalars stay scalars, 1D arrays stay 1D), there's no need for the extra collapsing/expanding layer when saving to NetCDF. The file sizes are naturally small because the data isn't expanded to full dimensions.

  Test results:
  - All 10 IO tests pass
  - All 4 integration tests pass
  - Benchmark runs successfully

* Summary of Fixes

  1. _dataset_resample handling all-scalar data (transform_accessor.py)

  When all data variables are scalars (no time dimension), the resample function now:
  - Creates a dummy variable to resample the time coordinate
  - Preserves all original scalar data variables
  - Preserves all non-time coordinates (period, scenario, cluster)

  2. _scalar_safe_reduce helper (modeling.py)

  Added a new helper function that safely performs aggregation operations (mean/sum/etc) over a dimension:
  - Returns reduced data if the dimension exists
  - Returns data unchanged if the dimension doesn't exist (scalar data)

  3. Updated Storage intercluster linking (components.py)

  Used _scalar_safe_reduce for:
  - relative_loss_per_hour.mean('time')
  - timestep_duration.mean('time')

  4. Updated clustering expansion (transform_accessor.py)

  Used _scalar_safe_reduce for relative_loss_per_hour.mean('time')

  5. Fixed dimension order in expand_data (clustering/base.py)

  When expanding clustered data back to original timesteps:
  - Added transpose to ensure (cluster, time) dimension order before numpy indexing
  - Fixes IndexError when dimensions were in different order

* Added reduction on loading old datasets

* Summary of changes made:

  1. flixopt/structure.py: Added dimension transpose in _ensure_coords() to ensure correct dimension order when data already has all dims but in wrong order
  2. tests/test_io_conversion.py: Updated test_returns_same_object → test_returns_equivalent_dataset since convert_old_dataset now creates a new dataset when reducing constants
  3. tests/deprecated/test_flow.py: Updated 6 assertions to expect minimal dimensions ('time',) instead of broadcasting to all model coords
  4. tests/deprecated/test_component.py: Updated 2 assertions to expect minimal dimensions ('time',)
  5. tests/deprecated/test_linear_converter.py: Updated 1 assertion to expect minimal dimensions ('time',)

  The key change is that data now stays in minimal form - a 1D time-varying array stays as ('time',) dimensions rather than being pre-broadcast to ('time', 'scenario') or other full model dimensions. Broadcasting happens at the linopy interface layer in FlowSystemModel.add_variables() when needed.

---------

Co-authored-by: Claude Opus 4.5 <[email protected]>
@FBumann FBumann changed the title Optimize IO: Reduce constant dimensions in to_dataset() Optimize IO: Keep data in minimal form (no pre-broadcasting) Jan 16, 2026
@FBumann FBumann marked this pull request as ready for review January 16, 2026 11:06
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@flixopt/io.py`:
- Around line 602-613: The conditional currently uses the xarray DataArray
scalar returned by (reduced == first_slice).all() directly in an if, which
raises when used as a truth value; change the check in the loop over dim
(working with reduced and first_slice) to convert that 0-D boolean DataArray to
a Python bool (e.g., call .item() or wrap with bool(...)) so the if uses a
proper boolean before removing the dimension.

In `@flixopt/structure.py`:
- Around line 44-78: _hotfix: In _ensure_coords detect a 0-D DataArray
containing infinity and return it as a plain scalar so it isn't broadcast;
specifically, inside the function (before creating template and broadcasting)
add a guard: if isinstance(data, xr.DataArray) and data.ndim == 0 and
np.isinf(data.item()): return data.item(). This preserves the existing behavior
of keeping infinities scalar for linopy while avoiding broadcasting of 0-D
DataArray(np.inf).
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fb0fee4 and 72b437c.

📒 Files selected for processing (18)
  • flixopt/clustering/base.py
  • flixopt/components.py
  • flixopt/core.py
  • flixopt/flow_system.py
  • flixopt/io.py
  • flixopt/modeling.py
  • flixopt/structure.py
  • flixopt/transform_accessor.py
  • tests/conftest.py
  • tests/deprecated/test_component.py
  • tests/deprecated/test_dataconverter.py
  • tests/deprecated/test_flow.py
  • tests/deprecated/test_linear_converter.py
  • tests/test_component.py
  • tests/test_dataconverter.py
  • tests/test_flow.py
  • tests/test_io_conversion.py
  • tests/test_linear_converter.py
💤 Files with no reviewable changes (1)
  • tests/deprecated/test_dataconverter.py
🧰 Additional context used
🧬 Code graph analysis (12)
tests/test_component.py (2)
tests/conftest.py (1)
  • assert_dims_compatible (832-848)
flixopt/structure.py (2)
  • get_coords (350-385)
  • get_coords (1721-1726)
flixopt/flow_system.py (2)
tests/test_comparison.py (1)
  • timesteps (23-24)
flixopt/clustering/base.py (1)
  • clusters (810-946)
tests/deprecated/test_flow.py (2)
flixopt/flow_system.py (1)
  • dims (2018-2038)
flixopt/structure.py (1)
  • dims (285-287)
flixopt/structure.py (1)
flixopt/flow_system.py (2)
  • coords (2083-2101)
  • dims (2018-2038)
tests/test_flow.py (2)
tests/conftest.py (1)
  • assert_dims_compatible (832-848)
flixopt/structure.py (2)
  • get_coords (350-385)
  • get_coords (1721-1726)
tests/test_io_conversion.py (1)
flixopt/io.py (1)
  • convert_old_dataset (784-838)
flixopt/components.py (1)
flixopt/modeling.py (3)
  • _scalar_safe_isel (14-32)
  • _scalar_safe_isel_drop (35-54)
  • _scalar_safe_reduce (57-76)
tests/test_linear_converter.py (2)
tests/conftest.py (1)
  • assert_dims_compatible (832-848)
flixopt/structure.py (2)
  • get_coords (350-385)
  • get_coords (1721-1726)
flixopt/clustering/base.py (2)
flixopt/flow_system.py (1)
  • dims (2018-2038)
flixopt/structure.py (1)
  • dims (285-287)
tests/deprecated/test_component.py (2)
flixopt/flow_system.py (1)
  • dims (2018-2038)
flixopt/structure.py (1)
  • dims (285-287)
tests/conftest.py (2)
flixopt/flow_system.py (1)
  • dims (2018-2038)
flixopt/structure.py (1)
  • dims (285-287)
flixopt/transform_accessor.py (1)
flixopt/modeling.py (1)
  • _scalar_safe_reduce (57-76)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: test (3.13)
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.14)
  • GitHub Check: test (3.11)
🔇 Additional comments (28)
flixopt/clustering/base.py (1)

442-444: Good safeguard for dimension ordering before indexing.

Ensuring ('cluster', 'time') order avoids mismatched numpy indexing when upstream data arrives as ('time','cluster').

tests/deprecated/test_linear_converter.py (1)

285-286: Keeps time-varying factor minimal.
This aligns with the new “no pre-broadcasting” behavior.

flixopt/io.py (2)

551-552: No actionable change in this line.


784-838: Conversion flow looks good.
The opt-in reduce_constants flag keeps backward compatibility while enabling automatic reduction.

flixopt/modeling.py (1)

14-76: Scalar-safe helpers look solid.
They preserve compact inputs while keeping xarray behavior consistent for callers.

flixopt/transform_accessor.py (4)

19-19: No actionable change here.


371-389: Scalar-only resample handling is robust.
Creating a dummy variable to drive time resampling keeps metadata consistent.


424-429: Good preservation of non-time coords.
This prevents coordinate loss when merging resampled and non-time variables.


1378-1379: Nice use of scalar-safe reduction.
Prevents failures when loss data is reduced or scalar.

flixopt/components.py (5)

19-19: No actionable change here.


573-578: Scalar-safe checks are appropriate here.
This keeps plausibility checks working with reduced-dimensional parameters.


1107-1147: Bounds expansion logic looks correct.
Handles scalar vs time-indexed bounds while preserving final-step semantics.


1507-1509: Scalar-safe decay computation is a good fit.
Maintains correctness when loss rates are reduced or constant.


1552-1554: Consistent scalar-safe reduction.
Keeps decay and timestep computations stable with minimal-form inputs.

tests/test_linear_converter.py (1)

7-7: Dims-compatibility assertion is a good fit here.
Keeps the test aligned with the minimal-dims strategy while still validating model compatibility.

Also applies to: 285-286

flixopt/flow_system.py (1)

694-704: Ensuring model coordinates are always present is solid.
This keeps dataset structure explicit even when no variable carries a given dimension.

tests/deprecated/test_component.py (2)

134-135: Time-only dims expectation is clear and consistent.
Matches the new “minimal form” policy for upper bound arrays.


315-316: Same minimal-dims assertion is appropriate here.
Keeps the test aligned with scalar/1D preservation.

tests/conftest.py (1)

832-848: Centralized dims-compatibility helper looks good.
Helps keep tests aligned with the minimal-dims behavior.

tests/test_component.py (2)

10-10: Updated dims-compatibility check is the right adjustment.
Keeps the test robust under minimal-dims storage.

Also applies to: 135-136


315-316: Consistent dims-compatibility check here as well.
Nice alignment with the new minimal-form behavior.

tests/deprecated/test_flow.py (1)

72-74: Minimal-dim assertions look correct.
These checks align with the new “keep data minimal” behavior in deprecated tests.

Also applies to: 186-188, 252-254, 328-330, 397-399, 637-639

tests/test_flow.py (1)

7-7: Good use of assert_dims_compatible for minimal-form inputs.
Keeps tests resilient while enforcing compatibility with model coordinates.

Also applies to: 72-73, 185-186, 250-251, 325-326, 393-394, 632-633

tests/test_dataconverter.py (2)

49-54: Minimal-form assertions are solid and consistent.
These updates comprehensively validate the “no broadcast at conversion time” behavior across scalar/1D/2D/Series/DataFrame/TimeSeriesData and edge cases.

Also applies to: 57-62, 65-70, 73-80, 111-118, 120-125, 139-147, 191-197, 206-213, 250-257, 285-293, 440-448, 450-456, 458-472, 489-499, 532-537, 539-550, 552-563, 564-578, 608-613, 712-722, 724-734, 757-767, 849-862, 864-875, 877-894, 919-946, 980-985, 1071-1075, 1113-1117, 1207-1212


275-283: Canonical-order transpose expectations are well captured.
The multidimensional permutation tests clearly enforce canonical dimension ordering.

Also applies to: 306-313

flixopt/core.py (1)

369-416: Dimension validation + canonical reordering is a good fit here.
This cleanly enforces compatibility without eager broadcasting.

Also applies to: 532-534

flixopt/structure.py (1)

139-155: Bounds broadcasting at the linopy interface looks good.
This preserves compact internal data while ensuring variable shapes are correctly inferred.

tests/test_io_conversion.py (1)

207-212: LGTM — equivalence check matches the conversion contract.
The attrs-focused assertion aligns with the updated conversion behavior and avoids relying on object identity.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +602 to +613
# Try to reduce each dimension
reduced = da
for dim in list(da.dims):
if dim not in reduced.dims:
continue # Already removed
# Check if constant along this dimension
first_slice = reduced.isel({dim: 0})
is_constant = (reduced == first_slice).all()
if is_constant:
# Remove this dimension by taking first slice
reduced = first_slice

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, locate and read the relevant file
find . -name "io.py" -path "*/flixopt/*" | head -5

Repository: flixOpt/flixopt

Length of output: 74


🏁 Script executed:

# Read the specific lines mentioned in the review
sed -n '600,615p' flixopt/io.py

Repository: flixOpt/flixopt

Length of output: 585


🏁 Script executed:

# Let's also check the context around this code
sed -n '595,620p' flixopt/io.py

Repository: flixOpt/flixopt

Length of output: 879


🏁 Script executed:

python3 << 'EOF'
import xarray as xr
import numpy as np

# Create test data
data = xr.DataArray(np.array([[1, 1], [1, 1]]), dims=['x', 'y'])
print("Original data:")
print(data)
print()

# Test 1: Check what .all() returns
comparison = (data == data).all()
print("Result of (data == data).all():")
print(comparison)
print(f"Type: {type(comparison)}")
print(f"Is it a scalar DataArray? {isinstance(comparison, xr.DataArray)}")
print()

# Test 2: Try using scalar DataArray in if statement
try:
    if comparison:
        print("Scalar DataArray in if: SUCCESS")
except ValueError as e:
    print(f"Scalar DataArray in if: ValueError - {e}")
except Exception as e:
    print(f"Scalar DataArray in if: {type(e).__name__} - {e}")
print()

# Test 3: Check isel with and without drop
print("Testing isel behavior:")
data_2d = xr.DataArray(np.array([[1, 2], [3, 4]]), dims=['x', 'y'])
print("Original shape:", data_2d.shape, "dims:", data_2d.dims)

isel_without_drop = data_2d.isel({'x': 0})
print("After isel({'x': 0}):")
print(f"  Shape: {isel_without_drop.shape}, Dims: {isel_without_drop.dims}")

isel_with_drop = data_2d.isel({'x': 0}, drop=True)
print("After isel({'x': 0}, drop=True):")
print(f"  Shape: {isel_with_drop.shape}, Dims: {isel_with_drop.dims}")
print()

# Test 4: Check if first_slice from isel can be assigned back
print("Testing assignment:")
data_reduced = data_2d
first_slice_no_drop = data_2d.isel({'x': 0})
first_slice_drop = data_2d.isel({'x': 0}, drop=True)
print("Original shape:", data_reduced.shape)
print("first_slice (no drop) shape:", first_slice_no_drop.shape)
print("first_slice (drop=True) shape:", first_slice_drop.shape)
EOF

Repository: flixOpt/flixopt

Length of output: 145


🏁 Script executed:

python3 << 'EOF'
import xarray as xr
import numpy as np

# Quick test for scalar DataArray in if
da = xr.DataArray(True)
try:
    if da:
        pass
    print("No error with scalar in if")
except ValueError as e:
    print(f"ValueError: {e}")
EOF

Repository: flixOpt/flixopt

Length of output: 177


🏁 Script executed:

python3 << 'EOF'
import xarray as xr
import numpy as np

# Test isel with/without drop
da = xr.DataArray([[1, 2], [3, 4]], dims=['x', 'y'])
s1 = da.isel({'x': 0})
s2 = da.isel({'x': 0}, drop=True)
print("No drop - dims:", s1.dims, "shape:", s1.shape)
print("With drop - dims:", s2.dims, "shape:", s2.shape)
EOF

Repository: flixOpt/flixopt

Length of output: 177


🌐 Web query:

xarray scalar DataArray if statement ValueError

💡 Result:

When you use an if on an xarray DataArray that contains more than one element Python can't reduce it to a single True/False — you get the familiar ValueError about ambiguous truth. Use one of these fixes:

  • If it really is a scalar (0‑D) DataArray, convert to a Python scalar first: val = da.item() or val = da.values.item(); then use if val: ...
  • If it may be multi‑element and you want to test "any" or "all", reduce explicitly: if da.any(): ... or if da.all(): ...
  • If you want to act where a condition holds, use boolean masking or xarray methods (e.g. da.where(cond) or da.sel/isel with the condition).

Examples:

  • Scalar: if da.item() == 0: ...
  • Multi: if (da > 0).any(): ...
  • Masking: da2 = da.where(da > 0)

References: general ambiguity and fixes (NumPy/arrays). [1]; xarray where/boolean masking docs. [2]; note on single‑element behavior. [3].


🌐 Web query:

xarray isel drop=True parameter behavior dimension

💡 Result:

Short answer: isel's drop parameter only affects coordinate variables, not the data-array dimension removal semantics. When you index a dimension with an integer, xarray removes that dimension from the returned DataArray/Dataset; if drop=True, any coordinate variables that were indexed by that integer are dropped instead of being turned into 0‑d (scalar) coordinate variables. See the xarray docs. [1][2]

Sources:

  • xarray.DataArray.isel documentation. [1]
  • xarray.Dataset.isel documentation. [2]

Fix boolean scalar handling in conditional check.

(reduced == first_slice).all() returns a 0-D scalar DataArray, which cannot be used directly in an if statement. Convert to a Python boolean using .item().

-            first_slice = reduced.isel({dim: 0})
-            is_constant = (reduced == first_slice).all()
-            if is_constant:
-                reduced = first_slice
+            first_slice = reduced.isel({dim: 0})
+            is_constant = bool((reduced == first_slice).all().item())
+            if is_constant:
+                reduced = first_slice
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Try to reduce each dimension
reduced = da
for dim in list(da.dims):
if dim not in reduced.dims:
continue # Already removed
# Check if constant along this dimension
first_slice = reduced.isel({dim: 0})
is_constant = (reduced == first_slice).all()
if is_constant:
# Remove this dimension by taking first slice
reduced = first_slice
# Try to reduce each dimension
reduced = da
for dim in list(da.dims):
if dim not in reduced.dims:
continue # Already removed
# Check if constant along this dimension
first_slice = reduced.isel({dim: 0})
is_constant = bool((reduced == first_slice).all().item())
if is_constant:
# Remove this dimension by taking first slice
reduced = first_slice
🤖 Prompt for AI Agents
In `@flixopt/io.py` around lines 602 - 613, The conditional currently uses the
xarray DataArray scalar returned by (reduced == first_slice).all() directly in
an if, which raises when used as a truth value; change the check in the loop
over dim (working with reduced and first_slice) to convert that 0-D boolean
DataArray to a Python bool (e.g., call .item() or wrap with bool(...)) so the if
uses a proper boolean before removing the dimension.

Comment on lines +44 to +78
def _ensure_coords(
data: xr.DataArray | float | int,
coords: xr.Coordinates | dict,
) -> xr.DataArray | float:
"""Broadcast data to coords if needed.
This is used at the linopy interface to ensure bounds are properly broadcasted
to the target variable shape. Linopy needs at least one bound to have all
dimensions to determine the variable shape.
Note: Infinity values (-inf, inf) are kept as scalars because linopy uses
special checks like `if (lower != -inf)` that fail with DataArrays.
"""
# Handle both dict and xr.Coordinates
if isinstance(coords, dict):
coord_dims = list(coords.keys())
else:
coord_dims = list(coords.dims)

# Keep infinity values as scalars (linopy uses them for special checks)
if not isinstance(data, xr.DataArray):
if np.isinf(data):
return data
# Finite scalar - create full DataArray
return xr.DataArray(data, coords=coords, dims=coord_dims)

if set(data.dims) == set(coord_dims):
# Has all dims - ensure correct order
if data.dims != tuple(coord_dims):
return data.transpose(*coord_dims)
return data

# Broadcast to full coords (broadcast_like ensures correct dim order)
template = xr.DataArray(coords=coords, dims=coord_dims)
return data.broadcast_like(template)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Handle scalar DataArray ±inf before broadcasting.
The helper keeps Python/NumPy infinities scalar, but a 0‑D xr.DataArray(np.inf) (e.g., from DataConverter) will be broadcast and can trigger linopy’s lower != -inf checks. A small guard avoids that edge case.

🐛 Proposed fix
-    if not isinstance(data, xr.DataArray):
-        if np.isinf(data):
-            return data
-        # Finite scalar - create full DataArray
-        return xr.DataArray(data, coords=coords, dims=coord_dims)
+    if not isinstance(data, xr.DataArray):
+        if np.isinf(data):
+            return data
+        # Finite scalar - create full DataArray
+        return xr.DataArray(data, coords=coords, dims=coord_dims)
+
+    # Keep scalar DataArray infinities as scalars too
+    if data.ndim == 0 and np.isinf(data.item()):
+        return data.item()
🤖 Prompt for AI Agents
In `@flixopt/structure.py` around lines 44 - 78, _hotfix: In _ensure_coords detect
a 0-D DataArray containing infinity and return it as a plain scalar so it isn't
broadcast; specifically, inside the function (before creating template and
broadcasting) add a guard: if isinstance(data, xr.DataArray) and data.ndim == 0
and np.isinf(data.item()): return data.item(). This preserves the existing
behavior of keeping infinities scalar for linopy while avoiding broadcasting of
0-D DataArray(np.inf).

@FBumann FBumann changed the title Optimize IO: Keep data in minimal form (no pre-broadcasting) perf: Keep data in minimal form (no pre-broadcasting) Jan 16, 2026
@FBumann FBumann merged commit d3ce064 into main Jan 16, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants