-
Notifications
You must be signed in to change notification settings - Fork 9
perf: Keep data in minimal form (no pre-broadcasting) #575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- _reduce_constant_dims(): Reduces dimensions where all values are constant
- _expand_reduced_dims(): Restores original dimensions with correct order
2. Added reduce_constants parameter to FlowSystem.to_dataset():
- Default: False (opt-in)
- When True: Reduces constant dimensions for memory efficiency
3. Updated FlowSystem.from_dataset():
- Expands both collapsed arrays (from NetCDF) and reduced dimensions
4. Key fixes:
- Store original dimension order in attrs to preserve (cluster, time) vs (time, cluster) ordering
- Skip solution variables in reduction (they're prefixed with solution|)
The optimization is opt-in via to_dataset(reduce_constants=True). For file storage, save_dataset_to_netcdf still collapses constants to scalars by default.
Benchmarks show 99%+ reduction in memory and file size for multi-period models, with faster IO speeds. The optimization is now enabled by default. Co-Authored-By: Claude Opus 4.5 <[email protected]>
📝 WalkthroughWalkthroughThis PR introduces dimension-aware utilities for safe xarray operations and refactors data conversion to validate dimensions first before broadcasting, shifting broadcasting responsibility from DataConverter.to_dataarray to FlowSystemModel.add_variables via new helper functions. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant DataConverter
participant Validator as _validate_dataarray_dims
participant FlowSystemModel
participant Broadcaster as _ensure_coords
Note over User,Broadcaster: OLD FLOW: Broadcast in to_dataarray
User->>DataConverter: to_dataarray(data, target_coords)
DataConverter->>DataConverter: _broadcast_dataarray_to_target_specification
DataConverter-->>User: Broadcasted array
Note over User,Broadcaster: NEW FLOW: Validate then defer broadcasting
User->>DataConverter: to_dataarray(data, target_coords)
DataConverter->>Validator: _validate_dataarray_dims(data, coords, dims)
Validator->>Validator: Check dims subset, validate coords, transpose
Validator-->>DataConverter: Validated (minimal dims) array
DataConverter-->>User: Validated array
User->>FlowSystemModel: add_variables(lower, upper, coords)
FlowSystemModel->>Broadcaster: _ensure_coords(bounds, coords)
Broadcaster->>Broadcaster: Broadcast scalars/lower-dims to target
Broadcaster-->>FlowSystemModel: Broadcasted bounds
FlowSystemModel-->>User: linopy.Variable
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
* Add safe isel wrappers for dimension-independent operations - Add _scalar_safe_isel_drop() to modeling.py for selecting from potentially reduced arrays (handles both scalar and array cases) - Update Storage validation to use _scalar_safe_isel for relative_minimum/maximum_charge_state access at time=0 - Update StorageModel._relative_charge_state_bounds to handle reduced arrays properly with dimension checks These wrappers prepare the codebase for future optimization where _expand_reduced_dims() could be removed from from_dataset(), keeping data compact throughout the system. Co-Authored-By: Claude Opus 4.5 <[email protected]> * Remove pre-broadcasting from DataConverter - data stays minimal Major change: DataConverter.to_dataarray() now validates dimensions instead of broadcasting to full target dimensions. This keeps data compact (scalars stay scalar, 1D arrays stay 1D) and lets linopy handle broadcasting at variable creation time. Changes: - core.py: Replace _broadcast_dataarray_to_target_specification with _validate_dataarray_dims in to_dataarray() method - components.py: Fix _relative_charge_state_bounds to handle scalar inputs that have no time dimension (expand to timesteps_extra) - conftest.py: Add assert_dims_compatible helper for tests - test_*.py: Update dimension assertions to use subset checking Memory impact (8760 timesteps × 20 periods): - Before: 24.1 MB dataset size - After: 137.1 KB dataset size Co-Authored-By: Claude Opus 4.5 <[email protected]> * Remove redundant dimension reduction - data is minimal from start Since DataConverter no longer broadcasts, these optimizations are redundant: - Remove _reduce_constant_dims() and _expand_reduced_dims() from io.py - Remove reduce_constants parameter from to_dataset() - Remove _expand_reduced_dims() call from from_dataset() Also ensure Dataset always has model coordinates (time, period, scenario, cluster) even if no data variable uses them. This is important for Dataset validity and downstream operations. Update benchmark_io.py to reflect the simplified API. Memory footprint now: - 24 timesteps: 568 bytes (23/24 vars are scalar) - 168 timesteps: 2.8 KB - 8760 timesteps × 20 periods: ~137 KB (vs 24 MB with old broadcasting) Co-Authored-By: Claude Opus 4.5 <[email protected]> * ⏺ Done. I've removed the _collapse_constant_arrays / _expand_collapsed_arrays optimization from the codebase. Here's a summary of the changes: Files modified: 1. flixopt/io.py: - Removed COLLAPSED_VAR_PREFIX constant - Removed _collapse_constant_arrays() function (~45 lines) - Removed _expand_collapsed_arrays() function (~40 lines) - Removed collapse_constants parameter from save_dataset_to_netcdf() - Removed expansion logic from load_dataset_from_netcdf() 2. flixopt/flow_system.py: - Removed import and call to _expand_collapsed_arrays in from_dataset() 3. benchmark_io.py: - Simplified benchmark_netcdf() to only benchmark single save/load (no more comparison) - Removed collapse_constants parameter from roundtrip function Rationale: Since data is now kept in minimal form throughout the system (scalars stay scalars, 1D arrays stay 1D), there's no need for the extra collapsing/expanding layer when saving to NetCDF. The file sizes are naturally small because the data isn't expanded to full dimensions. Test results: - All 10 IO tests pass - All 4 integration tests pass - Benchmark runs successfully * Summary of Fixes 1. _dataset_resample handling all-scalar data (transform_accessor.py) When all data variables are scalars (no time dimension), the resample function now: - Creates a dummy variable to resample the time coordinate - Preserves all original scalar data variables - Preserves all non-time coordinates (period, scenario, cluster) 2. _scalar_safe_reduce helper (modeling.py) Added a new helper function that safely performs aggregation operations (mean/sum/etc) over a dimension: - Returns reduced data if the dimension exists - Returns data unchanged if the dimension doesn't exist (scalar data) 3. Updated Storage intercluster linking (components.py) Used _scalar_safe_reduce for: - relative_loss_per_hour.mean('time') - timestep_duration.mean('time') 4. Updated clustering expansion (transform_accessor.py) Used _scalar_safe_reduce for relative_loss_per_hour.mean('time') 5. Fixed dimension order in expand_data (clustering/base.py) When expanding clustered data back to original timesteps: - Added transpose to ensure (cluster, time) dimension order before numpy indexing - Fixes IndexError when dimensions were in different order * Added reduction on loading old datasets * Summary of changes made: 1. flixopt/structure.py: Added dimension transpose in _ensure_coords() to ensure correct dimension order when data already has all dims but in wrong order 2. tests/test_io_conversion.py: Updated test_returns_same_object → test_returns_equivalent_dataset since convert_old_dataset now creates a new dataset when reducing constants 3. tests/deprecated/test_flow.py: Updated 6 assertions to expect minimal dimensions ('time',) instead of broadcasting to all model coords 4. tests/deprecated/test_component.py: Updated 2 assertions to expect minimal dimensions ('time',) 5. tests/deprecated/test_linear_converter.py: Updated 1 assertion to expect minimal dimensions ('time',) The key change is that data now stays in minimal form - a 1D time-varying array stays as ('time',) dimensions rather than being pre-broadcast to ('time', 'scenario') or other full model dimensions. Broadcasting happens at the linopy interface layer in FlowSystemModel.add_variables() when needed. --------- Co-authored-by: Claude Opus 4.5 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@flixopt/io.py`:
- Around line 602-613: The conditional currently uses the xarray DataArray
scalar returned by (reduced == first_slice).all() directly in an if, which
raises when used as a truth value; change the check in the loop over dim
(working with reduced and first_slice) to convert that 0-D boolean DataArray to
a Python bool (e.g., call .item() or wrap with bool(...)) so the if uses a
proper boolean before removing the dimension.
In `@flixopt/structure.py`:
- Around line 44-78: _hotfix: In _ensure_coords detect a 0-D DataArray
containing infinity and return it as a plain scalar so it isn't broadcast;
specifically, inside the function (before creating template and broadcasting)
add a guard: if isinstance(data, xr.DataArray) and data.ndim == 0 and
np.isinf(data.item()): return data.item(). This preserves the existing behavior
of keeping infinities scalar for linopy while avoiding broadcasting of 0-D
DataArray(np.inf).
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
flixopt/clustering/base.pyflixopt/components.pyflixopt/core.pyflixopt/flow_system.pyflixopt/io.pyflixopt/modeling.pyflixopt/structure.pyflixopt/transform_accessor.pytests/conftest.pytests/deprecated/test_component.pytests/deprecated/test_dataconverter.pytests/deprecated/test_flow.pytests/deprecated/test_linear_converter.pytests/test_component.pytests/test_dataconverter.pytests/test_flow.pytests/test_io_conversion.pytests/test_linear_converter.py
💤 Files with no reviewable changes (1)
- tests/deprecated/test_dataconverter.py
🧰 Additional context used
🧬 Code graph analysis (12)
tests/test_component.py (2)
tests/conftest.py (1)
assert_dims_compatible(832-848)flixopt/structure.py (2)
get_coords(350-385)get_coords(1721-1726)
flixopt/flow_system.py (2)
tests/test_comparison.py (1)
timesteps(23-24)flixopt/clustering/base.py (1)
clusters(810-946)
tests/deprecated/test_flow.py (2)
flixopt/flow_system.py (1)
dims(2018-2038)flixopt/structure.py (1)
dims(285-287)
flixopt/structure.py (1)
flixopt/flow_system.py (2)
coords(2083-2101)dims(2018-2038)
tests/test_flow.py (2)
tests/conftest.py (1)
assert_dims_compatible(832-848)flixopt/structure.py (2)
get_coords(350-385)get_coords(1721-1726)
tests/test_io_conversion.py (1)
flixopt/io.py (1)
convert_old_dataset(784-838)
flixopt/components.py (1)
flixopt/modeling.py (3)
_scalar_safe_isel(14-32)_scalar_safe_isel_drop(35-54)_scalar_safe_reduce(57-76)
tests/test_linear_converter.py (2)
tests/conftest.py (1)
assert_dims_compatible(832-848)flixopt/structure.py (2)
get_coords(350-385)get_coords(1721-1726)
flixopt/clustering/base.py (2)
flixopt/flow_system.py (1)
dims(2018-2038)flixopt/structure.py (1)
dims(285-287)
tests/deprecated/test_component.py (2)
flixopt/flow_system.py (1)
dims(2018-2038)flixopt/structure.py (1)
dims(285-287)
tests/conftest.py (2)
flixopt/flow_system.py (1)
dims(2018-2038)flixopt/structure.py (1)
dims(285-287)
flixopt/transform_accessor.py (1)
flixopt/modeling.py (1)
_scalar_safe_reduce(57-76)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: test (3.13)
- GitHub Check: test (3.12)
- GitHub Check: test (3.14)
- GitHub Check: test (3.11)
🔇 Additional comments (28)
flixopt/clustering/base.py (1)
442-444: Good safeguard for dimension ordering before indexing.Ensuring
('cluster', 'time')order avoids mismatched numpy indexing when upstream data arrives as('time','cluster').tests/deprecated/test_linear_converter.py (1)
285-286: Keeps time-varying factor minimal.
This aligns with the new “no pre-broadcasting” behavior.flixopt/io.py (2)
551-552: No actionable change in this line.
784-838: Conversion flow looks good.
The opt-inreduce_constantsflag keeps backward compatibility while enabling automatic reduction.flixopt/modeling.py (1)
14-76: Scalar-safe helpers look solid.
They preserve compact inputs while keeping xarray behavior consistent for callers.flixopt/transform_accessor.py (4)
19-19: No actionable change here.
371-389: Scalar-only resample handling is robust.
Creating a dummy variable to drive time resampling keeps metadata consistent.
424-429: Good preservation of non-time coords.
This prevents coordinate loss when merging resampled and non-time variables.
1378-1379: Nice use of scalar-safe reduction.
Prevents failures when loss data is reduced or scalar.flixopt/components.py (5)
19-19: No actionable change here.
573-578: Scalar-safe checks are appropriate here.
This keeps plausibility checks working with reduced-dimensional parameters.
1107-1147: Bounds expansion logic looks correct.
Handles scalar vs time-indexed bounds while preserving final-step semantics.
1507-1509: Scalar-safe decay computation is a good fit.
Maintains correctness when loss rates are reduced or constant.
1552-1554: Consistent scalar-safe reduction.
Keeps decay and timestep computations stable with minimal-form inputs.tests/test_linear_converter.py (1)
7-7: Dims-compatibility assertion is a good fit here.
Keeps the test aligned with the minimal-dims strategy while still validating model compatibility.Also applies to: 285-286
flixopt/flow_system.py (1)
694-704: Ensuring model coordinates are always present is solid.
This keeps dataset structure explicit even when no variable carries a given dimension.tests/deprecated/test_component.py (2)
134-135: Time-only dims expectation is clear and consistent.
Matches the new “minimal form” policy for upper bound arrays.
315-316: Same minimal-dims assertion is appropriate here.
Keeps the test aligned with scalar/1D preservation.tests/conftest.py (1)
832-848: Centralized dims-compatibility helper looks good.
Helps keep tests aligned with the minimal-dims behavior.tests/test_component.py (2)
10-10: Updated dims-compatibility check is the right adjustment.
Keeps the test robust under minimal-dims storage.Also applies to: 135-136
315-316: Consistent dims-compatibility check here as well.
Nice alignment with the new minimal-form behavior.tests/deprecated/test_flow.py (1)
72-74: Minimal-dim assertions look correct.
These checks align with the new “keep data minimal” behavior in deprecated tests.Also applies to: 186-188, 252-254, 328-330, 397-399, 637-639
tests/test_flow.py (1)
7-7: Good use ofassert_dims_compatiblefor minimal-form inputs.
Keeps tests resilient while enforcing compatibility with model coordinates.Also applies to: 72-73, 185-186, 250-251, 325-326, 393-394, 632-633
tests/test_dataconverter.py (2)
49-54: Minimal-form assertions are solid and consistent.
These updates comprehensively validate the “no broadcast at conversion time” behavior across scalar/1D/2D/Series/DataFrame/TimeSeriesData and edge cases.Also applies to: 57-62, 65-70, 73-80, 111-118, 120-125, 139-147, 191-197, 206-213, 250-257, 285-293, 440-448, 450-456, 458-472, 489-499, 532-537, 539-550, 552-563, 564-578, 608-613, 712-722, 724-734, 757-767, 849-862, 864-875, 877-894, 919-946, 980-985, 1071-1075, 1113-1117, 1207-1212
275-283: Canonical-order transpose expectations are well captured.
The multidimensional permutation tests clearly enforce canonical dimension ordering.Also applies to: 306-313
flixopt/core.py (1)
369-416: Dimension validation + canonical reordering is a good fit here.
This cleanly enforces compatibility without eager broadcasting.Also applies to: 532-534
flixopt/structure.py (1)
139-155: Bounds broadcasting at the linopy interface looks good.
This preserves compact internal data while ensuring variable shapes are correctly inferred.tests/test_io_conversion.py (1)
207-212: LGTM — equivalence check matches the conversion contract.
The attrs-focused assertion aligns with the updated conversion behavior and avoids relying on object identity.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| # Try to reduce each dimension | ||
| reduced = da | ||
| for dim in list(da.dims): | ||
| if dim not in reduced.dims: | ||
| continue # Already removed | ||
| # Check if constant along this dimension | ||
| first_slice = reduced.isel({dim: 0}) | ||
| is_constant = (reduced == first_slice).all() | ||
| if is_constant: | ||
| # Remove this dimension by taking first slice | ||
| reduced = first_slice | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, locate and read the relevant file
find . -name "io.py" -path "*/flixopt/*" | head -5Repository: flixOpt/flixopt
Length of output: 74
🏁 Script executed:
# Read the specific lines mentioned in the review
sed -n '600,615p' flixopt/io.pyRepository: flixOpt/flixopt
Length of output: 585
🏁 Script executed:
# Let's also check the context around this code
sed -n '595,620p' flixopt/io.pyRepository: flixOpt/flixopt
Length of output: 879
🏁 Script executed:
python3 << 'EOF'
import xarray as xr
import numpy as np
# Create test data
data = xr.DataArray(np.array([[1, 1], [1, 1]]), dims=['x', 'y'])
print("Original data:")
print(data)
print()
# Test 1: Check what .all() returns
comparison = (data == data).all()
print("Result of (data == data).all():")
print(comparison)
print(f"Type: {type(comparison)}")
print(f"Is it a scalar DataArray? {isinstance(comparison, xr.DataArray)}")
print()
# Test 2: Try using scalar DataArray in if statement
try:
if comparison:
print("Scalar DataArray in if: SUCCESS")
except ValueError as e:
print(f"Scalar DataArray in if: ValueError - {e}")
except Exception as e:
print(f"Scalar DataArray in if: {type(e).__name__} - {e}")
print()
# Test 3: Check isel with and without drop
print("Testing isel behavior:")
data_2d = xr.DataArray(np.array([[1, 2], [3, 4]]), dims=['x', 'y'])
print("Original shape:", data_2d.shape, "dims:", data_2d.dims)
isel_without_drop = data_2d.isel({'x': 0})
print("After isel({'x': 0}):")
print(f" Shape: {isel_without_drop.shape}, Dims: {isel_without_drop.dims}")
isel_with_drop = data_2d.isel({'x': 0}, drop=True)
print("After isel({'x': 0}, drop=True):")
print(f" Shape: {isel_with_drop.shape}, Dims: {isel_with_drop.dims}")
print()
# Test 4: Check if first_slice from isel can be assigned back
print("Testing assignment:")
data_reduced = data_2d
first_slice_no_drop = data_2d.isel({'x': 0})
first_slice_drop = data_2d.isel({'x': 0}, drop=True)
print("Original shape:", data_reduced.shape)
print("first_slice (no drop) shape:", first_slice_no_drop.shape)
print("first_slice (drop=True) shape:", first_slice_drop.shape)
EOFRepository: flixOpt/flixopt
Length of output: 145
🏁 Script executed:
python3 << 'EOF'
import xarray as xr
import numpy as np
# Quick test for scalar DataArray in if
da = xr.DataArray(True)
try:
if da:
pass
print("No error with scalar in if")
except ValueError as e:
print(f"ValueError: {e}")
EOFRepository: flixOpt/flixopt
Length of output: 177
🏁 Script executed:
python3 << 'EOF'
import xarray as xr
import numpy as np
# Test isel with/without drop
da = xr.DataArray([[1, 2], [3, 4]], dims=['x', 'y'])
s1 = da.isel({'x': 0})
s2 = da.isel({'x': 0}, drop=True)
print("No drop - dims:", s1.dims, "shape:", s1.shape)
print("With drop - dims:", s2.dims, "shape:", s2.shape)
EOFRepository: flixOpt/flixopt
Length of output: 177
🌐 Web query:
xarray scalar DataArray if statement ValueError
💡 Result:
When you use an if on an xarray DataArray that contains more than one element Python can't reduce it to a single True/False — you get the familiar ValueError about ambiguous truth. Use one of these fixes:
- If it really is a scalar (0‑D) DataArray, convert to a Python scalar first: val = da.item() or val = da.values.item(); then use if val: ...
- If it may be multi‑element and you want to test "any" or "all", reduce explicitly: if da.any(): ... or if da.all(): ...
- If you want to act where a condition holds, use boolean masking or xarray methods (e.g. da.where(cond) or da.sel/isel with the condition).
Examples:
- Scalar: if da.item() == 0: ...
- Multi: if (da > 0).any(): ...
- Masking: da2 = da.where(da > 0)
References: general ambiguity and fixes (NumPy/arrays). [1]; xarray where/boolean masking docs. [2]; note on single‑element behavior. [3].
🌐 Web query:
xarray isel drop=True parameter behavior dimension
💡 Result:
Short answer: isel's drop parameter only affects coordinate variables, not the data-array dimension removal semantics. When you index a dimension with an integer, xarray removes that dimension from the returned DataArray/Dataset; if drop=True, any coordinate variables that were indexed by that integer are dropped instead of being turned into 0‑d (scalar) coordinate variables. See the xarray docs. [1][2]
Sources:
- xarray.DataArray.isel documentation. [1]
- xarray.Dataset.isel documentation. [2]
Fix boolean scalar handling in conditional check.
(reduced == first_slice).all() returns a 0-D scalar DataArray, which cannot be used directly in an if statement. Convert to a Python boolean using .item().
- first_slice = reduced.isel({dim: 0})
- is_constant = (reduced == first_slice).all()
- if is_constant:
- reduced = first_slice
+ first_slice = reduced.isel({dim: 0})
+ is_constant = bool((reduced == first_slice).all().item())
+ if is_constant:
+ reduced = first_slice📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Try to reduce each dimension | |
| reduced = da | |
| for dim in list(da.dims): | |
| if dim not in reduced.dims: | |
| continue # Already removed | |
| # Check if constant along this dimension | |
| first_slice = reduced.isel({dim: 0}) | |
| is_constant = (reduced == first_slice).all() | |
| if is_constant: | |
| # Remove this dimension by taking first slice | |
| reduced = first_slice | |
| # Try to reduce each dimension | |
| reduced = da | |
| for dim in list(da.dims): | |
| if dim not in reduced.dims: | |
| continue # Already removed | |
| # Check if constant along this dimension | |
| first_slice = reduced.isel({dim: 0}) | |
| is_constant = bool((reduced == first_slice).all().item()) | |
| if is_constant: | |
| # Remove this dimension by taking first slice | |
| reduced = first_slice |
🤖 Prompt for AI Agents
In `@flixopt/io.py` around lines 602 - 613, The conditional currently uses the
xarray DataArray scalar returned by (reduced == first_slice).all() directly in
an if, which raises when used as a truth value; change the check in the loop
over dim (working with reduced and first_slice) to convert that 0-D boolean
DataArray to a Python bool (e.g., call .item() or wrap with bool(...)) so the if
uses a proper boolean before removing the dimension.
| def _ensure_coords( | ||
| data: xr.DataArray | float | int, | ||
| coords: xr.Coordinates | dict, | ||
| ) -> xr.DataArray | float: | ||
| """Broadcast data to coords if needed. | ||
| This is used at the linopy interface to ensure bounds are properly broadcasted | ||
| to the target variable shape. Linopy needs at least one bound to have all | ||
| dimensions to determine the variable shape. | ||
| Note: Infinity values (-inf, inf) are kept as scalars because linopy uses | ||
| special checks like `if (lower != -inf)` that fail with DataArrays. | ||
| """ | ||
| # Handle both dict and xr.Coordinates | ||
| if isinstance(coords, dict): | ||
| coord_dims = list(coords.keys()) | ||
| else: | ||
| coord_dims = list(coords.dims) | ||
|
|
||
| # Keep infinity values as scalars (linopy uses them for special checks) | ||
| if not isinstance(data, xr.DataArray): | ||
| if np.isinf(data): | ||
| return data | ||
| # Finite scalar - create full DataArray | ||
| return xr.DataArray(data, coords=coords, dims=coord_dims) | ||
|
|
||
| if set(data.dims) == set(coord_dims): | ||
| # Has all dims - ensure correct order | ||
| if data.dims != tuple(coord_dims): | ||
| return data.transpose(*coord_dims) | ||
| return data | ||
|
|
||
| # Broadcast to full coords (broadcast_like ensures correct dim order) | ||
| template = xr.DataArray(coords=coords, dims=coord_dims) | ||
| return data.broadcast_like(template) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle scalar DataArray ±inf before broadcasting.
The helper keeps Python/NumPy infinities scalar, but a 0‑D xr.DataArray(np.inf) (e.g., from DataConverter) will be broadcast and can trigger linopy’s lower != -inf checks. A small guard avoids that edge case.
🐛 Proposed fix
- if not isinstance(data, xr.DataArray):
- if np.isinf(data):
- return data
- # Finite scalar - create full DataArray
- return xr.DataArray(data, coords=coords, dims=coord_dims)
+ if not isinstance(data, xr.DataArray):
+ if np.isinf(data):
+ return data
+ # Finite scalar - create full DataArray
+ return xr.DataArray(data, coords=coords, dims=coord_dims)
+
+ # Keep scalar DataArray infinities as scalars too
+ if data.ndim == 0 and np.isinf(data.item()):
+ return data.item()🤖 Prompt for AI Agents
In `@flixopt/structure.py` around lines 44 - 78, _hotfix: In _ensure_coords detect
a 0-D DataArray containing infinity and return it as a plain scalar so it isn't
broadcast; specifically, inside the function (before creating template and
broadcasting) add a guard: if isinstance(data, xr.DataArray) and data.ndim == 0
and np.isinf(data.item()): return data.item(). This preserves the existing
behavior of keeping infinities scalar for linopy while avoiding broadcasting of
0-D DataArray(np.inf).
Summary
Major optimization: Data now stays in minimal form throughout the system. Scalars stay scalar, 1D arrays stay 1D - no pre-broadcasting to full model dimensions. This dramatically reduces memory usage and file sizes.
Key Changes
Core: Minimal Data Form
DataConverter.to_dataarray()now validates dimensions instead of broadcastingrelative_minimum=0stays scalar)FlowSystemModel.add_variables()when neededNew Helper:
_ensure_coords()instructure.pyNew Helper:
_scalar_safe_reduce()inmodeling.pyClustering Fixes
expand_data(): Added transpose to ensure(cluster, time)dimension order before numpy indexing_dataset_resample(): Handles all-scalar datasets by creating dummy variable for time coordinate resamplingOld Dataset Compatibility
convert_old_dataset(): Now reduces constant arrays dimension-by-dimensionBenchmark Results
File sizes follow similar reductions. IO speed improved 25-63%.
Files Changed
flixopt/core.py_validate_dataarray_dims()replaces broadcasting, adds transpose to canonical orderflixopt/structure.py_ensure_coords()helper,FlowSystemModel.add_variables()overrideflixopt/modeling.py_scalar_safe_reduce()helperflixopt/components.py_scalar_safe_reducefor storage calculationsflixopt/clustering/base.pyexpand_data()flixopt/transform_accessor.py_dataset_resample()for all-scalar dataflixopt/io.py_reduce_constant_arrays()for old datasets, removed collapse/expand layerType of Change
Testing
Checklist
Summary by CodeRabbit
Bug Fixes
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.