Skip to content

Conversation

@roylou
Copy link

@roylou roylou commented Dec 30, 2025

Summary

  • Added comprehensive tooling for deriving Scroll Galileo gas fee parameters
  • Includes data collection from blockchain and parameter analysis capabilities
  • Supports both real-time collection and cached data loading modes

Changes

  • Added derive_galileo_gas_parameter.py: Main script for collecting batch data and deriving optimal gas parameters
  • Added show_batches.py: Utility to display batch information
  • Added Python environment setup with Pipfile for dependency management
  • Added CLAUDE.md with project documentation and usage guidelines

Summary by CodeRabbit

  • New Features

    • Added a utility to collect, load, and analyze on-chain gas and batch metrics, plus a companion viewer to inspect generated summaries.
  • Documentation

    • Added comprehensive usage, setup, and CLI guidance including examples and monitoring notes.
  • Chores

    • Added environment examples, dependency configuration, and repository ignore rules to streamline setup and data caching.

✏️ Tip: You can customize this high-level summary in your review settings.

Add scripts and environment setup for deriving Scroll gas fee parameters
using data collected from L1 batches. Includes data collection, analysis,
and batch visualization utilities.
@coderabbitai
Copy link

coderabbitai bot commented Dec 30, 2025

📝 Walkthrough

Walkthrough

This PR adds scripts/derive_galileo_gas_parameter/, providing tools to derive Galileo L1 gas parameters from on-chain data via a two-stage process, including a main analysis script, a data-inspection utility, Pipenv config, env examples, .gitignore, and documentation.

Changes

Cohort / File(s) Summary
Configuration & Examples
scripts/derive_galileo_gas_parameter/Pipfile, scripts/derive_galileo_gas_parameter/.env.example, scripts/derive_galileo_gas_parameter/.gitignore
Adds Pipenv dependency file (python 3.10; web3<7,>=6, pandas, numpy, rlp, zstandard, requests, async_timeout), example environment variables for RPC endpoints, and a project .gitignore.
Documentation
scripts/derive_galileo_gas_parameter/claude.md, scripts/derive_galileo_gas_parameter/README.md
Adds usage and monitoring documentation describing environment setup, CLI modes/options (collect/load, batches, penalty settings), logging, and examples.
Main Analysis Script
scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py
New comprehensive script: Web3/beacon integration, event and blob parsing, transaction compression/size computation, parallelized data collection, DataFrame pipeline, penalty_multiplier derivation (P95), aggregation of batch costs, algebraic solution for commit_scalar/blob_scalar, analysis (RMSE/MAE/ recovery), save/load pickle, and CLI entrypoint.
Data Inspection Utility
scripts/derive_galileo_gas_parameter/show_batches.py
Adds script and public function show_batches(filename) to load cached pickle data and print human-readable batch and transaction summaries.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Script as derive_galileo_gas_parameter.py
    participant RPC as Blockchain / Beacon RPC
    participant FS as File System

    User->>Script: run (mode=collect|load, options)
    alt collect
        Script->>RPC: fetch finalized batch events & block headers
        RPC-->>Script: events & block data
        Script->>RPC: fetch blob data, transactions, receipts (parallel)
        RPC-->>Script: raw txs, blobs, receipts
        Script->>Script: parse blobs, compress txs, compute sizes & L1 fees
        Script->>Script: build tx_df and batch_df
        Script->>FS: save pickle (tx_df, batch_df)
    else load
        Script->>FS: load pickle (tx_df, batch_df)
        FS-->>Script: cached data
    end
    Script->>Script: calculate penalty_multiplier (P95) or use fixed
    Script->>Script: aggregate batch-level costs and effective sizes
    Script->>Script: solve for commit_scalar & blob_scalar
    Script->>Script: analyze results (RMSE, MAE, recovery, penalty analysis)
    Script-->>User: print parameters, metrics, and file locations
    User->>FS: optionally run show_batches.py
    FS-->>User: human-readable batch/tx summary
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 I hopped through blocks and bytes tonight,

Spun P95 tails into penalty light,
I counted blobs and compressed the race,
Scalars solved with gentle grace —
A tiny rabbit cheers your data's flight. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Description check ❓ Inconclusive The PR description provides summary and changes but lacks required sections from the template such as explicit 'Purpose/design rationale', 'Breaking change label', and 'Deployment tag versioning' checkboxes. Complete the PR description by filling in the template sections: add Purpose/design rationale, select deployment tag versioning option, and specify if there are breaking changes.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'feat: add Galileo gas parameter derivation tooling' follows conventional commits format and clearly describes the main change: adding tooling for deriving Galileo gas parameters.
Docstring Coverage ✅ Passed Docstring coverage is 96.15% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (6)
scripts/derive_galileo_gas_parameter/Pipfile (1)

6-13: Consider pinning dependency versions for reproducibility.

Using wildcard "*" for most dependencies can lead to non-reproducible builds and potential breaking changes when new versions are released. For production tooling, consider pinning to specific versions or at least setting upper bounds.

🔎 Suggested improvement
 [packages]
 web3 = "<7,>=6"
-pandas = "*"
-numpy = "*"
-rlp = "*"
-zstandard = "*"
-requests = "*"
-async_timeout = "*"
+pandas = ">=2.0,<3"
+numpy = ">=1.24,<2"
+rlp = ">=3.0,<4"
+zstandard = ">=0.21,<1"
+requests = ">=2.28,<3"
+async_timeout = ">=4.0,<5"

Alternatively, run pipenv lock to generate a Pipfile.lock that captures exact versions for reproducibility.

scripts/derive_galileo_gas_parameter/show_batches.py (2)

31-31: Remove extraneous f-prefix from string literal.

This f-string has no placeholders.

🔎 Proposed fix
-        print(f"\nBatch IDs:")
+        print("\nBatch IDs:")

72-73: Consider logging the exception type for better debugging.

While catching broad Exception is acceptable for a CLI utility, including the exception type would help with debugging.

🔎 Proposed fix
     except Exception as e:
-        print(f"Error reading file: {e}")
+        print(f"Error reading file: {type(e).__name__}: {e}")
scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py (3)

373-374: Global mutable state for caching.

Using a global dictionary for caching works for single-threaded execution but could cause issues if this module is ever used in a multi-threaded context. Consider using a class-based approach or passing the cache explicitly if thread-safety becomes a concern.


426-426: Unused variable version.

This variable is assigned but never used. Consider either removing it or using it for validation (e.g., asserting expected version).

🔎 Proposed fix
-    version = int(batch_data[0])
+    version = int(batch_data[0])  # Currently unused, but parsed from batch header
+    # TODO: Add version validation if needed

Or remove entirely if not needed:

-    version = int(batch_data[0])
+    # batch_data[0] contains version byte (unused)

1107-1112: DataFrame mutation side effect.

This function modifies tx_df directly by adding new columns. Since tx_df is passed in and also used later in analyze_results (which also modifies it), this side effect is likely intentional. However, consider either:

  1. Documenting this side effect in the docstring
  2. Using tx_df = tx_df.copy() if you want to avoid mutation
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7de388e and 8fc6d65.

⛔ Files ignored due to path filters (1)
  • scripts/derive_galileo_gas_parameter/Pipfile.lock is excluded by !**/*.lock
📒 Files selected for processing (4)
  • scripts/derive_galileo_gas_parameter/Pipfile
  • scripts/derive_galileo_gas_parameter/claude.md
  • scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py
  • scripts/derive_galileo_gas_parameter/show_batches.py
🧰 Additional context used
🪛 Ruff (0.14.10)
scripts/derive_galileo_gas_parameter/show_batches.py

19-19: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)


31-31: f-string without any placeholders

Remove extraneous f prefix

(F541)


72-72: Do not catch blind exception: Exception

(BLE001)

scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py

38-38: Avoid specifying long messages outside the exception class

(TRY003)


40-40: Avoid specifying long messages outside the exception class

(TRY003)


146-146: Probable use of requests call without timeout

(S113)


186-186: Avoid specifying long messages outside the exception class

(TRY003)


331-331: Avoid specifying long messages outside the exception class

(TRY003)


398-398: Do not catch blind exception: Exception

(BLE001)


426-426: Local variable version is assigned to but never used

Remove assignment to unused variable version

(F841)


476-476: f-string without any placeholders

Remove extraneous f prefix

(F541)


518-518: Avoid specifying long messages outside the exception class

(TRY003)


522-522: f-string without any placeholders

Remove extraneous f prefix

(F541)


566-566: f-string without any placeholders

Remove extraneous f prefix

(F541)


607-607: f-string without any placeholders

Remove extraneous f prefix

(F541)


678-678: f-string without any placeholders

Remove extraneous f prefix

(F541)


744-744: f-string without any placeholders

Remove extraneous f prefix

(F541)


795-795: f-string without any placeholders

Remove extraneous f prefix

(F541)


809-809: f-string without any placeholders

Remove extraneous f prefix

(F541)


846-846: f-string without any placeholders

Remove extraneous f prefix

(F541)


884-884: f-string without any placeholders

Remove extraneous f prefix

(F541)


948-948: f-string without any placeholders

Remove extraneous f prefix

(F541)


959-959: f-string without any placeholders

Remove extraneous f prefix

(F541)


968-968: f-string without any placeholders

Remove extraneous f prefix

(F541)


1022-1022: Avoid specifying long messages outside the exception class

(TRY003)


1025-1025: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)


1067-1067: f-string without any placeholders

Remove extraneous f prefix

(F541)


1096-1096: Docstring contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?

(RUF002)


1097-1097: Docstring contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?

(RUF002)


1127-1127: f-string without any placeholders

Remove extraneous f prefix

(F541)


1168-1168: f-string without any placeholders

Remove extraneous f prefix

(F541)


1170-1170: String contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?

(RUF001)


1173-1173: f-string without any placeholders

Remove extraneous f prefix

(F541)


1175-1175: String contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?

(RUF001)


1228-1228: f-string without any placeholders

Remove extraneous f prefix

(F541)


1240-1240: f-string without any placeholders

Remove extraneous f prefix

(F541)


1297-1297: f-string without any placeholders

Remove extraneous f prefix

(F541)


1302-1302: f-string without any placeholders

Remove extraneous f prefix

(F541)


1307-1307: f-string without any placeholders

Remove extraneous f prefix

(F541)


1330-1330: f-string without any placeholders

Remove extraneous f prefix

(F541)


1335-1335: f-string without any placeholders

Remove extraneous f prefix

(F541)


1340-1340: f-string without any placeholders

Remove extraneous f prefix

(F541)


1409-1409: Avoid specifying long messages outside the exception class

(TRY003)


1440-1440: f-string without any placeholders

Remove extraneous f prefix

(F541)


1444-1444: f-string without any placeholders

Remove extraneous f prefix

(F541)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: tests
🔇 Additional comments (5)
scripts/derive_galileo_gas_parameter/claude.md (1)

1-70: Documentation looks good overall.

The documentation clearly explains the CLI options, modes, and monitoring approach. A couple of minor suggestions:

  1. The hardcoded /tmp/ path won't work on Windows - consider noting this is Linux/macOS specific or using a cross-platform alternative.
  2. Consider renaming to CLAUDE.md (uppercase) or README.md for better visibility in the directory listing.
scripts/derive_galileo_gas_parameter/show_batches.py (1)

17-19: Pickle deserialization - acceptable for local tooling.

The static analysis flags pickle.load() as a potential security issue (S301). Since this script only loads locally-generated pickle files from the companion derive_galileo_gas_parameter.py script, this is acceptable. However, adding a brief comment noting this assumption would be helpful for future maintainers.

scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py (3)

740-741: Verify block range calculation - potential off-by-one issue.

The last_block is calculated as max(initial_L2_block_number + num_blocks), and then range(first_block, last_block) is used. Since Python's range() excludes the end value, the last block in the range won't be fetched.

If num_blocks represents the count of blocks starting from initial_L2_block_number, the calculation should be:

  • Last block index = initial_L2_block_number + num_blocks - 1
  • Range should be range(first_block, last_block + 1) to include it

Please verify the intended behavior:

  1. Does num_blocks include or exclude the initial block?
  2. Should the last block be fetched/processed?

Also applies to: 775-775, 813-813


198-331: Well-structured transaction serialization with good type coverage.

The get_raw_transaction_from_structured function handles multiple transaction types comprehensively (Legacy, EIP-2930, EIP-1559, L1 message, and EIP-7702). The error handling for unsupported types is appropriate.


1362-1458: Well-structured main function with clear execution flow.

The main function provides a clear step-by-step execution flow with good progress reporting. The argument handling is appropriate, and returning a results dictionary enables programmatic use of the script.

Comment on lines +144 to +147
url = f"{mainnet_beacon_url}/eth/v1/beacon/headers/head"
headers = {'accept': 'application/json'}
response = requests.get(url, headers=headers)
return int(response.json()['data']['header']['message']['slot']) - (latest_block_number - l1_head)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add timeout to prevent indefinite hangs.

This requests.get() call lacks a timeout, which could cause the script to hang indefinitely if the beacon API is unresponsive.

🔎 Proposed fix
     url = f"{mainnet_beacon_url}/eth/v1/beacon/headers/head"
     headers = {'accept': 'application/json'}
-    response = requests.get(url, headers=headers)
+    response = requests.get(url, headers=headers, timeout=10)
     return int(response.json()['data']['header']['message']['slot']) - (latest_block_number - l1_head)
🧰 Tools
🪛 Ruff (0.14.10)

146-146: Probable use of requests call without timeout

(S113)

🤖 Prompt for AI Agents
In scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py around
lines 144 to 147, the requests.get call has no timeout and can hang
indefinitely; modify the call to include a reasonable timeout (e.g., timeout=10)
and handle potential requests.exceptions.Timeout and other RequestException
errors by raising or returning a clear error so the script doesn't block
forever.

Comment on lines +800 to +806
while l1_base_fee == 0:
block = scroll_w3.eth.get_block(cur_block_num, full_transactions=True)
for tx in reversed(block['transactions']):
if tx.to == '0x5300000000000000000000000000000000000002' and tx.input.hex()[:10] == '0x39455d3a':
l1_base_fee = int.from_bytes(bytes.fromhex(tx.input.hex()[2:])[-64:-32], 'big')
l1_blob_base_fee = int.from_bytes(bytes.fromhex(tx.input.hex()[2:])[-32:], 'big')
cur_block_num -= 1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential infinite loop or crash if L1 fee oracle transaction not found.

This while loop decrements cur_block_num without a bound check. If the fee oracle transaction is never found in the blocks being searched, this will either loop indefinitely or eventually crash when trying to fetch block 0 or negative block numbers.

🔎 Proposed fix
+    max_blocks_to_search = 1000  # Safety limit
+    blocks_searched = 0
     while l1_base_fee == 0:
+        if blocks_searched >= max_blocks_to_search:
+            raise RuntimeError(f"Could not find L1 fee oracle transaction in {max_blocks_to_search} blocks starting from {first_block}")
         block = scroll_w3.eth.get_block(cur_block_num, full_transactions=True)
         for tx in reversed(block['transactions']):
             if tx.to == '0x5300000000000000000000000000000000000002' and tx.input.hex()[:10] == '0x39455d3a':
                 l1_base_fee = int.from_bytes(bytes.fromhex(tx.input.hex()[2:])[-64:-32], 'big')
                 l1_blob_base_fee = int.from_bytes(bytes.fromhex(tx.input.hex()[2:])[-32:], 'big')
         cur_block_num -= 1
+        blocks_searched += 1
🤖 Prompt for AI Agents
In scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py around
lines 800 to 806, the while loop that decrements cur_block_num can run forever
or underflow if the target L1 fee oracle transaction is never found; add a
bounded search by introducing a minimum block number or max_lookback counter and
check it each iteration, bail out when reached (raise a clear exception or log
and exit) and wrap the block fetch in try/except to handle out-of-range
requests; optionally add a configurable parameter (e.g., MAX_LOOKBACK or
min_block_num) and use it to stop the loop gracefully instead of looping until
negative block numbers.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Fix all issues with AI Agents 🤖
In @scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py:
- Around line 123-141: The latest_finalized_event_block function can loop
indefinitely or compute negative block ranges; add safety bounds by enforcing a
minimum from_block of 0 and a maximum iteration/count limit (e.g., max_attempts)
inside the while loop and exit gracefully if exceeded. Update the loop to
decrement from_block but clamp it to >= 0, increment an attempt counter each
iteration, and if attempts exceed the limit or from_block reaches 0 with no
events, return None or raise a clear exception so callers of
latest_finalized_event_block can handle the missing FinalizeBatch event.
- Around line 393-416: The loop using found_the_blob and cur_slot can run
forever if blob_hash is never found; add a maximum search depth (e.g.,
max_search_slots or max_iterations) and decrement cur_slot only while count <
max; if the limit is reached, stop the loop and raise an explicit error or
return a clear sentinel instead of looping forever. Update the loop around
found_the_blob/cur_slot (and the return path that uses indexed_blob[blob_hash])
to enforce the bound and handle the "not found within max" case by raising an
exception or returning (None, None, None) so callers don't get a KeyError.
🧹 Nitpick comments (4)
scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py (4)

429-432: Remove unused version variable.

The version variable is assigned but never used. Either remove it or add a comment explaining why it's extracted if it's for documentation purposes.

🔎 Proposed fix
-    version = int(batch_data[0])
+    _ = int(batch_data[0])  # version byte (unused)
     payload_N = int.from_bytes(batch_data[1:4], 'big')

1027-1028: Consider pickle security for untrusted data scenarios.

Using pickle.load() is fine for locally-generated cache files, but be aware that pickle can execute arbitrary code during deserialization. If this tool might ever load files from external sources, consider adding integrity checks or using a safer serialization format like JSON for the metadata.


953-959: Consider pre-building block-to-batch index for better performance.

The current implementation iterates over batch_df for every transaction, resulting in O(n×m) complexity. For larger datasets, pre-building a block-to-batch mapping would improve performance.

🔎 Proposed optimization
# Pre-build block to batch mapping
block_to_batch_map = {}
for idx, row in batch_df.iterrows():
    start = row['initial_L2_block_number']
    end = start + row['num_blocks']
    for block_num in range(start, end):
        block_to_batch_map[block_num] = idx

# Use O(1) lookup
tx_df['batch_index'] = tx_df['block_number'].map(block_to_batch_map)

758-770: Consider more specific exception handling.

The rate-limit detection relies on string matching ('429' in str(e)), which could be fragile. For more robust handling, consider catching requests.exceptions.HTTPError and checking response.status_code, or using the web3 library's specific exception types.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8fc6d65 and c799808.

📒 Files selected for processing (4)
  • scripts/derive_galileo_gas_parameter/.env.example
  • scripts/derive_galileo_gas_parameter/.gitignore
  • scripts/derive_galileo_gas_parameter/README.md
  • scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py
✅ Files skipped from review due to trivial changes (1)
  • scripts/derive_galileo_gas_parameter/.gitignore
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-27T18:50:44.578Z
Learnt from: Thegaram
Repo: scroll-tech/scroll PR: 1769
File: rollup/internal/config/relayer.go:112-114
Timestamp: 2025-11-27T18:50:44.578Z
Learning: In `rollup/internal/config/relayer.go`, the fields `L1BaseFeeLimit` and `L1BlobBaseFeeLimit` in `GasOracleConfig` should never be set to 0. Zero values would break the gas oracle fee enforcement logic in `l1_relayer.go` by capping all fees to 0.

Applied to files:

  • scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py
🪛 dotenv-linter (4.0.0)
scripts/derive_galileo_gas_parameter/.env.example

[warning] 2-2: [UnorderedKey] The MAINNET_URL key should go before the SCROLL_URL key

(UnorderedKey)


[warning] 3-3: [UnorderedKey] The BEACON_URL key should go before the MAINNET_URL key

(UnorderedKey)

🪛 Ruff (0.14.10)
scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py

39-39: Avoid specifying long messages outside the exception class

(TRY003)


41-41: Avoid specifying long messages outside the exception class

(TRY003)


43-43: Avoid specifying long messages outside the exception class

(TRY003)


149-149: Probable use of requests call without timeout

(S113)


189-189: Avoid specifying long messages outside the exception class

(TRY003)


334-334: Avoid specifying long messages outside the exception class

(TRY003)


401-401: Do not catch blind exception: Exception

(BLE001)


429-429: Local variable version is assigned to but never used

Remove assignment to unused variable version

(F841)


479-479: f-string without any placeholders

Remove extraneous f prefix

(F541)


521-521: Avoid specifying long messages outside the exception class

(TRY003)


525-525: f-string without any placeholders

Remove extraneous f prefix

(F541)


569-569: f-string without any placeholders

Remove extraneous f prefix

(F541)


610-610: f-string without any placeholders

Remove extraneous f prefix

(F541)


681-681: f-string without any placeholders

Remove extraneous f prefix

(F541)


747-747: f-string without any placeholders

Remove extraneous f prefix

(F541)


798-798: f-string without any placeholders

Remove extraneous f prefix

(F541)


812-812: f-string without any placeholders

Remove extraneous f prefix

(F541)


849-849: f-string without any placeholders

Remove extraneous f prefix

(F541)


887-887: f-string without any placeholders

Remove extraneous f prefix

(F541)


951-951: f-string without any placeholders

Remove extraneous f prefix

(F541)


962-962: f-string without any placeholders

Remove extraneous f prefix

(F541)


971-971: f-string without any placeholders

Remove extraneous f prefix

(F541)


1025-1025: Avoid specifying long messages outside the exception class

(TRY003)


1028-1028: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)


1070-1070: f-string without any placeholders

Remove extraneous f prefix

(F541)


1099-1099: Docstring contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?

(RUF002)


1100-1100: Docstring contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?

(RUF002)


1130-1130: f-string without any placeholders

Remove extraneous f prefix

(F541)


1171-1171: f-string without any placeholders

Remove extraneous f prefix

(F541)


1173-1173: String contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?

(RUF001)


1176-1176: f-string without any placeholders

Remove extraneous f prefix

(F541)


1178-1178: String contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?

(RUF001)


1231-1231: f-string without any placeholders

Remove extraneous f prefix

(F541)


1243-1243: f-string without any placeholders

Remove extraneous f prefix

(F541)


1300-1300: f-string without any placeholders

Remove extraneous f prefix

(F541)


1305-1305: f-string without any placeholders

Remove extraneous f prefix

(F541)


1310-1310: f-string without any placeholders

Remove extraneous f prefix

(F541)


1333-1333: f-string without any placeholders

Remove extraneous f prefix

(F541)


1338-1338: f-string without any placeholders

Remove extraneous f prefix

(F541)


1343-1343: f-string without any placeholders

Remove extraneous f prefix

(F541)


1412-1412: Avoid specifying long messages outside the exception class

(TRY003)


1443-1443: f-string without any placeholders

Remove extraneous f prefix

(F541)


1447-1447: f-string without any placeholders

Remove extraneous f prefix

(F541)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: tests
🔇 Additional comments (4)
scripts/derive_galileo_gas_parameter/README.md (1)

1-54: LGTM!

Clear and comprehensive documentation covering prerequisites, installation, and usage examples for both collect and load modes. The CLI options are well-documented with sensible defaults.

scripts/derive_galileo_gas_parameter/.env.example (1)

1-3: LGTM!

Environment variable examples are clear and consistent with the script's requirements. The placeholder YOUR_API_KEY pattern appropriately indicates where users need to substitute their own credentials.

scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py (2)

32-43: LGTM!

Good practice validating required environment variables at startup. The validation provides clear error messages for missing configuration.


1365-1461: LGTM!

Well-structured main function with clear workflow orchestration. The argument parsing is comprehensive and the step-by-step execution with progress output provides good user feedback.

Comment on lines +123 to +141
def latest_finalized_event_block(width=5):
"""Find the latest L1 block with a FinalizeBatch event"""
finalized_l1_head = -1

to_block = w3.eth.block_number
from_block = to_block - width

while finalized_l1_head == -1:
event_filter = rollup_contract.events.FinalizeBatch.create_filter(fromBlock=from_block, toBlock=to_block)
events = event_filter.get_all_entries()

if len(events) > 0:
finalized_l1_head = events[-1]['blockNumber']
break

to_block = from_block - 1
from_block = from_block - width - 1

return finalized_l1_head
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add safety bounds to prevent indefinite looping.

The while loop at line 130 could loop indefinitely or access invalid block numbers if no FinalizeBatch events are ever found. Consider adding a minimum block number check or maximum iteration count.

🔎 Proposed fix
 def latest_finalized_event_block(width=5):
     """Find the latest L1 block with a FinalizeBatch event"""
     finalized_l1_head = -1
 
     to_block = w3.eth.block_number
     from_block = to_block - width
+    max_iterations = 1000  # Safety limit
+    iterations = 0
 
     while finalized_l1_head == -1:
+        if iterations >= max_iterations or from_block < 0:
+            raise RuntimeError(f"Could not find FinalizeBatch event within {iterations} iterations")
         event_filter = rollup_contract.events.FinalizeBatch.create_filter(fromBlock=from_block, toBlock=to_block)
         events = event_filter.get_all_entries()
 
         if len(events) > 0:
             finalized_l1_head = events[-1]['blockNumber']
             break
 
         to_block = from_block - 1
         from_block = from_block - width - 1
+        iterations += 1
 
     return finalized_l1_head
🤖 Prompt for AI Agents
In @scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py around
lines 123-141, The latest_finalized_event_block function can loop indefinitely
or compute negative block ranges; add safety bounds by enforcing a minimum
from_block of 0 and a maximum iteration/count limit (e.g., max_attempts) inside
the while loop and exit gracefully if exceeded. Update the loop to decrement
from_block but clamp it to >= 0, increment an attempt counter each iteration,
and if attempts exceed the limit or from_block reaches 0 with no events, return
None or raise a clear exception so callers of latest_finalized_event_block can
handle the missing FinalizeBatch event.

Comment on lines +393 to +416
found_the_blob = False
while not found_the_blob:
cur_slot -= 1
url = f"{mainnet_beacon_url}/eth/v1/beacon/blob_sidecars/{cur_slot}"

headers = {'accept': 'application/json'}
try:
response = requests.get(url, headers=headers, timeout=5)
except Exception as e:
print(f"[warn] request error at slot {cur_slot}: {e}")
continue

if response.status_code != 200:
print(f"[warn] non-200 from beacon at slot {cur_slot}: {response.status_code}")
continue

if 'data' in response.json().keys():
for blob in response.json()['data']:
hash = kzg_to_versioned_hash(blob['kzg_commitment'])
if blob_hash == hash:
found_the_blob = True
indexed_blob[hash] = blob['blob']

return indexed_blob[blob_hash], cur_slot, cur_block
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add maximum slot search depth to prevent infinite loop.

The while not found_the_blob loop at line 394 could loop indefinitely if the blob hash is never found. Even though individual requests have timeouts, the outer loop has no termination bound.

🔎 Proposed fix
     found_the_blob = False
+    max_slots_to_search = 500  # Safety limit
+    slots_searched = 0
     while not found_the_blob:
+        if slots_searched >= max_slots_to_search:
+            raise RuntimeError(f"Could not find blob {blob_hash} within {max_slots_to_search} slots")
         cur_slot -= 1
+        slots_searched += 1
         url = f"{mainnet_beacon_url}/eth/v1/beacon/blob_sidecars/{cur_slot}"
🧰 Tools
🪛 Ruff (0.14.10)

401-401: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
In @scripts/derive_galileo_gas_parameter/derive_galileo_gas_parameter.py around
lines 393-416, The loop using found_the_blob and cur_slot can run forever if
blob_hash is never found; add a maximum search depth (e.g., max_search_slots or
max_iterations) and decrement cur_slot only while count < max; if the limit is
reached, stop the loop and raise an explicit error or return a clear sentinel
instead of looping forever. Update the loop around found_the_blob/cur_slot (and
the return path that uses indexed_blob[blob_hash]) to enforce the bound and
handle the "not found within max" case by raising an exception or returning
(None, None, None) so callers don't get a KeyError.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants