Skip to content

Add per-request cost support (obj_cost) and cost-miss reporting#309

Merged
1a1a11a merged 2 commits intodevelopfrom
pr308
Mar 15, 2026
Merged

Add per-request cost support (obj_cost) and cost-miss reporting#309
1a1a11a merged 2 commits intodevelopfrom
pr308

Conversation

@1a1a11a
Copy link
Owner

@1a1a11a 1a1a11a commented Mar 15, 2026

Motivation

  • Provide first-class per-request "cost" in traces so simulations can report cost-weighted miss ratios in addition to object counts and bytes.
  • Allow traces to carry a dedicated cost column (CSV/binary) without breaking existing traces that only supply object size.

Description

  • Added obj_cost to request_t with default initialization to 1 and included cost in print_request debug output.
  • Extended reader_init_param_t with obj_cost_field and added parsing of obj-cost-col / cost-col in parse_reader_params to configure a cost column from CSV parameters.
  • CSV and binary readers now parse an optional cost field (obj_cost) when present and wire the cost-field index/offset through readerInternal structures; when not configured the reader sets req->obj_cost = req->obj_size as a backward-compatible fallback.
  • read_one_req initializes obj_cost and ensures ignore_obj_size behavior still applies; verbose logging now prints size and cost for each read request.
  • cachesim simulation logic collects req_cost and miss_cost, computes cost_miss_ratio, and prints it alongside request miss ratio and byte miss ratio in the output.
  • Updated README.md examples and plotting script invocation to demonstrate obj-cost-col usage.

Testing

  • Attempted an automated CMake configuration (cmake -S . -B build) which failed due to a missing system dependency glib-2.0, so no full build/test ran in this environment (failure is environmental, not code-related).
  • No unit or integration tests were executed in the workspace; changes were validated via static edits and local file diffs and committed.

Codex Task

@1a1a11a 1a1a11a requested a review from haochengxia as a code owner March 15, 2026 00:50
Copilot AI review requested due to automatic review settings March 15, 2026 00:50
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the cache simulation framework by integrating object cost as a first-class metric. Previously, the simulator focused on request and byte miss ratios. With this change, users can now define a cost associated with each object, allowing the simulation to calculate and report a "cost saving ratio." This provides a more nuanced understanding of cache performance, especially in scenarios where different objects have varying economic or computational costs.

Highlights

  • Object Cost Integration: Introduced a new 'object cost' metric throughout the cache simulation framework, allowing for more nuanced performance analysis beyond traditional miss ratios.
  • New Metrics in Output: The simulation output now includes a 'cost saving ratio' which quantifies the effectiveness of the cache in reducing the total cost of missed objects.
  • Trace File Support: Updated both binary and CSV trace readers to support specifying and parsing an 'object cost' field from input trace files.
  • Simulation Logic Update: Modified the core simulation logic to track and accumulate total requested object cost and total missed object cost.
  • CLI Parameter: Added command-line parameter support (obj-cost-col or cost-col) for users to specify the column containing object cost in trace files.
Changelog
  • libCacheSim/bin/cachesim/main.c
    • Included the <math.h> header for floating-point operations.
    • Modified the output loop to calculate and display miss_ratio, cost_saving_ratio, and conditionally include byte miss ratio and cost saving ratio based on flags.
  • libCacheSim/bin/cachesim/sim.c
    • Declared new variables req_cost and miss_cost to track object costs.
    • Updated the request processing loop to accumulate req_cost and miss_cost.
    • Refactored the final output string generation to calculate and display miss_ratio, byte_miss_ratio, and cost_saving_ratio.
  • libCacheSim/bin/cli_reader_utils.c
    • Added logic to parse obj-cost-col or cost-col command-line parameters and store the column index in params->obj_cost_field.
  • libCacheSim/include/libCacheSim/cache.h
    • Extended the cache_result_t structure with n_req_cost (double) and n_miss_cost (double) to track total requested and missed object costs.
  • libCacheSim/include/libCacheSim/reader.h
    • Added obj_cost_field (int32_t) to the reader_init_params_t structure.
  • libCacheSim/include/libCacheSim/request.h
    • Added an obj_cost (int64_t) field to the request_t structure.
    • Initialized req->obj_cost = 1 in the new_request function.
    • Updated the print_request function to include obj_cost in debug logging.
  • libCacheSim/profiler/simulator.c
    • Updated the _simulate function to accumulate req->obj_cost into result[idx].n_req_cost and result[idx].n_miss_cost during simulation.
  • libCacheSim/traceReader/generalReader/binary.c
    • Added setup logic for obj_cost_field_idx, obj_cost_format, and obj_cost_offset based on reader parameters.
    • Included obj_cost_field details in debug output.
    • Implemented reading obj_cost from binary trace data.
  • libCacheSim/traceReader/generalReader/csv.c
    • Added logic to parse obj_cost from CSV fields using strtoll.
    • Updated csv_setup_reader to initialize obj_cost_field_idx from init_params.
  • libCacheSim/traceReader/reader.c
    • Initialized req->obj_cost = 1 when a new request is read.
    • Set req->obj_cost = 1 if obj_cost_field is not specified in reader parameters.
    • Updated verbose logging to include obj_cost in the request details.
  • libCacheSim/traceReader/readerInternal.h
    • Added obj_cost_field_idx to csv_reader_params_t and binary_reader_params_t structures.
    • Added obj_cost_offset, obj_cost_field_idx, and obj_cost_format to the binary_reader_params_t structure.
Activity
  • No human activity has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively integrates object cost support into the cache simulator, enhancing its analytical capabilities. The changes are well-structured, spanning from trace parsing to final statistics reporting. The updated output formatting is more flexible and informative. I've identified one potential issue regarding a division by zero that could affect the program's stability, for which I've provided a suggestion.

@1a1a11a 1a1a11a changed the title support object cost Add per-request cost support (obj_cost) and cost-miss reporting Mar 15, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds end-to-end support for an “object cost” attribute in traces/requests, and surfaces cost-based metrics in cachesim outputs.

Changes:

  • Extend request_t and reader init params to carry an obj_cost field (defaulting to 1 when not provided).
  • Add CSV and binary trace parsing support for an object-cost column/field.
  • Accumulate and report cost-based statistics (e.g., cost saving ratio) in simulators/CLI output.

Reviewed changes

Copilot reviewed 8 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
libCacheSim/traceReader/readerInternal.h Adds internal CSV/binary reader fields for object cost parsing.
libCacheSim/traceReader/reader.c Initializes default obj_cost and includes cost in verbose read logging.
libCacheSim/traceReader/generalReader/csv.c Parses obj_cost from a configured CSV column.
libCacheSim/traceReader/generalReader/binary.c Parses obj_cost from a configured binary field.
libCacheSim/profiler/simulator.c Accumulates request/miss cost totals into results.
libCacheSim/include/libCacheSim/request.h Adds obj_cost to request_t, default init, and logging output.
libCacheSim/include/libCacheSim/reader.h Adds obj_cost_field to reader init parameters.
libCacheSim/include/libCacheSim/cache.h Extends cache_stat_t with cost totals.
libCacheSim/bin/cli_reader_utils.c Adds CLI parsing for obj-cost-col / cost-col.
libCacheSim/bin/cachesim/sim.c Adds cost tracking and optional cost-saving ratio output.
libCacheSim/bin/cachesim/main.c Adds cost-saving ratio output in multi-cache summary.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +124 to +133
"req clock_time %lu, id %llu, size %ld, cost %ld, ttl %ld, op %s, "
"valid %d\n",
(unsigned long)req->clock_time, (unsigned long long)req->obj_id,
(long)req->obj_size, (long)req->ttl, req_op_str[req->op], req->valid);
(long)req->obj_size, (long)req->obj_cost, (long)req->ttl,
req_op_str[req->op], req->valid);
#else
LOGGING(DEBUG_LEVEL,
"req clock_time %lu, id %llu, size %ld, op %s, valid %d\n",
"req clock_time %lu, id %llu, size %ld, cost %ld, op %s, valid %d\n",
(unsigned long)req->clock_time, (unsigned long long)req->obj_id,
(long)req->obj_size, req_op_str[req->op], req->valid);
(long)req->obj_size, (long)req->obj_cost, req_op_str[req->op],
Comment on lines +342 to +345
VERBOSE(
"read one req: time %lu, obj_id %lu, size %lu, cost %lu at offset %zu\n",
req->clock_time, req->obj_id, req->obj_size, req->obj_cost,
offset_before_read);
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6f99bb404e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +85 to +86
n += snprintf(
output_str + n, sizeof(output_str) - n, ", byte miss ratio %.4lf",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Guard append offset after formatted prefix truncates

If the first snprintf result in n is larger than output_str (for example with a long trace path or cache name), output_str + n points past the buffer and sizeof(output_str) - n wraps to a huge size_t; the subsequent append call can write out of bounds. This turns long-but-valid CLI inputs into a potential crash/corruption path in summary output generation.

Useful? React with 👍 / 👎.

Comment on lines +101 to +102
n += snprintf(output_str + n, sizeof(output_str) - n,
", byte miss ratio %.4lf", byte_miss_ratio);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Guard append offset in single-cache output formatter

This append uses output_str + n and sizeof(output_str) - n without checking whether n already exceeded the buffer in the previous snprintf; when that happens (e.g., very long trace path/cache name), the pointer and remaining-size calculation become invalid and the write goes out of bounds. The single-cache path therefore has the same memory corruption risk as the multi-cache formatter.

Useful? React with 👍 / 👎.

double miss_ratio = req_cnt > 0 ? (double)miss_cnt / (double)req_cnt : 0.0;
double byte_miss_ratio =
req_byte > 0 ? (double)miss_byte / (double)req_byte : 0.0;
double cost_saving_ratio = 1.0 - (req_cost > 0 ? miss_cost / req_cost : 0.0);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Return zero saving when total request cost is zero

When req_cost is zero (for example, a trace with an explicit cost column containing all zeros), this expression evaluates cost_saving_ratio to 1.0, so the report can claim 100% cost saving even when misses occur. This makes single-cache cost metrics incorrect and inconsistent with the multi-cache path, which treats zero total cost as zero saving.

Useful? React with 👍 / 👎.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@1a1a11a 1a1a11a merged commit 70ce858 into develop Mar 15, 2026
7 checks passed
@1a1a11a 1a1a11a deleted the pr308 branch March 15, 2026 01:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants