Add per-request cost support (obj_cost) and cost-miss reporting#308
Add per-request cost support (obj_cost) and cost-miss reporting#308
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the cache simulation framework by integrating a first-class concept of 'request cost' ( Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces support for per-request cost (obj_cost) and cost-based miss ratio reporting. The changes are comprehensive, touching data structures, trace readers for both CSV and binary formats, and the main simulation logic. The implementation correctly adds the obj_cost field and integrates it into the simulation and reporting.
However, the pull request description states that no tests were run due to an environmental issue. Given the breadth of the changes, this introduces a significant risk of regressions or bugs. It is highly recommended to resolve the environment issue and run the existing test suite, as well as add new tests for the cost functionality, before merging.
I have also identified a couple of areas with code duplication that could be refactored to improve maintainability.
| if (!ignore_obj_size) { | ||
| snprintf(output_str, 1024, | ||
| "%s %s cache size %8s, %16lu req, miss ratio %.4lf, throughput " | ||
| "%s %s cache size %8s, %16lu req, miss ratio %.4lf, byte miss ratio %.4lf, cost miss ratio %.4lf, throughput " | ||
| "%.2lf MQPS\n", | ||
| reader->trace_path, detailed_cache_name, size_str, | ||
| (unsigned long)req_cnt, (double)miss_cnt / (double)req_cnt, | ||
| byte_miss_ratio, cost_miss_ratio, | ||
| (double)req_cnt / 1000000.0 / runtime); | ||
| } else { | ||
| snprintf(output_str, 1024, | ||
| "%s %s cache size %8lld, %16lu req, miss ratio %.4lf, throughput " | ||
| "%s %s cache size %8lld, %16lu req, miss ratio %.4lf, byte miss ratio %.4lf, cost miss ratio %.4lf, throughput " | ||
| "%.2lf MQPS\n", | ||
| reader->trace_path, detailed_cache_name, | ||
| (long long)cache->cache_size, (unsigned long)req_cnt, | ||
| (double)miss_cnt / (double)req_cnt, | ||
| byte_miss_ratio, cost_miss_ratio, | ||
| (double)req_cnt / 1000000.0 / runtime); | ||
| } |
There was a problem hiding this comment.
The snprintf calls inside this if/else block are largely duplicated. This can be refactored to reduce code duplication and improve maintainability by building the output string in parts.
int n;
if (!ignore_obj_size) {
n = snprintf(output_str, 1024, "%s %s cache size %8s",
reader->trace_path, detailed_cache_name, size_str);
} else {
n = snprintf(output_str, 1024, "%s %s cache size %8lld",
reader->trace_path, detailed_cache_name,
(long long)cache->cache_size);
}
snprintf(output_str + n, 1024 - n,
", %16lu req, miss ratio %.4lf, byte miss ratio %.4lf, cost miss ratio %.4lf, throughput "
"%.2lf MQPS\n",
(unsigned long)req_cnt, (double)miss_cnt / (double)req_cnt,
byte_miss_ratio, cost_miss_ratio,
(double)req_cnt / 1000000.0 / runtime);| } else if (csv_params->curr_field_idx == csv_params->obj_cost_field_idx) { | ||
| req->obj_cost = (int64_t)strtoll((char *)s, &end, 0); | ||
| if (req->obj_cost == 0 && end == s) { | ||
| WARN("csvReader obj_cost is not a number: \"%s\"\n", (char *)s); | ||
| } |
There was a problem hiding this comment.
Motivation
Description
obj_costtorequest_twith default initialization to1and included cost inprint_requestdebug output.reader_init_param_twithobj_cost_fieldand added parsing ofobj-cost-col/cost-colinparse_reader_paramsto configure a cost column from CSV parameters.obj_cost) when present and wire the cost-field index/offset throughreaderInternalstructures; when not configured the reader setsreq->obj_cost = req->obj_sizeas a backward-compatible fallback.read_one_reqinitializesobj_costand ensuresignore_obj_sizebehavior still applies; verbose logging now printssizeandcostfor each read request.cachesimsimulation logic collectsreq_costandmiss_cost, computescost_miss_ratio, and prints it alongside request miss ratio and byte miss ratio in the output.README.mdexamples and plotting script invocation to demonstrateobj-cost-colusage.Testing
cmake -S . -B build) which failed due to a missing system dependencyglib-2.0, so no full build/test ran in this environment (failure is environmental, not code-related).Codex Task