feat: API pipeline run list has execution summary#96
feat: API pipeline run list has execution summary#96yuechao-qin wants to merge 1 commit intoycq/function_get_execution_status_statsfrom
Conversation
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
1229fcb to
f086771
Compare
ea578e3 to
489c797
Compare
f086771 to
e2e05ef
Compare
489c797 to
3df5bde
Compare
e382bc8 to
5ae141c
Compare
3df5bde to
5014f69
Compare
| if status in bts.CONTAINER_STATUSES_ENDED: | ||
| self.ended_executions += count | ||
|
|
||
| self.has_ended = self.ended_executions == self.total_executions |
There was a problem hiding this comment.
This is a pattern I would typically try to avoid in other programming languages. Here we are creating a an instance of the class with some initial state, then calling a function on that instance to later on to hydrate the instance with some values, to then return the instance to a caller.
What I would strive to due instead is have a pure / read only model that doesn't write to itself, or get written to after instatiation, but instead gets instantiated with its final values in the constructor. In this case, you could have a function build_execution_status_summar_from_stats that calculates the model attributes, instantiates the model, and returns the model in a read only state.
I could give examples of how this would look in other languages but that wouldn't be helpful.
Here is what Gemini says about implementing that in Python in a "pythonic" way:
@dataclasses.dataclass(frozen=True) # frozen=True makes instances immutable
class ExecutionStatusSummary:
"""A read-only summary of execution statuses."""
total_executions: int
ended_executions: int
has_ended: bool
@classmethod
def from_stats(
cls, stats: Dict[bts.ContainerExecutionStatus, int]
) -> "ExecutionStatusSummary":
"""
Calculates summary attributes from a stats dictionary and returns a new
immutable instance.
"""
total_executions = sum(stats.values())
ended_executions = sum(
count
for status, count in stats.items()
if status in bts.CONTAINER_STATUSES_ENDED
)
# This logic is more robust: an empty set of executions hasn't "ended".
has_ended = total_executions > 0 and ended_executions == total_executions
return cls(
total_executions=total_executions,
ended_executions=ended_executions,
has_ended=has_ended,
)
and consuming it as:
summary = ExecutionStatusSummary.from_stats(stats)
There was a problem hiding this comment.
This is not based on your exact changes and just an example ☝️ . In other words, not a copy/paste replacement and just a suggestion.
There was a problem hiding this comment.
I prefer less mutation too.
We've discussed this case with Yue Chao and the main reason for mutating implementation is because the instance will be constructed from multiple status stats objects.
Although, what we could do in that case is to first combine multiple dicts into one (via a new function or collection.Counter) and then initialize the structure.
full_status_stats = collections.Counter()
for status_stats in list_of_status_stats:
full_status_stats.update(status_stats)
ExecutionStatusSummary.from_status_stats(full_status_stats)| summary = ExecutionStatusSummary() | ||
| status_stats: dict[str, int] = {} | ||
| for status, count in stats.items(): | ||
| summary.count_execution_status(status=status, count=count) |
There was a problem hiding this comment.
Could help simplify the code here

TL;DR
Added execution summary to pipeline run responses to provide aggregated information about pipeline execution status (
GET /api/pipeline_runs/)..What changed?
execution_summaryfield toPipelineRunResponsedataclass_get_execution_status_statsto_get_execution_stats_and_summaryto return both stats and summaryHow to test?
include_execution_stats=TrueRun the new test cases in
tests/test_api_server_sql.py:Why make this change?
This closes #87