Conversation
Add strategic metadata to SubmissionCancelled
Add strategic metadata to SubmissionCancelled
SubmissionNotFound compiles
Test cancel_submission
SubmissionNotCancellableError
There was a problem hiding this comment.
Pull request overview
This PR adds end-to-end support for cancelling an in-progress submission: it introduces a cancelled submission state persisted in a new submissions_cancelled table, exposes a new producer HTTP endpoint + Rust client method, and wires the feature through the Python bindings and tests.
Changes:
- Add
SubmissionStatus::Cancelled/SubmissionCancelledand a DB cancel path that moves submissions intosubmissions_cancelled. - Expose
POST /producer/submissions/cancel/{submission_id}and a matchingClient::cancel_submissionmethod. - Add Python API + exceptions for cancellation and add roundtrip tests for cancel/not-found/not-cancellable cases.
Reviewed changes
Copilot reviewed 14 out of 16 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| opsqueue/src/prometheus.rs | Adds metric names for cancelled submissions. |
| opsqueue/src/producer/server.rs | Adds new HTTP route/handler for cancelling submissions. |
| opsqueue/src/producer/client.rs | Adds client method to call the cancel endpoint and decode errors. |
| opsqueue/src/common/submission.rs | Adds cancelled submission status, DB queries, cancel mutation, and cleanup logic. |
| opsqueue/src/common/errors.rs | Adds serializable cancellation-related error types. |
| opsqueue/opsqueue_example_database_schema.db | Adds an example SQLite schema DB file reflecting new tables. |
| opsqueue/migrations/20260225164600_submissions_cancelled.sql | Adds migration creating submissions_cancelled. |
| libs/opsqueue_python/tests/test_roundtrip.py | Adds Python integration tests for cancel behavior and exceptions. |
| libs/opsqueue_python/src/producer.rs | Exposes cancel_submission from Rust -> Python client. |
| libs/opsqueue_python/src/lib.rs | Registers new Python-exposed types (but see review comments). |
| libs/opsqueue_python/src/errors.rs | Maps new cancellation errors into Python exceptions. |
| libs/opsqueue_python/src/common.rs | Adds Python types for cancelled status and not-cancellable details. |
| libs/opsqueue_python/python/opsqueue/producer.py | Adds Python ProducerClient.cancel_submission wrapper + docs. |
| libs/opsqueue_python/python/opsqueue/exceptions.py | Adds Python SubmissionNotCancellableError and enriches SubmissionNotFoundError. |
| Cargo.toml | Bumps workspace version to 0.35.0. |
| Cargo.lock | Updates locked versions for opsqueue and opsqueue_python. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Ok(Some(SubmissionStatus::InProgress(submission))) => { | ||
| panic!("Failed to cancel in progress submission {:?}", submission) |
There was a problem hiding this comment.
cancel_submission panics if cancel_submission_notx returned SubmissionNotFound but submission_status reports InProgress. Even if you believe this is unreachable, panicking here turns a recoverable inconsistency/race into a process crash. Prefer returning an internal error (e.g. E::L(DatabaseError) / dedicated invariant error) so the HTTP layer can respond with 500.
| Ok(Some(SubmissionStatus::InProgress(submission))) => { | |
| panic!("Failed to cancel in progress submission {:?}", submission) | |
| Ok(Some(SubmissionStatus::InProgress(_submission))) => { | |
| // Inconsistent state: cancel reported SubmissionNotFound but status is InProgress. | |
| // Treat this as a not-found error instead of panicking, so callers receive a | |
| // recoverable error rather than crashing the process. | |
| Err(E::R(E::L(not_found_err))) |
| histogram!(crate::prometheus::SUBMISSIONS_DURATION_CANCEL_HISTOGRAM).record( | ||
| crate::prometheus::time_delta_as_f64(Utc::now() - id.timestamp()), | ||
| ); | ||
| match submission_opt { | ||
| None => Err(E::R(SubmissionNotFound(id))), | ||
| Some(_) => { |
There was a problem hiding this comment.
cancel_submission_raw records the cancel-duration histogram even when the submission wasn't found (the None branch). This will skew latency metrics with attempts that never actually cancelled anything; the other status transitions only record duration on success. Consider recording the histogram only after a successful cancel (i.e., after Some(_)).
| histogram!(crate::prometheus::SUBMISSIONS_DURATION_CANCEL_HISTOGRAM).record( | |
| crate::prometheus::time_delta_as_f64(Utc::now() - id.timestamp()), | |
| ); | |
| match submission_opt { | |
| None => Err(E::R(SubmissionNotFound(id))), | |
| Some(_) => { | |
| match submission_opt { | |
| None => Err(E::R(SubmissionNotFound(id))), | |
| Some(_) => { | |
| histogram!(crate::prometheus::SUBMISSIONS_DURATION_CANCEL_HISTOGRAM).record( | |
| crate::prometheus::time_delta_as_f64(Utc::now() - id.timestamp()), | |
| ); |
Address comments
This PR adds a new method
cancel_submissiontoProducerClient. This method takes aSubmissionIdand returnsNoneis successful. If the submission could not be found aSubmissionNotFoundErroris raised. If the submission was already completed, failed or cancelled then aSubmissionNotCancellableErroris raised which contains the specific reason.