docs/tests: document job commands and add heartbeat command tests#816
docs/tests: document job commands and add heartbeat command tests#816mezzeddinee wants to merge 1 commit intoDIRACGrid:mainfrom
Conversation
c178410 to
41a711a
Compare
41a711a to
8972dcf
Compare
fbd607e to
95374e9
Compare
aldbr
left a comment
There was a problem hiding this comment.
Thanks @mezzeddinee 🙂
The documentation is fine to me, I just have a few comments on the tests
| def test_kill_command_created_and_delivered_once( | ||
| normal_user_client: TestClient, | ||
| valid_job_id: int, | ||
| ): | ||
| """Verify lifecycle of a Kill command. | ||
|
|
||
| 1. Job initially has no commands. | ||
| 2. Setting Status=KILLED creates a Kill command. | ||
| 3. Command is delivered on next heartbeat. | ||
| 4. Command is not re-delivered. | ||
| """ | ||
| # ------------------------------------------------------------------ | ||
| # 1️⃣ Initial heartbeat → no commands | ||
| # ------------------------------------------------------------------ | ||
| r = normal_user_client.patch( | ||
| "/api/jobs/heartbeat", | ||
| json={valid_job_id: {"Vsize": 1000}}, | ||
| ) | ||
| r.raise_for_status() | ||
| assert r.json() == [] | ||
|
|
||
| # ------------------------------------------------------------------ | ||
| # 2️⃣ Set job to KILLED (creates Kill command internally) | ||
| # ------------------------------------------------------------------ | ||
| r = normal_user_client.patch( | ||
| "/api/jobs/status", | ||
| json={ | ||
| valid_job_id: { | ||
| datetime.now(timezone.utc).isoformat(): { | ||
| "Status": JobStatus.KILLED, | ||
| "MinorStatus": "Marked for termination", | ||
| } | ||
| } | ||
| }, | ||
| ) | ||
| r.raise_for_status() | ||
|
|
||
| # Avoid heartbeat timestamp collision | ||
| sleep(1) | ||
|
|
||
| # ------------------------------------------------------------------ | ||
| # 3️⃣ First heartbeat → command delivered | ||
| # ------------------------------------------------------------------ | ||
| r = normal_user_client.patch( | ||
| "/api/jobs/heartbeat", | ||
| json={valid_job_id: {"Vsize": 1001}}, | ||
| ) | ||
| r.raise_for_status() | ||
|
|
||
| commands = r.json() | ||
|
|
||
| assert len(commands) == 1 | ||
| assert commands[0]["job_id"] == valid_job_id | ||
| assert commands[0]["command"] == "Kill" | ||
|
|
||
| sleep(1) | ||
|
|
||
| # ------------------------------------------------------------------ | ||
| # 4️⃣ Second heartbeat → command NOT delivered again | ||
| # ------------------------------------------------------------------ | ||
| r = normal_user_client.patch( | ||
| "/api/jobs/heartbeat", | ||
| json={valid_job_id: {"Vsize": 1002}}, | ||
| ) | ||
| r.raise_for_status() | ||
|
|
||
| assert r.json() == [] |
There was a problem hiding this comment.
Any difference with?
diracx/diracx-routers/tests/jobs/test_status.py
Lines 925 to 985 in ae39fb4
There was a problem hiding this comment.
Thanks for the review, you were right about the overlap.
I removed the duplicated single-job Kill/heartbeat cases, since test_status.py::test_heartbeat already covers that flow, and I removed the duplicate non-killed-status case as well. I kept a single JobStatus.RUNNING variant there.
test_heartbeat_commands.py now keeps only the distinct coverage: multi-job command isolation and the JobStatus.DELETED -> Kill path.
| assert r.json() == [] | ||
|
|
||
|
|
||
| def test_command_delivered_exactly_once( |
There was a problem hiding this comment.
Isn't it the same scenario as the first test?
| assert r.json() == [] | ||
|
|
||
|
|
||
| def test_non_terminal_status_does_not_create_command( |
There was a problem hiding this comment.
What's the difference between test_non_terminal_status_does_not_create_command and test_non_killed_status_does_not_create_command?
Other than "Running" vs JobStatus.RUNNING, which are equivalent, I don't see any difference
There was a problem hiding this comment.
I kept a single JobStatus.RUNNING variant there.
Add router-level tests covering JobCommand lifecycle: - Kill command created when status → KILLED or DELETED - Delivered exactly once via heartbeat - No duplicate delivery on subsequent heartbeats - Multiple jobs handled independently - Non-terminal transitions do not create commands Add developer documentation describing one-shot delivery semantics. Ignore local virtualenv (.venv).
95374e9 to
a996cf7
Compare
This PR adds focused router‑level coverage for heartbeat command delivery and documents the job‑command lifecycle. The new tests verify Kill command creation, single delivery, and non‑terminal transitions returning no commands. The docs include a concise overview plus sequence/activity diagrams.
Changes
• Add diracx-routers/tests/jobs/test_heartbeat_commands.py with router‑level Kill/heartbeat coverage.
• Add docs/dev/job_commands.md describing command creation/delivery and diagrams.
• Ignore local virtualenv (.venv) in .gitignore.
Tests
• pytest diracx-routers/tests/jobs/test_heartbeat_commands.py