Skip to content

docs/tests: document job commands and add heartbeat command tests#816

Draft
mezzeddinee wants to merge 1 commit intoDIRACGrid:mainfrom
mezzeddinee:feature/heartbeat-tests
Draft

docs/tests: document job commands and add heartbeat command tests#816
mezzeddinee wants to merge 1 commit intoDIRACGrid:mainfrom
mezzeddinee:feature/heartbeat-tests

Conversation

@mezzeddinee
Copy link

This PR adds focused router‑level coverage for heartbeat command delivery and documents the job‑command lifecycle. The new tests verify Kill command creation, single delivery, and non‑terminal transitions returning no commands. The docs include a concise overview plus sequence/activity diagrams.
Changes
• Add diracx-routers/tests/jobs/test_heartbeat_commands.py with router‑level Kill/heartbeat coverage.
• Add docs/dev/job_commands.md describing command creation/delivery and diagrams.
• Ignore local virtualenv (.venv) in .gitignore.
Tests
• pytest diracx-routers/tests/jobs/test_heartbeat_commands.py

@read-the-docs-community
Copy link

read-the-docs-community bot commented Mar 3, 2026

Documentation build overview

📚 diracx | 🛠️ Build #31737449 | 📁 Comparing a996cf7 against latest (70bf3c7)


🔍 Preview build

Show files changed (118 files in total): 📝 117 modified | ➕ 1 added | ➖ 0 deleted
File Status
404.html 📝 modified
index.html 📝 modified
REFERENCE/index.html 📝 modified
RUN_PROD/index.html 📝 modified
SECURITY/index.html 📝 modified
SSO/index.html 📝 modified
admin/index.html 📝 modified
dev/index.html 📝 modified
roadmap/index.html 📝 modified
user/index.html 📝 modified
admin/explanations/index.html 📝 modified
admin/how-to/index.html 📝 modified
admin/reference/index.html 📝 modified
admin/tutorials/index.html 📝 modified
dev/explanations/index.html 📝 modified
dev/how-to/index.html 📝 modified
dev/reference/index.html 📝 modified
dev/tutorials/index.html 📝 modified
user/explanations/index.html 📝 modified
user/how-to/index.html 📝 modified
user/reference/index.html 📝 modified
user/tutorials/index.html 📝 modified
admin/explanations/auth-with-diracx/index.html 📝 modified
admin/explanations/auth-with-external/index.html 📝 modified
admin/explanations/chart-structure/index.html 📝 modified
admin/explanations/configuration/index.html 📝 modified
admin/explanations/database-management/index.html 📝 modified
admin/explanations/manage-web-release/index.html 📝 modified
admin/explanations/opentelemetry/index.html 📝 modified
admin/explanations/sandbox-store/index.html 📝 modified
admin/explanations/user-management/index.html 📝 modified
admin/how-to/debugging/index.html 📝 modified
admin/how-to/install/index.html 📝 modified
admin/how-to/rotate-a-secret/index.html 📝 modified
admin/how-to/upgrading/index.html 📝 modified
admin/reference/env-variables/index.html 📝 modified
admin/reference/security_model/index.html 📝 modified
admin/reference/settings-and-preferences/index.html 📝 modified
admin/reference/values/index.html 📝 modified
admin/tutorials/authentication/index.html 📝 modified
admin/tutorials/run_locally/index.html 📝 modified
dev/explanations/components/index.html 📝 modified
dev/explanations/dependency-management/index.html 📝 modified
dev/explanations/designing-functionality/index.html 📝 modified
dev/explanations/documentation-system/index.html 📝 modified
dev/explanations/extensions/index.html 📝 modified
dev/explanations/job_commands/index.html ➕ added
dev/explanations/repo-structure/index.html 📝 modified
dev/explanations/run_demo/index.html 📝 modified
dev/explanations/testing/index.html 📝 modified
dev/explanations/web-architecture/index.html 📝 modified
dev/explanations/web-testing/index.html 📝 modified
dev/how-to/add-a-cli-command/index.html 📝 modified
dev/how-to/add-a-db/index.html 📝 modified
dev/how-to/add-a-route/index.html 📝 modified
dev/how-to/add-a-setting/index.html 📝 modified
dev/how-to/add-a-task/index.html 📝 modified
dev/how-to/add-a-test/index.html 📝 modified
dev/how-to/add-functionality/index.html 📝 modified
dev/how-to/client-customization/index.html 📝 modified
dev/how-to/client-extension/index.html 📝 modified
dev/how-to/client-generation/index.html 📝 modified
dev/how-to/contribute/index.html 📝 modified
dev/how-to/contribute-to-web/index.html 📝 modified
dev/how-to/create-web-application/index.html 📝 modified
dev/how-to/develop-legacy-dirac/index.html 📝 modified
dev/how-to/extend-diracx/index.html 📝 modified
dev/how-to/manage-web-extension/index.html 📝 modified
dev/how-to/setup-web-environment/index.html 📝 modified
dev/how-to/use-the-demo/index.html 📝 modified
dev/how-to/write-docs/index.html 📝 modified
dev/reference/application-state/index.html 📝 modified
dev/reference/client-metapathfinder/index.html 📝 modified
dev/reference/coding-conventions/index.html 📝 modified
dev/reference/configuration/index.html 📝 modified
dev/reference/db-transaction-model/index.html 📝 modified
dev/reference/dependency-injection/index.html 📝 modified
dev/reference/entrypoints/index.html 📝 modified
dev/reference/env-variables/index.html 📝 modified
dev/reference/pixi-tasks/index.html 📝 modified
dev/reference/security-policies/index.html 📝 modified
dev/reference/security-properties/index.html 📝 modified
dev/reference/test-recipes/index.html 📝 modified
dev/reference/web-coding-conventions/index.html 📝 modified
dev/reference/writing-tests/index.html 📝 modified
dev/tutorials/advanced-tutorial/index.html 📝 modified
dev/tutorials/getting-started/index.html 📝 modified
dev/tutorials/larger-developments/index.html 📝 modified
dev/tutorials/making-changes/index.html 📝 modified
dev/tutorials/play-with-auth/index.html 📝 modified
dev/tutorials/run-locally/index.html 📝 modified
dev/tutorials/web-extensions/index.html 📝 modified
dev/tutorials/web-getting-started/index.html 📝 modified
user/how-to/list-and-share-applications/index.html 📝 modified
user/how-to/login-out/index.html 📝 modified
user/how-to/monitor-jobs/index.html 📝 modified
user/reference/client-configuration/index.html 📝 modified
user/reference/known-installations/index.html 📝 modified
user/reference/programmatic-usage/index.html 📝 modified
user/tutorials/getting-started/index.html 📝 modified
admin/how-to/install/connect/index.html 📝 modified
admin/how-to/install/convert-cs/index.html 📝 modified
admin/how-to/install/embracing/index.html 📝 modified
admin/how-to/install/install-kubernetes/index.html 📝 modified
admin/how-to/install/installing/index.html 📝 modified
admin/how-to/install/minimal-requirements/index.html 📝 modified
admin/how-to/install/register-a-vo/index.html 📝 modified
admin/how-to/install/register-the-admin-vo/index.html 📝 modified
dev/explanations/components/api/index.html 📝 modified
dev/explanations/components/cli/index.html 📝 modified
dev/explanations/components/client/index.html 📝 modified
dev/explanations/components/db/index.html 📝 modified
dev/explanations/components/routes/index.html 📝 modified
dev/how-to/use-the-demo/swagger/index.html 📝 modified
dev/how-to/use-the-demo/web/index.html 📝 modified
user/reference/programmatic-usage/command-line-interface/index.html 📝 modified
user/reference/programmatic-usage/https-interface/index.html 📝 modified
user/reference/programmatic-usage/python-interface/index.html 📝 modified

@mezzeddinee mezzeddinee force-pushed the feature/heartbeat-tests branch 2 times, most recently from c178410 to 41a711a Compare March 3, 2026 15:25
@DIRACGridBot DIRACGridBot marked this pull request as draft March 3, 2026 16:25
@aldbr aldbr linked an issue Mar 3, 2026 that may be closed by this pull request
@mezzeddinee mezzeddinee force-pushed the feature/heartbeat-tests branch from 41a711a to 8972dcf Compare March 4, 2026 07:50
@mezzeddinee mezzeddinee marked this pull request as ready for review March 4, 2026 08:28
@mezzeddinee mezzeddinee force-pushed the feature/heartbeat-tests branch 2 times, most recently from fbd607e to 95374e9 Compare March 4, 2026 08:48
Copy link
Contributor

@aldbr aldbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mezzeddinee 🙂
The documentation is fine to me, I just have a few comments on the tests

Comment on lines +26 to +92
def test_kill_command_created_and_delivered_once(
normal_user_client: TestClient,
valid_job_id: int,
):
"""Verify lifecycle of a Kill command.

1. Job initially has no commands.
2. Setting Status=KILLED creates a Kill command.
3. Command is delivered on next heartbeat.
4. Command is not re-delivered.
"""
# ------------------------------------------------------------------
# 1️⃣ Initial heartbeat → no commands
# ------------------------------------------------------------------
r = normal_user_client.patch(
"/api/jobs/heartbeat",
json={valid_job_id: {"Vsize": 1000}},
)
r.raise_for_status()
assert r.json() == []

# ------------------------------------------------------------------
# 2️⃣ Set job to KILLED (creates Kill command internally)
# ------------------------------------------------------------------
r = normal_user_client.patch(
"/api/jobs/status",
json={
valid_job_id: {
datetime.now(timezone.utc).isoformat(): {
"Status": JobStatus.KILLED,
"MinorStatus": "Marked for termination",
}
}
},
)
r.raise_for_status()

# Avoid heartbeat timestamp collision
sleep(1)

# ------------------------------------------------------------------
# 3️⃣ First heartbeat → command delivered
# ------------------------------------------------------------------
r = normal_user_client.patch(
"/api/jobs/heartbeat",
json={valid_job_id: {"Vsize": 1001}},
)
r.raise_for_status()

commands = r.json()

assert len(commands) == 1
assert commands[0]["job_id"] == valid_job_id
assert commands[0]["command"] == "Kill"

sleep(1)

# ------------------------------------------------------------------
# 4️⃣ Second heartbeat → command NOT delivered again
# ------------------------------------------------------------------
r = normal_user_client.patch(
"/api/jobs/heartbeat",
json={valid_job_id: {"Vsize": 1002}},
)
r.raise_for_status()

assert r.json() == []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any difference with?

def test_heartbeat(normal_user_client: TestClient, valid_job_id: int):
search_body = {
"search": [{"parameter": "JobID", "operator": "eq", "value": valid_job_id}]
}
r = normal_user_client.post("/api/jobs/search", json=search_body)
r.raise_for_status()
old_data = r.json()[0]
assert old_data["HeartBeatTime"] is None
payload = {valid_job_id: {"Vsize": 1234}}
r = normal_user_client.patch("/api/jobs/heartbeat", json=payload)
r.raise_for_status()
r = normal_user_client.post("/api/jobs/search", json=search_body)
r.raise_for_status()
new_data = r.json()[0]
hbt = datetime.fromisoformat(new_data["HeartBeatTime"])
# This should be timezone aware due to the enforced tzinfo from
# the SQLAlchemy type used for datetime fields in JobDB
assert hbt.tzinfo is not None
assert hbt >= datetime.now(tz=timezone.utc) - timedelta(seconds=15)
# Kill the job by setting the status on it
r = normal_user_client.patch(
"/api/jobs/status",
json={
valid_job_id: {
str(datetime.now(timezone.utc)): {
"Status": JobStatus.KILLED,
"MinorStatus": "Marked for termination",
}
}
},
)
r.raise_for_status()
sleep(1)
# Send another heartbeat and check that a Kill job command was set
payload = {valid_job_id: {"Vsize": 1235}}
r = normal_user_client.patch("/api/jobs/heartbeat", json=payload)
r.raise_for_status()
commands = r.json()
assert len(commands) == 1, "Exactly one job command should be returned"
assert commands[0]["job_id"] == valid_job_id, (
f"Wrong job id, should be '{valid_job_id}' but got {commands[0]['job_id']=}"
)
assert commands[0]["command"] == "Kill", (
f"Wrong job command received, should be 'Kill' but got {commands[0]=}"
)
sleep(1)
# Send another heartbeat and check the job commands are empty
payload = {valid_job_id: {"Vsize": 1234}}
r = normal_user_client.patch("/api/jobs/heartbeat", json=payload)
r.raise_for_status()
commands = r.json()
assert len(commands) == 0, (
"Exactly zero job commands should be returned after heartbeat commands are sent"
)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, you were right about the overlap.

I removed the duplicated single-job Kill/heartbeat cases, since test_status.py::test_heartbeat already covers that flow, and I removed the duplicate non-killed-status case as well. I kept a single JobStatus.RUNNING variant there.

test_heartbeat_commands.py now keeps only the distinct coverage: multi-job command isolation and the JobStatus.DELETED -> Kill path.

assert r.json() == []


def test_command_delivered_exactly_once(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it the same scenario as the first test?

assert r.json() == []


def test_non_terminal_status_does_not_create_command(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between test_non_terminal_status_does_not_create_command and test_non_killed_status_does_not_create_command?

Other than "Running" vs JobStatus.RUNNING, which are equivalent, I don't see any difference

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept a single JobStatus.RUNNING variant there.

@DIRACGridBot DIRACGridBot marked this pull request as draft March 5, 2026 07:40
Add router-level tests covering JobCommand lifecycle:
- Kill command created when status → KILLED or DELETED
- Delivered exactly once via heartbeat
- No duplicate delivery on subsequent heartbeats
- Multiple jobs handled independently
- Non-terminal transitions do not create commands

Add developer documentation describing one-shot delivery semantics.
Ignore local virtualenv (.venv).
@mezzeddinee mezzeddinee force-pushed the feature/heartbeat-tests branch from 95374e9 to a996cf7 Compare March 10, 2026 07:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add tests for job heartbeat and commands

3 participants