Add Execution Mode for ADK Agent (multi-agent cost + non-real-time UX)

**Is your feature request related to a problem? Please describe.**
Yes.

In my experience, **multi-agent workflows are heavy to run in real time, and in most of my cases I do not require immediate answers.**

A concrete example: I have a “scoring agent” used to prioritize client projects/deals. This agent is expensive because it needs to retrieve and synthesize data from multiple sources (docs, CRM, email, meeting transcripts, and browsing), which means multiple complex tool calls and often multiple agent handoffs. The common expectation is:

“Please assess these deals, and return the results later.”

This becomes a cost and UX problem:
- Cost: scoring a single deal can be expensive, and we often need to score many deals.
- UX: users don’t want to wait until the process finishes; they still want to interact with the agent while the job is running.

ADK already supports batch/async execution in the context of offline evaluation, but production use cases frequently need a first-class “non-real-time execution mode” (e.g., “finish this task and email me later”).

Given that Gemini supports offline/batch-style processing, it would be very valuable if ADK could expose an execution mode that allows:
1) running agent jobs asynchronously (non-blocking)
2) optionally routing model calls through a batch backend for cost/throughput efficiency
3) resuming the workflow when the job completes

**Describe the solution you'd like**

Introduce an explicit Execution Mode for ADK agents and a first-class Job Service for job lifecycle management.

1) New Agent API: Execution Mode
Add a config option to Agent (or LlmAgent) such as:
- execution_mode="realtime" (default)
- execution_mode="offline" (enqueue job + return immediately)

Example (based on the multi-agent pattern in adk-samples):

```
root_agent = Agent(
    name="root_agent",
    global_instruction="You are a helpful virtual assistant for Cymbal Auto Insurance.",
    instruction="...",
    sub_agents=[membership_agent, roadside_agent, claims_agent, rewards_agent],
    tools=[membership],
    model="gemini-2.5-flash",
    execution_mode="realtime", #default
)

membership_agent = Agent(
    name="membership_agent",
    model="gemini-2.5-flash",
    description="Registers new members",
    instruction="...",
    tools=[membership],
    execution_mode="offline",   # NEW: run this agent as an offline job
)
```

2) Persist status in a shared store (JobService / JobStoreService)
Introduce a dedicated Job Service abstraction to store and retrieve job records across invocations.

Motivation:
Job state (QUEUED/RUNNING/DONE/FAILED) is operational metadata that changes frequently and needs deterministic reads/writes. This does not map cleanly to “memory” (which is primarily for long-term knowledge retrieval) or to session state alone (which is scoped per session). A dedicated service makes the semantics clear and avoids mixing concerns.

Proposed interface (conceptual):
- create_job(job_spec) -> job_id
- get_job(job_id) -> job_record
- update_job(job_id, patch)
- store_result(job_id, result_ref or result_payload)
- load_result(job_id) -> result_payload

Default implementation:
- InMemoryJobService for local dev
Pluggable implementations:
- Redis / SQL / Firestore / etc.

Job record fields (minimum viable):
- job_id
- agent_name
- parent_invocation_id (optional)
- status: QUEUED | RUNNING | DONE | FAILED
- created_at / updated_at
- result_ref (or result payload)
- error (if failed)
- optional: retry_count, next_retry_at

3) Runtime behavior: enqueue offline agent and return control immediately
When a parent agent delegates to a child agent with execution_mode="offline":
- The runtime enqueues a job via JobService
- Returns a structured “job pending” event to the parent flow
- The root agent can immediately respond to the user with:
  - an acknowledgement (“I started the scoring job…”)
  - the job_id
  - what to do next (“You can continue chatting; ask ‘status <job_id>’ anytime.”)

4) On subsequent calls, check job status before re-running
Every time the parent/root tries to invoke the offline agent again (or when the user asks for status), ADK checks JobService:

If RUNNING:
- Do not re-run.
- Emit a structured signal/event back to the root agent:
  {type: OFFLINE_JOB_RUNNING, job_id, last_update_at, optional_progress}
- Root agent can respond:
  “Your scoring job is still running. You can continue chatting; I’ll update you when it’s done.”

If DONE:
- Load the completed result payload via JobService (or result_ref).
- Inject it into the parent flow as if it were the child agent’s final response.
- Continue the workflow normally.

If FAILED:
- Attach failure details.
- Allow retry policy (either automatic retry or user-triggered retry).

**Describe alternatives you've considered**
1) Status quo: Only use offline evaluation batching
Offline evaluation is helpful for regression testing and benchmarking, but it does not solve production async job execution patterns like:
“please finish this task and email me later.”

2) Custom job orchestration outside ADK
It’s possible to build a custom queue, persistence layer, and polling logic around ADK, but this becomes duplicated boilerplate and inconsistent across teams.

3) Store job state in MemoryService or Session state
- Session state is scoped and not ideal for multi-session job tracking.
- MemoryService is typically optimized for knowledge retrieval rather than frequent operational status updates.
A dedicated JobService is clearer, safer, and more extensible.

**Additional context**
My current production-like use cases that do not require real-time completion, but do require online access to tools/data sources:

1) AI Scoring Agent (prioritize client deals/projects)
- pulls from CRM, docs, email, meeting notes, browsing
- outputs score + rationale
- expected UX: “assess these deals and return later”

2) SOW (Scope of Work) document generation
- pulls from meeting transcripts, emails, requirement docs, client metadata
- outputs a structured SOW draft
- expected UX: “generate SOW and notify me when ready”


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Execution Mode for ADK Agent (multi-agent cost + non-real-time UX) #4115

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add Execution Mode for ADK Agent (multi-agent cost + non-real-time UX) #4115

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions