-
Notifications
You must be signed in to change notification settings - Fork 3
Add gemini example using litellm #240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
43 changes: 43 additions & 0 deletions
43
examples/tutorials/10_async/10_temporal/100_gemini_litellm/.dockerignore
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,43 @@ | ||
| # Python | ||
| __pycache__/ | ||
| *.py[cod] | ||
| *$py.class | ||
| *.so | ||
| .Python | ||
| build/ | ||
| develop-eggs/ | ||
| dist/ | ||
| downloads/ | ||
| eggs/ | ||
| .eggs/ | ||
| lib/ | ||
| lib64/ | ||
| parts/ | ||
| sdist/ | ||
| var/ | ||
| wheels/ | ||
| *.egg-info/ | ||
| .installed.cfg | ||
| *.egg | ||
|
|
||
| # Environments | ||
| .env** | ||
| .venv | ||
| env/ | ||
| venv/ | ||
| ENV/ | ||
| env.bak/ | ||
| venv.bak/ | ||
|
|
||
| # IDE | ||
| .idea/ | ||
| .vscode/ | ||
| *.swp | ||
| *.swo | ||
|
|
||
| # Git | ||
| .git | ||
| .gitignore | ||
|
|
||
| # Misc | ||
| .DS_Store |
54 changes: 54 additions & 0 deletions
54
examples/tutorials/10_async/10_temporal/100_gemini_litellm/Dockerfile
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,54 @@ | ||
| # syntax=docker/dockerfile:1.3 | ||
| FROM python:3.12-slim | ||
| COPY --from=ghcr.io/astral-sh/uv:0.6.4 /uv /uvx /bin/ | ||
|
|
||
| # Install system dependencies | ||
| RUN apt-get update && apt-get install -y \ | ||
| htop \ | ||
| vim \ | ||
| curl \ | ||
| tar \ | ||
| python3-dev \ | ||
| postgresql-client \ | ||
| build-essential \ | ||
| libpq-dev \ | ||
| gcc \ | ||
| cmake \ | ||
| netcat-openbsd \ | ||
| nodejs \ | ||
| npm \ | ||
| && apt-get clean \ | ||
| && rm -rf /var/lib/apt/lists/** | ||
|
|
||
| # Install tctl (Temporal CLI) | ||
| RUN curl -L https://github.com/temporalio/tctl/releases/download/v1.18.1/tctl_1.18.1_linux_arm64.tar.gz -o /tmp/tctl.tar.gz && \ | ||
| tar -xzf /tmp/tctl.tar.gz -C /usr/local/bin && \ | ||
| chmod +x /usr/local/bin/tctl && \ | ||
| rm /tmp/tctl.tar.gz | ||
|
|
||
| RUN uv pip install --system --upgrade pip setuptools wheel | ||
|
|
||
| ENV UV_HTTP_TIMEOUT=1000 | ||
|
|
||
| # Copy pyproject.toml and README.md to install dependencies | ||
| COPY 10_async/10_temporal/100_gemini_litellm/pyproject.toml /app/100_gemini_litellm/pyproject.toml | ||
| COPY 10_async/10_temporal/100_gemini_litellm/README.md /app/100_gemini_litellm/README.md | ||
|
|
||
| WORKDIR /app/100_gemini_litellm | ||
|
|
||
| # Copy the project code | ||
| COPY 10_async/10_temporal/100_gemini_litellm/project /app/100_gemini_litellm/project | ||
|
|
||
| # Install the required Python packages | ||
| RUN uv pip install --system . | ||
|
|
||
| WORKDIR /app/100_gemini_litellm | ||
|
|
||
| ENV PYTHONPATH=/app | ||
| ENV AGENT_NAME=at100-gemini-litellm | ||
|
|
||
| # Run the ACP server using uvicorn | ||
| CMD ["uvicorn", "project.acp:acp", "--host", "0.0.0.0", "--port", "8000"] | ||
|
|
||
| # When we deploy the worker, we will replace the CMD with the following | ||
| # CMD ["python", "-m", "run_worker"] |
130 changes: 130 additions & 0 deletions
130
examples/tutorials/10_async/10_temporal/100_gemini_litellm/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,130 @@ | ||
| # [Temporal] Using Alternative Models with LiteLLM (Gemini) | ||
|
|
||
| **Part of the [OpenAI SDK + Temporal integration series](../README.md)** | ||
|
|
||
| ## What You'll Learn | ||
|
|
||
| This tutorial demonstrates how to use Google's Gemini models (or any other LLM provider) with the OpenAI Agents SDK through LiteLLM. The key insight is that LiteLLM provides a unified interface, allowing you to swap models without changing your agent code structure. | ||
|
|
||
| **Key insight:** You can use the same OpenAI Agents SDK patterns with any LLM provider supported by LiteLLM - Gemini, Anthropic Claude, Mistral, and many more. | ||
|
|
||
| ## Prerequisites | ||
| - Development environment set up (see [main repo README](https://github.com/scaleapi/scale-agentex)) | ||
| - Backend services running: `make dev` from repository root (includes Temporal) | ||
| - Temporal UI available at http://localhost:8233 | ||
| - **Google Gemini API key** (see setup below) | ||
| - Understanding of OpenAI Agents SDK basics (see [060_open_ai_agents_sdk_hello_world](../060_open_ai_agents_sdk_hello_world/)) | ||
|
|
||
| ## Setup | ||
|
|
||
| ### 1. Get a Gemini API Key | ||
|
|
||
| 1. Go to [Google AI Studio](https://aistudio.google.com/apikey) | ||
| 2. Create a new API key | ||
| 3. Copy the key for the next step | ||
|
|
||
| ### 2. Configure the API Key | ||
|
|
||
| Add to your environment or `manifest.yaml`: | ||
|
|
||
| **Option A: Environment variable** | ||
| ```bash | ||
| export GEMINI_API_KEY="your-gemini-api-key-here" | ||
| ``` | ||
|
|
||
| **Option B: In manifest.yaml** | ||
| ```yaml | ||
| agent: | ||
| env: | ||
| GEMINI_API_KEY: "your-gemini-api-key-here" | ||
| ``` | ||
| ### 3. Install LiteLLM Dependency | ||
| The `pyproject.toml` already includes `litellm>=1.52.0`. When you run the agent, dependencies are installed automatically. | ||
|
|
||
| ## Quick Start | ||
|
|
||
| ```bash | ||
| cd examples/tutorials/10_async/10_temporal/100_gemini_litellm | ||
| uv run agentex agents run --manifest manifest.yaml | ||
| ``` | ||
|
|
||
| **Monitor:** Open Temporal UI at http://localhost:8233 to see workflow execution. | ||
|
|
||
| ## Key Code Changes | ||
|
|
||
| The main difference from OpenAI examples is using `LitellmModel`: | ||
|
|
||
| ```python | ||
| from agents.extensions.models.litellm_model import LitellmModel | ||
| # Create a LiteLLM model pointing to Gemini | ||
| gemini_model = LitellmModel(model="gemini/gemini-2.0-flash") | ||
| agent = Agent( | ||
| name="Gemini Assistant", | ||
| instructions="You are a helpful assistant powered by Gemini.", | ||
| model=gemini_model, # Use the LiteLLM model instead of default | ||
| ) | ||
| # Run works exactly the same way | ||
| result = await Runner.run(agent, user_messages) | ||
| ``` | ||
|
|
||
| ## Supported Models | ||
|
|
||
| LiteLLM supports many providers. Just change the model string: | ||
|
|
||
| | Provider | Model String Example | | ||
| |----------|---------------------| | ||
| | Google Gemini | `gemini/gemini-2.0-flash`, `gemini/gemini-1.5-pro` | | ||
| | Anthropic | `anthropic/claude-3-sonnet-20240229` | | ||
| | Mistral | `mistral/mistral-large-latest` | | ||
| | Cohere | `cohere/command-r-plus` | | ||
| | AWS Bedrock | `bedrock/anthropic.claude-3-sonnet` | | ||
|
|
||
| See [LiteLLM Providers](https://docs.litellm.ai/docs/providers) for the full list. | ||
|
|
||
| ## Why LiteLLM? | ||
|
|
||
| **Model Flexibility:** Switch between providers without code changes - just update the model string. | ||
|
|
||
| **Unified Interface:** Same OpenAI Agents SDK patterns work with any provider. | ||
|
|
||
| **Cost Optimization:** Easily compare costs across providers by switching models. | ||
|
|
||
| **Fallback Support:** LiteLLM supports automatic fallbacks if a provider is unavailable. | ||
|
|
||
| ## Architecture Notes | ||
|
|
||
| The Temporal integration remains identical: | ||
| - Workflows are durable and survive restarts | ||
| - LLM calls are wrapped as activities automatically | ||
| - Full observability in Temporal UI | ||
| - Automatic retries on failures | ||
|
|
||
| The only change is the model provider - everything else works the same. | ||
|
|
||
| ## When to Use | ||
|
|
||
| - Want to use non-OpenAI models with OpenAI Agents SDK | ||
| - Need to compare model performance across providers | ||
| - Building multi-model systems with fallbacks | ||
| - Cost optimization across different providers | ||
| - Regulatory requirements for specific model providers | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| **"GEMINI_API_KEY environment variable is not set"** | ||
| - Ensure you've exported the API key or added it to manifest.yaml | ||
|
|
||
| **"Model not found" errors** | ||
| - Check the model string format matches LiteLLM's expected format | ||
| - See [LiteLLM Providers](https://docs.litellm.ai/docs/providers) for correct model names | ||
|
|
||
| **Rate limiting errors** | ||
| - Gemini has different rate limits than OpenAI | ||
| - Consider adding retry logic or using LiteLLM's built-in retry support | ||
|
|
||
| **Previous:** [090_claude_agents_sdk_mvp](../090_claude_agents_sdk_mvp/) - Claude SDK integration |
1 change: 1 addition & 0 deletions
1
examples/tutorials/10_async/10_temporal/100_gemini_litellm/project/__init__.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| # Gemini LiteLLM Tutorial |
60 changes: 60 additions & 0 deletions
60
examples/tutorials/10_async/10_temporal/100_gemini_litellm/project/acp.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,60 @@ | ||
| import os | ||
| from datetime import timedelta | ||
|
|
||
| from temporalio.contrib.openai_agents import OpenAIAgentsPlugin, ModelActivityParameters | ||
| from agents.extensions.models.litellm_provider import LitellmProvider | ||
|
|
||
| # === DEBUG SETUP (AgentEx CLI Debug Support) === | ||
| if os.getenv("AGENTEX_DEBUG_ENABLED") == "true": | ||
| import debugpy | ||
| debug_port = int(os.getenv("AGENTEX_DEBUG_PORT", "5679")) | ||
| debugpy.configure(subProcess=False) | ||
| debugpy.listen(debug_port) | ||
| if os.getenv("AGENTEX_DEBUG_WAIT_FOR_ATTACH", "false").lower() == "true": | ||
| debugpy.wait_for_client() | ||
| # === END DEBUG SETUP === | ||
|
|
||
| from agentex.lib.types.fastacp import TemporalACPConfig | ||
| from agentex.lib.sdk.fastacp.fastacp import FastACP | ||
| from agentex.lib.core.temporal.plugins.openai_agents.interceptors.context_interceptor import ContextInterceptor | ||
|
|
||
| context_interceptor = ContextInterceptor() | ||
|
|
||
| # Create the ACP server | ||
| # We use LitellmProvider instead of TemporalStreamingModelProvider | ||
| # to enable using Gemini and other models through LiteLLM | ||
| acp = FastACP.create( | ||
| acp_type="async", | ||
| config=TemporalACPConfig( | ||
| # When deployed to the cluster, the Temporal address will automatically be set to the cluster address | ||
| # For local development, we set the address manually to talk to the local Temporal service set up via docker compose | ||
| # | ||
| # We use the OpenAI Agents SDK plugin because Temporal has built-in support for it, | ||
| # handling serialization and activity wrapping automatically. LitellmProvider lets us | ||
| # route to different model providers (like Gemini) while keeping all that infrastructure. | ||
| type="temporal", | ||
| temporal_address=os.getenv("TEMPORAL_ADDRESS", "localhost:7233"), | ||
| plugins=[OpenAIAgentsPlugin( | ||
| model_params=ModelActivityParameters( | ||
| start_to_close_timeout=timedelta(days=1) | ||
| ), | ||
| model_provider=LitellmProvider(), | ||
| )], | ||
| interceptors=[context_interceptor] | ||
| ) | ||
| ) | ||
|
|
||
|
|
||
| # Notice that we don't need to register any handlers when we use type="temporal" | ||
| # If you look at the code in agentex.sdk.fastacp.impl.temporal_acp | ||
| # You can see that these handlers are automatically registered when the ACP is created | ||
|
|
||
| # @acp.on_task_create | ||
| # This will be handled by the method in your workflow that is decorated with @workflow.run | ||
|
|
||
| # @acp.on_task_event_send | ||
| # This will be handled by the method in your workflow that is decorated with @workflow.signal(name=SignalName.RECEIVE_MESSAGE) | ||
|
|
||
| # @acp.on_task_cancel | ||
| # This does not need to be handled by your workflow. | ||
| # It is automatically handled by the temporal client which cancels the workflow directly | ||
62 changes: 62 additions & 0 deletions
62
examples/tutorials/10_async/10_temporal/100_gemini_litellm/project/run_worker.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,62 @@ | ||
| import asyncio | ||
| from datetime import timedelta | ||
|
|
||
| from temporalio.contrib.openai_agents import OpenAIAgentsPlugin, ModelActivityParameters | ||
| from agents.extensions.models.litellm_provider import LitellmProvider | ||
|
|
||
| from project.workflow import At100GeminiLitellmWorkflow | ||
| from agentex.lib.utils.debug import setup_debug_if_enabled | ||
| from agentex.lib.utils.logging import make_logger | ||
| from agentex.lib.environment_variables import EnvironmentVariables | ||
| from agentex.lib.core.temporal.activities import get_all_activities | ||
| from agentex.lib.core.temporal.workers.worker import AgentexWorker | ||
| from agentex.lib.core.temporal.plugins.openai_agents.interceptors.context_interceptor import ContextInterceptor | ||
|
|
||
| environment_variables = EnvironmentVariables.refresh() | ||
|
|
||
| logger = make_logger(__name__) | ||
|
|
||
|
|
||
| async def main(): | ||
| # Setup debug mode if enabled | ||
| setup_debug_if_enabled() | ||
|
|
||
| task_queue_name = environment_variables.WORKFLOW_TASK_QUEUE | ||
| if task_queue_name is None: | ||
| raise ValueError("WORKFLOW_TASK_QUEUE is not set") | ||
|
|
||
| # Add activities to the worker | ||
| all_activities = get_all_activities() + [] # add your own activities here | ||
|
|
||
| # ============================================================================ | ||
| # LITELLM SETUP: Interceptor + LitellmProvider | ||
| # ============================================================================ | ||
| # The ContextInterceptor threads task_id through activity headers using | ||
| # Temporal's interceptor pattern. This enables runtime context without | ||
| # forking the Temporal plugin. | ||
| # | ||
| # We use LitellmProvider instead of TemporalStreamingModelProvider to | ||
| # enable routing to Gemini and other models through LiteLLM. | ||
| context_interceptor = ContextInterceptor() | ||
|
|
||
| # Create a worker with automatic tracing | ||
| # IMPORTANT: We use the STANDARD temporalio.contrib.openai_agents.OpenAIAgentsPlugin | ||
| # but with LitellmProvider to handle model routing to Gemini. | ||
| worker = AgentexWorker( | ||
| task_queue=task_queue_name, | ||
| plugins=[OpenAIAgentsPlugin( | ||
| model_params=ModelActivityParameters( | ||
| start_to_close_timeout=timedelta(days=1) | ||
| ), | ||
| model_provider=LitellmProvider(), | ||
| )], | ||
| interceptors=[context_interceptor] | ||
| ) | ||
|
|
||
| await worker.run( | ||
| activities=all_activities, | ||
| workflow=At100GeminiLitellmWorkflow, | ||
| ) | ||
|
|
||
| if __name__ == "__main__": | ||
| asyncio.run(main()) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.