Skip to content

feat(py/google-genai): add Vertex AI rerankers and evaluators#4428

Merged
yesudeep merged 5 commits intomainfrom
yesudeep/feat/vertexai-rerank-eval
Feb 5, 2026
Merged

feat(py/google-genai): add Vertex AI rerankers and evaluators#4428
yesudeep merged 5 commits intomainfrom
yesudeep/feat/vertexai-rerank-eval

Conversation

@yesudeep
Copy link
Contributor

@yesudeep yesudeep commented Feb 3, 2026

Summary

Adds Vertex AI rerankers and evaluators to the Google GenAI plugin, enabling RAG quality improvements and automated model output evaluation. Also includes infrastructure improvements for interactive GCP API management.

Vertex AI Rerankers

New module: plugins/google_genai/rerankers/

Rerankers improve RAG quality by re-scoring documents based on semantic relevance to a query (two-stage retrieval pattern):

  1. Fast retrieval: Get many candidates quickly (~100 docs)
  2. Quality reranking: Score candidates by relevance, keep top-k

Supported Models:

Model Description
semantic-ranker-default@latest Latest default semantic ranker
semantic-ranker-default-004 Semantic ranker version 004
semantic-ranker-fast-004 Fast variant (lower latency)
semantic-ranker-default-003 Semantic ranker version 003
semantic-ranker-default-002 Semantic ranker version 002

Usage:

from genkit import Genkit
from genkit.plugins.google_genai import VertexAI

ai = Genkit(plugins=[VertexAI(project='my-project')])

ranked_docs = await ai.rerank(
    reranker='vertexai/semantic-ranker-default@latest',
    query='What is machine learning?',
    documents=retrieved_docs,
    options={'top_n': 5},
)

Vertex AI Evaluators

New module: plugins/google_genai/evaluators/

Evaluators are automatically registered when using the VertexAI plugin and accessed via ai.evaluate().

Supported Metrics:

Metric Description
BLEU Translation quality (compare to reference)
ROUGE Summarization quality
FLUENCY Language mastery and readability
SAFETY Harmful/inappropriate content detection
GROUNDEDNESS Hallucination detection
SUMMARIZATION_QUALITY Overall summarization ability
SUMMARIZATION_HELPFULNESS Usefulness as summary substitute
SUMMARIZATION_VERBOSITY Conciseness

Usage:

from genkit import Genkit
from genkit.core.typing import BaseDataPoint
from genkit.plugins.google_genai import VertexAI

ai = Genkit(plugins=[VertexAI(project='my-project')])

results = await ai.evaluate(
    evaluator='vertexai/fluency',
    dataset=[BaseDataPoint(input='...', output='...')],
)

HTTP Client Caching (Fixes #4420)

Introduces genkit.core.http_client module for per-event-loop client caching, solving "bound to different event loop" errors with httpx.AsyncClient.

Interactive GCP API Management

Adds reusable gcloud helper functions to _common.sh for all samples:

Function Purpose
check_gcloud_installed Verify gcloud CLI is installed
check_gcloud_auth Check ADC and prompt for login if needed
is_api_enabled Check if a specific API is enabled
enable_required_apis Only prompts if APIs need enabling
run_gcp_setup One-liner to run complete GCP setup

Sample

New sample: py/samples/vertexai-rerank-eval/ demonstrates both features with automated gcloud setup.


Review Feedback Addressed

  • ✅ Removed redundant location fallback in google.py (self._location is set in init)
  • ✅ Fixed docstring to show correct DEFAULT_LOCATION (global, not us-central1)
  • ✅ Simplified _extract_score/_extract_reasoning by removing unnecessary None checks
  • ✅ Fixed README.md evaluator example to show correct ai.evaluate() usage
  • ✅ Updated test fixture to use constructor parameters instead of private attributes
  • ✅ Acknowledged metadata limitation in reranker response (schema constraint from upstream)

Files Changed

File Description
packages/genkit/src/genkit/core/http_client.py Per-event-loop HTTP client caching
packages/genkit/tests/genkit/core/http_client_test.py 18 unit tests
plugins/google-genai/evaluators/* Evaluator implementation
plugins/google-genai/rerankers/* Reranker implementation
plugins/google-genai/tests/* Unit tests
plugins/google-genai/README.md Documentation updates
samples/_common.sh Reusable gcloud helper functions
samples/vertexai-rerank-eval/* Demo sample
samples/firestore-retreiver/run.sh Refactored to use common gcloud helpers
samples/google-genai-vertexai-hello/run.sh Refactored to use common gcloud helpers
samples/google-genai-vertexai-image/run.sh Refactored to use common gcloud helpers
samples/README.md Sample documentation

Test Plan

cd py

# Run all tests
uv run pytest plugins/google-genai/tests/ -v
uv run pytest packages/genkit/tests/genkit/core/http_client_test.py -v

# Run sample
cd samples/vertexai-rerank-eval
export GOOGLE_CLOUD_PROJECT=your-project-id
./run.sh

Related Issues

@yesudeep yesudeep requested a review from huangjeff5 as a code owner February 3, 2026 22:59
@github-actions github-actions bot added docs Improvements or additions to documentation feature New feature or request python Python config labels Feb 3, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @yesudeep, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the capabilities of the Google GenAI plugin by integrating Vertex AI's advanced reranking and evaluation services. It allows users to improve the relevance of retrieved documents in RAG pipelines and to automatically assess the quality and safety of AI model outputs using a suite of metrics. A foundational improvement to HTTP client management in asynchronous contexts has also been included to ensure stability and performance for these new features.

Highlights

  • Vertex AI Rerankers Integration: Introduced support for Vertex AI rerankers within the Google GenAI plugin. These rerankers enhance Retrieval-Augmented Generation (RAG) quality by re-scoring documents based on semantic relevance to a query, following a two-stage retrieval pattern. Supported models include various versions of semantic-ranker-default and semantic-ranker-fast.
  • Vertex AI Evaluators Integration: Added Vertex AI evaluators to the Google GenAI plugin, enabling automated assessment of model outputs. These evaluators leverage the Vertex AI Evaluation API to measure quality metrics such as BLEU, ROUGE, FLUENCY, SAFETY, GROUNDEDNESS, and various summarization qualities.
  • Per-Event-Loop HTTP Client Caching: Implemented a new genkit.core.http_client module to provide per-event-loop caching for httpx.AsyncClient instances. This resolves 'bound to different event loop' errors and improves connection reuse efficiency for asynchronous operations across the framework.
  • New Sample Application: A new sample application (py/samples/vertexai-rerank-eval/) has been added to demonstrate the usage and capabilities of both the Vertex AI rerankers and evaluators, complete with automated gcloud setup scripts.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • py/packages/genkit/src/genkit/core/init.py
    • Imported clear_client_cache, close_cached_clients, and get_cached_client from the new http_client module.
    • Added the newly imported HTTP client functions to the module's __all__ export list.
  • py/packages/genkit/src/genkit/core/http_client.py
    • Added a new module for per-event-loop httpx.AsyncClient caching.
    • Implemented get_cached_client to provide cached or new httpx.AsyncClient instances, preventing 'bound to different event loop' errors.
    • Included close_cached_clients for explicit client cleanup and clear_client_cache for cache management, primarily for testing.
    • Utilizes WeakKeyDictionary to automatically clean up clients when their associated event loops are garbage collected.
  • py/packages/genkit/tests/genkit/core/http_client_test.py
    • Added comprehensive unit tests for the new http_client module.
    • Tests cover client caching, isolation across different cache keys, proper replacement of closed clients, and correct application of headers and timeouts.
    • Includes tests for cleanup utilities like close_cached_clients and clear_client_cache.
  • py/plugins/google-genai/pyproject.toml
    • Adjusted the placement of the license field for consistency.
    • Added vertexai-rerank-eval to the project.optional-dependencies.samples section.
  • py/plugins/google-genai/src/genkit/plugins/google_genai/evaluators/init.py
    • Added a new module to encapsulate Vertex AI Evaluators.
    • Provides an overview, key concepts, data flow, and available metrics for Vertex AI Evaluation.
    • Exports VertexAIEvaluationMetricType and create_vertex_evaluators for external use.
  • py/plugins/google-genai/src/genkit/plugins/google_genai/evaluators/evaluation.py
    • Implemented the core logic for integrating Vertex AI Evaluation API.
    • Defined VertexAIEvaluationMetricType enum and VertexAIEvaluationMetricConfig for metric specification.
    • Created EvaluatorFactory to manage API calls to the Vertex AI evaluateInstances endpoint and generate evaluator functions.
    • Configured request and response handlers for various evaluation metrics (BLEU, ROUGE, FLUENCY, SAFETY, GROUNDEDNESS, SUMMARIZATION_QUALITY, SUMMARIZATION_HELPFULNESS, SUMMARIZATION_VERBOSITY).
    • Integrated Google Cloud Application Default Credentials (ADC) for authentication and the new cached HTTP client for API requests.
  • py/plugins/google-genai/src/genkit/plugins/google_genai/google.py
    • Imported necessary types and functions to support Vertex AI rerankers.
    • Modified the init method to register available Vertex AI rerankers.
    • Updated the resolve method to handle ActionKind.RERANKER requests.
    • Added a _resolve_reranker method to create Action objects for rerankers, including request/response mapping and API interaction.
  • py/plugins/google-genai/src/genkit/plugins/google_genai/rerankers/init.py
    • Added a new module to encapsulate Vertex AI Rerankers.
    • Provides an overview, key concepts, data flow in RAG, available models, and usage examples for Vertex AI Reranking.
    • Exports core reranker types and functions for use within the plugin.
  • py/plugins/google-genai/src/genkit/plugins/google_genai/rerankers/reranker.py
    • Implemented the core logic for interacting with the Vertex AI Discovery Engine Ranking API.
    • Defined constants for default location, model names, and known reranker models.
    • Created Pydantic models for VertexRerankerConfig, RerankRequestRecord, RerankRequest, RerankResponseRecord, RerankResponse, and VertexRerankerClientOptions to structure API interactions.
    • Developed reranker_rank function for asynchronous API calls to the Discovery Engine.
    • Provided utility functions _to_reranker_doc and _from_rerank_response for converting between Genkit's Document format and the reranker API's record format.
    • Utilized Google Cloud ADC for authentication and the new cached HTTP client for API requests.
  • py/plugins/google-genai/test/google_plugin_test.py
    • Modified the vertexai_plugin_instance fixture to set mock _project and _location attributes on the _client for proper testing of reranker resolution.
  • py/plugins/google-genai/tests/evaluators_test.py
    • Added new unit tests for the Vertex AI Evaluators module.
    • Tests cover the VertexAIEvaluationMetricType enum, VertexAIEvaluationMetricConfig model, stringification utilities, EvaluatorFactory functionality (initialization, API call structure, error handling), and the create_vertex_evaluators function.
  • py/plugins/google-genai/tests/rerankers_test.py
    • Added new unit tests for the Vertex AI Rerankers module.
    • Tests validate constants, model name checks, Pydantic models for requests/responses, URL generation, document conversion functions, and the reranker_rank API call structure.
  • py/samples/vertexai-rerank-eval/LICENSE
    • Added the Apache License 2.0 file for the new sample.
  • py/samples/vertexai-rerank-eval/README.md
    • Added a new README file detailing the Vertex AI Rerankers and Evaluators demo.
    • Includes sections on features, quick start, manual setup, testing flows, supported models/metrics, and troubleshooting.
  • py/samples/vertexai-rerank-eval/pyproject.toml
    • Added a new pyproject.toml file for the Vertex AI Rerankers and Evaluators demo.
    • Specifies project metadata and dependencies, including genkit and genkit-plugin-google-genai.
  • py/samples/vertexai-rerank-eval/run.sh
    • Added a new run.sh script to automate the setup and execution of the Vertex AI Rerankers and Evaluators demo.
    • Handles gcloud authentication, enables required Google Cloud APIs (aiplatform.googleapis.com, discoveryengine.googleapis.com), installs dependencies, and starts the Genkit Dev UI.
  • py/samples/vertexai-rerank-eval/src/main.py
    • Added the main Python application file for the Vertex AI Rerankers and Evaluators demo.
    • Defines Genkit flows for demonstrating rerank_documents, rag_with_reranking, evaluate_fluency, evaluate_safety, evaluate_groundedness, evaluate_bleu, and evaluate_summarization.
Activity
  • New modules for Vertex AI rerankers and evaluators have been introduced, including their core logic, types, and API integrations.
  • A new HTTP client caching mechanism has been implemented and integrated to improve performance and stability for asynchronous operations.
  • Comprehensive unit tests have been added for both the new HTTP client and the Vertex AI reranker and evaluator functionalities.
  • A new sample application demonstrating the usage of Vertex AI rerankers and evaluators has been created, along with its setup scripts and documentation.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a significant feature addition, introducing Vertex AI rerankers and evaluators to the Google GenAI plugin. These are valuable for improving RAG quality and assessing model outputs. The implementation is well-structured, with clear separation of concerns into new modules for rerankers, evaluators, and a core HTTP client utility. The new http_client with per-event-loop caching is a robust solution to a common problem with async clients. The addition of a comprehensive sample application with an automated setup script is excellent for developer experience. The code is of high quality, and I have a few suggestions for refactoring to improve maintainability and adherence to best practices.

@yesudeep yesudeep force-pushed the yesudeep/feat/vertexai-rerank-eval branch from b04be26 to 6aaaeb1 Compare February 3, 2026 23:03
@yesudeep yesudeep marked this pull request as draft February 3, 2026 23:07
@yesudeep yesudeep force-pushed the yesudeep/feat/vertexai-rerank-eval branch 2 times, most recently from 89a7893 to 8415873 Compare February 3, 2026 23:11
@yesudeep
Copy link
Contributor Author

yesudeep commented Feb 3, 2026

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant new features: Vertex AI rerankers and evaluators, along with a core HTTP client caching mechanism. The implementation is generally of high quality, with good structure, documentation, and comprehensive tests.

My review focuses on a few key areas for improvement:

  • Asynchronous Correctness: There are a couple of instances of blocking I/O calls within async functions for authentication token refreshing. These should be converted to their async equivalents to avoid blocking the event loop.
  • Error Handling: The exception handling in the new API clients can be made more specific to avoid catching unexpected errors.
  • Code Simplification: A minor opportunity for code simplification in the reranker response handling.
  • Test Accuracy: A small correction is needed in a test fixture to ensure it correctly mocks the state for the code under test.

Overall, this is a great contribution that adds valuable capabilities to the Genkit Google GenAI plugin. The new sample is also very helpful for demonstrating these features.

@yesudeep yesudeep force-pushed the yesudeep/feat/vertexai-rerank-eval branch from 8415873 to d7e9fd7 Compare February 3, 2026 23:19
@yesudeep
Copy link
Contributor Author

yesudeep commented Feb 3, 2026

/gemini review

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@yesudeep yesudeep force-pushed the yesudeep/feat/vertexai-rerank-eval branch from d7e9fd7 to 5d99e74 Compare February 3, 2026 23:24
@yesudeep
Copy link
Contributor Author

yesudeep commented Feb 3, 2026

/gemini review

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@yesudeep yesudeep force-pushed the yesudeep/feat/vertexai-rerank-eval branch from 88c797c to 674de70 Compare February 5, 2026 01:47
@yesudeep
Copy link
Contributor Author

yesudeep commented Feb 5, 2026

/gemini review

@yesudeep yesudeep marked this pull request as ready for review February 5, 2026 01:52
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for Vertex AI rerankers and evaluators. The implementation for both features is well-structured and includes comprehensive unit tests and a new sample application.

However, there are a few significant issues to address:

  • The reranker implementation currently drops all original metadata from documents, which could break RAG pipelines.
  • The evaluator feature appears to be incompletely integrated. The VertexAI plugin does not register the evaluator actions, and the documentation provides an incorrect example of how to configure them, making the feature unusable as is.

I've also included a minor suggestion to improve the test code's clarity. Overall, this is a great addition, but the issues with metadata loss and evaluator integration need to be resolved before merging.

@yesudeep yesudeep marked this pull request as draft February 5, 2026 01:59
@yesudeep
Copy link
Contributor Author

yesudeep commented Feb 5, 2026

Addressed.

@yesudeep yesudeep marked this pull request as ready for review February 5, 2026 18:27
Adds Vertex AI rerankers and evaluators to the Google GenAI plugin,
enabling RAG quality improvements and automated model output evaluation.

## Vertex AI Rerankers

Rerankers improve RAG quality by re-scoring documents based on semantic
relevance to a query (two-stage retrieval pattern):
1. Fast retrieval: Get many candidates quickly
2. Quality reranking: Score candidates by relevance, keep top-k

Supported models:
- semantic-ranker-default@latest
- semantic-ranker-default-004
- semantic-ranker-fast-004
- semantic-ranker-default-003
- semantic-ranker-default-002

## Vertex AI Evaluators

Evaluators assess model outputs for quality metrics using the Vertex AI
Evaluation API. Supported metrics:
- BLEU, ROUGE (translation/summarization quality)
- FLUENCY (language mastery)
- SAFETY (harmful content detection)
- GROUNDEDNESS (hallucination detection)
- SUMMARIZATION_QUALITY/HELPFULNESS/VERBOSITY

## HTTP Client Caching

Introduces genkit.core.http_client module for per-event-loop client
caching, solving "bound to different event loop" errors with httpx.

## Sample

New sample: py/samples/vertexai-rerank-eval/ demonstrates both features
with automated gcloud setup.
- Add reusable gcloud functions: check_gcloud_installed, check_gcloud_auth,
  is_api_enabled, enable_required_apis, run_gcp_setup
- Interactive API enablement: only prompts if APIs are not enabled
- Refactor vertexai-rerank-eval, firestore-retreiver, google-genai-vertexai-hello,
  google-genai-vertexai-image to use the common functions
- Reduces ~175 lines of duplicate code per sample
- Fix README.md evaluator example to show correct usage (ai.evaluate())
- Update test fixture to use constructor parameters instead of setting
  private attributes
@yesudeep yesudeep force-pushed the yesudeep/feat/vertexai-rerank-eval branch from 0e19bac to bbb7e0e Compare February 5, 2026 18:28
@yesudeep
Copy link
Contributor Author

yesudeep commented Feb 5, 2026

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant new capabilities by adding support for Vertex AI rerankers and evaluators to the Google GenAI plugin. The changes are well-structured, including new modules for the features, comprehensive unit tests, and a new sample application to demonstrate usage. I especially appreciate the addition of reusable gcloud helper functions in _common.sh, which greatly improves the developer experience for all samples. The code quality is high, and my feedback consists of a few minor suggestions for simplification and documentation consistency.

- Remove redundant location fallback in google.py (self._location is set in __init__)
- Fix docstring to show correct DEFAULT_LOCATION (global, not us-central1)
- Simplify _extract_score/_extract_reasoning by removing unnecessary None checks
@yesudeep yesudeep enabled auto-merge (squash) February 5, 2026 18:35
@yesudeep yesudeep disabled auto-merge February 5, 2026 18:47
@yesudeep yesudeep merged commit 9997401 into main Feb 5, 2026
17 checks passed
@yesudeep yesudeep deleted the yesudeep/feat/vertexai-rerank-eval branch February 5, 2026 18:48
@yesudeep yesudeep mentioned this pull request Feb 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

config docs Improvements or additions to documentation feature New feature or request python Python

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[PY] Some sort of per-event-loop cache to avoid create a fresh httpx client in cf-ai plugin

2 participants