Skip to content

fix(py/typing): respect additionalProperties from JSON schema#4451

Merged
huangjeff5 merged 1 commit intomainfrom
yesudeep/fix/schema-additionalproperties
Feb 5, 2026
Merged

fix(py/typing): respect additionalProperties from JSON schema#4451
huangjeff5 merged 1 commit intomainfrom
yesudeep/fix/schema-additionalproperties

Conversation

@yesudeep
Copy link
Contributor

@yesudeep yesudeep commented Feb 5, 2026

Summary

Fixes the Python schema generator to respect additionalProperties: true from the canonical JSON schema, matching JavaScript behavior for models using .passthrough() in Zod.

Problem

The Python schema sanitizer (py/bin/sanitize_schema_typing.py) was always setting extra='forbid' for all Pydantic models, regardless of the additionalProperties setting in the JSON schema. This caused a parity issue with the JavaScript implementation where models use .passthrough() to allow extra properties.

Impact:

  • Reranker metadata was being discarded
  • Evaluation score details beyond reasoning were rejected
  • Operation error details beyond message were rejected

Solution

Updated the schema sanitizer to:

  1. Read genkit-tools/genkit-schema.json at generation time
  2. Identify models with additionalProperties: true:
    • Top-level $defs (e.g., RankedDocumentMetadata)
    • Inline nested objects (e.g., Score.detailsDetails class, Operation.errorError class)
  3. Use extra='allow' for those models (equivalent to .passthrough() in Zod)
  4. Use extra='forbid' for all other models (default behavior)

Changes

File Description
py/bin/sanitize_schema_typing.py Added load_models_allowing_extra() function to read JSON schema and identify passthrough models (both top-level and inline). Updated ClassTransformer to use extra='allow' for those models.
py/packages/genkit/src/genkit/core/typing.py Regenerated - three classes now use extra='allow' instead of extra='forbid'

Affected Models

Model Source Purpose
RankedDocumentMetadata Top-level $defs Preserves arbitrary metadata during reranking
Details Inline in Score.details Allows extra evaluation score details
Error Inline in Operation.error Allows extra error details

Before/After

Before:

class RankedDocumentMetadata(BaseModel):
    model_config = ConfigDict(..., extra='forbid', ...)  # Rejects extra fields

class Details(BaseModel):
    model_config = ConfigDict(..., extra='forbid', ...)  # Rejects extra fields

class Error(BaseModel):
    model_config = ConfigDict(..., extra='forbid', ...)  # Rejects extra fields

After:

class RankedDocumentMetadata(BaseModel):
    model_config = ConfigDict(..., extra='allow', ...)  # Matches JS .passthrough()

class Details(BaseModel):
    model_config = ConfigDict(..., extra='allow', ...)  # Matches JS .passthrough()

class Error(BaseModel):
    model_config = ConfigDict(..., extra='allow', ...)  # Matches JS .passthrough()

Testing

  • All 500 tests pass
  • Verified all three models now correctly have extra='allow'
  • Verified the detection works for both top-level and inline schemas

Related

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @yesudeep, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical data handling inconsistency between Python and JavaScript schema implementations. By enhancing the Python schema generation script to interpret additionalProperties: true from the canonical JSON schema, it now correctly configures Pydantic models to accept extra fields. This change specifically rectifies an issue where RankedDocumentMetadata was discarding valuable reranker metadata, ensuring consistent behavior across platforms and preventing data loss.

Highlights

  • Dynamic extra setting for Pydantic models: The Python schema generator now dynamically sets the extra configuration for Pydantic models based on the additionalProperties field in the canonical JSON schema. This ensures that models which should allow extra fields (equivalent to Zod's .passthrough()) are correctly configured with extra='allow'.
  • JSON Schema Integration: The sanitize_schema_typing.py script has been enhanced to read genkit-tools/genkit-schema.json at generation time. It identifies models where additionalProperties: true is specified, allowing for precise control over field validation.
  • Reranker Metadata Parity Fix: The RankedDocumentMetadata model in Python is now correctly configured with extra='allow'. This resolves a previous issue where extra metadata on documents was being discarded, ensuring parity with the JavaScript implementation and preventing data loss for reranker metadata.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • py/bin/sanitize_schema_typing.py
    • Added json import for parsing JSON schema files.
    • Modified ClassTransformer.__init__ to accept an optional models_allowing_extra set, enabling it to know which models should allow extra properties.
    • Updated create_model_config to conditionally set extra='allow' or extra='forbid' based on a new allow_extra parameter, aligning with additionalProperties in JSON schema.
    • Introduced a new function load_models_allowing_extra to read genkit-schema.json and identify model names that have additionalProperties: true.
    • Updated process_file to utilize load_models_allowing_extra and pass the resulting set to the ClassTransformer instance.
    • Modified the main function to dynamically determine the path to genkit-schema.json relative to the typing.py file being processed.
  • py/packages/genkit/src/genkit/core/typing.py
    • The RankedDocumentMetadata model's model_config was regenerated, changing its extra setting from extra='forbid' to extra='allow' to correctly handle additional properties.
Activity
  • No specific activity (comments, reviews, or progress updates) has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the Python schema sanitizer to respect the additionalProperties: true setting from the JSON schema, which brings it to parity with the JavaScript implementation. The changes correctly load the schema, identify models that allow extra properties, and adjust the Pydantic model generation accordingly. The implementation is solid. I've added one suggestion to improve the robustness of how the script locates the schema file, making it less dependent on a fixed directory structure.

The Python schema sanitizer was always setting extra='forbid' for all
Pydantic models, ignoring the additionalProperties setting from the
canonical JSON schema.

This fix:
- Reads genkit-schema.json to identify models with additionalProperties: true
- Handles both top-level $defs and inline nested objects (e.g., Score.details)
- Uses extra='allow' for those models (equivalent to .passthrough() in Zod)
- Uses extra='forbid' for all other models (default behavior)

Models now correctly using extra='allow':
- RankedDocumentMetadata (reranker metadata preservation)
- Details (Score.details for evaluation reasoning)
- Error (Operation.error for error details)

This matches the JavaScript .passthrough() behavior and ensures JS/Python
parity for type validation.
@yesudeep yesudeep force-pushed the yesudeep/fix/schema-additionalproperties branch from 6418cc9 to 300a118 Compare February 5, 2026 02:39
@huangjeff5 huangjeff5 merged commit 09fbe22 into main Feb 5, 2026
17 checks passed
@huangjeff5 huangjeff5 deleted the yesudeep/fix/schema-additionalproperties branch February 5, 2026 16:04
@yesudeep yesudeep mentioned this pull request Feb 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants