Skip to content

bluejay-validator performance optimizations#96

Open
swalkinshaw wants to merge 4 commits intomainfrom
validator-perf-optimizations
Open

bluejay-validator performance optimizations#96
swalkinshaw wants to merge 4 commits intomainfrom
validator-perf-optimizations

Conversation

@swalkinshaw
Copy link
Contributor

Reduce validator allocation overhead — ~28% faster on representative queries

Optimizations:

  • Replace HashSet/HashMap with Vec + linear scan in hot paths where N is small (parent_fragments, cycle detection, required arguments, argument equivalence, type overlap checks)
  • Eliminate Path Vec allocation by making Path Copy
  • Optimize duplicates() to skip BTreeMap allocation when no duplicates found (common case), and avoid intermediate (K,T) vec
  • Reuse Cache's fragment_definitions HashMap in FragmentSpreadTargetDefined instead of building a separate HashSet
  • Collect errors via &mut Vec in FieldSelectionMerging instead of returning intermediate Vecs

Also adds a criterion benchmark suite for the validation pipeline.

Note: this is a manually curated (with the help of Claude) and cleaned up version of an /autoresearch run

…queries

Optimizations:

- Replace HashSet/HashMap with Vec + linear scan in hot paths where N
  is small (parent_fragments, cycle detection, required arguments,
  argument equivalence, type overlap checks)
- Eliminate Path Vec allocation by making Path Copy
- Optimize duplicates() to skip BTreeMap allocation when no duplicates
  found (common case), and avoid intermediate (K,T) vec
- Reuse Cache's fragment_definitions HashMap in FragmentSpreadTargetDefined
  instead of building a separate HashSet
- Collect errors via &mut Vec in FieldSelectionMerging instead of
  returning intermediate Vecs

Also adds a criterion benchmark suite for the validation pipeline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@swalkinshaw swalkinshaw requested a review from adampetro March 17, 2026 18:15
Comment on lines +77 to +80
// Fast path: if either type is not a composite type, spread is not applicable
if !Self::is_composite(parent_type) || !Self::is_composite(fragment_type) {
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅 kinda confusing how I wrote this because I would think spread would not be possible for non-composite parent types, I guess it kinda makes sense because we shouldn't be returning an error here because there would be a different error from a different rule (FragmentsOnCompositeTypes)

Comment on lines +29 to +32
if self
.cache
.fragment_definition(fragment_spread.name())
.is_none()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

Comment on lines 33 to 35
pub fn members(&self) -> &[&'a E::Selection] {
&self.members
&[]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is just wrong now, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah pushed a 3rd separate which adds a leave_field hook on the visitor so that the typegen crate can track this itself and we can remove the concern from the validator entirely

@swalkinshaw swalkinshaw force-pushed the validator-perf-optimizations branch 2 times, most recently from 53f7169 to afcddc5 Compare March 23, 2026 16:27
swalkinshaw and others added 2 commits March 23, 2026 14:01
- Add comment explaining Interface parent type grouping in field_selection_merging
- Use existing TypeDefinitionReference::is_composite() instead of custom method
- Rewrite types_have_overlap as match (a, b) to avoid redundant expansion of b's types
- Use Iterator::next() to avoid Vec allocation for 0-1 element iterators in duplicates()
- Use itertools::combinations(2) for pairwise duplicate check

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Path is now purely a root identifier (Copy, no Vec allocation).
Removed with_selection/members stubs that were broken after the Copy
optimization. Added leave_field to the Visitor trait so
PathsWithCustomScalarType can maintain its own field stack.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@swalkinshaw swalkinshaw force-pushed the validator-perf-optimizations branch from afcddc5 to 9320f4f Compare March 23, 2026 19:02
pub(crate) struct PathsWithCustomScalarType<'a, S: SchemaDefinition> {
schema_definition: &'a S,
paths: HashSet<Vec<String>>,
field_stack: Vec<String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it much hassle to make this Vec<&'a str>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easy and defers an allocation when only inserting a path 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just delete this then since it doesn't do what it's name suggests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm multiple rules do use PathRoot though. Ideally Path is removed and simplified down to just PathRoot? I'd rather defer that cleanup since it was cause a lot of (trivial) changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants