Skip to content

[#9392] Update Datum order and equality#34840

Merged
mtabebe merged 1 commit intoMaterializeInc:mainfrom
mtabebe:ma/9392/datum-eq-ord
Jan 29, 2026
Merged

[#9392] Update Datum order and equality#34840
mtabebe merged 1 commit intoMaterializeInc:mainfrom
mtabebe:ma/9392/datum-eq-ord

Conversation

@mtabebe
Copy link
Contributor

@mtabebe mtabebe commented Jan 27, 2026

Motivation

Datum and Datum List had different equality (https://github.com/MaterializeInc/database-issues/issues/9392) and order operators. Equality was doing the default bytewise operation, while order was calling the object comparison.

This leads to issues where to equivalent values (e.g., -0.0 and +0.0) are order equivalent but not equal.

Solution

Fix this issue by implementing explicit equal and hash operators that operate over each entry in the datum.

Testing

Introduce new unit tests to demonstrate that order and equality now do match and hashing works as expected.

Also test manually with:

materialize=> select distinct * from (values (map['a' => '-0.0'::float]), (map['a' => '0.0'::float]));
 column1
---------
 {a=>0}

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@mtabebe mtabebe requested a review from SangJunBak January 27, 2026 17:42
@mtabebe mtabebe requested a review from a team as a code owner January 27, 2026 17:42
@github-actions
Copy link

github-actions bot commented Jan 27, 2026

All contributors have signed the CLA.
Posted by the CLA Assistant Lite bot.

@mtabebe
Copy link
Contributor Author

mtabebe commented Jan 27, 2026

I have read the Contributor License Agreement (CLA) and I hereby sign the CLA.

@mtabebe
Copy link
Contributor Author

mtabebe commented Jan 27, 2026

recheck

materialize-bot added a commit to MaterializeInc/cla that referenced this pull request Jan 27, 2026
@bkirwi bkirwi self-requested a review January 27, 2026 18:17
Copy link
Member

@antiguru antiguru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine. Can you check what postgres reports for the jsonb_agg where we observe slt changes?

@@ -769,6 +770,22 @@ impl<'a> Debug for DatumList<'a> {
}
}

impl<'a> PartialEq for DatumList<'a> {
fn eq(&self, other: &DatumList<'a>) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add #[inline(always)] to all of the functions.

@mtabebe
Copy link
Contributor Author

mtabebe commented Jan 27, 2026

Looks fine. Can you check what postgres reports for the jsonb_agg where we observe slt changes?

bin/sqllogictest -- test/sqllogictest/postgres/jsonb.slt was clean, but not sure where to look for the jsonb_agg

@mtabebe
Copy link
Contributor Author

mtabebe commented Jan 27, 2026

Nvm: I realize now the CI had a jsonb_agg test failure, I will look into how to run and debug this test!

Thanks

@mtabebe mtabebe force-pushed the ma/9392/datum-eq-ord branch from 3371d7a to fca6ee7 Compare January 28, 2026 17:17
@mtabebe
Copy link
Contributor Author

mtabebe commented Jan 28, 2026

Ok, I think I figured out what went wrong in the jsonb test:

The output order got flipped for these json blobs:

{"f1": 1, "f2": "2020-01-01"}
{"f1": null, "f2": "2020-01-02"}

With my change the null got sorted after the 1. The reason is that the Datum sort order comes from the enum type in scalar.rs, and NULL is the last order. Whereas before the ordering was bytewise, so null came first.

    // Keep `Null` last so that calling `<` on Datums sorts nulls last, to
    // match the default in PostgreSQL. Note that this doesn't have an effect
    // on ORDER BY, because that is handled by compare_columns. The only
    // situation it has an effect is array comparisons, e.g.,
    // `SELECT ARRAY[1] < ARRAY[NULL]::int[]`. In such a situation, we end up
    // calling `<` on Datums (see `fn lt` in scalar/func.rs).
    /// An unknown value.
    Null,
    // WARNING! DON'T PLACE NEW DATUM VARIANTS HERE!
    //
    // This order of variants of this enum determines how nulls sort. We
    // have decided that nulls should sort last in Materialize, so all
    // other datum variants should appear before `Null`.

My (AI assisted) understanding is that jsonb_agg sorts by payload which is why the order has flipped.

I wrote a unit test with null and this shows the behaviour.

Copy link
Contributor

@bkirwi bkirwi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test comments seem to have drifted a bit, but otherwise seems great. Thanks for picking this up!

Datum and Datum List had different equality and order operators.
Equality was doing the default bytewise operation, while order was
calling the object comparission.

This leads to issues where to equivalent values (e.g., -0.0 and +0.0)
are order equivalent but not equal.

Fix this issue by implementing explicit equal and hash operators that
operate over each entry in the datum.

Introduce new tests to demonstrate that order and equality now do match
and hashing works as expected.

Also test manually with:
materialize=> select distinct * from (values (map['a' => '-0.0'::float]), (map['a' => '0.0'::float]));
 column1
---------
 {a=>0}
@mtabebe mtabebe force-pushed the ma/9392/datum-eq-ord branch from fca6ee7 to 3ba4c5a Compare January 28, 2026 18:39
Copy link
Contributor

@SangJunBak SangJunBak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@mtabebe mtabebe merged commit 061c3c5 into MaterializeInc:main Jan 29, 2026
131 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants