[#9392] Update Datum order and equality#34840
Conversation
|
All contributors have signed the CLA. |
|
I have read the Contributor License Agreement (CLA) and I hereby sign the CLA. |
|
recheck |
antiguru
left a comment
There was a problem hiding this comment.
Looks fine. Can you check what postgres reports for the jsonb_agg where we observe slt changes?
| @@ -769,6 +770,22 @@ impl<'a> Debug for DatumList<'a> { | |||
| } | |||
| } | |||
|
|
|||
| impl<'a> PartialEq for DatumList<'a> { | |||
| fn eq(&self, other: &DatumList<'a>) -> bool { | |||
There was a problem hiding this comment.
Please add #[inline(always)] to all of the functions.
|
|
Nvm: I realize now the CI had a jsonb_agg test failure, I will look into how to run and debug this test! Thanks |
3371d7a to
fca6ee7
Compare
|
Ok, I think I figured out what went wrong in the jsonb test: The output order got flipped for these json blobs: With my change the null got sorted after the 1. The reason is that the Datum sort order comes from the enum type in scalar.rs, and NULL is the last order. Whereas before the ordering was bytewise, so null came first. My (AI assisted) understanding is that jsonb_agg sorts by payload which is why the order has flipped. I wrote a unit test with null and this shows the behaviour. |
bkirwi
left a comment
There was a problem hiding this comment.
Test comments seem to have drifted a bit, but otherwise seems great. Thanks for picking this up!
Datum and Datum List had different equality and order operators.
Equality was doing the default bytewise operation, while order was
calling the object comparission.
This leads to issues where to equivalent values (e.g., -0.0 and +0.0)
are order equivalent but not equal.
Fix this issue by implementing explicit equal and hash operators that
operate over each entry in the datum.
Introduce new tests to demonstrate that order and equality now do match
and hashing works as expected.
Also test manually with:
materialize=> select distinct * from (values (map['a' => '-0.0'::float]), (map['a' => '0.0'::float]));
column1
---------
{a=>0}
fca6ee7 to
3ba4c5a
Compare
Motivation
Datum and Datum List had different equality (https://github.com/MaterializeInc/database-issues/issues/9392) and order operators. Equality was doing the default bytewise operation, while order was calling the object comparison.
This leads to issues where to equivalent values (e.g., -0.0 and +0.0) are order equivalent but not equal.
Solution
Fix this issue by implementing explicit equal and hash operators that operate over each entry in the datum.
Testing
Introduce new unit tests to demonstrate that order and equality now do match and hashing works as expected.
Also test manually with:
Checklist
$T ⇔ Proto$Tmapping (possibly in a backwards-incompatible way), then it is tagged with aT-protolabel.