Skip to content

CacheManager doesn't protect against hash collisions #307

@Krzmbrzl

Description

@Krzmbrzl

This can best be seen via

auto imed_counts = container::map<size_t, size_t>{};
// visits a node and check if its hash value exists in imed_counts map
// if it does increase the count and return false (to signal stop visiting
// children nodes) otherwise returns true.
auto imed_visitor = [&imed_counts](auto&& n) -> bool {
auto&& end = imed_counts.end();
auto&& h = hash::value(*n);
if (auto&& found = imed_counts.find(h); found != end) {
++found->second;
return false;
} else
imed_counts.emplace(h, 1);
return true;
}; // imed_visitor

where it becomes clear that the CacheManager is using hash values to identify given results.

In case of a hash collision, two distinct EvalNodes will have the same hash value in which case the CacheManager will consider them to be equal. This is clearly invalid behavior and will lead to incorrect results if used in an actual computation. Most severely, this will lead to the wrong result without any kind of error or warning message and only under very specific circumstances. That is, users will evaluate the framework on a couple of example inputs and afterwards will trust that the system works. But it may just not at which point the user will be very confused due to an incorrect result (best case) or just work with the wrong result because they didn't realize something's off (worst case).

Typically, one guards against hash collisions by providing an operator== for the key object that is able to uniquely determine whether two objects really are the same or not. This process is called collision chaining. See also https://stackoverflow.com/q/21518704.

/CC @bimalgaudel

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions