(improvement) unit tests for benchmarking query planning. by mykaul · Pull Request #653 · scylladb/python-driver

mykaul · 2026-01-24T11:33:54Z

Not a very scientific one, but reasonable to get some measurements in terms of how different optimizations work. Example run (on #650 branch):


  warnings.warn(PytestDeprecationWarning(_DEFAULT_FIXTURE_LOOP_SCOPE_UNSET))
============================================================================================================ test session starts ============================================================================================================= platform linux -- Python 3.14.2, pytest-8.3.5, pluggy-1.6.0 rootdir: /home/ykaul/github/python-driver
configfile: pyproject.toml
plugins: asyncio-1.1.0, anyio-4.12.1
asyncio: mode=Mode.STRICT, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 4 items

tests/unit/test_policy_performance.py Pinned to CPU 0 ....

=== Performance Benchmarks ===
Policy                         | Ops        | Time (s)   | Kops/s
----------------------------------------------------------------------
DCAware                        | 100000     | 0.2328     | 429
RackAware                      | 100000     | 0.3637     | 274
TokenAware(DCAware)            | 100000     | 1.5884     | 62
TokenAware(RackAware)          | 100000     | 1.6816     | 59
----------------------------------------------------------------------

Pre-review checklist

I have split my patch into logically separate commits.
All commit messages clearly explain what they change and why.
I added relevant tests for new features and bug fixes.
All commits compile, pass static checks and pass test.
PR description sums up the changes and reasons why they should be introduced.
I have provided docstrings for the public items that I want to introduce.
I have adjusted the documentation in ./docs/source/.
I added appropriate Fixes: annotations to PR description.

mykaul · 2026-01-25T15:31:53Z

Latest commit add Host policy. Now it looks like this:

ykaul@ykaul:~/github/python-driver$ pytest -s tests/unit/test_policy_performance.py 
/usr/lib/python3.14/site-packages/pytest_asyncio/plugin.py:211: PytestDeprecationWarning: The configuration option "asyncio_default_fixture_loop_scope" is unset.
The event loop scope for asynchronous fixtures will default to the fixture caching scope. Future versions of pytest-asyncio will default the loop scope for asynchronous fixtures to function scope. Set the default fixture loop scope explicitly in order to avoid unexpected behavior in the future. Valid fixture loop scopes are: "function", "class", "module", "package", "session"

  warnings.warn(PytestDeprecationWarning(_DEFAULT_FIXTURE_LOOP_SCOPE_UNSET))
==================================================================================================================================================== test session starts =====================================================================================================================================================
platform linux -- Python 3.14.2, pytest-8.3.5, pluggy-1.6.0
rootdir: /home/ykaul/github/python-driver
configfile: pyproject.toml
plugins: asyncio-1.1.0, anyio-4.12.1
asyncio: mode=Mode.STRICT, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 6 items                                                                                                                                                                                                                                                                                                            

tests/unit/test_policy_performance.py Pinned to CPU 0
......

=== Performance Benchmarks ===
Policy                         | Ops        | Time (s)   | Kops/s    
----------------------------------------------------------------------
DCAware                        | 100000     | 0.0989     | 1010      
Default(DCAware)               | 100000     | 0.1532     | 652       
HostFilter(DCAware)            | 100000     | 0.3303     | 302       
RackAware                      | 100000     | 0.1149     | 870       
TokenAware(DCAware)            | 100000     | 0.2112     | 473       
TokenAware(RackAware)          | 100000     | 0.2249     | 444       
----------------------------------------------------------------------

dkropachev · 2026-01-26T11:31:01Z

@mykaul , thanks looks great, can you implement them in pytest and mark with @pytest.mark.benchmark and register such a flag in pytest.ini

Not a very scientific one, but reasonable to get some measurements in terms of how different optimizations work. Example run (on scylladb#650 branch): ykaul@ykaul:~/github/python-driver$ pytest -s tests/unit/test_policy_performance.py /usr/lib/python3.14/site-packages/pytest_asyncio/plugin.py:211: PytestDeprecationWarning: The configuration option "asyncio_default_fixture_loop_scope" is unset. The event loop scope for asynchronous fixtures will default to the fixture caching scope. Future versions of pytest-asyncio will default the loop scope for asynchronous fixtures to function scope. Set the default fixture loop scope explicitly in order to avoid unexpected behavior in the future. Valid fixture loop scopes are: "function", "class", "module", "package", "session" warnings.warn(PytestDeprecationWarning(_DEFAULT_FIXTURE_LOOP_SCOPE_UNSET)) ============================================================================================================ test session starts ============================================================================================================= platform linux -- Python 3.14.2, pytest-8.3.5, pluggy-1.6.0 rootdir: /home/ykaul/github/python-driver configfile: pyproject.toml plugins: asyncio-1.1.0, anyio-4.12.1 asyncio: mode=Mode.STRICT, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 4 items tests/unit/test_policy_performance.py Pinned to CPU 0 .... === Performance Benchmarks === Policy | Ops | Time (s) | Kops/s ---------------------------------------------------------------------- DCAware | 100000 | 0.2328 | 429 RackAware | 100000 | 0.3637 | 274 TokenAware(DCAware) | 100000 | 1.5884 | 62 TokenAware(RackAware) | 100000 | 1.6816 | 59 ---------------------------------------------------------------------- Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylldb.com>

1. Move to regular Pytest, within the performance subdir, as part of a benchmark module 2. Renamed tests/integration/standard/column_encryption/test_policies.py -> test_encrypted_policies.py - we had two test_policies.py which conflicted when trying to run all unit tests. Example run: (scylla-driver) ykaul@ykaul:~/github/python-driver$ SCYLLA_VERSION=release:2025.4.2 PROTOCOL_VRESION=4 pytest -s -m benchmark ============================================================================================================ test session starts ============================================================================================================= platform linux -- Python 3.14.2, pytest-8.4.2, pluggy-1.6.0 rootdir: /home/ykaul/github/python-driver configfile: pyproject.toml collected 1798 items / 1792 deselected / 6 selected tests/performance/test_policy_performance.py Pinned to CPU 0 DCAware | 100000 | 0.1176 | 850 Kops/s . RackAware | 100000 | 0.1774 | 563 Kops/s . TokenAware(DCAware) | 100000 | 0.6666 | 150 Kops/s . TokenAware(RackAware) | 100000 | 0.7195 | 138 Kops/s . Default(DCAware) | 100000 | 0.1481 | 675 Kops/s . HostFilter(DCAware) | 100000 | 0.2416 | 413 Kops/s . Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

mykaul · 2026-02-06T09:03:58Z

@mykaul , thanks looks great, can you implement them in pytest and mark with @pytest.mark.benchmark and register such a flag in pytest.ini

There is no pytest.ini, so I've added one. The tests were already in pytest, now they are in the right dir, in a follow-up commit.

Lorak-mmk · 2026-02-06T09:12:49Z

There is benchmarks folder. Maybe its a better place for such tests?

…cement The test_idle_heartbeat test was failing with a KeyError when connections were dynamically replaced during the test run in shard-aware environments. Root cause: The test captures connection object IDs at the start, sleeps for heartbeat intervals, then tries to validate those connections. In shard-aware ScyllaDB deployments, the driver may replace connections during this window. When a connection is replaced, the new connection object has a different Python ID, causing a KeyError on lookup. Fix: Skip validation for connections that weren't present in the initial snapshot. This preserves the test's intent (validating heartbeats on stable connections) while being robust to dynamic connection management in shard-aware mode. I'm not sure this is a great fix. I don't think it has anything to do with this branch. I'll push it and if it's OK, I'll also create a separate PR for it. Why it popped now? Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

dkropachev

I love it, have some comments though.
Also can you please take a look at pytest-benchmark, it looks like exactly what we need.

dkropachev · 2026-02-06T14:04:47Z

tests/unit/test_policy_performance.py

+"A micro-bechmark for performance of policies"
+


Is it a real commit ? there is real change here.

dkropachev · 2026-02-06T14:12:32Z

tests/performance/test_policy_performance.py

+        replicas = []
+        for r in range(3):
+            idx = (i + r) % len(hosts)
+            replicas.append(hosts[idx])
+        replicas_map[key] = replicas


I don't think it is good idea to connect number of token ranges with number of queries, you may want to run 1k queries on setup with 100 token ranges.
Also this way you would split mock cluster initialization from query initialization, which would make fixture api more readable and flexiable at the same time, it would look like this:

def test_dc_aware(policy, vnode_cluster_1000, select_10000): policy.populate(vnode_cluster_1000, vnode_cluster_1000.metadata.all_hosts()) for query in select_10000: ...

dkropachev · 2026-02-06T14:23:47Z

tests/performance/test_policy_performance.py

@@ -0,0 +1,242 @@
+import time


Can you please merge these two commits, it looks you you just moved code to another place

dkropachev · 2026-02-06T14:24:44Z

tests/integration/standard/column_encryption/test_encrypted_policies.py

dkropachev · 2026-02-06T14:25:26Z

tests/integration/standard/test_cluster.py

        connections = [c for holders in cluster.get_connection_holders() for c in holders.get_connections()]

        # make sure requests were sent on all connections
+        # Note: connections can be replaced in shard-aware environments, so we skip


Could you please create a separate pr for this change.

I was going to, first wanted to see that it solves the issue - but it doesn't:

FAILED tests/integration/standard/test_cluster.py::ClusterTests::test_idle_heartbeat - assert not True + where True = any(<generator object ClusterTests.test_idle_heartbeat.<locals>.<genexpr> at 0x7f63c0292d40>)

I feel like I should git bisect it - if I can reproduce this locally first.

Of course, it passes locally. Over and over, consistently. Not a single failure.

mykaul marked this pull request as draft January 24, 2026 11:34

mykaul mentioned this pull request Jan 24, 2026

(improvement)Optimize DCAware/RackAware/TokenAware/HostFilter policies with host distance caching and overall perf. improvements #651

Open

8 tasks

mykaul marked this pull request as ready for review January 24, 2026 15:23

mykaul force-pushed the query_plan_opt_perf branch from b9a8faa to a797262 Compare January 25, 2026 15:25

mykaul added 3 commits February 6, 2026 10:45

(improvement) add HostPolicy to the benchmark suite.

47b64aa

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylldb.com>

mykaul force-pushed the query_plan_opt_perf branch from a797262 to d850f81 Compare February 6, 2026 09:03

dkropachev reviewed Feb 6, 2026

View reviewed changes

dkropachev mentioned this pull request Feb 6, 2026

Expand bench query planning coverage #691

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(improvement) unit tests for benchmarking query planning.#653

(improvement) unit tests for benchmarking query planning.#653
mykaul wants to merge 4 commits intoscylladb:masterfrom
mykaul:query_plan_opt_perf

mykaul commented Jan 24, 2026 •

edited

Loading

Uh oh!

mykaul commented Jan 25, 2026

Uh oh!

dkropachev commented Jan 26, 2026

Uh oh!

mykaul commented Feb 6, 2026

Uh oh!

Lorak-mmk commented Feb 6, 2026

Uh oh!

dkropachev left a comment

Uh oh!

dkropachev Feb 6, 2026

Uh oh!

dkropachev Feb 6, 2026

Uh oh!

dkropachev Feb 6, 2026

Uh oh!

dkropachev Feb 6, 2026

Uh oh!

dkropachev Feb 6, 2026

Uh oh!

mykaul Feb 6, 2026

Uh oh!

mykaul Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mykaul commented Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pre-review checklist

Uh oh!

mykaul commented Jan 25, 2026

Uh oh!

dkropachev commented Jan 26, 2026

Uh oh!

mykaul commented Feb 6, 2026

Uh oh!

Lorak-mmk commented Feb 6, 2026

Uh oh!

dkropachev left a comment

Choose a reason for hiding this comment

Uh oh!

dkropachev Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

dkropachev Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

dkropachev Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

dkropachev Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

dkropachev Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

mykaul Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

mykaul Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mykaul commented Jan 24, 2026 •

edited

Loading