Skip to content

LCORE-1504: Change how the OKP server is located at runtime: RH_SERVER_OKP#1347

Merged
tisnik merged 1 commit intolightspeed-core:mainfrom
syedriko:syedriko-lcore-1504
Mar 19, 2026
Merged

LCORE-1504: Change how the OKP server is located at runtime: RH_SERVER_OKP#1347
tisnik merged 1 commit intolightspeed-core:mainfrom
syedriko:syedriko-lcore-1504

Conversation

@syedriko
Copy link
Contributor

@syedriko syedriko commented Mar 18, 2026

Description

According to https://docs.google.com/document/d/15wxuPf9r1oEsVdpsgLM-aialDaz7UzaR5cl85vk2WkE/edit?tab=t.0, make the following changes to how the OKP server is located at runtime:

  • added the rhokp_url configuration variable to the OkpConfiguration section with intention to be set from the RH_SERVER_OKP environment variable:
rhokp_url: ${env.RH_SERVER_OKP}

The default value for rhokp_url is "http://localhost:8081"

  • The SOLR_URL environment variable is not read anymore
  • The URLs for offline mode start with the value of rhokp_url

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
  • Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • Chores

    • Unified document and search backend URL resolution to use a single environment-driven base URL with a clearer default; removed the old default constant and added a new one.
  • Tests

    • Updated and expanded unit tests to validate environment-driven base-URL behavior and the revised document URL construction.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 18, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Replaced MIMIR_DOC_URL with RH_SERVER_OKP_DEFAULT_URL, switched Solr env-var usage from SOLR_URL to RH_SERVER_OKP, added helpers to read and join the OKP base URL, and updated URL-building logic and tests to use the new base-url helpers.

Changes

Cohort / File(s) Summary
Constants
src/constants.py
Removed MIMIR_DOC_URL; added and exported RH_SERVER_OKP_DEFAULT_URL.
Configuration
src/llama_stack_configuration.py
Switched Solr environment variable reference from SOLR_URL to RH_SERVER_OKP in enrich_solr, emitting rh_server_okp_env for the Solr URL.
URL resolution / Vector search
src/utils/vector_search.py
Added _get_okp_base_url() (reads RH_SERVER_OKP with fallback to RH_SERVER_OKP_DEFAULT_URL) and _join_okp_doc_url(); refactored _build_document_url and _convert_solr_chunks_to_rag_format to use the dynamic base URL and removed references to the deleted constant.
Tests
tests/unit/utils/test_vector_search.py
Added context manager to set RH_SERVER_OKP in tests; added tests for _get_okp_base_url() and updated URL construction expectations to use the environment-driven base URL; imports adjusted accordingly.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: replacing SOLR_URL with RH_SERVER_OKP to locate the OKP server at runtime, which is the primary objective across all modified files.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/llama_stack_configuration.py (1)

424-446: ⚠️ Potential issue | 🟠 Major

Align RH_SERVER_OKP contract across Solr and document URLs.

At Line 424, RH_SERVER_OKP is treated as a Solr URL (.../solr), but in src/utils/vector_search.py (Line 29-31 and Line 573+) the same variable is treated as a document base URL. One value cannot reliably satisfy both shapes.

Suggested fix (single contract: RH_SERVER_OKP = OKP server base)
-        rh_server_okp_env = "${env.RH_SERVER_OKP:=http://localhost:8081/solr}"
+        rh_server_okp_env = "${env.RH_SERVER_OKP:=http://localhost:8081}"
@@
-                    "solr_url": rh_server_okp_env,
+                    "solr_url": f"{rh_server_okp_env}/solr",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llama_stack_configuration.py` around lines 424 - 446, The RH_SERVER_OKP
value is used inconsistently: in this file rh_server_okp_env is defined with a
trailing "/solr" but elsewhere (src/utils/vector_search.py) RH_SERVER_OKP is
expected to be the OKP server base (document URLs); fix by standardizing on a
single contract (prefer: RH_SERVER_OKP = OKP server base) and update this config
to not append "/solr" directly—change rh_server_okp_env to use the base env var
only and build the Solr endpoint when needed (e.g., set "solr_url" to
f"{rh_server_okp_env.rstrip('/')}/solr" at the point where
ls_config["providers"]["vector_io"] is populated), or alternatively update code
in vector_search.py to append "/solr" when calling Solr; adjust only the
rh_server_okp_env definition and the place that sets "solr_url" (and mention
RH_SERVER_OKP usage in vector_search) so both consumers agree on a base-URL
contract.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/utils/vector_search.py`:
- Around line 573-585: The doc_url assembly can produce malformed paths due to
raw concatenation and urljoin quirks; change both offline and online branches to
a single deterministic join function that ensures base_url ends with a single
'/' and reference_doc/reference_url is stripped of any leading '/', then join
with base_url + reference_doc (or return reference_doc unchanged if it already
starts with "http"); update the code that sets base_url, reference_doc, and
doc_url (and the similar logic at the later block) to use this helper logic so
offline/relative-path flows and the variables base_url, reference_doc,
reference_url, and doc_url produce consistent, well-formed URLs.
- Around line 29-31: The helper _get_okp_base_url currently returns the raw
RH_SERVER_OKP env var which can be an empty string; change it to treat empty or
whitespace-only values as unset by reading the env var (e.g., val =
os.environ.get("RH_SERVER_OKP")) and returning val if val and val.strip() else
constants.RH_SERVER_OKP_DEFAULT_URL so downstream URL generation uses the
default when the env var is empty; update the function _get_okp_base_url
accordingly.

---

Outside diff comments:
In `@src/llama_stack_configuration.py`:
- Around line 424-446: The RH_SERVER_OKP value is used inconsistently: in this
file rh_server_okp_env is defined with a trailing "/solr" but elsewhere
(src/utils/vector_search.py) RH_SERVER_OKP is expected to be the OKP server base
(document URLs); fix by standardizing on a single contract (prefer:
RH_SERVER_OKP = OKP server base) and update this config to not append "/solr"
directly—change rh_server_okp_env to use the base env var only and build the
Solr endpoint when needed (e.g., set "solr_url" to
f"{rh_server_okp_env.rstrip('/')}/solr" at the point where
ls_config["providers"]["vector_io"] is populated), or alternatively update code
in vector_search.py to append "/solr" when calling Solr; adjust only the
rh_server_okp_env definition and the place that sets "solr_url" (and mention
RH_SERVER_OKP usage in vector_search) so both consumers agree on a base-URL
contract.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2017d98b-dcc3-4905-8eb4-4f803bf7b6f0

📥 Commits

Reviewing files that changed from the base of the PR and between ded94d7 and c486c32.

📒 Files selected for processing (4)
  • src/constants.py
  • src/llama_stack_configuration.py
  • src/utils/vector_search.py
  • tests/unit/utils/test_vector_search.py

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we use env. vars in the first place? Is there any strong reason about it?

@syedriko
Copy link
Contributor Author

Why we use env. vars in the first place? Is there any strong reason about it?

It's an OKP proposal - https://docs.google.com/document/d/15wxuPf9r1oEsVdpsgLM-aialDaz7UzaR5cl85vk2WkE/edit?tab=t.0

@syedriko syedriko force-pushed the syedriko-lcore-1504 branch from c486c32 to 9d4518d Compare March 18, 2026 17:01
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/utils/vector_search.py (1)

574-585: ⚠️ Potential issue | 🟠 Major

Document URL assembly is still inconsistent and can produce malformed links.

At Line 578/Line 585, raw concatenation can miss separators; at Line 618/Line 631, a different join strategy is used. This keeps URL output format nondeterministic across code paths for the same inputs.

🔧 Consolidated fix pattern
+def _join_okp_url(base_url: str, reference: str) -> str:
+    """Join OKP base URL and document reference deterministically."""
+    return f"{base_url.rstrip('/')}/{reference.lstrip('/')}"
+
 def _build_document_url(
@@
-    base_url = _get_okp_base_url()
+    base_url = _get_okp_base_url()
@@
-        doc_url = base_url + reference_doc if reference_doc else ""
+        doc_url = _join_okp_url(base_url, reference_doc) if reference_doc else ""
@@
-            else (base_url + reference_doc if reference_doc else "")
+            else (_join_okp_url(base_url, reference_doc) if reference_doc else "")
@@
-                    attributes["doc_url"] = urljoin(_get_okp_base_url(), parent_id)
+                    attributes["doc_url"] = _join_okp_url(
+                        _get_okp_base_url(), parent_id
+                    )
@@
-                    attributes["doc_url"] = urljoin(_get_okp_base_url(), doc_id)
+                    attributes["doc_url"] = _join_okp_url(
+                        _get_okp_base_url(), doc_id
+                    )

Use this to verify the current mismatch in output shape:

#!/bin/bash
python - <<'PY'
from urllib.parse import urljoin

cases = [
    ("https://mimir.corp.redhat.com", "doc_123"),
    ("https://mimir.corp.redhat.com/", "doc_123"),
    ("http://localhost:8081/solr", "doc_123"),
]

for base, ref in cases:
    print(f"\nbase={base!r}, ref={ref!r}")
    print("concat :", base + ref)
    print("urljoin:", urljoin(base, ref))
PY

Also applies to: 618-631

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/vector_search.py` around lines 574 - 585, The URL assembly is
inconsistent due to raw string concatenation; replace both concatenation
branches so doc_url is built with urllib.parse.urljoin using base_url and the
chosen reference (use reference_url or doc_id as before) to guarantee correct
separators; update the blocks that set reference_doc and doc_url (variables:
base_url, reference_doc, reference_url, doc_id, doc_url) and the helper call to
_get_okp_base_url(), and apply the same urljoin logic to the other affected
section around the 618–631 region to make output deterministic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/llama_stack_configuration.py`:
- Around line 424-425: The RH_SERVER_OKP default currently includes a Solr path
which mixes server-base and Solr-specific URL shapes; change rh_server_okp_env
to default to the server base (e.g., "http://localhost:8081") instead of
".../solr" and remove any hardcoded "/solr" from collection_env initialization,
then update the Solr-specific call sites to append "/solr" at the Solr boundary
(for example in src/utils/vector_search.py where RH_SERVER_OKP is used to build
document/Solr endpoints around lines 29-32); also apply the same change to the
other occurrence referenced near line 445 so all Solr paths are constructed only
where Solr is actually invoked.

---

Duplicate comments:
In `@src/utils/vector_search.py`:
- Around line 574-585: The URL assembly is inconsistent due to raw string
concatenation; replace both concatenation branches so doc_url is built with
urllib.parse.urljoin using base_url and the chosen reference (use reference_url
or doc_id as before) to guarantee correct separators; update the blocks that set
reference_doc and doc_url (variables: base_url, reference_doc, reference_url,
doc_id, doc_url) and the helper call to _get_okp_base_url(), and apply the same
urljoin logic to the other affected section around the 618–631 region to make
output deterministic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e65d44af-f4cd-4f6b-b7e7-99dfb761ed5c

📥 Commits

Reviewing files that changed from the base of the PR and between c486c32 and 9d4518d.

📒 Files selected for processing (4)
  • src/constants.py
  • src/llama_stack_configuration.py
  • src/utils/vector_search.py
  • tests/unit/utils/test_vector_search.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/unit/utils/test_vector_search.py

@tisnik
Copy link
Contributor

tisnik commented Mar 18, 2026

Why we use env. vars in the first place? Is there any strong reason about it?

It's an OKP proposal - https://docs.google.com/document/d/15wxuPf9r1oEsVdpsgLM-aialDaz7UzaR5cl85vk2WkE/edit?tab=t.0

But WHO creates the env.var? Is not it that OKP is just another service we have to connect to?

@syedriko syedriko force-pushed the syedriko-lcore-1504 branch from 9d4518d to f0316d8 Compare March 18, 2026 17:09
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/unit/utils/test_vector_search.py (1)

239-244: Add dedicated tests for _get_okp_base_url fallback behavior.

These changes correctly isolate _build_document_url, but they also bypass verification of env/default resolution. Please add focused tests for _get_okp_base_url with RH_SERVER_OKP set/unset (and default-port expectation) so the runtime contract is directly covered.

Suggested test additions
 from utils.vector_search import (
+    _get_okp_base_url,
     _build_document_url,
     _build_query_params,
@@
 class TestBuildDocumentUrl:
@@
         assert doc_url == "https://mimir.test/relative/path"
         assert reference_doc == "relative/path"

+class TestGetOkpBaseUrl:
+    """Tests for _get_okp_base_url helper."""
+
+    def test_uses_env_value_when_set(self, monkeypatch) -> None:
+        monkeypatch.setenv("RH_SERVER_OKP", "https://okp.example:8081")
+        assert _get_okp_base_url().startswith("https://okp.example:8081")
+
+    def test_uses_default_when_env_unset(self, monkeypatch) -> None:
+        monkeypatch.delenv("RH_SERVER_OKP", raising=False)
+        value = _get_okp_base_url()
+        assert value.startswith("https://")
+        # tighten this assertion if the contract requires explicit :8081
+```
</details>


Also applies to: 263-268

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @tests/unit/utils/test_vector_search.py around lines 239 - 244, Add dedicated
unit tests for _get_okp_base_url to verify its fallback behavior when
RH_SERVER_OKP is set and when it is unset; specifically, write one test that
sets RH_SERVER_OKP to a custom host and asserts the returned base URL uses that
host, and another test that clears RH_SERVER_OKP and asserts the returned base
URL uses the default host and includes the expected default port (e.g., :443 or
configured default). Update tests/unit/utils/test_vector_search.py to remove the
mocker.patch of _get_okp_base_url in the existing _build_document_url tests and
instead add the new focused tests around _get_okp_base_url (also add equivalents
for the region around lines 263-268) so env/default resolution is directly
covered.


</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @tests/unit/utils/test_vector_search.py:

  • Around line 239-244: Add dedicated unit tests for _get_okp_base_url to verify
    its fallback behavior when RH_SERVER_OKP is set and when it is unset;
    specifically, write one test that sets RH_SERVER_OKP to a custom host and
    asserts the returned base URL uses that host, and another test that clears
    RH_SERVER_OKP and asserts the returned base URL uses the default host and
    includes the expected default port (e.g., :443 or configured default). Update
    tests/unit/utils/test_vector_search.py to remove the mocker.patch of
    _get_okp_base_url in the existing _build_document_url tests and instead add the
    new focused tests around _get_okp_base_url (also add equivalents for the region
    around lines 263-268) so env/default resolution is directly covered.

</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: Path: .coderabbit.yaml

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `6b9bd9c3-0099-48ca-8142-6fa08922532b`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 9d4518d979cabaebeeabad9936e067c473a2cef9 and f0316d8e845215bb7a5dc9b1f358818ce3989d5f.

</details>

<details>
<summary>📒 Files selected for processing (4)</summary>

* `src/constants.py`
* `src/llama_stack_configuration.py`
* `src/utils/vector_search.py`
* `tests/unit/utils/test_vector_search.py`

</details>

<details>
<summary>🚧 Files skipped from review as they are similar to previous changes (2)</summary>

* src/llama_stack_configuration.py
* src/utils/vector_search.py

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

@syedriko
Copy link
Contributor Author

Why we use env. vars in the first place? Is there any strong reason about it?

It's an OKP proposal - https://docs.google.com/document/d/15wxuPf9r1oEsVdpsgLM-aialDaz7UzaR5cl85vk2WkE/edit?tab=t.0

But WHO creates the env.var? Is not it that OKP is just another service we have to connect to?

Whoever runs the LSC?
@Jared-Sprague should have more insight.

@syedriko syedriko force-pushed the syedriko-lcore-1504 branch from f0316d8 to f461c9a Compare March 18, 2026 17:27
src/constants.py Outdated
# SOLR OKP RAG
MIMIR_DOC_URL = "https://mimir.corp.redhat.com"
# SOLR OKP RAG - default document base URL when RH_SERVER_OKP env is unset
RH_SERVER_OKP_DEFAULT_URL = "https://mimir.corp.redhat.com"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@syedriko I think we should set this default to "http://localhost:8081"

Since mimir.corp.redhat.com is internal and behind VPN it should not be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@syedriko syedriko force-pushed the syedriko-lcore-1504 branch from f461c9a to e5a8d6d Compare March 18, 2026 18:06
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/utils/vector_search.py (1)

630-632: Consider caching _get_okp_base_url() at the start of the function.

_get_okp_base_url() is called multiple times within _convert_solr_chunks_to_rag_format (at lines 631 and 646). While os.environ.get is fast, caching the base URL at the start of the function would be slightly cleaner and more efficient for large chunk lists.

♻️ Proposed refactor
 def _convert_solr_chunks_to_rag_format(
     retrieved_chunks: list[Any],
     retrieved_scores: list[float],
     offline: bool,
 ) -> list[RAGChunk]:
     ...
     rag_chunks = []
+    base_url = _get_okp_base_url()

     for i, chunk in enumerate(retrieved_chunks):
         ...
         if chunk.metadata:
             if offline:
                 parent_id = chunk.metadata.get("parent_id")
                 if parent_id:
-                    attributes["doc_url"] = _join_okp_doc_url(
-                        _get_okp_base_url(), parent_id
-                    )
+                    attributes["doc_url"] = _join_okp_doc_url(base_url, parent_id)
         ...
                 if "doc_url" not in attributes and offline and doc_id:
-                    attributes["doc_url"] = _join_okp_doc_url(
-                        _get_okp_base_url(), doc_id
-                    )
+                    attributes["doc_url"] = _join_okp_doc_url(base_url, doc_id)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/vector_search.py` around lines 630 - 632, The function
_convert_solr_chunks_to_rag_format calls _get_okp_base_url() multiple times;
cache its result once at the start of that function (e.g., base_url =
_get_okp_base_url()) and then use base_url where _get_okp_base_url() is
currently passed (such as into _join_okp_doc_url for attributes["doc_url"] and
other uses) to avoid repeated environment lookups and make the code cleaner.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unit/utils/test_vector_search.py`:
- Around line 265-269: The test test_returns_default_when_rh_server_okp_unset
incorrectly expects HTTPS; update the assertion to match the default constant
RH_SERVER_OKP_DEFAULT_URL (which is "http://localhost:8081") by asserting
_get_okp_base_url().startswith("http://") or simply comparing equality to
constants.RH_SERVER_OKP_DEFAULT_URL; modify the assertion in that test to use
"http://" or the constant accordingly.

---

Nitpick comments:
In `@src/utils/vector_search.py`:
- Around line 630-632: The function _convert_solr_chunks_to_rag_format calls
_get_okp_base_url() multiple times; cache its result once at the start of that
function (e.g., base_url = _get_okp_base_url()) and then use base_url where
_get_okp_base_url() is currently passed (such as into _join_okp_doc_url for
attributes["doc_url"] and other uses) to avoid repeated environment lookups and
make the code cleaner.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f39f6bca-2a82-4e92-8c88-52bab255d60f

📥 Commits

Reviewing files that changed from the base of the PR and between f0316d8 and e5a8d6d.

📒 Files selected for processing (4)
  • src/constants.py
  • src/llama_stack_configuration.py
  • src/utils/vector_search.py
  • tests/unit/utils/test_vector_search.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/llama_stack_configuration.py

@syedriko syedriko force-pushed the syedriko-lcore-1504 branch from e5a8d6d to 3adfa9a Compare March 18, 2026 18:21
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/utils/vector_search.py (1)

619-647: Consider hoisting _get_okp_base_url() outside the loop.

The base URL is fetched via _get_okp_base_url() inside the loop (lines 631, 646), resulting in repeated os.environ.get() calls for each chunk. Since the value doesn't change during iteration, it could be retrieved once before the loop.

♻️ Suggested optimization
 def _convert_solr_chunks_to_rag_format(
     retrieved_chunks: list[Any],
     retrieved_scores: list[float],
     offline: bool,
 ) -> list[RAGChunk]:
     ...
     rag_chunks = []
+    base_url = _get_okp_base_url()
 
     for i, chunk in enumerate(retrieved_chunks):
         ...
         if chunk.metadata:
             if offline:
                 parent_id = chunk.metadata.get("parent_id")
                 if parent_id:
-                    attributes["doc_url"] = _join_okp_doc_url(
-                        _get_okp_base_url(), parent_id
-                    )
+                    attributes["doc_url"] = _join_okp_doc_url(base_url, parent_id)
         ...
                 if "doc_url" not in attributes and offline and doc_id:
-                    attributes["doc_url"] = _join_okp_doc_url(
-                        _get_okp_base_url(), doc_id
-                    )
+                    attributes["doc_url"] = _join_okp_doc_url(base_url, doc_id)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/vector_search.py` around lines 619 - 647, The loop repeatedly calls
_get_okp_base_url() for each chunk which is unnecessary; before iterating over
retrieved_chunks, call _get_okp_base_url() once and store it in a local variable
(e.g., okp_base_url), then pass that variable into _join_okp_doc_url(...) in the
branches that build attributes (the offline branch where parent_id is used and
the Solr chunk branch where document_id is used), leaving the rest of the logic
in the for-loop unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/utils/vector_search.py`:
- Around line 619-647: The loop repeatedly calls _get_okp_base_url() for each
chunk which is unnecessary; before iterating over retrieved_chunks, call
_get_okp_base_url() once and store it in a local variable (e.g., okp_base_url),
then pass that variable into _join_okp_doc_url(...) in the branches that build
attributes (the offline branch where parent_id is used and the Solr chunk branch
where document_id is used), leaving the rest of the logic in the for-loop
unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a4151224-0765-4c02-9573-a2c1f869bf0c

📥 Commits

Reviewing files that changed from the base of the PR and between e5a8d6d and 3adfa9a.

📒 Files selected for processing (4)
  • src/constants.py
  • src/llama_stack_configuration.py
  • src/utils/vector_search.py
  • tests/unit/utils/test_vector_search.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/llama_stack_configuration.py
  • tests/unit/utils/test_vector_search.py

@syedriko syedriko force-pushed the syedriko-lcore-1504 branch from 3adfa9a to 6311518 Compare March 19, 2026 16:55
@tisnik
Copy link
Contributor

tisnik commented Mar 19, 2026

Please rebase, the changes in PR are from previous commits (OpenAPI.json for example)

@syedriko syedriko force-pushed the syedriko-lcore-1504 branch from 6311518 to 2ef828f Compare March 19, 2026 17:09
Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main part: LGTM config handling is fine, thank you

some nits about URLs (url is not any string)

  • info about rendering (might be remove from this PR, can be addressed later).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not rendered correctly, as you can see. Being you I'd skip the "Java part" from this PR for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, leaving PlantUML in place. The current plantuml-1.2026.2.jar still needs a couple of tweaks in the Makefile to work though.

Only relevant when ``"okp"`` is listed in ``rag.inline`` or ``rag.tool``.
"""

rhokp_url: Optional[str] = Field(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AnyHttpUrl perhaps instead of str perhaps?


def _get_okp_base_url() -> str:
"""Return OKP document base URL from configuration (rhokp_url), or default if unset."""
value = (configuration.okp.rhokp_url or "").strip()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the URL will be real url, strip() is not needed

return reference
base = base_url.rstrip("/") + "/"
ref = reference.lstrip("/")
return base + ref
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URL classes provides "join" operations, which means that handling it manually won't be needed. But it's up to you in this case.

Copy link
Contributor Author

@syedriko syedriko Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean from urllib.parse import urljoin?
It has some odd behavior (https://stackoverflow.com/questions/10893374/python-confusions-with-urljoin) so the end result is not cleaner than what we started with:

    base = str(base_url)
    if not base.endswith("/"):
        base = f"{base}/"
    ref = reference.lstrip("/")
    return urljoin(base, ref)

else constants.SOLR_CHUNK_FILTER_QUERY
)

solr_url = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dtto about AnyHttpUrl - no stripping etc.

uv run pyreverse src/models/config.py --output puml --output-directory=docs/
mv docs/classes.puml docs/config.puml

docs/config.png: docs/config.puml ## Generate an image with configuration graph
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dunno why, but you can see that it does not render correctly (and AFAIK is outside scope of this pr)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a default 4096 pixel width limit for PNGs in PlantUML apparently, bumped it in the Makefile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And "--theme rose" blows up like so:

$ make docs/config.svg 
pyreverse src/models/config.py --output puml --output-directory=docs/
mv docs/classes.puml docs/config.puml
pushd docs && \
java -jar ~/bin/plantuml.jar --theme rose config.puml -tsvg && \
xmllint --format classes.svg > config.svg && \
rm -f classes.svg && \
popd
~/lightspeed-core/lightspeed-stack/docs ~/lightspeed-core/lightspeed-stack
Error line 1 in file: config.puml
Some diagram description contains errors
make: *** [Makefile:80: docs/config.svg] Error 200
$

added the rhokp_url configuration variable to the OkpConfiguration
section.
@syedriko syedriko force-pushed the syedriko-lcore-1504 branch from 2ef828f to c2f6afe Compare March 19, 2026 19:22
Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tisnik tisnik merged commit 55c4543 into lightspeed-core:main Mar 19, 2026
21 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants