LCORE-1500: [bug] Fixing cases where both inline and tools RAG are enabled by JslYoon · Pull Request #1351 · lightspeed-core/lightspeed-stack

JslYoon · 2026-03-18T20:14:47Z

Description

There was a bug where when querying LLM, if both inline RAG and tools RAG are enabled, LLMS run an infinite loop.
This was caused when user provides a vector_store_id in their query, which bypasses the preset inline RAG tools in the _fetch_byok_rag() function.

The Jira ticket is [LCORE-1500]

Hen

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

Related Issue #
Closes #

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

New Features
- Added a per-request "no_tools" flag to disable tool-based sources in RAG contexts and a corresponding request field.
Behavior
- When set, tool-based sources are excluded from inline RAG context; per-request vector store IDs still take precedence over configured inline sources.
Tests
- Unit and integration tests updated to cover the new flag and related selection behavior.

Signed-off-by: Lucas <lyoon@redhat.com>

coderabbitai · 2026-03-18T20:15:04Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f2005d9a-6094-450f-88b0-9b6a38b54f6c

📥 Commits

Reviewing files that changed from the base of the PR and between 3972b4e and fba709a.

📒 Files selected for processing (1)

src/app/endpoints/streaming_query.py

🚧 Files skipped from review as they are similar to previous changes (1)

src/app/endpoints/streaming_query.py

Walkthrough

Adds a no_tools flag to request models and threads it into endpoint handlers and RAG construction via a new RAGContextParams object; build_rag_context signature changed to accept this params object and BYOK selection logic now respects no_tools.

Changes

Cohort / File(s)	Summary
Endpoint handlers `src/app/endpoints/query.py`, `src/app/endpoints/responses.py`, `src/app/endpoints/streaming_query.py`	Invoke `build_rag_context` with a `RAGContextParams` object instead of discrete args. `query.py` passes `no_tools=True` unconditionally; other handlers pass `no_tools` from the request (default False).
Request models & tests `src/models/requests.py`, `tests/integration/endpoints/test_query_byok_integration.py`, `tests/integration/endpoints/test_streaming_query_byok_integration.py`	Added `no_tools: bool` to request models; integration tests updated to include `no_tools=True` in relevant QueryRequest setups.
RAG types `src/utils/types.py`	Added `RAGContextParams` Pydantic model: `moderation_decision`, `query`, `vector_store_ids`, `solr`, `no_tools` (default False).
RAG context & BYOK logic `src/utils/vector_search.py`, `tests/unit/utils/test_vector_search.py`	Refactored `build_rag_context` to accept `RAGContextParams`; added `_fetch_byok_rag(...)`; threaded `no_tools` into BYOK selection so tool-based sources can be excluded when true; unit tests updated to pass `RAGContextParams`.
Response tests `tests/unit/app/endpoints/test_responses.py`	Adjusted assertions to inspect the `query` field on the `RAGContextParams` passed to `build_rag_context` rather than previous positional-argument checks.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Endpoint
    participant RAGBuilder
    participant VectorStore
    Client->>Endpoint: send request (may include no_tools)
    Endpoint->>RAGBuilder: build_rag_context(RAGContextParams{query, moderation_decision, vector_store_ids, solr, no_tools})
    alt params.no_tools == true
        RAGBuilder->>VectorStore: fetch BYOK using provided vector_store_ids (skip tool-based sources)
    else params.no_tools == false
        RAGBuilder->>VectorStore: fetch BYOK using config/inline sources and vector_store_ids
    end
    VectorStore-->>RAGBuilder: return RAG chunks
    RAGBuilder-->>Endpoint: assembled RAG context
    Endpoint-->>Client: respond (with RAG context applied)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly describes the main bug fix: preventing cases where both inline and tools RAG are enabled simultaneously. This matches the core objective evident from the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

src/app/endpoints/responses.py (1)

218-238: ⚠️ Potential issue | 🟠 Major

no_tools is not actually disabling tool-based RAG in /responses.

At Line 222, tools are resolved unconditionally, and at Line 237 no_tools only affects inline RAG source selection. As written, requests with no_tools=True can still send tool definitions to client.responses.create(...).

🛠️ Suggested fix

     (
         responses_request.tools,
         responses_request.tool_choice,
         vector_store_ids,
     ) = await resolve_tool_choice(
         responses_request.tools,
         responses_request.tool_choice,
         auth[1],
         mcp_headers,
         request.headers,
     )
+
+    if responses_request.no_tools:
+        # keep resolved vector_store_ids for inline RAG, but prevent tool execution
+        responses_request.tools = None
+        responses_request.tool_choice = "none"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/responses.py` around lines 218 - 238, The code currently
resolves tools unconditionally and only uses responses_request.no_tools when
building inline RAG, so requests with no_tools=True can still send tools to
client.responses.create; modify the control flow to check
responses_request.no_tools before calling resolve_tool_choice (or, if you must
call it, ensure it returns empty tools and None tool_choice when
responses_request.no_tools is True) and ensure the final payload sent to
client.responses.create does not include tool definitions or tool_choice when
responses_request.no_tools is True; update references in this block
(responses_request.no_tools, resolve_tool_choice, build_rag_context, and the
eventual client.responses.create call) so no tool-based RAG is used if no_tools
is set.

src/utils/vector_search.py (2)

320-420: ⚠️ Potential issue | 🟠 Major

CI blocker: _fetch_byok_rag exceeds pylint local-variable limit.

Line 320 is currently failing with R0914 (16/15), so this PR won’t merge cleanly until locals are reduced (or explicitly waived).

✅ Minimal refactor to reduce locals

-        score_multiplier_mapping = configuration.score_multiplier_mapping
-        rag_id_mapping = configuration.rag_id_mapping
-
         # Query all vector stores in parallel
         results_per_store = await asyncio.gather(
             *[
                 _query_store_for_byok_rag(
                     client,
                     vector_store_id,
                     query,
-                    score_multiplier_mapping.get(vector_store_id, 1.0),
+                    configuration.score_multiplier_mapping.get(vector_store_id, 1.0),
                 )
                 for vector_store_id in vector_store_ids_to_query
             ]
         )
@@
-            result["source"] = rag_id_mapping.get(result["source"], result["source"])
+            result["source"] = configuration.rag_id_mapping.get(
+                result["source"], result["source"]
+            )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/utils/vector_search.py` around lines 320 - 420, _fetcH_byok_rag currently
has too many local variables (R0914); refactor to reduce locals by extracting
small helpers: move the vector-store selection logic into a helper that returns
vector_store_ids_to_query (use resolve_vector_store_ids and filter out
constants.SOLR_DEFAULT_VECTOR_STORE_ID), and move the
result-flattening/sorting/top-N + RAGChunk conversion into another helper that
takes results_per_store, configuration.score_multiplier_mapping,
configuration.rag_id_mapping and returns (rag_chunks, referenced_documents)
using _query_store_for_byok_rag and _process_byok_rag_chunks_for_documents; call
these helpers from _fetch_byok_rag so the function keeps few locals and stays
under the pylint limit.

499-506: ⚠️ Potential issue | 🟠 Major

CI blocker: build_rag_context now violates pylint argument limits.

Line 499 fails R0913/R0917 after adding no_tools as a sixth parameter. This needs a parameter-shape change (or a targeted pylint waiver) to unblock CI.

🔧 Pragmatic unblock (targeted waiver)

-async def build_rag_context(
+async def build_rag_context(  # pylint: disable=too-many-arguments,too-many-positional-arguments
     client: AsyncLlamaStackClient,
     moderation_decision: str,
     query: str,
     vector_store_ids: Optional[list[str]],
     solr: Optional[dict[str, Any]] = None,
     no_tools: bool = False,
 ) -> RAGContext:

Also applies to: 525-525

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/utils/vector_search.py` around lines 499 - 506, build_rag_context
currently has too many positional parameters (causing R0913/R0917); fold the
less-essential/optional parameters into a single options object to reduce
parameter count. Create a small dataclass/TypedDict (e.g., BuildRAGOptions) that
contains vector_store_ids: Optional[list[str]], solr: Optional[dict[str, Any]],
and no_tools: bool, change the signature of build_rag_context to
build_rag_context(client: AsyncLlamaStackClient, moderation_decision: str,
query: str, options: BuildRAGOptions) -> RAGContext, update the function body to
read options.vector_store_ids / options.solr / options.no_tools, and update all
callers to construct and pass a BuildRAGOptions instance (or equivalent dict)
instead of separate parameters so the pylint argument limit is no longer
violated.

🧹 Nitpick comments (1)

src/models/requests.py (1)

671-672: Expose no_tools in ResponsesRequest schema examples.

Line 698 adds the field, but the request examples don’t show it, which makes the new behavior harder to discover in generated API docs.

✏️ Suggested doc example update

                 {
                     "input": "Hello World!",
                     "model": "openai/gpt-4o-mini",
                     "instructions": "You are a helpful assistant",
                     "store": True,
                     "stream": False,
+                    "no_tools": False,
                     "generate_topic_summary": True,
                 }

Also applies to: 698-698

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/models/requests.py` around lines 671 - 672, The ResponsesRequest schema
gained a new boolean field no_tools but the existing examples for
ResponsesRequest were not updated; update the API schema examples for the
ResponsesRequest model/class (the examples used by the pydantic/schema generator
in the ResponsesRequest definition) to include the no_tools key (true/false) in
each relevant example so the generated docs show the new behavior; locate the
ResponsesRequest examples object/Examples dict in src/models/requests.py and add
no_tools to every example entry alongside the other fields.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/app/endpoints/responses.py`:
- Around line 218-238: The code currently resolves tools unconditionally and
only uses responses_request.no_tools when building inline RAG, so requests with
no_tools=True can still send tools to client.responses.create; modify the
control flow to check responses_request.no_tools before calling
resolve_tool_choice (or, if you must call it, ensure it returns empty tools and
None tool_choice when responses_request.no_tools is True) and ensure the final
payload sent to client.responses.create does not include tool definitions or
tool_choice when responses_request.no_tools is True; update references in this
block (responses_request.no_tools, resolve_tool_choice, build_rag_context, and
the eventual client.responses.create call) so no tool-based RAG is used if
no_tools is set.

In `@src/utils/vector_search.py`:
- Around line 320-420: _fetcH_byok_rag currently has too many local variables
(R0914); refactor to reduce locals by extracting small helpers: move the
vector-store selection logic into a helper that returns
vector_store_ids_to_query (use resolve_vector_store_ids and filter out
constants.SOLR_DEFAULT_VECTOR_STORE_ID), and move the
result-flattening/sorting/top-N + RAGChunk conversion into another helper that
takes results_per_store, configuration.score_multiplier_mapping,
configuration.rag_id_mapping and returns (rag_chunks, referenced_documents)
using _query_store_for_byok_rag and _process_byok_rag_chunks_for_documents; call
these helpers from _fetch_byok_rag so the function keeps few locals and stays
under the pylint limit.
- Around line 499-506: build_rag_context currently has too many positional
parameters (causing R0913/R0917); fold the less-essential/optional parameters
into a single options object to reduce parameter count. Create a small
dataclass/TypedDict (e.g., BuildRAGOptions) that contains vector_store_ids:
Optional[list[str]], solr: Optional[dict[str, Any]], and no_tools: bool, change
the signature of build_rag_context to build_rag_context(client:
AsyncLlamaStackClient, moderation_decision: str, query: str, options:
BuildRAGOptions) -> RAGContext, update the function body to read
options.vector_store_ids / options.solr / options.no_tools, and update all
callers to construct and pass a BuildRAGOptions instance (or equivalent dict)
instead of separate parameters so the pylint argument limit is no longer
violated.

---

Nitpick comments:
In `@src/models/requests.py`:
- Around line 671-672: The ResponsesRequest schema gained a new boolean field
no_tools but the existing examples for ResponsesRequest were not updated; update
the API schema examples for the ResponsesRequest model/class (the examples used
by the pydantic/schema generator in the ResponsesRequest definition) to include
the no_tools key (true/false) in each relevant example so the generated docs
show the new behavior; locate the ResponsesRequest examples object/Examples dict
in src/models/requests.py and add no_tools to every example entry alongside the
other fields.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1cb84421-ade6-41af-898e-2b6a71104fbc

📥 Commits

Reviewing files that changed from the base of the PR and between cdec708 and 543dc7d.

📒 Files selected for processing (5)

src/app/endpoints/query.py
src/app/endpoints/responses.py
src/app/endpoints/streaming_query.py
src/models/requests.py
src/utils/vector_search.py

tisnik

Is it rebased? does not seem so

Signed-off-by: Lucas <lyoon@redhat.com>

JslYoon · 2026-03-18T20:49:15Z

Just rebased and fixed @tisnik!

Signed-off-by: Lucas <lyoon@redhat.com>

coderabbitai

🧹 Nitpick comments (1)

src/app/endpoints/query.py (1)

63-69: Consolidate duplicate imports from utils.types.

RAGContextParams should be imported alongside the other types from utils.types on lines 63-67 rather than as a separate import on line 69.

♻️ Suggested consolidation

 from utils.types import (
     ResponsesApiParams,
     ShieldModerationResult,
     TurnSummary,
+    RAGContextParams,
 )
 from utils.vector_search import build_rag_context
-from utils.types import RAGContextParams

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/query.py` around lines 63 - 69, The imports from
utils.types are duplicated: move RAGContextParams into the existing grouped
import with ResponsesApiParams, ShieldModerationResult, and TurnSummary so all
four types are imported in a single statement; update the import block that
currently lists ResponsesApiParams, ShieldModerationResult, TurnSummary to also
include RAGContextParams and remove the separate from utils.types import
RAGContextParams line to eliminate the duplicate import.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/app/endpoints/query.py`:
- Around line 63-69: The imports from utils.types are duplicated: move
RAGContextParams into the existing grouped import with ResponsesApiParams,
ShieldModerationResult, and TurnSummary so all four types are imported in a
single statement; update the import block that currently lists
ResponsesApiParams, ShieldModerationResult, TurnSummary to also include
RAGContextParams and remove the separate from utils.types import
RAGContextParams line to eliminate the duplicate import.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8c55e906-b91f-45df-ae02-2f0333ee9895

📥 Commits

Reviewing files that changed from the base of the PR and between b19d3b9 and a53fcec.

📒 Files selected for processing (9)

src/app/endpoints/query.py
src/app/endpoints/responses.py
src/app/endpoints/streaming_query.py
src/utils/types.py
src/utils/vector_search.py
tests/integration/endpoints/test_query_byok_integration.py
tests/integration/endpoints/test_streaming_query_byok_integration.py
tests/unit/app/endpoints/test_responses.py
tests/unit/utils/test_vector_search.py

🚧 Files skipped from review as they are similar to previous changes (2)

src/app/endpoints/streaming_query.py
tests/unit/utils/test_vector_search.py

Signed-off-by: Lucas <lyoon@redhat.com>

Fixing cases where both inline and tools RAG are enabled

543dc7d

Signed-off-by: Lucas <lyoon@redhat.com>

coderabbitai bot reviewed Mar 18, 2026

View reviewed changes

JslYoon force-pushed the JslYoon-LCORE-1500 branch from eb33354 to 956da63 Compare March 18, 2026 20:29

Merge remote-tracking branch 'upstream/main' into JslYoon-LCORE-1500

f0b7d76

JslYoon force-pushed the JslYoon-LCORE-1500 branch from 956da63 to b19d3b9 Compare March 18, 2026 20:32

tisnik reviewed Mar 18, 2026

View reviewed changes

tisnik changed the title ~~[LCORE-1500] [bug] Fixing cases where both inline and tools RAG are enabled~~ LCORE-1500: [bug] Fixing cases where both inline and tools RAG are enabled Mar 18, 2026

Updating test cases to reflect changes/bug fix

08cc480

Signed-off-by: Lucas <lyoon@redhat.com>

JslYoon force-pushed the JslYoon-LCORE-1500 branch from b19d3b9 to 08cc480 Compare March 18, 2026 20:48

JslYoon requested a review from tisnik March 18, 2026 20:50

JslYoon added 3 commits March 18, 2026 17:12

fix linting

52fcf81

Signed-off-by: Lucas <lyoon@redhat.com>

Merge remote-tracking branch 'upstream/main' into JslYoon-LCORE-1500

50acac0

fix unit tests params

a53fcec

Signed-off-by: Lucas <lyoon@redhat.com>

coderabbitai bot reviewed Mar 18, 2026

View reviewed changes

Merge remote-tracking branch 'upstream/main' into JslYoon-LCORE-1500

8992bb9

tisnik requested a review from are-ces March 19, 2026 15:22

JslYoon added 2 commits March 19, 2026 15:00

Merge remote-tracking branch 'upstream/main' into JslYoon-LCORE-1500

3972b4e

pylint error

fba709a

Signed-off-by: Lucas <lyoon@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LCORE-1500: [bug] Fixing cases where both inline and tools RAG are enabled#1351

LCORE-1500: [bug] Fixing cases where both inline and tools RAG are enabled#1351
JslYoon wants to merge 9 commits intolightspeed-core:mainfrom
JslYoon:JslYoon-LCORE-1500

JslYoon commented Mar 18, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 18, 2026 •

edited

Loading

Reviews paused

Uh oh!

coderabbitai bot left a comment

Uh oh!

tisnik left a comment

Uh oh!

JslYoon commented Mar 18, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JslYoon commented Mar 18, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

JslYoon commented Mar 18, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JslYoon commented Mar 18, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 18, 2026 •

edited

Loading