docs: add RAG service documentation and deployment guide#359
docs: add RAG service documentation and deployment guide#359tsivaprasad merged 12 commits intomainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds an unreleased changelog entry and expands documentation: updates the services index and introduces a comprehensive pgEdge RAG Server guide covering provisioning, configuration, hybrid vector+keyword retrieval, LLM answer synthesis, request/response contracts, deployment, and troubleshooting. Changes
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Up to standards ✅🟢 Issues
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/services/rag.md`:
- Around line 333-337: The examples after the first one are missing the required
DB schema provisioning described in the intro; update examples 2–5 to include
the scripts.post_database_create entry (same semantics as the Minimal example)
so the vector extension, documents_content_chunks table, and related indexes are
created, or alternatively amend the intro to state only the Minimal example
includes the schema setup; reference the scripts.post_database_create field and
ensure it provisions CREATE EXTENSION IF NOT EXISTS vector, the
documents_content_chunks table (with embedding vector(1536)), and the HNSW and
tsvector indexes so runtime queries against documents_content_chunks succeed.
- Line 772: Replace the invalid Anthropic model identifier "claude-sonnet-4-5"
with the correct ID "claude-sonnet-4-20250514" in the RAG service configuration
entries that currently reference that string; ensure both occurrences (the one
shown in the diff and the other matching instance) are updated so they match the
working model ID used elsewhere ("claude-sonnet-4-20250514").
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5b216a36-bf55-4fd3-9651-f30fa0fff6c7
📒 Files selected for processing (3)
changes/unreleased/Added-20260422-004204.yamldocs/services/index.mddocs/services/rag.md
There was a problem hiding this comment.
♻️ Duplicate comments (1)
docs/services/rag.md (1)
124-124:⚠️ Potential issue | 🟠 MajorUse explicit, currently supported Anthropic model IDs consistently.
claude-sonnet-4-5is ambiguous in docs/examples and can break as aliases change. Please standardize on an official current ID format across the reference and all curl examples (either a stable alias or a dated snapshot), and add a short note to check Anthropic deprecation timelines.As of April 2026, what are the valid Anthropic Claude API model identifiers for Sonnet 4.5 and Sonnet 4.6, and which are aliases versus snapshot IDs? Please use Anthropic official documentation links only.Also applies to: 291-291, 420-420, 556-557, 664-664, 891-891
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/services/rag.md` at line 124, The docs use an ambiguous Anthropic model ID ("claude-sonnet-4-5") — update the `model` examples in docs/services/rag.md and every matching occurrence (the `model` table entry and all curl/example usages) to a specific, currently supported Anthropic model identifier (choose either the official stable alias or the dated snapshot ID) and make the ID consistent across all references; add a one-line note below the `model` description reminding readers to confirm Anthropic deprecation timelines and link to Anthropic's official model docs for verification so examples remain valid.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@docs/services/rag.md`:
- Line 124: The docs use an ambiguous Anthropic model ID ("claude-sonnet-4-5") —
update the `model` examples in docs/services/rag.md and every matching
occurrence (the `model` table entry and all curl/example usages) to a specific,
currently supported Anthropic model identifier (choose either the official
stable alias or the dated snapshot ID) and make the ID consistent across all
references; add a one-line note below the `model` description reminding readers
to confirm Anthropic deprecation timelines and link to Anthropic's official
model docs for verification so examples remain valid.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 14730477-48bd-485a-af23-a457e0717e4e
📒 Files selected for processing (1)
docs/services/rag.md
- Resolve index.md merge conflict: keep RAG Server link, adopt main's connect_as-based Database Credentials section and updated Next Steps - Apply pgEdge stylesheet to rag.md: 79-char wrap, hyphens for em-dashes, table intro sentences, bullet periods, no bold headings, Next Steps as doc links - Remove redundant sections: Automation & Responsibilities, Loading Documents (duplicated in Step 3), User-Managed Responsibilities, Search Configuration (duplicated in Config Reference) PLAT-495 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Keep fuller service descriptions for MCP and RAG entries; adopt main's restructured intro and connect_as wording. PLAT-495 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Restore main's trimmed MCP description; our PR only adds the RAG Server entry. PLAT-495 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Score values like 0.82/0.87 implied cosine similarity but the RAG service returns RRF scores which are much smaller (~0.008). Update both example responses to use realistic RRF score values. PLAT-495 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
docs/services/rag.md (1)
124-124:⚠️ Potential issue | 🟠 MajorVerify and normalize Anthropic model IDs in examples.
claude-sonnet-4-5appears repeatedly, but Anthropic examples/docs typically use dated IDs (for exampleclaude-sonnet-4-20250514) or the official alias pattern. Please verify what the RAG server accepts and update all occurrences to a valid, consistent identifier to avoid copy/paste failures.Does Anthropic's Messages API accept "claude-sonnet-4-5" as a valid model ID, or should docs use "claude-sonnet-4-20250514" / official alias names? Please cite Anthropic official model list docs.Also applies to: 291-291, 420-420, 556-556, 577-577, 664-664, 891-891
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/services/rag.md` at line 124, Update all occurrences of the example Anthropic model ID `claude-sonnet-4-5` used in the `model` examples to a verified, valid Anthropic model identifier (e.g., the dated ID `claude-sonnet-4-20250514` or the official alias) and ensure consistency across the document; verify the exact accepted IDs against Anthropic's official model list and replace every instance (including the other noted occurrences of `claude-sonnet-4-5`) with the canonical ID or alias the RAG server accepts, and add a short parenthetical note next to the `model` field explaining that the value must match Anthropic's official model name/alias.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/services/rag.md`:
- Around line 1025-1031: The docs currently suggest checking the wrong DB user;
update the verification commands to check the service user used by this
deployment (connect_as) instead of admin: replace the example command that runs
"\du+ admin" with "\du+ app_read_only" so the guide validates the actual service
user (app_read_only) and keep the table check "\dt documents_content_chunks"
as-is.
---
Duplicate comments:
In `@docs/services/rag.md`:
- Line 124: Update all occurrences of the example Anthropic model ID
`claude-sonnet-4-5` used in the `model` examples to a verified, valid Anthropic
model identifier (e.g., the dated ID `claude-sonnet-4-20250514` or the official
alias) and ensure consistency across the document; verify the exact accepted IDs
against Anthropic's official model list and replace every instance (including
the other noted occurrences of `claude-sonnet-4-5`) with the canonical ID or
alias the RAG server accepts, and add a short parenthetical note next to the
`model` field explaining that the value must match Anthropic's official model
name/alias.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 2a66fb37-373b-4721-ad85-ccd867e38523
📒 Files selected for processing (2)
docs/services/index.mddocs/services/rag.md
🚧 Files skipped from review as they are similar to previous changes (1)
- docs/services/index.md
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/services/rag.md`:
- Around line 1023-1025: Replace the hardcoded "-h localhost" in the psql
connectivity check command with the same host placeholder used throughout the
guide (e.g., "host-1" or the shared DB host variable) so the example reflects
distributed deployments; update the command shown (the psql -h ... -U admin -d
knowledge_base -c "SELECT 1") to use that placeholder instead of localhost.
- Around line 923-924: The example uses two different host
conventions—"localhost:9200" in the curl example and "host-1:9200"
earlier—causing copy/paste errors; pick one convention (preferably the earlier
"host-1:9200") and update all occurrences of "localhost:9200" (including the
curl example "curl http://localhost:9200/v1/pipelines" and the other instance
around the later snippet) to use the same host string "host-1:9200" so all
service URL examples are consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 61ee05e5-bbb1-47d1-84fe-1585474c65b5
📒 Files selected for processing (1)
docs/services/rag.md
| ### OpenAI End-to-End | ||
|
|
||
| In the following example, OpenAI is used for both embeddings and | ||
| answer generation: |
There was a problem hiding this comment.
This sentence would read more clearly as:
In the following example, OpenAI is used for both embeddings and to generate answers:
Clarified wording in the documentation regarding shared default values for pipelines.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
docs/services/rag.md (2)
438-500: Add dimension reminder to Ollama example.Similar to the Voyage AI example, users of the Ollama example need to know that
nomic-embed-textproduces 768-dimensional vectors, not 1536. Without this reminder, document insertions will fail with dimension mismatch errors.📝 Suggested inline reminder
Add a note after the example heading:
### Ollama (Self-Hosted) In the following example, the RAG service uses a self-hosted Ollama server for both embeddings and answer generation. No API key is required; the Ollama server URL is provided via `base_url`: !!! note When using `nomic-embed-text`, adjust the database schema to use `vector(768)` instead of `vector(1536)`. See [Vector Dimensions](`#vector-dimensions`).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/services/rag.md` around lines 438 - 500, Add a short note under the "Ollama (Self-Hosted)" example explaining that the embedding model "nomic-embed-text" produces 768-dimensional vectors (not 1536) and that users must update their schema/vector column type (e.g., use vector(768)) before inserting documents; place the reminder near the embedding_llm block in the example so it’s visible when configuring the Ollama provider.
371-436: Add dimension reminder to Voyage AI example.While lines 224-231 explain that users should adjust
vector(N)dimensions, the Voyage AI example doesn't include an inline reminder thatvoyage-3requiresvector(1024)instead of thevector(1536)shown in the first example's schema. Users who jump directly to this example might miss the dimension mismatch, resulting in insertion failures.📝 Suggested inline reminder
Add a note after the example heading:
### Voyage AI with Vector-Only Search In the following example, Voyage AI is used for embeddings and the service is configured for vector-only search (disabling BM25 keyword matching): !!! note When using `voyage-3`, adjust the database schema to use `vector(1024)` instead of `vector(1536)`. See [Vector Dimensions](`#vector-dimensions`).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/services/rag.md` around lines 371 - 436, The Voyage AI example is missing the schema-dimension reminder; update the "Voyage AI with Vector-Only Search" section to add a short note after the heading that tells users that the voyage-3 model requires vector(1024) (not vector(1536)) and to adjust the database schema accordingly (reference the Vector Dimensions anchor), so readers who jump straight to this example won't get failed inserts due to dimension mismatch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/services/rag.md`:
- Line 124: Replace every example model identifier "claude-sonnet-4-5" with the
valid identifier "claude-sonnet-4-6" in the docs for the `model` field;
specifically update all seven occurrences referenced in the review (examples
shown alongside the `model` property in docs/services/rag.md) so examples use
"claude-sonnet-4-6" (do not change other models like `gpt-4o` or `llama3.2`).
---
Nitpick comments:
In `@docs/services/rag.md`:
- Around line 438-500: Add a short note under the "Ollama (Self-Hosted)" example
explaining that the embedding model "nomic-embed-text" produces 768-dimensional
vectors (not 1536) and that users must update their schema/vector column type
(e.g., use vector(768)) before inserting documents; place the reminder near the
embedding_llm block in the example so it’s visible when configuring the Ollama
provider.
- Around line 371-436: The Voyage AI example is missing the schema-dimension
reminder; update the "Voyage AI with Vector-Only Search" section to add a short
note after the heading that tells users that the voyage-3 model requires
vector(1024) (not vector(1536)) and to adjust the database schema accordingly
(reference the Vector Dimensions anchor), so readers who jump straight to this
example won't get failed inserts due to dimension mismatch.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 25822863-81d1-467e-8745-bca1003c7225
📒 Files selected for processing (1)
docs/services/rag.md
| | Field | Type | Description | | ||
| |---|---|---| | ||
| | `provider` | string | Required. The LLM provider. One of: `anthropic`, `openai`, `ollama`. | | ||
| | `model` | string | Required. The model name (e.g., `claude-sonnet-4-5`, `gpt-4o`, `llama3.2`). | |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
What are the current valid Claude Sonnet model identifiers for the Anthropic API in 2026?
💡 Result:
The current valid Claude Sonnet model identifiers for the Anthropic API as of April 23, 2026, are claude-sonnet-4-6 (primary current model, alias claude-sonnet-4-6) and claude-sonnet-4-5-20250929 (still available but recommended to migrate). Older versions like claude-sonnet-4-20250514 are deprecated and scheduled for retirement on June 15, 2026. Previous Claude 3.x Sonnet models (e.g., claude-3-7-sonnet-20250219, claude-3-5-sonnet-*) have been retired.
Citations:
- 1: https://docs.anthropic.com/en/docs/about-claude/models/whats-new-sonnet-4-5
- 2: https://platform.claude.com/docs/en/about-claude/models/whats-new-sonnet-4-5
- 3: https://docs.anthropic.com/claude/reference/models
- 4: https://platform.claude.com/docs/en/api/models-list
- 5: https://docs.anthropic.com/en/docs/about-claude/models/all-models
Update all model identifier examples from claude-sonnet-4-5 to claude-sonnet-4-6.
The model identifier claude-sonnet-4-5 is not a valid Anthropic Claude model. Current valid identifiers are claude-sonnet-4-6 (primary) and claude-sonnet-4-5-20250929. The incorrect identifier appears in 7 locations throughout the documentation (lines 124, 294, 423, 558, 579, 666, 893) and will cause all examples to fail at runtime. Update these references to use claude-sonnet-4-6.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/services/rag.md` at line 124, Replace every example model identifier
"claude-sonnet-4-5" with the valid identifier "claude-sonnet-4-6" in the docs
for the `model` field; specifically update all seven occurrences referenced in
the review (examples shown alongside the `model` property in
docs/services/rag.md) so examples use "claude-sonnet-4-6" (do not change other
models like `gpt-4o` or `llama3.2`).
susan-pgedge
left a comment
There was a problem hiding this comment.
I committed the change that I noted below... this all looks good to me!
| ### Minimal (OpenAI + Anthropic) | ||
|
|
||
| In the following example, a `curl` command provisions a RAG service | ||
| with OpenAI for embeddings and Anthropic Claude for answer generation: |
There was a problem hiding this comment.
Maybe:
In the following example, a curl command provisions a RAG service using OpenAI for embeddings and Anthropic Claude to generate answers:
Summary
This PR adds complete documentation for the pgEdge RAG Server service, including
configuration reference, and a step-by-step deployment guide covering database creation, document loading, and pipeline
querying.
Changes
docs/services/rag.mdwith full configuration reference forpipelines,embedding_llm,rag_llm,tables,search, anddefaultsfieldsload documents → query pipeline → update config
docs/services/index.md: linkrag.md(remove "coming soon")Testing
Verification:
Run the script to load documents
load.py
Confirm the db entries
Response from pipeline:
Checklist
PLAT-495