diff options
| author | TheSiahxyz <164138827+TheSiahxyz@users.noreply.github.com> | 2026-01-16 08:30:14 +0900 |
|---|---|---|
| committer | TheSiahxyz <164138827+TheSiahxyz@users.noreply.github.com> | 2026-01-16 08:30:14 +0900 |
| commit | 3fbb9a18372f2b6a675dd6c039ba52be76f3eeb4 (patch) | |
| tree | aa694a36cdd323a7853672ee7a2ba60409ac3b06 /tooling/vercel-ai-sdk/.claude/commands/ai-rag-setup.md | |
updates
Diffstat (limited to 'tooling/vercel-ai-sdk/.claude/commands/ai-rag-setup.md')
| -rw-r--r-- | tooling/vercel-ai-sdk/.claude/commands/ai-rag-setup.md | 252 |
1 files changed, 252 insertions, 0 deletions
diff --git a/tooling/vercel-ai-sdk/.claude/commands/ai-rag-setup.md b/tooling/vercel-ai-sdk/.claude/commands/ai-rag-setup.md new file mode 100644 index 0000000..5003af9 --- /dev/null +++ b/tooling/vercel-ai-sdk/.claude/commands/ai-rag-setup.md @@ -0,0 +1,252 @@ +--- +allowed-tools: Read, Write, Edit, MultiEdit, Bash +description: Set up RAG (Retrieval-Augmented Generation) system +argument-hint: "[basic|advanced|conversational|agentic]" +--- + +## Set up RAG (Retrieval-Augmented Generation) System + +Create a comprehensive RAG implementation with embeddings, vector storage, and retrieval: $ARGUMENTS + +### Current Project Analysis + +Existing database setup: !`find . -name "*schema*" -o -name "*migration*" -o -name "drizzle.config.*" | head -5` + +Vector database configuration: !`grep -r "vector\|embedding" . --include="*.ts" --include="*.sql" | head -5` + +AI SDK integration: !`grep -r "embed\|embedMany" . --include="*.ts" | head -5` + +### RAG Implementation Types + +**Basic RAG**: Simple query → retrieve → generate pipeline +**Advanced RAG**: Multi-query, re-ranking, hybrid search, filtering +**Conversational RAG**: Context-aware retrieval with chat history +**Agentic RAG**: Tool-based retrieval with dynamic knowledge access + +### Your Task + +1. **Analyze current data infrastructure** and vector storage capabilities +2. **Design embedding and chunking strategy** for optimal retrieval +3. **Set up vector database** with proper indexing and search +4. **Implement embedding pipeline** with batch processing +5. **Create retrieval system** with similarity search and ranking +6. **Build RAG generation pipeline** with context injection +7. **Add evaluation metrics** for retrieval and generation quality +8. **Implement comprehensive testing** for all RAG components + +### Implementation Requirements + +#### Data Processing Pipeline + +- Document ingestion and preprocessing +- Intelligent chunking strategies (sentence, semantic, sliding window) +- Metadata extraction and enrichment +- Batch embedding generation with rate limiting +- Deduplication and quality filtering + +#### Vector Storage and Search + +- Database setup (PostgreSQL + pgvector, Pinecone, Supabase, etc.) +- Proper indexing (HNSW, IVFFlat) for performance +- Similarity search with filtering and ranking +- Hybrid search combining vector and text search +- Metadata filtering and faceted search + +#### RAG Generation + +- Context selection and ranking +- Prompt engineering for RAG scenarios +- Context window management +- Response grounding and source attribution +- Quality control and relevance scoring + +### Expected Deliverables + +1. **Document processing pipeline** with chunking and embedding +2. **Vector database setup** with optimized indexing +3. **Retrieval system** with advanced search capabilities +4. **RAG generation API** with streaming support +5. **Evaluation framework** for quality measurement +6. **Admin interface** for content management +7. **Comprehensive documentation** and examples + +### Database Schema Design + +#### PostgreSQL with pgvector + +```sql +-- Enable vector extension +CREATE EXTENSION IF NOT EXISTS vector; + +-- Documents table +CREATE TABLE documents ( + id SERIAL PRIMARY KEY, + title VARCHAR(255), + content TEXT NOT NULL, + metadata JSONB, + created_at TIMESTAMP DEFAULT NOW(), + updated_at TIMESTAMP DEFAULT NOW() +); + +-- Chunks table +CREATE TABLE document_chunks ( + id SERIAL PRIMARY KEY, + document_id INTEGER REFERENCES documents(id) ON DELETE CASCADE, + content TEXT NOT NULL, + chunk_index INTEGER, + metadata JSONB, + embedding VECTOR(1536), + created_at TIMESTAMP DEFAULT NOW() +); + +-- Indexes for performance +CREATE INDEX ON document_chunks USING hnsw (embedding vector_cosine_ops); +CREATE INDEX ON document_chunks (document_id); +CREATE INDEX ON documents USING gin (metadata); +``` + +#### Drizzle ORM Schema + +```typescript +export const documents = pgTable('documents', { + id: serial('id').primaryKey(), + title: varchar('title', { length: 255 }), + content: text('content').notNull(), + metadata: jsonb('metadata'), + createdAt: timestamp('created_at').defaultNow(), + updatedAt: timestamp('updated_at').defaultNow(), +}); + +export const documentChunks = pgTable( + 'document_chunks', + { + id: serial('id').primaryKey(), + documentId: integer('document_id').references(() => documents.id, { + onDelete: 'cascade', + }), + content: text('content').notNull(), + chunkIndex: integer('chunk_index'), + metadata: jsonb('metadata'), + embedding: vector('embedding', { dimensions: 1536 }), + createdAt: timestamp('created_at').defaultNow(), + }, + (table) => ({ + embeddingIndex: index('embedding_idx').using( + 'hnsw', + table.embedding.op('vector_cosine_ops'), + ), + documentIdIndex: index('document_id_idx').on(table.documentId), + }), +); +``` + +### Embedding Strategy + +#### Chunking Algorithms + +- **Sentence-based**: Split on sentence boundaries for coherent chunks +- **Semantic**: Use NLP models to identify semantic boundaries +- **Sliding window**: Overlapping chunks to preserve context +- **Recursive**: Hierarchical chunking for different granularities + +#### Model Selection + +- **OpenAI**: text-embedding-3-small/large for versatility +- **Cohere**: embed-english-v3.0 for specialized domains +- **Local models**: Sentence-transformers for privacy/cost +- **Multilingual**: Support for multiple languages + +### Advanced RAG Patterns + +#### Multi-Query RAG + +```typescript +async function multiQueryRAG(userQuery: string) { + // Generate multiple query variants + const queryVariants = await generateQueryVariants(userQuery); + + // Retrieve for each variant + const retrievalResults = await Promise.all( + queryVariants.map(query => retrieveDocuments(query)) + ); + + // Combine and re-rank results + const combinedResults = combineAndRerankResults(retrievalResults); + + return combinedResults; +} +``` + +#### Conversational RAG + +```typescript +async function conversationalRAG(messages: Message[], query: string) { + // Extract conversation context + const conversationContext = extractContext(messages); + + // Generate context-aware query + const contextualQuery = await generateContextualQuery(query, conversationContext); + + // Retrieve with conversation awareness + const documents = await retrieveWithContext(contextualQuery, conversationContext); + + return documents; +} +``` + +### Quality Evaluation + +#### Retrieval Metrics + +- **Precision@K**: Relevant documents in top-K results +- **Recall@K**: Coverage of relevant documents +- **MRR**: Mean Reciprocal Rank of first relevant document +- **NDCG**: Normalized Discounted Cumulative Gain + +#### Generation Metrics + +- **Faithfulness**: Response grounded in retrieved context +- **Relevance**: Response relevance to user query +- **Completeness**: Coverage of important information +- **Coherence**: Logical flow and readability + +### Testing and Validation + +#### Unit Testing + +- Embedding generation accuracy +- Chunking algorithm correctness +- Similarity search precision +- Database operations integrity + +#### Integration Testing + +- End-to-end RAG pipeline +- Performance under load +- Quality with various document types +- Scalability testing + +#### Evaluation Testing + +- Golden dataset evaluation +- A/B testing with different strategies +- User feedback collection +- Continuous quality monitoring + +### Performance Optimization + +#### Database Optimization + +- Proper indexing strategies (HNSW vs IVFFlat) +- Connection pooling and caching +- Query optimization and profiling +- Horizontal scaling considerations + +#### Embedding Optimization + +- Batch processing for efficiency +- Caching frequently used embeddings +- Model quantization for speed +- Parallel processing pipelines + +Focus on building a production-ready RAG system that provides accurate, relevant, and fast retrieval-augmented generation with proper evaluation and optimization strategies. |
