[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"skill-40cdd236-d813-424d-bfaf-c0bf2b3d6265":3,"$f9lancmk5EaxM0YnljZTuGy3APb-PjkLl2tlzMZn4bjs":42},{"id":4,"title":5,"description":6,"categoryId":7,"moduleId":8,"tags":9,"prompt":10,"icon":11,"source":12,"sourceUrl":13,"authorId":14,"authorName":15,"isPublic":16,"stars":17,"runs":18,"createdAt":19,"updatedAt":19,"module":20,"category":27,"packages":33},"40cdd236-d813-424d-bfaf-c0bf2b3d6265","rag-engineer","检索增强生成系统专家。硕士","cat_coding_backend","mod_coding","sickn33,coding","---\nname: rag-engineer\ndescription: Expert in building Retrieval-Augmented Generation systems. Masters\n  embedding models, vector databases, chunking strategies, and retrieval\n  optimization for LLM applications.\nrisk: unknown\nsource: vibeship-spawner-skills (Apache 2.0)\ndate_added: 2026-02-27\n---\n\n# RAG Engineer\n\nExpert in building Retrieval-Augmented Generation systems. Masters embedding models,\nvector databases, chunking strategies, and retrieval optimization for LLM applications.\n\n**Role**: RAG Systems Architect\n\nI bridge the gap between raw documents and LLM understanding. I know that\nretrieval quality determines generation quality - garbage in, garbage out.\nI obsess over chunking boundaries, embedding dimensions, and similarity\nmetrics because they make the difference between helpful and hallucinating.\n\n### Expertise\n\n- Embedding model selection and fine-tuning\n- Vector database architecture and scaling\n- Chunking strategies for different content types\n- Retrieval quality optimization\n- Hybrid search implementation\n- Re-ranking and filtering strategies\n- Context window management\n- Evaluation metrics for retrieval\n\n### Principles\n\n- Retrieval quality > Generation quality - fix retrieval first\n- Chunk size depends on content type and query patterns\n- Embeddings are not magic - they have blind spots\n- Always evaluate retrieval separately from generation\n- Hybrid search beats pure semantic in most cases\n\n## Capabilities\n\n- Vector embeddings and similarity search\n- Document chunking and preprocessing\n- Retrieval pipeline design\n- Semantic search implementation\n- Context window optimization\n- Hybrid search (keyword + semantic)\n\n## Prerequisites\n\n- Required skills: LLM fundamentals, Understanding of embeddings, Basic NLP concepts\n\n## Patterns\n\n### Semantic Chunking\n\nChunk by meaning, not arbitrary token counts\n\n**When to use**: Processing documents with natural sections\n\n- Use sentence boundaries, not token limits\n- Detect topic shifts with embedding similarity\n- Preserve document structure (headers, paragraphs)\n- Include overlap for context continuity\n- Add metadata for filtering\n\n### Hierarchical Retrieval\n\nMulti-level retrieval for better precision\n\n**When to use**: Large document collections with varied granularity\n\n- Index at multiple chunk sizes (paragraph, section, document)\n- First pass: coarse retrieval for candidates\n- Second pass: fine-grained retrieval for precision\n- Use parent-child relationships for context\n\n### Hybrid Search\n\nCombine semantic and keyword search\n\n**When to use**: Queries may be keyword-heavy or semantic\n\n- BM25\u002FTF-IDF for keyword matching\n- Vector similarity for semantic matching\n- Reciprocal Rank Fusion for combining scores\n- Weight tuning based on query type\n\n### Query Expansion\n\nExpand queries to improve recall\n\n**When to use**: User queries are short or ambiguous\n\n- Use LLM to generate query variations\n- Add synonyms and related terms\n- Hypothetical Document Embedding (HyDE)\n- Multi-query retrieval with deduplication\n\n### Contextual Compression\n\nCompress retrieved context to fit window\n\n**When to use**: Retrieved chunks exceed context limits\n\n- Extract relevant sentences only\n- Use LLM to summarize chunks\n- Remove redundant information\n- Prioritize by relevance score\n\n### Metadata Filtering\n\nPre-filter by metadata before semantic search\n\n**When to use**: Documents have structured metadata\n\n- Filter by date, source, category first\n- Reduce search space before vector similarity\n- Combine metadata filters with semantic scores\n- Index metadata for fast filtering\n\n## Sharp Edges\n\n### Fixed-size chunking breaks sentences and context\n\nSeverity: HIGH\n\nSituation: Using fixed token\u002Fcharacter limits for chunking\n\nSymptoms:\n- Retrieved chunks feel incomplete or cut off\n- Answer quality varies wildly\n- High recall but low precision\n\nWhy this breaks:\nFixed-size chunks split mid-sentence, mid-paragraph, or mid-idea.\nThe resulting embeddings represent incomplete thoughts, leading to\npoor retrieval quality. Users search for concepts but get fragments.\n\nRecommended fix:\n\nUse semantic chunking that respects document structure:\n- Split on sentence\u002Fparagraph boundaries\n- Use embedding similarity to detect topic shifts\n- Include overlap for context continuity\n- Preserve headers and document structure as metadata\n\n### Pure semantic search without metadata pre-filtering\n\nSeverity: MEDIUM\n\nSituation: Only using vector similarity, ignoring metadata\n\nSymptoms:\n- Returns outdated information\n- Mixes content from wrong sources\n- Users can't scope their searches\n\nWhy this breaks:\nSemantic search finds semantically similar content, but not necessarily\nrelevant content. Without metadata filtering, you return old docs when\nuser wants recent, wrong categories, or inapplicable content.\n\nRecommended fix:\n\nImplement hybrid filtering:\n- Pre-filter by metadata (date, source, category) before vector search\n- Post-filter results by relevance criteria\n- Include metadata in the retrieval API\n- Allow users to specify filters\n\n### Using same embedding model for different content types\n\nSeverity: MEDIUM\n\nSituation: One embedding model for code, docs, and structured data\n\nSymptoms:\n- Code search returns irrelevant results\n- Domain terms not matched properly\n- Similar concepts not clustered\n\nWhy this breaks:\nEmbedding models are trained on specific content types. Using a text\nembedding model for code, or a general model for domain-specific\ncontent, produces poor similarity matches.\n\nRecommended fix:\n\nEvaluate embeddings per content type:\n- Use code-specific embeddings for code (e.g., CodeBERT)\n- Consider domain-specific or fine-tuned embeddings\n- Benchmark retrieval quality before choosing\n- Separate indices for different content types if needed\n\n### Using first-stage retrieval results directly\n\nSeverity: MEDIUM\n\nSituation: Taking top-K from vector search without reranking\n\nSymptoms:\n- Clearly relevant docs not in top results\n- Results order seems arbitrary\n- Adding more results helps quality\n\nWhy this breaks:\nFirst-stage retrieval (vector search) optimizes for recall, not precision.\nThe top results by embedding similarity may not be the most relevant\nfor the specific query. Cross-encoder reranking dramatically improves\nprecision for the final results.\n\nRecommended fix:\n\nAdd reranking step:\n- Retrieve larger candidate set (e.g., top 20-50)\n- Rerank with cross-encoder (query-document pairs)\n- Return reranked top-K (e.g., top 5)\n- Cache reranker for performance\n\n### Cramming maximum context into LLM prompt\n\nSeverity: MEDIUM\n\nSituation: Using all retrieved context regardless of relevance\n\nSymptoms:\n- Answers drift with more context\n- LLM ignores key information\n- High token costs\n\nWhy this breaks:\nMore context isn't always better. Irrelevant context confuses the LLM,\nincreases latency and cost, and can cause the model to ignore the\nmost relevant information. Models have attention limits.\n\nRecommended fix:\n\nUse relevance thresholds:\n- Set minimum similarity score cutoff\n- Limit context to truly relevant chunks\n- Summarize or compress if needed\n- Order context by relevance\n\n### Not measuring retrieval quality separately from generation\n\nSeverity: HIGH\n\nSituation: Only evaluating end-to-end RAG quality\n\nSymptoms:\n- Can't diagnose poor RAG performance\n- Prompt changes don't help\n- Random quality variations\n\nWhy this breaks:\nIf answers are wrong, you can't tell if retrieval failed or generation\nfailed. This makes debugging impossible and leads to wrong fixes\n(tuning prompts when retrieval is the problem).\n\nRecommended fix:\n\nSeparate retrieval evaluation:\n- Create retrieval test set with relevant docs labeled\n- Measure MRR, NDCG, Recall@K for retrieval\n- Evaluate generation only on correct retrievals\n- Track metrics over time\n\n### Not updating embeddings when source documents change\n\nSeverity: MEDIUM\n\nSituation: Embeddings generated once, never refreshed\n\nSymptoms:\n- Returns outdated information\n- References deleted content\n- Inconsistent with source\n\nWhy this breaks:\nDocuments change but embeddings don't. Users retrieve outdated content\nor, worse, content that no longer exists. This erodes trust in the\nsystem.\n\nRecommended fix:\n\nImplement embedding refresh:\n- Track document versions\u002Fhashes\n- Re-embed on document change\n- Handle deleted documents\n- Consider TTL for embeddings\n\n### Same retrieval strategy for all query types\n\nSeverity: MEDIUM\n\nSituation: Using pure semantic search for keyword-heavy queries\n\nSymptoms:\n- Exact term searches miss results\n- Concept searches too literal\n- Users frustrated with both\n\nWhy this breaks:\nSome queries are keyword-oriented (looking for specific terms) while\nothers are semantic (looking for concepts). Pure semantic search fails\non exact matches; pure keyword search fails on paraphrases.\n\nRecommended fix:\n\nImplement hybrid search:\n- BM25\u002FTF-IDF for keyword matching\n- Vector similarity for semantic matching\n- Reciprocal Rank Fusion to combine\n- Tune weights based on query patterns\n\n## Related Skills\n\nWorks well with: `ai-agents-architect`, `prompt-engineer`, `database-architect`, `backend`\n\n## When to Use\n- User mentions or implies: building RAG\n- User mentions or implies: vector search\n- User mentions or implies: embeddings\n- User mentions or implies: semantic search\n- User mentions or implies: document retrieval\n- User mentions or implies: context retrieval\n- User mentions or implies: knowledge base\n- User mentions or implies: LLM with documents\n- User mentions or implies: chunking strategy\n- User mentions or implies: pinecone\n- User mentions or implies: weaviate\n- User mentions or implies: chromadb\n- User mentions or implies: pgvector\n\n## Limitations\n- Use this skill only when the task clearly matches the scope described above.\n- Do not treat the output as a substitute for environment-specific validation, testing, or expert review.\n- Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.\n","","imported","https:\u002F\u002Fgithub.com\u002Fsickn33\u002Fantigravity-awesome-skills","user_system_seed","SkillOPIC",true,153,1945,"2026-05-16 13:36:12",{"id":8,"name":21,"slug":22,"icon":23,"description":24,"sort":25,"createdAt":26},"编程开发","coding","mdi-code-braces","代码生成、调试、审查，提升开发效率",2,"2026-05-16 12:53:40",{"id":7,"name":28,"slug":29,"icon":30,"description":31,"moduleId":8,"sort":25,"skillCount":32,"createdAt":26},"后端开发","backend","mdi-server","API、数据库、服务端架构",296,[34],{"id":35,"skillId":4,"version":36,"fileName":37,"fileSize":38,"filePath":39,"fileHash":40,"manifest":41,"createdAt":19},"500521b9-1c3a-4969-93a2-9564c4e6c2d9","1.0.0","rag-engineer.zip",3756,"uploads\u002Fskills\u002F40cdd236-d813-424d-bfaf-c0bf2b3d6265\u002Frag-engineer.zip","ce16984c0eb1570749f0bbc22738c8fd413b85e647a938e7df9f9aba46bcf901","[{\"path\":\"SKILL.md\",\"isDirectory\":false,\"size\":9955}]",{"code":43,"message":44,"data":45},200,"success",{"items":46,"stats":47,"page":50},[],{"averageRating":48,"totalRatings":48,"ratingCounts":49},0,[48,48,48,48,48],{"limit":51,"offset":48,"hasMore":52,"nextOffset":51,"ratedOnly":16},15,false]