[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"skill-d2f6cdef-e4e5-4aaf-b23f-b6de8aa918d2":3,"$fTEovxaM-OrObhmYt_RqrxIgFs2m_jn2NoAECDy2fXhE":42},{"id":4,"title":5,"description":6,"categoryId":7,"moduleId":8,"tags":9,"prompt":10,"icon":11,"source":12,"sourceUrl":13,"authorId":14,"authorName":15,"isPublic":16,"stars":17,"runs":18,"createdAt":19,"updatedAt":19,"module":20,"category":27,"packages":33},"d2f6cdef-e4e5-4aaf-b23f-b6de8aa918d2","nosql-expert","分布式NoSQL数据库（Cassandra、DynamoDB）的专业指导。重点关注思维模型、以查询为先的建模、单表设计和避免大规模系统中的热点分区。","cat_coding_backend","mod_coding","sickn33,coding","---\nname: nosql-expert\ndescription: \"Expert guidance for distributed NoSQL databases (Cassandra, DynamoDB). Focuses on mental models, query-first modeling, single-table design, and avoiding hot partitions in high-scale systems.\"\nrisk: unknown\nsource: community\ndate_added: \"2026-02-27\"\n---\n\n# NoSQL Expert Patterns (Cassandra & DynamoDB)\n\n## Overview\n\nThis skill provides professional mental models and design patterns for **distributed wide-column and key-value stores** (specifically Apache Cassandra and Amazon DynamoDB).\n\nUnlike SQL (where you model data entities), or document stores (like MongoDB), these distributed systems require you to **model your queries first**.\n\n## When to Use\n- **Designing for Scale**: Moving beyond simple single-node databases to distributed clusters.\n- **Technology Selection**: Evaluating or using **Cassandra**, **ScyllaDB**, or **DynamoDB**.\n- **Performance Tuning**: Troubleshooting \"hot partitions\" or high latency in existing NoSQL systems.\n- **Microservices**: Implementing \"database-per-service\" patterns where highly optimized reads are required.\n\n## The Mental Shift: SQL vs. Distributed NoSQL\n\n| Feature | SQL (Relational) | Distributed NoSQL (Cassandra\u002FDynamoDB) |\n| :--- | :--- | :--- |\n| **Data modeling** | Model Entities + Relationships | Model **Queries** (Access Patterns) |\n| **Joins** | CPU-intensive, at read time | **Pre-computed** (Denormalized) at write time |\n| **Storage cost** | Expensive (minimize duplication) | Cheap (duplicate data for read speed) |\n| **Consistency** | ACID (Strong) | **BASE (Eventual)** \u002F Tunable |\n| **Scalability** | Vertical (Bigger machine) | **Horizontal** (More nodes\u002Fshards) |\n\n> **The Golden Rule:** In SQL, you design the data model to answer *any* query. In NoSQL, you design the data model to answer *specific* queries efficiently.\n\n## Core Design Patterns\n\n### 1. Query-First Modeling (Access Patterns)\n\nYou typically cannot \"add a query later\" without migration or creating a new table\u002Findex.\n\n**Process:**\n1.  **List all Entities** (User, Order, Product).\n2.  **List all Access Patterns** (\"Get User by Email\", \"Get Orders by User sorted by Date\").\n3.  **Design Table(s)** specifically to serve those patterns with a single lookup.\n\n### 2. The Partition Key is King\n\nData is distributed across physical nodes based on the **Partition Key (PK)**.\n-   **Goal:** Even distribution of data and traffic.\n-   **Anti-Pattern:** Using a low-cardinality PK (e.g., `status=\"active\"` or `gender=\"m\"`) creates **Hot Partitions**, limiting throughput to a single node's capacity.\n-   **Best Practice:** Use high-cardinality keys (User IDs, Device IDs, Composite Keys).\n\n### 3. Clustering \u002F Sort Keys\n\nWithin a partition, data is sorted on disk by the **Clustering Key (Cassandra)** or **Sort Key (DynamoDB)**.\n-   This allows for efficient **Range Queries** (e.g., `WHERE user_id=X AND date > Y`).\n-   It effectively pre-sorts your data for specific retrieval requirements.\n\n### 4. Single-Table Design (Adjacency Lists)\n\n*Primary use: DynamoDB (but concepts apply elsewhere)*\n\nStoring multiple entity types in one table to enable pre-joined reads.\n\n| PK (Partition) | SK (Sort) | Data Fields... |\n| :--- | :--- | :--- |\n| `USER#123` | `PROFILE` | `{ name: \"Ian\", email: \"...\" }` |\n| `USER#123` | `ORDER#998` | `{ total: 50.00, status: \"shipped\" }` |\n| `USER#123` | `ORDER#999` | `{ total: 12.00, status: \"pending\" }` |\n\n-   **Query:** `PK=\"USER#123\"`\n-   **Result:** Fetches User Profile AND all Orders in **one network request**.\n\n### 5. Denormalization & Duplication\n\nDon't be afraid to store the same data in multiple tables to serve different query patterns.\n-   **Table A:** `users_by_id` (PK: uuid)\n-   **Table B:** `users_by_email` (PK: email)\n\n*Trade-off: You must manage data consistency across tables (often using eventual consistency or batch writes).*\n\n## Specific Guidance\n\n### Apache Cassandra \u002F ScyllaDB\n\n-   **Primary Key Structure:** `((Partition Key), Clustering Columns)`\n-   **No Joins, No Aggregates:** Do not try to `JOIN` or `GROUP BY`. Pre-calculate aggregates in a separate counter table.\n-   **Avoid `ALLOW FILTERING`:** If you see this in production, your data model is wrong. It implies a full cluster scan.\n-   **Writes are Cheap:** Inserts and Updates are just appends to the LSM tree. Don't worry about write volume as much as read efficiency.\n-   **Tombstones:** Deletes are expensive markers. Avoid high-velocity delete patterns (like queues) in standard tables.\n\n### AWS DynamoDB\n\n-   **GSI (Global Secondary Index):** Use GSIs to create alternative views of your data (e.g., \"Search Orders by Date\" instead of by User).\n    -   *Note:* GSIs are eventually consistent.\n-   **LSI (Local Secondary Index):** Sorts data differently *within* the same partition. Must be created at table creation time.\n-   **WCU \u002F RCU:** Understand capacity modes. Single-table design helps optimize consumed capacity units.\n-   **TTL:** Use Time-To-Live attributes to automatically expire old data (free delete) without creating tombstones.\n\n## Expert Checklist\n\nBefore finalizing your NoSQL schema:\n\n-   [ ] **Access Pattern Coverage:** Does every query pattern map to a specific table or index?\n-   [ ] **Cardinality Check:** Does the Partition Key have enough unique values to spread traffic evenly?\n-   [ ] **Split Partition Risk:** For any single partition (e.g., a single user's orders), will it grow indefinitely? (If > 10GB, you need to \"shard\" the partition, e.g., `USER#123#2024-01`).\n-   [ ] **Consistency Requirement:** Can the application tolerate eventual consistency for this read pattern?\n\n## Common Anti-Patterns\n\n❌ **Scatter-Gather:** Querying *all* partitions to find one item (Scan).\n❌ **Hot Keys:** Putting all \"Monday\" data into one partition.\n❌ **Relational Modeling:** Creating `Author` and `Book` tables and trying to join them in code. (Instead, embed Book summaries in Author, or duplicate Author info in Books).\n\n## Limitations\n- Use this skill only when the task clearly matches the scope described above.\n- Do not treat the output as a substitute for environment-specific validation, testing, or expert review.\n- Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.\n","","imported","https:\u002F\u002Fgithub.com\u002Fsickn33\u002Fantigravity-awesome-skills","user_system_seed","SkillOPIC",true,70,1937,"2026-05-16 13:31:04",{"id":8,"name":21,"slug":22,"icon":23,"description":24,"sort":25,"createdAt":26},"编程开发","coding","mdi-code-braces","代码生成、调试、审查，提升开发效率",2,"2026-05-16 12:53:40",{"id":7,"name":28,"slug":29,"icon":30,"description":31,"moduleId":8,"sort":25,"skillCount":32,"createdAt":26},"后端开发","backend","mdi-server","API、数据库、服务端架构",296,[34],{"id":35,"skillId":4,"version":36,"fileName":37,"fileSize":38,"filePath":39,"fileHash":40,"manifest":41,"createdAt":19},"64defe02-c255-4309-9ef7-39273065ec3a","1.0.0","nosql-expert.zip",3125,"uploads\u002Fskills\u002Fd2f6cdef-e4e5-4aaf-b23f-b6de8aa918d2\u002Fnosql-expert.zip","588908b57ad6b74a7c7f4a54a719851d8c9a42572c73aa356fa39aad71ae9c81","[{\"path\":\"SKILL.md\",\"isDirectory\":false,\"size\":6266}]",{"code":43,"message":44,"data":45},200,"success",{"items":46,"stats":47,"page":50},[],{"averageRating":48,"totalRatings":48,"ratingCounts":49},0,[48,48,48,48,48],{"limit":51,"offset":48,"hasMore":52,"nextOffset":51,"ratedOnly":16},15,false]