[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"skill-3e7368cf-271f-4731-96dc-51e2dfeaff1f":3,"$f9tWAut-AyC3NeqQDqfJA1dlHmsZV32kZKg-oStp7db8":42},{"id":4,"title":5,"description":6,"categoryId":7,"moduleId":8,"tags":9,"prompt":10,"icon":11,"source":12,"sourceUrl":13,"authorId":14,"authorName":15,"isPublic":16,"stars":17,"runs":18,"createdAt":19,"updatedAt":19,"module":20,"category":27,"packages":33},"3e7368cf-271f-4731-96dc-51e2dfeaff1f","server-management","服务器管理原则和决策。流程管理、监控策略和扩展决策。教授思考，而非命令。","cat_coding_backend","mod_coding","sickn33,coding","---\nname: server-management\ndescription: \"Server management principles and decision-making. Process management, monitoring strategy, and scaling decisions. Teaches thinking, not commands.\"\nrisk: safe\nsource: community\ndate_added: \"2026-02-27\"\n---\n\n# Server Management\n\n> Server management principles for production operations.\n> **Learn to THINK, not memorize commands.**\n\n---\n\n## 1. Process Management Principles\n\n### Tool Selection\n\n| Scenario | Tool |\n|----------|------|\n| **Node.js app** | PM2 (clustering, reload) |\n| **Any app** | systemd (Linux native) |\n| **Containers** | Docker\u002FPodman |\n| **Orchestration** | Kubernetes, Docker Swarm |\n\n### Process Management Goals\n\n| Goal | What It Means |\n|------|---------------|\n| **Restart on crash** | Auto-recovery |\n| **Zero-downtime reload** | No service interruption |\n| **Clustering** | Use all CPU cores |\n| **Persistence** | Survive server reboot |\n\n---\n\n## 2. Monitoring Principles\n\n### What to Monitor\n\n| Category | Key Metrics |\n|----------|-------------|\n| **Availability** | Uptime, health checks |\n| **Performance** | Response time, throughput |\n| **Errors** | Error rate, types |\n| **Resources** | CPU, memory, disk |\n\n### Alert Severity Strategy\n\n| Level | Response |\n|-------|----------|\n| **Critical** | Immediate action |\n| **Warning** | Investigate soon |\n| **Info** | Review daily |\n\n### Monitoring Tool Selection\n\n| Need | Options |\n|------|---------|\n| Simple\u002FFree | PM2 metrics, htop |\n| Full observability | Grafana, Datadog |\n| Error tracking | Sentry |\n| Uptime | UptimeRobot, Pingdom |\n\n---\n\n## 3. Log Management Principles\n\n### Log Strategy\n\n| Log Type | Purpose |\n|----------|---------|\n| **Application logs** | Debug, audit |\n| **Access logs** | Traffic analysis |\n| **Error logs** | Issue detection |\n\n### Log Principles\n\n1. **Rotate logs** to prevent disk fill\n2. **Structured logging** (JSON) for parsing\n3. **Appropriate levels** (error\u002Fwarn\u002Finfo\u002Fdebug)\n4. **No sensitive data** in logs\n\n---\n\n## 4. Scaling Decisions\n\n### When to Scale\n\n| Symptom | Solution |\n|---------|----------|\n| High CPU | Add instances (horizontal) |\n| High memory | Increase RAM or fix leak |\n| Slow response | Profile first, then scale |\n| Traffic spikes | Auto-scaling |\n\n### Scaling Strategy\n\n| Type | When to Use |\n|------|-------------|\n| **Vertical** | Quick fix, single instance |\n| **Horizontal** | Sustainable, distributed |\n| **Auto** | Variable traffic |\n\n---\n\n## 5. Health Check Principles\n\n### What Constitutes Healthy\n\n| Check | Meaning |\n|-------|---------|\n| **HTTP 200** | Service responding |\n| **Database connected** | Data accessible |\n| **Dependencies OK** | External services reachable |\n| **Resources OK** | CPU\u002Fmemory not exhausted |\n\n### Health Check Implementation\n\n- Simple: Just return 200\n- Deep: Check all dependencies\n- Choose based on load balancer needs\n\n---\n\n## 6. Security Principles\n\n| Area | Principle |\n|------|-----------|\n| **Access** | SSH keys only, no passwords |\n| **Firewall** | Only needed ports open |\n| **Updates** | Regular security patches |\n| **Secrets** | Environment vars, not files |\n| **Audit** | Log access and changes |\n\n---\n\n## 7. Troubleshooting Priority\n\nWhen something's wrong:\n\n1. **Check if running** (process status)\n2. **Check logs** (error messages)\n3. **Check resources** (disk, memory, CPU)\n4. **Check network** (ports, DNS)\n5. **Check dependencies** (database, APIs)\n\n---\n\n## 8. Anti-Patterns\n\n| ❌ Don't | ✅ Do |\n|----------|-------|\n| Run as root | Use non-root user |\n| Ignore logs | Set up log rotation |\n| Skip monitoring | Monitor from day one |\n| Manual restarts | Auto-restart config |\n| No backups | Regular backup schedule |\n\n---\n\n> **Remember:** A well-managed server is boring. That's the goal.\n\n## When to Use\nThis skill is applicable to execute the workflow or actions described in the overview.\n\n## Limitations\n- Use this skill only when the task clearly matches the scope described above.\n- Do not treat the output as a substitute for environment-specific validation, testing, or expert review.\n- Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.\n","","imported","https:\u002F\u002Fgithub.com\u002Fsickn33\u002Fantigravity-awesome-skills","user_system_seed","SkillOPIC",true,116,1314,"2026-05-16 13:40:04",{"id":8,"name":21,"slug":22,"icon":23,"description":24,"sort":25,"createdAt":26},"编程开发","coding","mdi-code-braces","代码生成、调试、审查，提升开发效率",2,"2026-05-16 12:53:40",{"id":7,"name":28,"slug":29,"icon":30,"description":31,"moduleId":8,"sort":25,"skillCount":32,"createdAt":26},"后端开发","backend","mdi-server","API、数据库、服务端架构",296,[34],{"id":35,"skillId":4,"version":36,"fileName":37,"fileSize":38,"filePath":39,"fileHash":40,"manifest":41,"createdAt":19},"316e10a2-e170-4370-bc21-5c5ed224821f","1.0.0","server-management.zip",2054,"uploads\u002Fskills\u002F3e7368cf-271f-4731-96dc-51e2dfeaff1f\u002Fserver-management.zip","64393642f23ffabc028d82fd4b905f80989c6a79fc3e470b08dde4023a7f309f","[{\"path\":\"SKILL.md\",\"isDirectory\":false,\"size\":4161}]",{"code":43,"message":44,"data":45},200,"success",{"items":46,"stats":47,"page":50},[],{"averageRating":48,"totalRatings":48,"ratingCounts":49},0,[48,48,48,48,48],{"limit":51,"offset":48,"hasMore":52,"nextOffset":51,"ratedOnly":16},15,false]