ASP源码
PHP源码
.NET源码
JSP源码
生产就绪度元技能作为一个综合编排层,旨在确保服务在进入生产流量之前已全面运行。该技能不是手动检查每个配置,而是利用 Openclaw Skills 生态系统将技术问题(如结构化日志、熔断和密钥管理)路由到专业的子代理。它将这些细颗粒度的输入综合为最终的发布/不发布评估,为开发人员提供关于服务健康和可靠性的统一报告。
通过协调日志、错误处理、性能和安全方面的领域专家,该技能消除了服务移交过程中对部落知识的需求。它确保每个服务都能自我观察,从瞬时故障中自动恢复,并有效地向编排器传达其健康状况,使其成为维护高可用性环境的重要工具。
下载入口:https://github.com/openclaw/skills/tree/main/skills/wpank/production-readiness
从源直接安装技能的最快方式。
npx clawhub@latest install production-readiness
将技能文件夹复制到以下位置之一
全局模式~/.openclaw/skills/
工作区
/skills/
优先级:工作区 > 本地 > 内置
将此提示词复制到 OpenClaw 即可自动安装。
请帮我使用 Clawhub 安装 production-readiness。如果尚未安装 Clawhub,请先安装(npm i -g clawhub)。
要将此元技能添加到您的环境中,请使用 Openclaw Skills 中心 CLI:
npx clawhub@latest install production-readiness
该技能将其评估数据组织到基于成熟度的分类法中,将专业技能输出映射到运营要求:
| 数据类别 | 描述 |
|---|---|
| 健康与生命周期 | 探针检查结果和优雅停机排空验证 |
| 弹性矩阵 | 熔断、限流和背压机制的审计 |
| 密钥审计 | Vault 集成和外部化配置状态的验证 |
| 数据安全 | 备份策略、RPO/RTO 和迁移可逆性的文档 |
| 运营文档 | 操作手册、SLO 和升级矩阵的链接 |
| 成熟度评级 | 基于累积审计结果的最终 L1-L4 分类 |
name: production-readiness
model: reasoning
description: Meta-skill that orchestrates logging, monitoring, error handling, performance, security, deployment, and testing skills to ensure a service is fully production-ready before launch. Use before first deploy, major releases, quarterly reviews, or after incidents.
Coordinates all operational concerns into a single readiness review. Instead of duplicating domain expertise, this skill routes to specialized skills and agents for each area, then synthesizes results into a unified go/no-go assessment.
npx clawhub@latest install production-readiness
Ensure a service is production-ready by systematically checking every operational concern — logging, error handling, performance, security, deployment, testing, and documentation — before traffic hits it.
A production-ready service:
| Trigger | Context |
|---|---|
| Before first deploy | New service going to production for the first time |
| Before major release | Significant feature or architectural change shipping |
| Quarterly production review | Scheduled audit of existing services |
| After incident | Post-incident hardening to prevent recurrence |
| Dependency upgrade | Major framework, runtime, or infrastructure change |
| Team handoff | Transferring ownership of a service to another team |
Run each area sequentially or in parallel. Each step delegates to a specialized skill or agent — this skill does not re-implement their logic.
┌─────────────────────────────────────────────────┐
│ Production Readiness Review │
├─────────────────────────────────────────────────┤
│ │
│ 1. Logging & Observability ──? logging-observability skill
│ 2. Error Handling ───────────? error-handling-patterns skill
│ 3. Performance ──────────────? performance-agent
│ 4. Security ─────────────────? security-review meta-skill
│ 5. Deployment ───────────────? deployment-agent + docker-expert skill
│ 6. Testing ──────────────────? testing-workflow meta-skill
│ 7. Documentation ────────────? /generate-docs command
│ │
│ ──? Synthesize results into go/no-go report │
└─────────────────────────────────────────────────┘
| Concern | Skill / Agent | Path |
|---|---|---|
| Logging & Observability | logging-observability |
ai/skills/tools/logging-observability/SKILL.md |
| Error Handling | error-handling-patterns |
ai/skills/backend/error-handling-patterns/SKILL.md |
| Performance | performance-agent |
ai/agents/performance/ |
| Security | security-review |
ai/skills/meta/security-review/SKILL.md |
| Deployment (containers) | docker-expert |
ai/skills/devops/docker/SKILL.md |
| Deployment (pipelines) | deployment-agent |
ai/agents/deployment/ |
| Testing | testing-workflow |
ai/skills/testing/testing-workflow/SKILL.md |
| Rate Limiting | rate-limiting-patterns |
ai/skills/backend/rate-limiting-patterns/SKILL.md |
| Documentation | /generate-docs |
ai/commands/documentation/ |
Routing rule: Read the target skill first, follow its instructions, then return results here for synthesis.
/healthz or /health) returns dependency status| Level | Name | Requirements |
|---|---|---|
| L1 | MVP | Health check, basic logging, error handling, manual deploy, unit tests, README |
| L2 | Stable | Structured logging, metrics, graceful shutdown, CI/CD pipeline, integration tests, runbooks |
| L3 | Resilient | Distributed tracing, circuit breakers, auto-scaling, chaos testing, SLOs, on-call rotation |
| L4 | Optimized | Adaptive rate limiting, predictive alerting, canary deploys, full observability, error budgets, postmortem culture |
| Severity | Response Time | Escalation After | Stakeholder Notification |
|---|---|---|---|
| SEV-1 (outage) | 15 min | 30 min | Immediate — exec + customers |
| SEV-2 (degraded) | 30 min | 1 hour | Within 1 hour — eng lead |
| SEV-3 (minor) | 4 hours | Next business day | Daily standup |
| SEV-4 (cosmetic) | Next sprint | N/A | Backlog |
## Incident: [Title]
**Date:** YYYY-MM-DD | **Duration:** X hours | **Severity:** SEV-N
### Summary
One-paragraph description of what happened and impact.
### Timeline
- HH:MM — First alert fired
- HH:MM — Engineer paged, investigation started
- HH:MM — Root cause identified
- HH:MM — Mitigation applied
- HH:MM — Full resolution confirmed
### Root Cause
What broke and why. Link to code/config change if applicable.
### Impact
- Users affected: N
- Revenue impact: $X (if applicable)
- SLO budget consumed: X%
### Action Items
| Action | Owner | Due Date | Status |
|--------|-------|----------|--------|
| Fix X | @eng | YYYY-MM-DD | Open |
### Lessons Learned
- What went well
- What went poorly
- Where we got lucky