An open standard for documenting AI agent skill risks. Like MSDS for chemicals, but for the capabilities you give to autonomous systems. Rate inherent hazards with HDAC scores, then prescribe containment controls.
Risk to the local machine, environment integrity, and availability. Shell execution, package installs, config modifications, persistence.
Risk of privacy leakage, data exfiltration, or sensitive information misuse. File access, credential stores, third-party transmission.
Risk of irreversible or externally impactful actions. Emails, posts, purchases, deletions, production changes.
Risk of generating harmful, NSFW, or regulated content. Adult material, self-harm topics, medical/legal advice without safeguards.
Core Principle: Rate the inherent hazard, then prescribe containment. A skill that CAN send emails gets the flag and score, even if it's "supposed to" only draft them.
Score each factor 0–4 based on the highest matching rule.
| Score | Criteria |
|---|---|
| H0 | No filesystem access, no execution |
| H1 | Read-only workspace access |
| H2 | Write to workspace only, OR read user home, no exec |
| H3 | Shell/code execution, OR write to user home, OR install deps |
| H4 | Privileged execution (admin/root), OR persistence, OR write system |
| Score | Criteria |
|---|---|
| D0 | No data access beyond user-provided input |
| D1 | Reads workspace files only, no egress |
| D2 | Reads user home OR has egress to allowlist |
| D3 | Reads credential files/env vars, OR browser data, OR unrestricted egress with file access |
| D4 | Keychain access, OR browser sessions + unrestricted egress, OR multiple credential sources |
| Score | Criteria |
|---|---|
| A0 | No side effects (read-only, outputs text only) |
| A1 | Local side effects only (writes to workspace) |
| A2 | External side effects that are reversible (create issues, draft PRs) |
| A3 | Send messages, post publicly, modify external state |
| A4 | Financial transactions (purchase/transfer), OR deploy to production, OR delete external resources |
| Score | Criteria |
|---|---|
| C0 | No content generation, or pure data transformation |
| C1 | General content generation |
| C2 | Handles regulated domains (medical/legal/financial advice) |
| C3 | Can access/generate NSFW content |
| C4 | Designed for harmful content, or systematically bypasses safety |
Flags are derived deterministically from the capabilities manifest. They indicate specific capabilities the skill has.
Select containment level based on the maximum HDAC score:
| Level | Criteria | Typical Controls |
|---|---|---|
| none | All scores ≤ 1 | Minimal or no containment needed |
| standard | Max score = 2 | Basic sandboxing, logging |
| elevated | Max score = 3 | Container sandbox, approval gates, egress filtering |
| maximum | Any score = 4 | VM isolation, all approvals required, strict allowlists |
| Flag | Required Control |
|---|---|
ACT_SEND_MESSAGE | APPROVE_SEND |
ACT_POST_PUBLIC | APPROVE_POST |
ACT_PURCHASE or ACT_TRANSFER_MONEY | APPROVE_PURCHASE |
ACT_DELETE_EXTERNAL | APPROVE_DELETE |
ACT_DEPLOY | APPROVE_DEPLOY |
EXEC or CODE_EXEC | APPROVE_EXEC or SANDBOX_CONTAINER |
CREDS_BROWSER | CREDS_NO_BROWSER (unless explicitly needed) |
NET_EGRESS_ANY with D≥3 | NET_EGRESS_ALLOWLIST |
Need automated SSDS generation & monitoring?
Safe Agent Skills — Coming SoonThe canonical SSDS schema is defined in TypeScript using Zod.
Full source: schemas/ssds.schema.ts
export const SSDS = z.object({
// Metadata
meta: DocumentMeta,
// What skill is this for?
skill: SkillIdentification,
// What can it do? (detailed capabilities for flag derivation)
capabilities: CapabilityManifest,
// How dangerous is it? (the diamond + flags)
hazards: HazardAssessment,
// How should it be deployed? (PPE equivalent)
containment: RecommendedContainment,
// What could go wrong?
risks: KnownRisks,
// What to do if something goes wrong
incident_response: IncidentResponse,
// Supporting evidence
evidence: z.array(EvidenceItem),
// Vendor extensions
extensions: z.record(z.string(), z.unknown()).optional(),
}).strict();
export const HdacRatings = z.object({
H: z.number().int().min(0).max(4)
.describe("Host risk (0-4): system access, execution, persistence"),
D: z.number().int().min(0).max(4)
.describe("Data risk (0-4): sensitive data access, exfiltration potential"),
A: z.number().int().min(0).max(4)
.describe("Actuation risk (0-4): side effects, external actions"),
C: z.number().int().min(0).max(4)
.describe("Content risk (0-4): harmful/sensitive content generation"),
}).strict();
export const SkillIdentification = z.object({
name: z.string().max(200),
version: z.string().regex(semverPattern),
format: SkillFormat,
description: z.string().max(5000).nullable().optional(),
publisher: z.string().max(200),
publisher_url: z.string().url().nullable().optional(),
source: z.object({
channel: DistributionChannel, // npm, pypi, github, local, other
url: z.string().url().nullable().optional(),
retrieved_at: z.string().datetime().nullable().optional(),
}).strict(),
// Cryptographic pinning - CRITICAL for supply chain security
artifact: z.object({
sha256: z.string().regex(sha256Pattern),
hash_method: HashMethod.default("files_sorted"),
git_commit: z.string().regex(gitCommitPattern).nullable().optional(),
git_tag: z.string().max(200).nullable().optional(),
}).strict(),
license: z.string().max(200).nullable().optional(),
}).strict();
export const HazardFlag = z.enum([
// Execution
"EXEC", "CODE_EXEC", "PRIVILEGED", "PERSISTENCE",
// Filesystem
"FS_READ_WORKSPACE", "FS_READ_USER", "FS_READ_SYSTEM",
"FS_WRITE_WORKSPACE", "FS_WRITE_USER", "FS_WRITE_SYSTEM", "FS_DELETE",
// Network
"NET_EGRESS_LOCAL", "NET_EGRESS_ALLOWLIST", "NET_EGRESS_ANY", "NET_INGRESS",
// Credentials
"CREDS_ENV", "CREDS_FILES", "CREDS_BROWSER", "CREDS_KEYCHAIN",
// Services
"SVC_EMAIL", "SVC_MESSAGING", "SVC_STORAGE", "SVC_CODE_HOST",
"SVC_ISSUE_TRACKER", "SVC_CLOUD_INFRA", "SVC_PAYMENTS",
"SVC_SOCIAL", "SVC_DATABASE",
// Actions
"ACT_SEND_MESSAGE", "ACT_POST_PUBLIC", "ACT_PURCHASE",
"ACT_TRANSFER_MONEY", "ACT_DEPLOY", "ACT_DELETE_EXTERNAL",
// Prompt Injection Surfaces
"PI_WEB", "PI_EMAIL", "PI_DOCUMENTS", "PI_USER_INPUT",
// Content
"CONTENT_NSFW", "CONTENT_REGULATED",
]);
export const ContainmentControl = z.enum([
// Sandboxing
"SANDBOX_CONTAINER", "SANDBOX_VM", "SANDBOX_REMOTE", "SANDBOX_NONE_OK",
// Filesystem
"FS_READONLY", "FS_WORKSPACE_ONLY", "FS_NO_HOME", "FS_NO_SECRETS",
// Network
"NET_OFFLINE", "NET_EGRESS_BLOCK", "NET_EGRESS_ALLOWLIST", "NET_LOG_ALL",
// Credentials
"CREDS_VAULT_ONLY", "CREDS_NO_BROWSER", "CREDS_ROTATE_AFTER",
// Approval Gates
"APPROVE_SEND", "APPROVE_POST", "APPROVE_PURCHASE",
"APPROVE_DELETE", "APPROVE_DEPLOY", "APPROVE_EXEC",
// Monitoring
"LOG_ACTIONS", "LOG_REDACT_SECRETS", "ALERT_ANOMALY",
// Rate Limiting
"RATE_LIMIT_ACTIONS", "RATE_LIMIT_EGRESS",
// Review
"REVIEW_OUTPUT", "REVIEW_DIFF",
]);
import { SSDS, validateSSDS, parseSSDS, isSSDS } from "ssds-schema";
// Safe parsing with error handling
const result = validateSSDS(data);
if (result.success) {
console.log(result.data); // Typed SSDS object
} else {
console.error(result.errors);
}
// Direct parsing (throws on invalid)
const ssds = parseSSDS(jsonString);
// Type guard
if (isSSDS(unknownData)) {
// unknownData is now typed as SSDS
}
Set confidence based on the review performed:
| Level | Criteria |
|---|---|
| low | Self-attested only, or incomplete review |
| medium | Static analysis OR manual review (not both) |
| high | Manual review AND static analysis AND sandbox test |
| verified | Third-party audit with signed attestation |
self_attested — Publisher's claims onlystatic_analysis — Automated code scanningmanual_review — Human code reviewsandbox_test — Tested in isolated environmentthird_party_audit — Independent security reviewManaged SSDS infrastructure for your organization
Safe Agent Skills — Coming Soon