v0.2.0

Skill Safety
Data Sheet

An open standard for documenting AI agent skill risks. Like MSDS for chemicals, but for the capabilities you give to autonomous systems. Rate inherent hazards with HDAC scores, then prescribe containment controls.

H D A C
Host
Data
Actuation
Content
01

The Four Risk Factors

Host Risk

H: 0–4

Risk to the local machine, environment integrity, and availability. Shell execution, package installs, config modifications, persistence.

Data Risk

D: 0–4

Risk of privacy leakage, data exfiltration, or sensitive information misuse. File access, credential stores, third-party transmission.

Actuation Risk

A: 0–4

Risk of irreversible or externally impactful actions. Emails, posts, purchases, deletions, production changes.

Content Risk

C: 0–4

Risk of generating harmful, NSFW, or regulated content. Adult material, self-harm topics, medical/legal advice without safeguards.

Core Principle: Rate the inherent hazard, then prescribe containment. A skill that CAN send emails gets the flag and score, even if it's "supposed to" only draft them.
02

HDAC Scoring Rules

Score each factor 0–4 based on the highest matching rule.

H (Host Risk) — System Access

ScoreCriteria
H0No filesystem access, no execution
H1Read-only workspace access
H2Write to workspace only, OR read user home, no exec
H3Shell/code execution, OR write to user home, OR install deps
H4Privileged execution (admin/root), OR persistence, OR write system

D (Data Risk) — Sensitive Data Exposure

ScoreCriteria
D0No data access beyond user-provided input
D1Reads workspace files only, no egress
D2Reads user home OR has egress to allowlist
D3Reads credential files/env vars, OR browser data, OR unrestricted egress with file access
D4Keychain access, OR browser sessions + unrestricted egress, OR multiple credential sources

A (Actuation Risk) — Side Effects

ScoreCriteria
A0No side effects (read-only, outputs text only)
A1Local side effects only (writes to workspace)
A2External side effects that are reversible (create issues, draft PRs)
A3Send messages, post publicly, modify external state
A4Financial transactions (purchase/transfer), OR deploy to production, OR delete external resources

C (Content Risk) — Harmful Content

ScoreCriteria
C0No content generation, or pure data transformation
C1General content generation
C2Handles regulated domains (medical/legal/financial advice)
C3Can access/generate NSFW content
C4Designed for harmful content, or systematically bypasses safety
03

Hazard Flags

Flags are derived deterministically from the capabilities manifest. They indicate specific capabilities the skill has.

Execution Flags

EXEC CODE_EXEC PRIVILEGED PERSISTENCE

Filesystem Flags

FS_READ_WORKSPACE FS_READ_USER FS_READ_SYSTEM FS_WRITE_WORKSPACE FS_WRITE_USER FS_WRITE_SYSTEM FS_DELETE

Network Flags

NET_EGRESS_LOCAL NET_EGRESS_ALLOWLIST NET_EGRESS_ANY NET_INGRESS

Credential Flags

CREDS_ENV CREDS_FILES CREDS_BROWSER CREDS_KEYCHAIN

Service Flags

SVC_EMAIL SVC_MESSAGING SVC_STORAGE SVC_CODE_HOST SVC_ISSUE_TRACKER SVC_CLOUD_INFRA SVC_PAYMENTS SVC_SOCIAL SVC_DATABASE

Action Flags

ACT_SEND_MESSAGE ACT_POST_PUBLIC ACT_PURCHASE ACT_TRANSFER_MONEY ACT_DEPLOY ACT_DELETE_EXTERNAL

Prompt Injection Flags

PI_WEB PI_EMAIL PI_DOCUMENTS PI_USER_INPUT

Content Flags

CONTENT_NSFW CONTENT_REGULATED
04

Containment Levels

Select containment level based on the maximum HDAC score:

LevelCriteriaTypical Controls
noneAll scores ≤ 1Minimal or no containment needed
standardMax score = 2Basic sandboxing, logging
elevatedMax score = 3Container sandbox, approval gates, egress filtering
maximumAny score = 4VM isolation, all approvals required, strict allowlists

Required Controls by Flag

FlagRequired Control
ACT_SEND_MESSAGEAPPROVE_SEND
ACT_POST_PUBLICAPPROVE_POST
ACT_PURCHASE or ACT_TRANSFER_MONEYAPPROVE_PURCHASE
ACT_DELETE_EXTERNALAPPROVE_DELETE
ACT_DEPLOYAPPROVE_DEPLOY
EXEC or CODE_EXECAPPROVE_EXEC or SANDBOX_CONTAINER
CREDS_BROWSERCREDS_NO_BROWSER (unless explicitly needed)
NET_EGRESS_ANY with D≥3NET_EGRESS_ALLOWLIST

Containment Control Reference

SANDBOX_CONTAINER SANDBOX_VM SANDBOX_REMOTE FS_READONLY FS_WORKSPACE_ONLY FS_NO_HOME NET_OFFLINE NET_EGRESS_BLOCK NET_LOG_ALL CREDS_VAULT_ONLY APPROVE_SEND APPROVE_POST APPROVE_PURCHASE APPROVE_DELETE APPROVE_DEPLOY APPROVE_EXEC LOG_ACTIONS LOG_REDACT_SECRETS RATE_LIMIT_ACTIONS REVIEW_OUTPUT

Need automated SSDS generation & monitoring?

Safe Agent Skills — Coming Soon
05

Zod Schema

The canonical SSDS schema is defined in TypeScript using Zod. Full source: schemas/ssds.schema.ts

Top-Level Structure

export const SSDS = z.object({
  // Metadata
  meta: DocumentMeta,
  
  // What skill is this for?
  skill: SkillIdentification,
  
  // What can it do? (detailed capabilities for flag derivation)
  capabilities: CapabilityManifest,
  
  // How dangerous is it? (the diamond + flags)
  hazards: HazardAssessment,
  
  // How should it be deployed? (PPE equivalent)
  containment: RecommendedContainment,
  
  // What could go wrong?
  risks: KnownRisks,
  
  // What to do if something goes wrong
  incident_response: IncidentResponse,
  
  // Supporting evidence
  evidence: z.array(EvidenceItem),
  
  // Vendor extensions
  extensions: z.record(z.string(), z.unknown()).optional(),
}).strict();

HDAC Ratings

export const HdacRatings = z.object({
  H: z.number().int().min(0).max(4)
    .describe("Host risk (0-4): system access, execution, persistence"),
  D: z.number().int().min(0).max(4)
    .describe("Data risk (0-4): sensitive data access, exfiltration potential"),
  A: z.number().int().min(0).max(4)
    .describe("Actuation risk (0-4): side effects, external actions"),
  C: z.number().int().min(0).max(4)
    .describe("Content risk (0-4): harmful/sensitive content generation"),
}).strict();

Skill Identification (with Cryptographic Pinning)

export const SkillIdentification = z.object({
  name: z.string().max(200),
  version: z.string().regex(semverPattern),
  format: SkillFormat,
  description: z.string().max(5000).nullable().optional(),
  
  publisher: z.string().max(200),
  publisher_url: z.string().url().nullable().optional(),
  
  source: z.object({
    channel: DistributionChannel,  // npm, pypi, github, local, other
    url: z.string().url().nullable().optional(),
    retrieved_at: z.string().datetime().nullable().optional(),
  }).strict(),
  
  // Cryptographic pinning - CRITICAL for supply chain security
  artifact: z.object({
    sha256: z.string().regex(sha256Pattern),
    hash_method: HashMethod.default("files_sorted"),
    git_commit: z.string().regex(gitCommitPattern).nullable().optional(),
    git_tag: z.string().max(200).nullable().optional(),
  }).strict(),
  
  license: z.string().max(200).nullable().optional(),
}).strict();

Hazard Flags Enum

export const HazardFlag = z.enum([
  // Execution
  "EXEC", "CODE_EXEC", "PRIVILEGED", "PERSISTENCE",
  
  // Filesystem
  "FS_READ_WORKSPACE", "FS_READ_USER", "FS_READ_SYSTEM",
  "FS_WRITE_WORKSPACE", "FS_WRITE_USER", "FS_WRITE_SYSTEM", "FS_DELETE",
  
  // Network
  "NET_EGRESS_LOCAL", "NET_EGRESS_ALLOWLIST", "NET_EGRESS_ANY", "NET_INGRESS",
  
  // Credentials
  "CREDS_ENV", "CREDS_FILES", "CREDS_BROWSER", "CREDS_KEYCHAIN",
  
  // Services
  "SVC_EMAIL", "SVC_MESSAGING", "SVC_STORAGE", "SVC_CODE_HOST",
  "SVC_ISSUE_TRACKER", "SVC_CLOUD_INFRA", "SVC_PAYMENTS",
  "SVC_SOCIAL", "SVC_DATABASE",
  
  // Actions
  "ACT_SEND_MESSAGE", "ACT_POST_PUBLIC", "ACT_PURCHASE",
  "ACT_TRANSFER_MONEY", "ACT_DEPLOY", "ACT_DELETE_EXTERNAL",
  
  // Prompt Injection Surfaces
  "PI_WEB", "PI_EMAIL", "PI_DOCUMENTS", "PI_USER_INPUT",
  
  // Content
  "CONTENT_NSFW", "CONTENT_REGULATED",
]);

Containment Controls Enum

export const ContainmentControl = z.enum([
  // Sandboxing
  "SANDBOX_CONTAINER", "SANDBOX_VM", "SANDBOX_REMOTE", "SANDBOX_NONE_OK",
  
  // Filesystem
  "FS_READONLY", "FS_WORKSPACE_ONLY", "FS_NO_HOME", "FS_NO_SECRETS",
  
  // Network
  "NET_OFFLINE", "NET_EGRESS_BLOCK", "NET_EGRESS_ALLOWLIST", "NET_LOG_ALL",
  
  // Credentials
  "CREDS_VAULT_ONLY", "CREDS_NO_BROWSER", "CREDS_ROTATE_AFTER",
  
  // Approval Gates
  "APPROVE_SEND", "APPROVE_POST", "APPROVE_PURCHASE",
  "APPROVE_DELETE", "APPROVE_DEPLOY", "APPROVE_EXEC",
  
  // Monitoring
  "LOG_ACTIONS", "LOG_REDACT_SECRETS", "ALERT_ANOMALY",
  
  // Rate Limiting
  "RATE_LIMIT_ACTIONS", "RATE_LIMIT_EGRESS",
  
  // Review
  "REVIEW_OUTPUT", "REVIEW_DIFF",
]);

Validation Helpers

import { SSDS, validateSSDS, parseSSDS, isSSDS } from "ssds-schema";

// Safe parsing with error handling
const result = validateSSDS(data);
if (result.success) {
  console.log(result.data);  // Typed SSDS object
} else {
  console.error(result.errors);
}

// Direct parsing (throws on invalid)
const ssds = parseSSDS(jsonString);

// Type guard
if (isSSDS(unknownData)) {
  // unknownData is now typed as SSDS
}
06

Confidence Levels

Set confidence based on the review performed:

LevelCriteria
lowSelf-attested only, or incomplete review
mediumStatic analysis OR manual review (not both)
highManual review AND static analysis AND sandbox test
verifiedThird-party audit with signed attestation

Confidence Basis Values

Managed SSDS infrastructure for your organization

Safe Agent Skills — Coming Soon