OSIRIS JSON Producer Guidelines
| Field | Value |
|---|---|
| Authors | Tia Zanella skhell |
| Revision | 1.0.0-DRAFT |
| Creation date | 08 February 2026 |
| Last revision date | 16 February 2026 |
| Status | Draft |
| Document ID | OSIRIS-ADG-PR-1.0 |
| Document URI | OSIRIS-ADG-PR-1.0 |
| Document Name | OSIRIS JSON Producer Guidelines |
| Specification ID | OSIRIS-1.0 |
| Specification URI | OSIRIS-1.0 |
| Schema URI | OSIRIS-1.0 |
| License | CC BY 4.0 |
| Repository | github.com/osirisjson/osiris |
Table of Content
- Table of Content
- 1 The producer mindset
- 2 The ingestion workflow
- 3 Security and redaction (deep dive)
- 4 Quality assurance
1 The producer mindset
An OSIRIS producer (also commonly named a parser) is any component that reads a vendor device or platform inventory and emits an OSIRIS JSON document. OSIRIS Producers are the bridge between proprietary data sources and the OSIRIS Open Standard interchange format. Their quality determines whether downstream consumers (validators, diagram engines, CMDBs, diff tools etc.) can trust the exported snapshot at a point in time.
This guide defines the mapping contract that every producer MUST honor. It is intentionally language-agnostic: the rules apply whether the producer is written in Go (recommended for first-party producers), Python, Rust, C or any other high or low-level language. Implementation-level patterns (transport, concurrency, SDK helpers) belong in the Go producer SDK documentation, not here.
[!NOTE] Back-reference: The ecosystem boundaries and canonical truth rule are defined in OSIRIS-ADG-1.0. The validation model (levels, codes, profiles) is defined in OSIRIS-ADG-VL-1.0. Validation engine internals are defined in OSIRIS-ADG-TLB-CORE-1.0.
1.1 Discovery vs. static definition
Producer implementations fall on a spectrum between two extremes.
| Mode | Description | Typical source | Example |
|---|---|---|---|
| Live discovery | The producer queries a live API, CLI, or protocol endpoint and builds the OSIRIS document from the response. | Cloud provider APIs, SNMP/LLDP/NETCONF, device CLIs via SSH | osirisjson-producer azure, osirisjson-producer arista --ssh |
| Static ingestion | The producer reads an existing export file (Terraform state, inventory dump, CMDB extract) and transforms resources it into OSIRIS. | JSON/YAML/CSV files, database exports, IaC state files | Terraform state > OSIRIS, ServiceNow CMDB CSV > OSIRIS |
Most real-world producers combine both: they discover live inventory for some resource types and fall back to static sources for others.
Guidance:
- Producers SHOULD document which mode they use for each resource type, including known limitations and required permissions.
- Producers MUST NOT assume they will discover the complete infrastructure. Partial inventories are valid OSIRIS documents; the scope SHOULD describe what was exported and what was not (see
metadata.scope). - Discovery failures for individual resources SHOULD NOT abort the entire export. Producers SHOULD continue collecting other resources, log the failure and reflect limitations in the emitted scope or tags.
1.2 Scope of responsibility (what NOT to map)
A producer execute a snapshot in time translating existing infrastructure into OSIRIS. It MUST NOT orchestrate, provision or mutate anything and:
| MUST | SHOULD | MUST NOT |
|---|---|---|
| Emit structurally valid OSIRIS JSON (passes Level 1 schema validation) | Map to standard OSIRIS types (OSIRIS-JSON-v1.0 Chapter 7 and Appendix C) before resorting to custom types | Re-implement validation logic. Canonical validation is @osirisjson/core. Producers validate their output by invoking the canonical engine (e.g. npx @osirisjson/cli validate <file>) in CI |
Populate all required fields (version, metadata.timestamp and required resource/connection/group fields per the OSIRIS-JSON-v1.0 specification Chapter 3-6) | Provide metadata.generator with a stable tool name and version | Invent vendor-specific interpretations of the OSIRIS-JSON-v1.0 specification or core osiris.schema.json |
| Generate stable, deterministic resource IDs | Describe export boundaries in metadata.scope | Emit data intended to provision, modify, or delete infrastructure. OSIRIS is a read-only snapshot format |
| Exclude credentials, secrets and authentication material (see Chapter 3) | Produce deterministic output for the same input | Guess or produce unknown values for unknown fields. If data is not available, omit the optional field entirely (OSIRIS-JSON-v1.0 specification section 11.1.5) |
2 The ingestion workflow
The ingestion workflow is the lifecycle a producer follows to transform vendor data into an OSIRIS snapshot.
flowchart LR
D["**Discovery**
(vendor APIs, SNMP/LLDP/NETCONF, device CLIs via SSH etc.)"] --> N["**Normalization**
(types, IDs, units etc.)"]
N --> R["**Redaction**
(secrets, PII, noise etc.)"]
R --> E["**Emission**
(JSON serialization + validation)"]
2.1 Discovery normalization redaction emission
Each phase has a clear contract and failure mode.
Discovery: Acquire raw vendor data (API responses, CLI output, file contents). Log what was fetched, what was skipped and why. Establish the export scope (metadata.scope).
Normalization: Transform vendor-native representations into OSIRIS specification compliant structures. This is where type mapping, ID generation, unit conversion, timestamp normalization and relationship extraction happen. Normalization standardizes representation, not meaning; it MUST NOT introduce artifacts or invent data altering the source of truth.
Redaction: Strip credentials, secrets, PII and irrelevant vendor noise before the document leaves the producer boundary. Redaction MUST NOT be negotiable (see Chapter 3 of the current document).
Emission: Serialize the document to JSON and validate it against the OSIRIS schema. Producers MUST validate before publishing. Emit structured logs summarizing the run (resource counts, warnings, errors).
Failure propagation rules:
- Discovery failures for individual resources: log, skip and continue.
- Normalization errors for individual resources: log, skip the resource (or emit with
status: "unknown") and continue. - Redaction detection of secrets: MUST halt emission for the affected resource or fail the entire export, depending on configuration (see section 3.3 of the current document).
- Validation failure at Level 1: MUST fail the export pipeline.
- Validation failure at Level 2: MUST fail the export pipeline under the
strictprofile. Underdefault, producers MAY continue only if explicitly configured, but MUST emit the diagnostics and mark limitations inmetadata.scope.
2.2 Identity strategy (deterministic IDs and naming patterns)
Stable identity is what makes OSIRIS useful for topology diffs, snapshot correlation and downstream automation. If IDs drift across exports, consumers cannot detect changes reliably.
2.2.1 Resource IDs
Producers MUST ensure resource id values are unique within the document and stable across exports when the underlying entity is the same.
Recommended patterns:
| Domain | Pattern | Examples |
|---|---|---|
| Hyperscalers and Public Cloud providers | provider::native-id | aws::i-0abc123def456, azure::/subscriptions/sub-123/.../vm01 |
| On-premise | site::identifier | mxp::spine-sw-01, mxp::srv-r770-010 |
| OT | site::identifier | mxp::sensor-temp-01, mxp::rfid-reader-01 |
Rules:
- If the source provides a stable unique identifier, producers SHOULD build
idfrom it. - Producers SHOULD NOT generate random IDs (UUIDs, timestamps) for real resources.
- When no stable native identifier exists, producers MAY derive a deterministic ID from a stable tuple (e.g.
{site, name, serial}) and SHOULD document the derivation strategy. - IDs are opaque to consumers. Consumers SHOULD NOT parse ID structure for meaning.
2.2.1.1 Provider attribution contract (native identity and origin)
Producers MUST separate OSIRIS identity (resource.id) from provider-native identity (resource.provider.native_id).
Rules:
- If the source provides a stable vendor identifier, producers MUST set
resource.provider.native_idto that value. - Producers SHOULD derive
resource.idfrom stable inputs (often including the native id), butresource.idMUST NOT be the only place where the vendor identifier exists. - Producers SHOULD preserve the vendor’s original type/class string in
resource.provider.typewhen available. - Origin/context fields (tenant/account/subscription/project/region/site/zone) MUST live in the
providerobject when OSIRIS defines a corresponding field. They MUST NOT be buried inextensionsas ad-hoc conventions. - Vendor-only payloads or deep raw objects MAY be preserved under
extensionsusing a namespaced key, but only when they are not representable in OSIRIS standard fields.
Rationale: provider.* enables round-trip correlation and cross-producer consistency even when resource.id patterns differ.
2.2.2 Connection IDs
Connection IDs SHOULD be deterministic and derived from the relationship they represent.
Recommended algorithm: Producers SHOULD build a canonical key from stable parts:
typedirection(if omitted, treat as"bidirectional")- canonicalized
(source, target)using resource IDs- if
direction = "bidirectional", sort(source, target)lexicographically to prevent flips across exports
- if
- stable qualifiers when needed (examples:
properties.port,properties.protocol, interface IDs, provider-native link ID)
Canonical key serialization (normative):
Serialize as:
v1|{type}|{direction}|{sourceId}|{targetId}|{qualifiers}
Where:
sourceId/targetIdare the full resource IDs (not hints)qualifiersis a comma-joined list ofk=vpairs sorted by key (omit absent qualifiers)
Compute:
hash16 = first 16 chars of lowercase hex(SHA-256(canonical_key))
Emit this recommended pattern:
conn-{type}-{sourceHint}-to-{targetHint}-{hash16}
Collision rule:
- If an ID collision is detected within the same document, producers MUST extend the hash length (e.g. take 24 or 32 chars) rather than renumbering or adding randomness.
Example:
{
"id": "conn-dataflow.tcp-plc-line-01-to-printer-label-01-7f3a91c22f4c0a1b",
"type": "dataflow.tcp",
"source": "mxp::plc-line-01",
"target": "mxp::printer-label-01",
"direction": "forward",
"properties": {
"protocol": "tcp",
"port": 9100,
"application": "zpl_over_tcp"
}
}
Hint derivation (normative):
sourceHintandtargetHintMUST be derived fromsource/targetresource IDs (never fromname).- Extraction rule:
- take the substring after the last
::if present, else after the last/ - lowercase
- replace any sequence of non
[a-z0-9]with- - trim leading/trailing
- - truncate to max
24chars
- take the substring after the last
- If the result is empty, use the first
8chars ofhash16.
[!NOTE]
sourceHint,targetHint, andboundaryHintare display-only slugs and MUST NOT be used as inputs to hashing (use full IDs/stable scope fields).
2.2.3 Group IDs
Group IDs SHOULD be derived from stable boundaries (region, site, rack, environment, cloud organizational unit).
Recommended algorithm:
- Producers MUST select a stable
boundaryTokenrepresenting the group boundary (examples: site codemxp, regioneu-west-1, rack idr42, subscription id, project id). - The canonical key MUST NOT include
membersorchildren(membership and hierarchy may change over time and would destabilize the group ID).
Canonical key serialization (normative):
Serialize as:
v1|{type}|boundary={boundaryToken}|{scopePairs}
Where:
boundaryTokenis the stable machine token (hash input).scopePairsis a pipe-joined list ofk=vpairs sorted by key name.- Eligible keys (use if present):
provider.nameprovider.namespaceprovider.accountprovider.subscriptionprovider.projectprovider.regionprovider.siteprovider.zone
- Omit absent keys entirely (do not emit empty placeholders).
Compute:
hash16 = first 16 chars of lowercase hex(SHA-256(canonical_key))
Input rules:
- The canonical key MUST NOT include
membersorchildren(membership and hierarchy may change over time and would destabilize the group ID). - When applicable, include provider scope inputs (e.g. provider name/namespace, account/subscription, region/site).
Emit this recommended pattern:
group-{type}-{boundaryHint}-{hash16}
Example:
{
"id": "group-physical.site-mxp-2c10d4a93d7a5b82",
"type": "physical.site",
"name": "MXP Datacenter",
"description": "All on-premise resources in Milan Datacenter",
"members": [
"mxp::srv-r770-001",
"mxp::srv-r770-002",
"mxp::storage-dell-me5024"
],
"properties": {
"address": "Via Malpensa 1, 21010 Vizzola Ticino VA, Italy",
"coordinates": "45.6301° N, 8.7280° E"
},
"tags": {
"location": "on-premise",
"datacenter": "mxp"
}
}
boundaryHint rule:
boundaryHintMUST be derived fromboundaryToken(display-only slug).boundaryHintMUST NOT be used as an input to hashing.- If no stable boundary token exists, producers MAY omit it and emit:
group-{type}-{hash16}.
Groups SHOULD remain stable across exports even when temporarily empty (OSIRIS-JSON-v1.0 specification section 2.3.8).
2.3 Schema compliance essentials
Every OSIRIS document MUST include the three required top-level fields: version, metadata and topology.
2.3.1 The $schema field
Producers MUST include $schema at the top level to enable editor resolution and tooling auto-detection.
{
"$schema": "https://osirisjson.org/schema/v1.0/osiris.schema.json",
"version": "1.0.0"
}
Producers targeting OSIRIS v1.0 MUST emit "version": "1.0.0". Other top-level fields beyond $schema SHOULD NOT be emitted in v1.0.
[!NOTE] Producers emit the specification baseline version (
1.0.0for OSIRIS v1.0). Consumers and validators may accept forward-compatible PATCH updates within the same MAJOR.MINOR (e.g.1.0.x) without requiring producer changes.
2.3.2 Top-level field rules
| Field | Requirement | Notes |
|---|---|---|
version | MUST be "1.0.0" for v1.0 producers | SemVer string matching the specification version |
metadata.timestamp | MUST be present | RFC3339/ISO 8601 date-time with timezone (e.g. "2026-02-14T10:30:00Z") |
metadata.generator.name | MUST be present | Stable producer name (e.g. "osirisjson-producer-azure") |
metadata.generator.version | MUST be present | Producer SemVer string, not OSIRIS specification version |
metadata.scope | SHOULD be present | Describes export boundaries (providers, regions, accounts, sites, environments) |
topology.resources | MUST be present (array, may be empty) | Minimum: [] |
topology.connections | SHOULD be present when relationships are known | Defaults to [] |
topology.groups | SHOULD be present when groupings are known | Defaults to [] |
2.3.3 The provider.name = "custom" namespace requirement
When provider.name is "custom", the provider.namespace field becomes required by the schema. This is enforced at Level 1 (structural validation).
The namespace MUST follow the osiris.<identifier> pattern using reverse-domain notation.
{
"provider": {
"name": "custom",
"namespace": "osiris.com.acme",
"native_id": "asset-12345",
"site": "mxp"
}
}
Guidance:
- Use
"custom"for on-premise equipment, internal systems, or any private resource. You MUST NOT use it from a well-known public vendor device or platform. - Prefer well-known canonical provider names (
aws,azure,gcp,cisco,arista,nokia, etc.) when applicable as documented in OSIRIS Specification. - Do not use
"custom"as a catch-all when a standard provider name exists.
2.3.4 Properties vs. extensions (quick rule)
- Use
propertiesfor generic, broadly useful attributes that many producers could emit consistently. - Use
extensionsfor vendor/org-specific payloads, nested objects, or fields whose semantics are not standardized by OSIRIS. - Extension keys MUST be namespaced (e.g.
osiris.aws.*) and MUST NOT be used to store data that OSIRIS already models in standard fields.
2.4 Handling relationships (Physical vs. logical links)
Producers SHOULD emit explicit relationships whenever they are discoverable. OSIRIS provides two mechanisms: connections (graph edges) and groups (membership sets).
2.4.1 When to use connections
Connections represent relationships that consumers traverse as graph edges.
Use connections for:
- Network connectivity (physical links, logical paths, tunnels)
- Application dependencies (service-to-database, frontend-to-API)
- Data flows (producer-to-consumer, source-to-sink)
- Containment when traversal semantics are required (VM inside a host, disk attached to a server)
- Attachment (NIC-to-switch port, volume-to-VM)
Required fields: id, type, source, target.
Default direction: bidirectional when direction is omitted.
2.4.2 When to use groups
Groups represent classification, organization and boundaries without graph traversal semantics. Use groups for:
- Organizational boundaries (resource groups, projects, accounts, subscriptions)
- Physical boundaries (datacenter site, rack, floor, building)
- Logical boundaries (environment, availability zone, security zone, cost center)
- Hierarchical nesting (parent group > child group via
children)
2.4.3 Avoiding duplication
Producers SHOULD NOT model the same relationship as both a contains connection and a group membership.
Choose one representation:
flowchart TD
Q["Does this relationship need
to be traversed as topology?"]
Q -- "Yes" --> CONN["Use a **connection**
(graph edge)"]
Q -- "No" --> GRP["Use a **group**
(membership set)"]
CONN --> CONN_EX["Network links, dependencies,
data flows, containment,
attachments"]
GRP --> GRP_EX["Inventory, reporting,
classification, organizational
and physical boundaries"]
2.4.4 Inferred relationships
When producers infer relationships (e.g. from naming conventions, subnet overlap, or LLDP data), they SHOULD mark inferred relationships using tags to distinguish them from explicitly discovered relationships:
{
"tags": {
"osiris.inferred": "true",
"osiris.inference_source": "lldp"
}
}
2.5 Data normalization (units, timestamps, casing standards)
Normalization ensures documents are comparable across producers and over time. Every producer SHOULD apply these conventions consistently.
2.5.1 Units
OSIRIS does not enforce specific units, but producers SHOULD follow consistent conventions to maximize interoperability.
| Measurement | Recommended unit | Property name convention | Example |
|---|---|---|---|
| Memory/RAM | Gigabytes | memory_gb | "memory_gb": 64 |
| Storage capacity | Gigabytes (or Terabytes for large volumes) | capacity_gb/capacity_tb | "capacity_tb": 1.92 |
| Network bandwidth | Gigabits per second | bandwidth_gbps | "bandwidth_gbps": 100 |
| CPU count | Virtual CPUs | vcpus | "vcpus": 8 |
| CPU frequency | Gigahertz | cpu_frequency_ghz | "cpu_frequency_ghz": 2.4 |
| Latency | Milliseconds | latency_ms | "latency_ms": 5 |
| Power | Watts | power_watts | "power_watts": 750 |
Producers MUST include the unit suffix in the property name. Bare names like "memory": 64 are ambiguous and SHOULD be avoided.
2.5.2 Timestamps
All timestamps MUST be RFC3339/ISO 8601 format with timezone. UTC is recommended.
| Good | Bad |
|---|---|
2026-02-14T10:30:00Z | 2026-02-14 10:30:00 (missing T separator and timezone) |
2026-02-14T11:30:00+01:00 | 1708000200 (Unix epoch, not RFC 3339) |
| - | Feb 14, 2026 (ambiguous locale-dependent format) |
Producers SHOULD normalize vendor timestamps to UTC when the source timezone is known.
2.5.3 Casing standards
| Field | Casing rule | Example |
|---|---|---|
provider.name | Lowercase, dots allowed | "aws", "cisco", "custom" |
Resource/connection/group type | Lowercase, dot-separated segments | "compute.vm", "network.switch" |
| Extension namespace keys | osiris.<lowercase.segments> | "osiris.aws", "osiris.com.acme" |
| Property keys | snake_case (recommended) | "memory_gb", "serial_number" |
| Tag keys | snake_case or dotted.path | "env", "cost_center", "osiris.inferred" |
Producers MUST NOT emit uppercase characters in provider.name, type, or extension namespace keys. This is enforced by schema patterns at Level 1.
2.5.4 Vendor-native type preservation
While the OSIRIS type field is normalized to the standard taxonomy, the original vendor type string SHOULD be preserved in provider.type:
{
"type": "compute.vm",
"provider": {
"name": "aws",
"type": "AWS::EC2::Instance",
"native_id": "i-0abc123"
}
}
This enables round-trip correlation: consumers can identify the OSIRIS-normalized type (compute.vm) and the vendor’s original classification (AWS::EC2::Instance).
3 Security and redaction (deep dive)
OSIRIS documents describe infrastructure topology. They MUST be safe to share under controlled policies. Producers are the first and most critical line of defense against accidental secret disclosure.
[!NOTE] Back-reference: Normative security requirements are defined in the OSIRIS specification Chapter 13. Cross-cutting security constraints are summarized in OSIRIS-ADG-1.0 section 5.2.
3.1 Secret stripping (non-negotiable patterns)
OSIRIS documents MUST NOT contain credentials, secrets, or authentication material. Producers MUST scan document content for common credential patterns before emission and MUST refuse to emit fields known to contain secrets.
3.1.1 Prohibited content categories
- Passwords (plaintext, hashed, or encoded)
- API keys and access tokens
- SSH private keys or certificates (private material)
- Database connection strings with embedded credentials
- OAuth client secrets
- Cloud access keys/secret keys (e.g. AWS
AKIA*, GCP service account JSON keys) - Encryption keys or private certificates
- Bearer tokens, JWTs with secrets, session tokens
- PII that is not required for topology (employee names, phone numbers, email addresses in non-generator contexts)
3.1.2 Detection patterns
Producers MUST implement heuristic scanning before emission. The following patterns are non-exhaustive but represent the minimum coverage.
Key-name matching (case-insensitive):
Any property key containing: password, secret, token, credential, private_key, api_key, access_key, client_secret, auth.
Examples of value-pattern matching:
| Pattern | What it catches |
|---|---|
AKIA[0-9A-Z]{16} | AWS access key IDs |
[A-Za-z0-9/+=]{40} adjacent to AKIA | AWS secret access keys |
ghp_[A-Za-z0-9]{36} | GitHub personal access tokens |
gho_[A-Za-z0-9]{36} | GitHub OAuth tokens |
-----BEGIN .* PRIVATE KEY----- | PEM-encoded private keys |
Bearer [A-Za-z0-9\-._~+/]+=* | Bearer tokens |
Basic [A-Za-z0-9+/]+=* | Base64-encoded basic auth |
eyJ[A-Za-z0-9_-]*\.eyJ[A-Za-z0-9_-]*\. | JWT tokens |
xox[boaprs]-[A-Za-z0-9-]+ | Slack tokens |
Connection strings with ://user:pass@ | Embedded credentials in URIs |
3.1.3 Sanitization strategy for connection strings
Producers MUST decompose connection strings rather than emitting them whole.
Prohibited:
{ "connection_string": "postgresql://admin:[email protected]:5432/app" }
Required:
{
"endpoint": "db.prod.internal:5432",
"database": "app",
"protocol": "postgresql"
}
Credentials are omitted entirely, not replaced with placeholders.
3.2 Filtering irrelevant vendor metadata
Not every field returned by a vendor API belongs in an OSIRIS document. Producers SHOULD apply data minimization (OSIRIS-JSON-v1.0 specification section 13.1.3) and emit only what serves the intended use case.
| Filter out | Keep |
|---|---|
| Internal vendor request/response metadata (request IDs, pagination tokens, HTTP headers, rate-limit counters) | Resource identity and classification (IDs, types, names, descriptions) |
| Volatile operational telemetry that changes every second (real-time CPU load, instantaneous packet counters, live session counts) | Provider attribution (native IDs, regions, accounts, zones, sites) |
| Redundant or derived fields that consumers can compute from existing data | Stable configuration properties (instance type, memory, CPU, firmware version) |
| Vendor marketing or billing metadata unrelated to topology (pricing tier names, promotional flags) | Network addressing when required for topology (IPs, MAC addresses, VLANs, subnets) |
| Debugging artifacts (stack traces, internal error codes from the vendor API) | Physical characteristics for on-premise resources (serial numbers, rack positions, hardware models) |
| - | Vendor-specific features that affect behavior, placed in extensions (e.g. ebs_optimized, fault_tolerance) |
3.2.1 Producers configurable detail levels
Producers SHOULD support a configuration option to control emission detail (e.g. --detail minimal|detailed). This allows the same producer to serve documentation-focused exports (minimal) and audit-focused exports (detailed) as described in the OSIRIS-JSON-v1.0 specification section 13.1.3.
3.3 Safe failure behavior modes
When a producer detects potential secrets or encounters an ambiguous field, it MUST choose a safe failure mode.
3.3.1 Fail-closed (default for regulated environments)
The producer MUST halt emission for the affected resource or fail the entire export pipeline when credentials are detected. This is the safest posture and SHOULD be the default.
flowchart LR
SCAN["Scan field value"] --> DETECT{"Credential
pattern found?"}
DETECT -- "no" --> PASS["Continue
emission"]
DETECT -- "yes" --> LOG["Log error
(field path only,
**never** the value)"]
LOG --> MODE{"Producer
config?"}
MODE -- "fail-closed
(default)" --> ABORT["Skip resource
or abort run"]
MODE -- "log-and-redact
(opt-in)" --> REDACT["Replace value
with REDACTED"]
ABORT --> EXIT["Exit non-zero
(fail CI)"]
REDACT --> META["Set metadata.redacted: true
+ redaction_policy"]
3.3.2 Log-and-redact (opt-in for tolerant environments)
When explicitly configured, the producer MAY replace the detected value with a "REDACTED" placeholder and continue emission. This mode trades safety for completeness when operators accept the risk.
flowchart LR
DETECT["Credential
pattern detected"] --> REPLACE["Replace value
with REDACTED"]
REPLACE --> LOG["Log warning
(field path only,
**never** the value)"]
LOG --> META["Set metadata.redacted: true
+ metadata.redaction_policy"]
META --> CONTINUE["Continue
emission"]
Rules for both safe failure behavior modes:
- Producers MUST NOT log, print, or include the actual secret value in any output (logs, error messages, diagnostics, stack traces).
- Producers MUST NOT silently pass through detected secrets.
- The choice between fail-closed and log-and-redact MUST be an explicit producer configuration option, never implicit.
4 Quality assurance
Producers MUST be testable and their output MUST be reproducible. The OSIRIS ecosystem relies on canonical validation via @osirisjson/core; producers do not implement their own validation but MUST ensure their output passes it.
[!NOTE] Back-reference: The canonical truth rule (validation behavior is never re-implemented) is defined in OSIRIS-ADG-1.0 section 1.2.1. Validation levels and profiles are defined in OSIRIS-ADG-VL-1.0.
4.1 Golden files (Standardizing test fixtures)
A golden file is a known-good OSIRIS document that represents the expected output for a given input. Golden files are the primary regression defense for producers.
4.1.1 Structure
Each producer SHOULD maintain a testdata/ (or fixtures/) directory containing paired files:
testdata/
├── vendor_scenario_a/
│ ├── input.json # Mocked vendor API response or static source
│ ├── golden.json # Expected OSIRIS output
│ └── README.md # Scenario description and coverage notes
├── vendor_scenario_b/
│ ├── input.json
│ ├── golden.json
│ └── README.md
├── README.md # Test suite overview and run instructions
└── ...
4.1.2 Requirements for golden files
- Golden files MUST pass
@osirisjson/corevalidation at thestrictprofile with zero errors. - Golden files MUST be committed to version control and updated only through deliberate, reviewed changes.
- Golden files SHOULD include
$schemafor editor support. - Golden files SHOULD NOT contain synthetic data that looks like real production infrastructure (real-looking IPs, real hostnames, real serial numbers). Use obviously fictional values documented in the OSIRIS specification examples (e.g. IPv4 Address Blocks Reserved for Documentation like
203.0.113.xadhering to RFC 5737). - Golden files SHOULD cover edge cases: empty topologies, resources with minimal fields, resources with full properties and extensions, custom provider namespaces, multiple connection types, nested group hierarchies.
4.1.3 Golden file maintenance workflow
- Developer makes a mapping change in the producer.
- Run the producer against the mocked input.
- Diff the output against the golden file.
- If the diff is intentional: update the golden file, document the change reason in the commit message.
- If the diff is unintentional: investigate and fix the regression.
- CI validates all golden files against
@osirisjson/coreon every commit.
4.2 Regression testing against schema and @osirisjson/core
Producers MUST integrate canonical validation into their CI pipeline. This ensures that no mapping change silently producing invalid OSIRIS output.
4.2.1 CI validation pipeline
flowchart LR
BUILD["Producer
build"] --> RUN["Run against
test fixtures"]
RUN --> DIFF["Diff against
golden files"]
DIFF --> VALIDATE["Validate with
@osirisjson/core"]
VALIDATE --> GATE{errors?}
GATE -- "none" --> PASS["Pass"]
GATE -- "any" --> FAIL["Fail"]
Implementation:
Producers invoke the canonical TypeScript validator via the CLI. Producers MUST NOT embed @osirisjson/core as a library; they call it as an external tool.
# Validate a single golden file at the strict profile
npx @osirisjson/cli validate --profile strict testdata/vendor_scenario_a/golden.json
# Validate all golden files in a directory
npx @osirisjson/cli validate --profile strict testdata/**/golden.json
4.2.2 What to test
| Test category | What to verify | Failure means |
|---|---|---|
| Schema compliance | All golden files pass Level 1 (structural) | Producer emits malformed OSIRIS |
| Semantic integrity | All golden files pass Level 2 (referential integrity, uniqueness) | Broken references, duplicate IDs |
| Domain best practices | Golden files pass Level 3 with no errors under strict | Non-standard types, missing recommended fields |
| Determinism | Running the producer twice with the same input produces identical output (byte-for-byte after normalization) | Non-deterministic ID generation, unstable ordering |
| Redaction | Golden files contain no credential patterns (run secret scanner on output) | Secret leakage in test fixtures |
| Snapshot stability | Golden file diffs are empty when input has not changed | Unintended mapping drift |
4.2.3 Version alignment
- Producer CI SHOULD pin the
@osirisjson/cliversion and update deliberately. - When
@osirisjson/coreintroduces new validation rules (e.g. in a MINOR release), producers SHOULD update their golden files to address any new warnings before tagging a release. - Producers SHOULD declare which OSIRIS specification
MAJORversion they target in their documentation and package metadata.
4.2.4 Snapshot comparison strategy
Producers SHOULD normalize JSON output before comparison to avoid false-positive diffs from insignificant formatting changes:
- Sort top-level arrays (
resources,connections,groups) byid. - Use consistent 2-space indentation.
- Emit a trailing newline.
This produces clean, reviewable diffs when golden files change intentionally.
Go implementation note:
- Do not build output arrays by iterating over maps. Collect into slices and sort explicitly (e.g. by
id) before emission. - Always sort
resources,connections, andgroupsdeterministically (and any nested arrays with semantic meaning).