Reference Specification

Formal Protocol Specification

The immutable deterministic contract for SIAS SEO structural scoring parity.

Specification Overview

I. Core Mathematical Engine

Implementations MUST apply the following weighted composite model to ensure binary-equivalent audit results:

$$S_{total} = \sum_{i \in \{D, C, F, V, SC\}} w_i \cdot s_i$$ Weights: $w_D=0.25, w_C=0.20, w_F=0.30, w_V=0.15, w_{SC}=0.10$ D-Score (Hierarchy): Fixed penalty of 0.35 if $H1 \neq 1$. Bonus 0.98 if $H2 \ge 2$. C-Score (Density): Logarithmic scaling with base $300$ word count. Floor at 0.1. V-Score (Security): Absolute requirement for TLS. Non-secure endpoints capped at 0.30.

II. Canonical DOM Snapshot (CDS)

The Canonical DOM Snapshot ensures structural equivalence without heuristic drift:

1. Normalization: Unicode NFC → Case Folding → Whitespace Collapse.
2. Hierarchy Guard: Sequential order MUST be verified (H1 > H2 > H3).
3. Density Base: Minimum 300 tokens required for full saturation.
4. Asset Signals: Mandatory verification of favicon.ico and lang attributes.

III. URI Integrity & Authority (F-Score)

- Canonicalization: Mandatory rel="canonical" audit.
- Schema Extraction: Support for application/ld+json with nested @graph resolution.
- Social Signals: Audit for OpenGraph Title, Description and Image parity.

IV. Status & Error Registry

Code Constant Diagnostic Description
0x00 STATUS_OK Zero-delta success. Structural integrity verified.
0x01 ERR_SEC_TLS Insecure endpoint. V-Score capped.
0x02 ERR_MATH_DOMAIN Word count underflow or illegal log argument.
0x03 ERR_DOM_HIERARCHY H1 count violation or non-sequential jumps.
0x04 ERR_CANON_MISSING Lack of rel="canonical" in authoritative documents.
0x05 ERR_HASH_FAIL Integrity check failure against Master Core.