agentprivacy-threat-adversarial - SKILL.md Agent Skill

name: agentprivacy-threat-adversarial description: > Threat modelling and adversarial analysis for 0xagentprivacy. Activates when discussing R(t) adversarial bounds, attack surface analysis, red team methodologies, trusted setup attacks, side-channel vulnerabilities, the capability-indexed adversary (moving ceiling, attestation discounting, compositional leakage), or systematic approaches to finding privacy architecture weaknesses. V6 register note (2026-06-10): conjecture and version citations resolve to agentprivacy-docs/research/CONJECTURE_REGISTER_V6.md (head C89); model head: privacy_value_v6_formal_specification.md. license: Apache-2.0 metadata: version: "6.0" category: "role" origin: "0xagentprivacy" author: "Mitchell Travers" affiliation: "0xagentprivacy, BGIN, First Person Network" status: "working_paper" target_context: "Security auditors, red teams, academic reviewers, protocol analysts" equation_term: "R(t) · reconstruction resistance under adversarial conditions, with shelf life t*" template_references: "cipher, sentinel, ranger, architect, witness"

PVM V6 Skill · Threat Model & Adversarial Analysis

Source: Privacy Value Model V6 + Zero Knowledge Spellbook (Tale 26: Vulnerability Codex) Target context: Security auditors, red teamers, academic peer reviewers, protocol analysts, penetration testers Architecture: agentprivacy.ai · Sync: sync.soulbis.com · Contact: mage@agentprivacy.ai

What this is

Every security model has a boundary. This skill defines PVM V6's boundary: what the model defends against, what it explicitly does not, and the conditions under which the entire framework fails. A defence that does not know its own breaking conditions is not a defence. V6 adds the boundary the static model could not see: the adversary improves while your archive sits still.

The adversary

Capability: The adversary has complete output streams of both agents (Swordsman and Mage). It observes everything the agents transmit externally. It is computationally bounded — probabilistic polynomial-time (PPT).

Goal: Reconstruct the principal's private state X from the combined observations.

Result (Proven, conditional regime): R_max = (C_S + C_M)/H(X) < 1. Even with complete output observation, the adversary cannot reconstruct the full private state, provided the dual-agent conditional independence holds (I(S;M|π) ≤ ε with ε < 0.1). The regime is explicit at V6: Precondition 1 (non-collusion, no third channel carries the inter-agent residue) and Precondition 2 (capacities stated against a fixed adversary class). V6 form: R(t) = (C_S(t) + C_M(t))/H(X) with shelf life t* = sup{t : R(t) < 1} (C82, ~65%).

Information bound (Proven, inside Precondition 1): I(X; O_S, O_M) ≤ I(X; O_S) + I(X; O_M) + ε. Information from both agents combines additively, not multiplicatively, under near-independence. The adversary gains less from combining both streams than the sum of what each stream reveals individually.

The capability-indexed adversary (V6)

The PPT adversary above is a snapshot. V6 indexes the adversary by time and adds three faces the static taxonomy missed:

The moving-ceiling adversary (C82, ~65%). Its decoders improve against your fixed archives. C_S(t) + C_M(t) rise with frontier capability while H(X) stays fixed, so R(t) drifts upward and every static reconstruction guarantee carries a finite shelf life t*. This adversary never attacks today. It archives now and decodes later; the mechanism is the decoder, not the data.

The attestation-reader (C81 ~70%, C84 ~50%). It does not break your proofs; it reads them. A public feasibility attestation prices the search: I(feasibility; method) > 0 (C81), and the floor is hard, leakage-resilient ZK with λ < 1 is impossible (Garg-Jain-Sahai). Every public attestation therefore discounts the migration horizon, Z_b' = Z_b − D(a) (C84): deadlines tighten on attestation alone, independent of any actual attack.

The composition adversary (C83, ~55%). It exploits inter-agent channels. Policy-only separation compounds toward (2^N − 1)ε with chain depth; amnesia separation breaks the Markov chain and caps at Nε. The empirical anchor is AgentLeak's 68.8% figure: roughly two thirds of measured leakage in agent systems flows through inter-agent channels, exactly where policy-only separation compounds.

What the model defends against

Passive surveillance. Complete observation of all agent outputs. The reconstruction ceiling guarantees the adversary cannot close the gap.
Traffic analysis. Even if the adversary sees all traffic patterns, the separation between Swordsman and Mage means traffic from one agent does not reveal the other's state.
Temporal correlation. The adversary may attempt to correlate Swordsman and Mage outputs over time. The conditional independence requirement (ε < 0.1) bounds the information leaked through temporal patterns.
Network-level observation. Observing which agents communicate and when. The VRC system provides cover through bilateral channels.
The moving ceiling, via amnesia only. Amnesia separation is the one defense the capability-indexed adversary cannot erode: if Grade-2 forgetting is a non-vanishing gluing obstruction (C86, ~30%), amnesia is the only term whose security is independent of t. Every other guarantee in the model carries a shelf life; what was never written cannot be decoded later.

What the model does NOT defend against

Side-channel attacks. If the adversary can observe timing, power consumption, electromagnetic emissions, or other physical signals from the execution environment, the model provides no guarantees. TEE integrity is assumed, not proven.

Principal-adversary collusion. If the human principal cooperates with the adversary (voluntarily or under coercion), the separation guarantees are meaningless. The model protects sovereignty — it cannot enforce it against the sovereign's own choices.

Internal state access. If the adversary can read the internal memory of either agent (through malware, compromised hardware, or TEE breach), the model breaks entirely. The reconstruction ceiling assumes output-only observation.

Social engineering. The model is mathematical. It does not defend against the adversary convincing the human to reveal information through social means.

Compromised counterparties. If VRC counterparties are adversarial, the bilateral trust model weakens. The Sybil resistance from personhood binding mitigates but does not eliminate this threat.

Quantum adversaries. The PPT assumption does not hold against quantum computers. Post-quantum ZKP backends (ZK Spellbook Tale 28) are an active research area but not currently integrated.

The four breaking conditions

PVM V6 is explicitly falsifiable. Any of these demonstrated empirically requires fundamental revision:

Breaking condition 1 (retired at V6, replaced): three-axis multiplicative failure. The original condition here was UOR structural incompatibility; C4 (the 96 vs 64 discrepancy) is resolved at the register (holographic + algebraic), so that condition is closed. The live falsification frontier is C7 (~30%): if the multiplicative three-axis form fails at its boundary cases (partial collapse, axis correlation under composition, time dependence), the gating claim needs its register-bound corrections (additive-with-floor, min(), the det(Σ) form, or the differential form). Impact: the collapse-any-axis-collapse-everything story weakens to a softer degradation law.

Breaking condition 2: ε > 0.1 in practice. If real implementations cannot achieve conditional independence ε < 0.1 between Swordsman and Mage agents, the reconstruction ceiling no longer holds. The agents leak too much mutual information, and the adversary can triangulate. Impact: Dual-agent separation is a theoretical result without practical realisability. Alternative separation mechanisms would be needed.

Breaking condition 3: Sublinear network effects. If empirical measurement shows k < 1 in the network term (1 + Σ wᵢ nᵢ/N₀)^k, privacy networks face diminishing returns to scale. This would mean the sovereignty gap (17×–12,000×) is overstated and privacy networks hit natural scale ceilings. Impact: The economic thesis weakens. Privacy remains valuable but not exponentially so.

Breaking condition 4: Singular Σ matrices in practice. If real sovereignty architectures cluster near det(Σ) ≈ 0, the four-force separation is a theoretical ideal that real systems cannot approach. The sovereignty multiplier Φ(Σ) stays near zero regardless of individual force strength. Impact: The tetrahedral model is beautiful but impractical. A simpler scalar separation model may be sufficient.

The open conjectures (register-corrected)

Beyond the breaking conditions, the register (CONJECTURE_REGISTER_V6.md, head C89) supersedes any earlier list here. C3 (edge additivity) is challenged, replaced by the path integral; C4 (96 vs 64) and C5 (~3,000× ZKP reduction) are resolved. The live set a red team should track:

C1 (open). Golden ratio φ ≈ 1.618 as optimal protect/project balance. No proof that φ is the optimum rather than merely a good ratio.
C2 (open). Logarithmic growth of temporal memory. Could be sub- or super-logarithmic.
C7 (~30%). Three-axis separation is multiplicative. The V6 falsification frontier: break it at partial collapse, axis correlation, or time dependence.
C18 (~25%). Strange attractor dynamics on the sovereignty path (λ > 0). λ is unmeasured; a red team that measures it moves the model either way.
C82 (~65%). The moving ceiling: R(t) non-decreasing under capability growth; every static guarantee has a finite shelf life t*.
C83 (~55%). Compositional leakage amplification: (2^N − 1)ε versus Nε; the policy-to-amnesia gap is exponential-to-linear in N.
C84 (~50%). Existence-leak discount: every public feasibility attestation tightens the migration deadline, Z_b' = Z_b − D(a).
C86 (~30%). Obstruction-theoretic amnesia. Falsify it by exhibiting any view-composition that recovers a Grade-2-forgotten origin O.
C42 (~50%). Stake economics as Sybil resistance: open. Until resolved, presence signals are regime-1 color, not weight; attack any deployment that reads them as trust inputs.

Measurement gaps

Four quantities the model requires but cannot currently measure:

M1: Cross-force separations. No methodology exists for measuring σ_SR, σ_SC, σ_MR, σ_MC, σ_RC in the separation matrix Σ. Only σ_SM (Swordsman-Mage separation) has a clear operational definition.

M2: Empirical decay rate λ. Different privacy configurations decay at different rates. No calibration data exists.

M3: Network exponent k. Sub-, linear, or superlinear? Requires comparative measurement across networks of different sizes and sovereignty quality distributions.

M4: Edge weight f(e). How much more valuable are vertical transitions than lateral transitions? No empirical basis for the weighting.

The cheapest attacks

For each adversary capability level, the minimum-cost attack:

Passive observation only: Wait. The temporal decay term e^{-λt} erodes value automatically. If the agent does not continuously generate attested transitions, its sovereignty decays without any adversary action.

Frontier capability growth: Also wait, but for the decoder. Archive the output streams now; reconstruct later when C_S(t) + C_M(t) have risen past the old ceiling (C82). Costs nothing but storage. The only archive this attack can never open is the one amnesia never wrote (C86).

Network-level observation: Correlation analysis. Even under separation, temporal patterns may leak information. The cheapest version: observe when Swordsman and Mage are both active and correlate activity windows.

One compromised counterparty: Poison the VRC chain. Create a bilateral relationship where one party is adversarial, then use the VRC's bilateral proverb to extract context about the principal's private state.

TEE breach: Full compromise. If the execution environment is breached, all guarantees fail. The cheapest TEE breach depends on hardware — historically, speculative execution attacks have been the most accessible.

Red team protocol

When stress-testing a PVM V6 deployment:

Measure actual ε between Swordsman and Mage. If ε > 0.1, the deployment fails the fundamental requirement before any other testing.
Measure det(Σ) for the sovereignty matrix. If near zero, the four-force model is not providing value.
Simulate passive adversary with complete output streams. Measure actual reconstruction rate R. Compare to theoretical ceiling.
Test temporal correlation between agent outputs. Measure mutual information leaked through timing.
Attempt Sybil attack at each armor tier. Measure cost-to-forge vs. legitimate progression cost. If the deployment reads any presence signal (mana, 🪢) as a trust input, flag it: regime 1 forbids that, and C42 is still open.
Estimate t* for the deployment: re-run step 3 against successively stronger decoder classes and extrapolate R(t). Then audit every public feasibility attestation the deployment has made and compute the discount D(a) on its migration horizon (C84). A deployment that has attested loudly has less time than it thinks.

Verify: agentprivacy.ai · sync.soulbis.com · github.com/mitchuski/agentprivacy-docs