▎AI & Multi-Agent

Model Extraction

Attack that recreates or approximates a model by querying it and observing outputs.

Definition

Model Extraction is attack that recreates or approximates a model by querying it and observing outputs. In defense applications, it can steal costly autonomy, targeting, or perception capabilities from exposed services. The hard part is query camouflage and partial reconstruction of sensitive behavior, especially when systems are deployed across contested links, coalition boundaries, and mixed human-machine teams. KhanBMS treats it as a threat mitigated by rate limits, watermarking, and confidential execution in KhanBMS, tying the concept back to modular command, edge execution, and auditable authority.

Reference attributes

Layer: model theft attack
Operational value: Can steal costly autonomy, targeting, or perception capabilities from exposed services
Primary risk: Query camouflage and partial reconstruction of sensitive behavior
KhanBMS role: A threat mitigated by rate limits, watermarking, and confidential execution in KhanBMS

Related terms

#security#ip#model