Skip to content

[Phase 0.2.3] Create adversarial test framework#37

Open
richard-devbot wants to merge 4 commits intoCursorTouch:mainfrom
richard-devbot:richardson/phase0-adversarial-tests
Open

[Phase 0.2.3] Create adversarial test framework#37
richard-devbot wants to merge 4 commits intoCursorTouch:mainfrom
richard-devbot:richardson/phase0-adversarial-tests

Conversation

@richard-devbot
Copy link
Copy Markdown

Closes #9

What was implemented

tests/adversarial/ directory structure

  • __init__.py — package marker
  • conftest.py — three pytest fixtures:
    • injection_payloads — parametrized fixture loading all 55 prompt injection patterns from prompt_injection.yaml (one test invocation per pattern, keyed by ID)
    • mock_llm_with_injection — builds a MagicMock LLM client whose .complete() / .acomplete() return the current injection payload as response content, simulating a compromised or adversarially-controlled model
    • attack_scenario — parametrized fixture over 5 multi-step attack chains (roleplay escalation, tool-chain exfiltration, indirect web injection, authority escalation, context poisoning via memory)

Payload library (tests/adversarial/payloads/)

File Patterns Categories
prompt_injection.yaml 55 instruction override, data exfiltration, jailbreak, prompt leakage, context poisoning, obfuscation, multi-turn, tool abuse, social engineering, nested injection
indirect_injection.yaml 33 web content, document (PDF/CSV/JSON/YAML), email, API response, database, code repository, calendar, search results, media metadata, cross-context
resource_exhaustion.yaml 28 context flood, recursive prompts, tool call abuse, memory exhaustion, computation abuse, malformed input, rate/session abuse

Test file (test_adversarial.py)

  • TestPromptInjectionPayloads — 5 parametrized test methods run against every injection payload (schema validation, null byte stripping, INST delimiter removal, <system> tag stripping, mock LLM response sanitization)
  • TestAttackScenarios — 5 test methods validate each attack scenario's schema and ordering invariants
  • 5 @given hypothesis tests: sanitizer never crashes on arbitrary text, sanitizer is idempotent, handles control/surrogate characters, response safety check handles any dict, injection delimiters stripped in all contexts

Dev dependencies added

  • hypothesis>=6.100.0
  • pyyaml>=6.0.0

Test count

With 55 injection payloads × 5 test methods + 5 scenarios × 5 test methods + 5 hypothesis tests = 305+ test invocations from this framework.

Richardson Gunde and others added 4 commits April 13, 2026 12:19
…brary

Creates tests/adversarial/ with conftest fixtures (injection_payloads,
mock_llm_with_injection, attack_scenario), 55+ prompt injection patterns,
33 indirect injection patterns, 28 resource exhaustion patterns across three
YAML payload files, and property-based fuzz tests via hypothesis. Adds
hypothesis and pyyaml to dev dependencies.

Closes CursorTouch#9

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Remove unused local variable `mock_session` (F841) in test_mcp_manager.py
- Prefix unused `tool_names` with underscore (F841) in test_mcp_manager.py
- Add TYPE_CHECKING guard for MCPManager import in cli/start.py (F821)
- Apply ruff --fix and ruff format across all files (12 auto-fixed, 212 reformatted)

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…[ci]

- BrowserPlugin.SYSTEM_PROMPT: add <perception>, <tool_use>, <execution_principles> sections
- BrowserPlugin.register_hooks: actually register _state_hook on BEFORE_LLM_CALL when enabled
- BrowserPlugin.unregister_hooks: unregister _state_hook from BEFORE_LLM_CALL
- BrowserPlugin.unregister_tools: call unset_extension for "browser" and "_browser"
- BrowserPlugin.enable/disable: wire hook register/unregister through lifecycle
- ComputerPlugin.SYSTEM_PROMPT: add <perception>, <tool_use>, <execution_principles> sections
- ComputerPlugin.register_hooks: register _state_hook + _wait_for_ui_hook when enabled
- ComputerPlugin.unregister_hooks: unregister both hooks
- ComputerPlugin.enable/disable: wire hook register/unregister through lifecycle
- control_center: pass kwargs._graceful_restart_fn through to _do_restart(graceful_fn=...)
- ToolRegistry.get: also check _extensions so registry.get("browser") finds the browser instance
- ruff format: reformat entire codebase to resolve style violations

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Phase 0.2.3] Create adversarial test framework

1 participant