Skip to content

Add integration tests for security analyzer #8774

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

rbren
Copy link
Collaborator

@rbren rbren commented May 28, 2025

  • This change is worth documenting at https://docs.all-hands.dev/
  • Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below

End-user friendly description of the problem this fixes or functionality this introduces.

N/A - This is a test improvement only.


Summarize what the PR does, explaining any non-trivial design decisions.

This PR adds integration tests for the security analyzer component to ensure it properly integrates with the event stream and correctly handles actions with different security risk levels. The tests verify:

  1. High-risk actions are properly identified and blocked
  2. Low-risk actions are allowed to proceed
  3. Medium-risk actions require confirmation when confirmation mode is enabled
  4. User rejection of actions works correctly

Additionally, this PR adds three new security analyzers:

  1. PushoverSecurityAnalyzer: A simple analyzer that allows all actions by marking them as low risk
  2. BullySecurityAnalyzer: A simple analyzer that blocks all actions by marking them as high risk
  3. LLMSecurityAnalyzer: An analyzer that uses an LLM to evaluate actions for security risks

These new analyzers provide different security evaluation strategies:

  • The Pushover and Bully analyzers provide simple, predictable behavior for testing and can be useful in development environments where you want to either bypass security checks or enforce strict security policies.
  • The LLM-based analyzer provides a more sophisticated approach by leveraging language models to evaluate the potential risks of actions based on their content.

These tests help ensure that no action gets passed to the runtime until the security analyzer has evaluated it and determined it is safe to run.


Link of any specific issues this addresses:

N/A


To run this PR locally, use the following command:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:3a85a35-nikolaik   --name openhands-app-3a85a35   docker.all-hands.dev/all-hands-ai/openhands:3a85a35

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants