A scaleable, resilient anti-virus designed for cloud workloads, based on ClamAV.
Currently supported clouds:
- AWS
Start a ClamAV instance using docker
docker run -d -p 3310:3310 -e CLAMD_CONF_MaxFileSize=2000M -e CLAMD_CONF_StreamMaxLength=2000M -e CLAMD_CONF_MaxScanSize=2000M mk0x/docker-clamav
Configure Mousetrap
general:
port: 3000
scanTimeout: 3600
markStaleAfter: 4000
pollingInterval: 5
# maxScanAttempts: -1 # not yet implemented
clamd:
host: localhost
port: 3310
storage: s3
dynamodb:
tableName: mousetrap-tasks
region: us-east-1
sqs:
url: "https://sqs.us-east-1.amazonaws.com/123456789/mousetrap-tasks"
pollingInterval: 20
visibilityTimeout: 30
region: us-east-1
Start Mousetrap
npm run prod
Scan a file in a bucket using either SQS
aws sqs send-message --queue-url https://sqs.us-east-1.amazonaws.com/123456789/mousetrap-tasks --message-body '{"filePath": "s3://bucket/file.csv"}'
or the rest api
curl --location --request POST 'localhost:3000/api/tasks' \
--header 'Content-Type: application/json' \
--data-raw '{
"filePath": "s3://bucket/file.csv"
}'
You can put files to be scanned by either using an SQS queue, or through the REST API (both documented below).
At any point you can query the REST API for the status of a specific file (documented below), it's status will be one of either:
PENDING
- wasn't picked up by a worker yet and hasn't been scannedSCANNING
- a worker is currently scanning the fileFINISHED
- a worker has finished scanning the fileFAILED
- the file could not be scanned for any reason
When a file has been scanned a few things happen:
- The file is marked as
FINISHED
in the database, and can be queried though the REST API - Two tags:
MOUSETRAP_RESULT
&MOUSETRAP_TS
are added to the file in the bucket - (optionally) An SNS notification is sent to a channel
These are the options that are currently supported:
general:
port: 3000
scanTimeout: 3600
markStaleAfter: 4000
pollingInterval: 5
# maxScanAttempts: 5 # not yet implemented
clamd:
host: localhost
port: 3310
storage: s3 # supports only s3 currently
dynamodb:
tableName: mousetrap-scan-tasks
region: us-east-1
sqs:
url: "https://sqs.us-east-1.amazonaws.com/123456789/mousetrap-tasks"
pollingInterval: 20
visibilityTimeout: 30
region: us-east-1
sns:
topicArn: "arn:aws:sns:us-east-1:123456789:mousetrap-notifications"
region: us-east-1
You can configure mousetrap to send a notification for every scan result.
The topic you specify in the configuration will receive a notification for every scan performed. You can also specify a topic through the task payload when putting a task to be scanned:
Add a notifyChannels
key to the payload:
{ "filePath":"s3://bucket/file.csv", "notifyChannels": ["arn:aws:sns:us-east-1:123456789:mousetrap-notifications"] }
The message payload is a json document that looks like this:
{
"filePath":"s3://mousetrap-files/report.csv",
"scanResult":"CLEAN",
"viruses":[],
"timestamp":1594983711402
}
or like so in case of an error:
{
"filePath":"s3://mousetrap-files/report.csv",
"error": {
"code": "FILE_NOT_EXIST",
"message": "file does not exist in specified location"
},
"timestamp":1594983711402
}
While you can put a file to be scanned using a REST api, it is highly recommended that youuse a queue for resiliency instead.
The expected body is exactly the same as using the REST api, i.e:
{ "filePath": "s3://<bucketName>/<pathToFile>" }
Returns a current snapshot of all pending & scanning tasks, as well as which task this particular worker is scanning at the time of querying.
-
URL:
/api/tasks
-
Method:
GET
-
Success Response:
currentTask
specifies the task this particular worker is scanning.Should you query a worker that is idle,
currentTask
will benull
.- Code: 200
Content:{ "status": "success", "data": { "currentTask": [ { "scanResult": "PENDING", "viruses": [], "scanEndTs": -1, "sizeMb": 1073.07328414917, "scanAttempts": 0, "scanStartTs": 1594936370438, "createdTs": 1594928147834, "scanState": "SCANNING", "filePath": "s3://bucket/file.csv", "fileHash": "6b378f6bb00613a4b8192cfb3d805d9d-68" } ], "scanning": [ { "scanResult": "PENDING", "viruses": [], "scanEndTs": -1, "sizeMb": 1073.07328414917, "scanAttempts": 0, "scanStartTs": 1594936370438, "createdTs": 1594928147834, "scanState": "SCANNING", "filePath": "s3://bucket/file.csv", "fileHash": "6b378f6bb00613a4b8192cfb3d805d9d-68" } ], "pending": [] } }
- Code: 200
-
Sample Call:
curl --location --request GET 'localhost:3000/api/tasks'
-
URL:
/api/tasks/:filePath
-
Method:
GET
-
Success Response:
- Code: 200
Content:{ "status": "success", "data": { "task": { "scanResult": "INFECTED", "viruses": [ "Win.Test.EICAR_HDB-1" ], "scanEndTs": 1594924786399, "sizeMb": 0.000064849853515625, "scanAttempts": 1, "scanStartTs": 1594924784450, "createdTs": 1594927820899, "scanState": "FINISHED", "filePath": "s3://bucket/eicar.com.txt", "fileHash": "44d88612fea8a8f36de82e1278abb02f" } } }
- Code: 200
-
Error Response:
- Code: 404 NOT FOUND
-
Sample Call:
curl --location --request GET 'localhost:3000/api/tasks/s3://bucket/eicar.com.txt'
-
URL:
/api/tasks
-
Method:
POST
-
Data Params
Required:
filePath
Optional:
notifyChannels
{ "filePath": "s3://<bucketName>/<pathToFile>", "notifyChannels": [ "arn:aws:sns:us-east-1:123456789:mousetrap-notifications" ] // any valid notifications provider }
-
Success Response:
- Code: 200
-
Error Response:
- When body is malformed
Code: 400
Content:{ "status": "fail", "data": { "message": "no 'filePath' in body", "requestId": "0bfbd02e-84a7-4d5e-a887-3f42cd059a34" } }
OR
- When file does not exist in location
Code: 422
Content:{ "status": "fail", "data": { "message": "file does not exist in specified location", "code": "FILE_NOT_EXIST", "requestId": "bc79b594-23ea-4e8b-a809-0f05d305b18c" } }
- When body is malformed
-
Sample Call:
curl --location --request POST 'localhost:3000/api/tasks' \ --header 'Content-Type: application/json' \ --data-raw '{ "filePath": "s3://bucket/file.csv" }'
Bear in mind, how your infrastructure looks may vary from ours over at Totango, but this should give you an estimate for the performance you can expect:
10mb: 1031ms
100mb: 8840ms
1000mb: 56995ms - 94773ms # large files have seen the most amount of variation
# once i'll have a larger sample of files ill update with an average
This was in us-east-1, mousetrap and clamav running in a Kubernetes cluster, in the same pod on the same node.
Mousetrap having 0.5 a core and 800mb of memory. Clamav having 1 core and 2300mb of memory.
I dont know if this affects performance, but ClamAV ran with these env vars:
CLAMD_CONF_MaxFileSize=2000M
CLAMD_CONF_StreamMaxLength=2000M
CLAMD_CONF_MaxScanSize=2000M
Please use the issue tracker to report any bugs or file feature requests.
If you are interested in being a contributor and want to get involved in developing this project shoot us an email at [email protected]
In general, PRs are welcome. We follow the typical "fork-and-pull" Git workflow.
- Fork the repo on GitHub
- Clone the project to your own machine
- Commit changes to your own branch
- Push your work back up to your fork
- Submit a Pull Request so that we can review your changes
NOTE: Be sure to merge the latest changes from "upstream" before making a pull request!