forked from numenta/NAB
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTM community detector (py3) #5
Merged
Merged
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
12d98e5
HTM.core new detector installed
breznak 2d692db
Merge branch 'py2_compatibility' into htm_community_detector
breznak e4e12c2
Merge branch 'py2_compatibility' into htm_community_detector
breznak b08c92e
Merge branch 'master' into htm_community_detector
breznak 8de00c6
new HtmcoreDetector POC
breznak 8e62c87
Htmcore: updated Readme for this detector
breznak 1649ec4
Htm.core detector: cleanup handleRecord()
breznak ae84be0
Htm.core detector: params and Enc, SP, TM initialized
breznak 7b49b09
Htm.core runable detector!
breznak 02bbf1f
Htm.core detector remove relic from numenta's Model API
breznak 3c11d61
Updated README
breznak a1c262f
README: add htmcore detector's results
breznak c893cb2
Htmcore detector: add model params from numenta_detector
breznak 4f2e8a5
Htmcore: add params comparable with numenta detector
breznak 01f4c41
remove wrong comment
breznak 3097a72
Improved parameters for htm.core detector.
ctrl-z-9000-times File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
The Numenta Anomaly Benchmark [![Build Status](https://travis-ci.org/numenta/NAB.svg?branch=master)](https://travis-ci.org/numenta/NAB) | ||
# The Numenta Anomaly Benchmark [![Build Status](https://travis-ci.org/numenta/NAB.svg?branch=master)](https://travis-ci.org/numenta/NAB) | ||
----------------------------- | ||
|
||
Welcome. This repository contains the data and scripts comprising the Numenta | ||
|
@@ -28,26 +28,44 @@ Ahmad, S., Lavin, A., Purdy, S., & Agha, Z. (2017). Unsupervised real-time | |
anomaly detection for streaming data. Neurocomputing, Available online 2 June | ||
2017, ISSN 0925-2312, https://doi.org/10.1016/j.neucom.2017.04.070 | ||
|
||
#### Scoreboard | ||
## Community edition | ||
|
||
This repo is [NAB community edition](https://github.com/htm-community/NAB) which is a for of the original [Numenta's NAB](https://github.com/numenta/NAB). One of the reasons for forking | ||
was a lack of developer activity in the upstream repo. | ||
|
||
### Features: | ||
|
||
- [x] Identical algorithms and datasets as the Numenta's NAB. So the results are `reproducible`. | ||
- [x] `Python 3` codebase (as Python 2 reaches end-of-life at 1/1/2020 and Numenta's not yet ported) | ||
- [x] additional community-provided detectors: | ||
- `htmcore`: currently the only HTM implementation able to run in NAB natively in python 3. (with many improvements in [Community HTM implementation, successor of nupic.core](https://github.com/htm-community/htm.core/). | ||
- `numenta`, `numenta_TM` detectors (original from Numenta) made compatible with the Py3 codebase (only requires Py2 installed) | ||
- [ ] additional datasets | ||
- TBD, none so far | ||
|
||
Statement: We'll try to upstream any changes, new detectors and datasets to upstream Numenta's NAB, when the devs have time to apply the changes. | ||
|
||
## Scoreboard | ||
|
||
The NAB scores are normalized such that the maximum possible is 100.0 (i.e. the perfect detector), and a baseline of 0.0 is determined by the "null" detector (which makes no detections). | ||
|
||
| Detector | Standard Profile | Reward Low FP | Reward Low FN | | ||
|---------------|------------------|---------------|---------------| | ||
| Perfect | 100.0 | 100.0 | 100.0 | | ||
| [Numenta HTM](https://github.com/numenta/nupic)* | 70.5-69.7 | 62.6-61.7 | 75.2-74.2 | | ||
| [CAD OSE](https://github.com/smirmik/CAD)† | 69.9 | 67.0 | 73.2 | | ||
| [earthgecko Skyline](https://github.com/earthgecko/skyline) | 58.2 | 46.2 | 63.9 | | ||
| [KNN CAD](https://github.com/numenta/NAB/tree/master/nab/detectors/knncad)† | 58.0 | 43.4 | 64.8 | | ||
| [Relative Entropy](http://www.hpl.hp.com/techreports/2011/HPL-2011-8.pdf) | 54.6 | 47.6 | 58.8 | | ||
| [Random Cut Forest](http://proceedings.mlr.press/v48/guha16.pdf) **** | 51.7 | 38.4 | 59.7 | | ||
| [Twitter ADVec v1.0.0](https://github.com/twitter/AnomalyDetection)| 47.1 | 33.6 | 53.5 | | ||
| [Windowed Gaussian](https://github.com/numenta/NAB/blob/master/nab/detectors/gaussian/windowedGaussian_detector.py) | 39.6 | 20.9 | 47.4 | | ||
| [Etsy Skyline](https://github.com/etsy/skyline) | 35.7 | 27.1 | 44.5 | | ||
| Bayesian Changepoint** | 17.7 | 3.2 | 32.2 | | ||
| [EXPoSE](https://arxiv.org/abs/1601.06602v3) | 16.4 | 3.2 | 26.9 | | ||
| Random*** | 11.0 | 1.2 | 19.5 | | ||
| Null | 0.0 | 0.0 | 0.0 | | ||
| Detector | Standard Profile | Reward Low FP | Reward Low FN | Detector name | Time (s) | | ||
|---------------|------------------|---------------|---------------|---------------|------------| | ||
| Perfect | 100.0 | 100.0 | 100.0 | | | | ||
| [Numenta HTM](https://github.com/numenta/nupic)* | 70.5-69.7 | 62.6-61.7 | 75.2-74.2 | `numenta` | | | ||
| [CAD OSE](https://github.com/smirmik/CAD)† | 69.9 | 67.0 | 73.2 | | | | ||
| [earthgecko Skyline](https://github.com/earthgecko/skyline) | 58.2 | 46.2 | 63.9 | | | | ||
| [KNN CAD](https://github.com/htm-community/NAB/tree/master/nab/detectors/knncad)† | 58.0 | 43.4 | 64.8 | | | | ||
| [Relative Entropy](http://www.hpl.hp.com/techreports/2011/HPL-2011-8.pdf) | 54.6 | 47.6 | 58.8 | | | | ||
| [Random Cut Forest](http://proceedings.mlr.press/v48/guha16.pdf) **** | 51.7 | 38.4 | 59.7 | | | | ||
| [htm.core](https://github.com/htm-community/htm.core/) | 50.83 | 49.95 | 52.64 | `htmcore` | | | ||
| [Twitter ADVec v1.0.0](https://github.com/twitter/AnomalyDetection)| 47.1 | 33.6 | 53.5 | | | | ||
| [Windowed Gaussian](https://github.com/htm-community/NAB/blob/master/nab/detectors/gaussian/windowedGaussian_detector.py) | 39.6 | 20.9 | 47.4 | | | | ||
| [Etsy Skyline](https://github.com/etsy/skyline) | 35.7 | 27.1 | 44.5 | | | | ||
| Bayesian Changepoint** | 17.7 | 3.2 | 32.2 | | | | ||
| [EXPoSE](https://arxiv.org/abs/1601.06602v3) | 16.4 | 3.2 | 26.9 | | | | ||
| Random*** | 11.0 | 1.2 | 19.5 | | | | ||
| Null | 0.0 | 0.0 | 0.0 | | | | ||
|
||
*As of NAB v1.0* | ||
|
||
|
@@ -64,22 +82,6 @@ The NAB scores are normalized such that the maximum possible is 100.0 (i.e. the | |
|
||
Please see [the wiki section on contributing algorithms](https://github.com/numenta/NAB/wiki/NAB-Contributions-Criteria#anomaly-detection-algorithms) for discussion on posting algorithms to the scoreboard. | ||
|
||
#### Corpus | ||
|
||
The NAB corpus of 58 timeseries data files is designed to provide data for research | ||
in streaming anomaly detection. It is comprised of both | ||
real-world and artifical timeseries data containing labeled anomalous periods of behavior. | ||
|
||
The majority of the data is real-world from a variety of sources such as AWS | ||
server metrics, Twitter volume, advertisement clicking metrics, traffic data, | ||
and more. All data is included in the repository, with more details in the [data | ||
readme](https://github.com/numenta/NAB/tree/master/data). We are in the process | ||
of adding more data, and actively searching for more data. Please contact us at | ||
[[email protected]](mailto:[email protected]) if you have similar data (ideally with | ||
known anomalies) that you would like to see incorporated into NAB. | ||
|
||
The NAB version will be updated whenever new data (and corresponding labels) is | ||
added to the corpus; NAB is currently in v1.0. | ||
|
||
#### Additional Scores | ||
|
||
|
@@ -88,35 +90,49 @@ For comparison, here are the NAB V1.0 scores for some additional flavors of HTM. | |
* Numenta HTM using NuPIC v.0.5.6: This version of NuPIC was used to generate the data for the paper mentioned above (Unsupervised real-time anomaly detection for streaming data. Neurocomputing, ISSN 0925-2312, https://doi.org/10.1016/j.neucom.2017.04.070). If you are interested in replicating the results shown in the paper, use this version. | ||
* [HTM Java](https://github.com/numenta/htm.java) is a Community-Driven Java port of HTM. | ||
* [nab-comportex](https://github.com/floybix/nab-comportex) is a twist on HTM anomaly detection using [Comportex](https://github.com/htm-community/comportex), a community-driven HTM implementation in Clojure. Please see [Felix Andrew's blog post](http://floybix.github.io/2016/07/01/attempting-nab) on experiments with this algorithm. | ||
* NumentaTM HTM detector uses the implementation of temporal memory found | ||
[here](https://github.com/numenta/nupic.core/blob/master/src/nupic/algorithms/TemporalMemory.hpp). | ||
* Numenta HTM detector with no likelihood uses the raw anomaly scores directly. To | ||
run without likelihood, set the variable `self.useLikelihood` in | ||
[numenta_detector.py](https://github.com/numenta/NAB/blob/master/nab/detectors/numenta/numenta_detector.py) | ||
to `False`. | ||
|
||
* NumentaTM HTM detector uses the implementation of temporal memory found [here](https://github.com/numenta/nupic.core/blob/master/src/nupic/algorithms/TemporalMemory.hpp). | ||
* Numenta HTM detector with no likelihood uses the raw anomaly scores directly. To run without likelihood, set the variable `self.useLikelihood` in [numenta_detector.py](https://github.com/numenta/NAB/blob/master/nab/detectors/numenta/numenta_detector.py) to `False`. | ||
|
||
|
||
|
||
| Detector |Standard Profile | Reward Low FP | Reward Low FN | | ||
|---------------|---------|------------------|---------------| | ||
| Numenta HTMusing NuPIC v0.5.6* | 70.1 | 63.1 | 74.3 | | ||
| [nab-comportex](https://github.com/floybix/nab-comportex)† | 64.6 | 58.8 | 69.6 | | ||
| [NumentaTM HTM](https://github.com/numenta/NAB/blob/master/nab/detectors/numenta/numentaTM_detector.py)* | 64.6 | 56.7 | 69.2 | | ||
| [HTM Java](https://github.com/numenta/NAB/blob/master/nab/detectors/htmjava) | 56.8 | 50.7 | 61.4 | | ||
| [NumentaTM HTM](https://github.com/htm-community/NAB/blob/master/nab/detectors/numenta/numentaTM_detector.py)* | 64.6 | 56.7 | 69.2 | | ||
| [HTM Java](https://github.com/htm-community/NAB/blob/master/nab/detectors/htmjava) | 56.8 | 50.7 | 61.4 | | ||
| Numenta HTM*, no likelihood | 53.62 | 34.15 | 61.89 | | ||
|
||
\* From NuPIC version 0.5.6 ([available on PyPI](https://pypi.python.org/pypi/nupic/0.5.6)). | ||
|
||
† Algorithm was an entry to the [2016 NAB Competition](http://numenta.com/blog/2016/08/10/numenta-anomaly-benchmark-nab-competition-2016-winners/). | ||
|
||
Installing NAB 1.0 | ||
|
||
|
||
## Corpus | ||
|
||
The NAB corpus of 58 timeseries data files is designed to provide data for research | ||
in streaming anomaly detection. It is comprised of both | ||
real-world and artifical timeseries data containing labeled anomalous periods of behavior. | ||
|
||
The majority of the data is real-world from a variety of sources such as AWS | ||
server metrics, Twitter volume, advertisement clicking metrics, traffic data, | ||
and more. All data is included in the repository, with more details in the [data | ||
readme](https://github.com/numenta/NAB/tree/master/data). We are in the process | ||
of adding more data, and actively searching for more data. Please contact us at | ||
[[email protected]](mailto:[email protected]) if you have similar data (ideally with | ||
known anomalies) that you would like to see incorporated into NAB. | ||
|
||
The NAB version will be updated whenever new data (and corresponding labels) is | ||
added to the corpus; NAB is currently in v1.0. | ||
|
||
|
||
## Installing NAB 1.0 | ||
------------------ | ||
|
||
### Supported Platforms | ||
|
||
- OSX 10.9 and higher | ||
- Amazon Linux (via AMI) | ||
- Linux | ||
|
||
Other platforms may work but have not been tested. | ||
|
||
|
@@ -125,34 +141,22 @@ Other platforms may work but have not been tested. | |
|
||
You need to manually install the following: | ||
|
||
- [Python 2.7](https://www.python.org/download/) | ||
- [Python 3](https://www.python.org/download/) | ||
- [pip](https://pip.pypa.io/en/latest/installing.html) | ||
- [NumPy](http://www.numpy.org/) | ||
- [NuPIC](http://www.github.com/numenta/nupic) (only required if running the Numenta detector) | ||
|
||
##### Download this repository | ||
|
||
Use the Github links provided in the right sidebar. | ||
#### Download this repository | ||
|
||
##### Install the Python requirements | ||
Use the Github [download links](https://github.com/htm-community/NAB/archive/master.zip) provided in the right sidebar, | ||
or `git clone https://github.com/htm-community/NAB` | ||
|
||
cd NAB | ||
(sudo) pip install -r requirements.txt | ||
|
||
This will install the required modules. | ||
|
||
##### Install NAB | ||
#### Install NAB | ||
|
||
Recommended: | ||
|
||
cd NAB | ||
pip install . --user | ||
|
||
|
||
> Note: If NuPIC is not already installed, the version specified in | ||
`NAB/requirements.txt` will be installed. If NuPIC is already installed, it | ||
will not be re-installed. | ||
|
||
|
||
If you want to manage dependency versions yourself, you can skip dependencies | ||
with: | ||
|
||
|
@@ -198,13 +202,11 @@ follow the directions below to "Run a subset of NAB". | |
|
||
##### Run HTM with NAB | ||
|
||
First make sure NuPIC is installed and working properly. Then: | ||
|
||
cd /path/to/nab | ||
python run.py -d numenta --detect --optimize --score --normalize | ||
python run.py -d htmcore --detect --optimize --score --normalize | ||
|
||
This will run the Numenta detector only and produce normalized scores. Note that | ||
by default it tries to use all the cores on your machine. The above command | ||
This will run the community HTM detector `htmcore` (to run Numenta's detector use `-d numenta`) and produce normalized scores. | ||
Note that by default it tries to use all the cores on your machine. The above command | ||
should take about 20-30 minutes on a current powerful laptop with 4-8 cores. | ||
For debugging you can run subsets of the data files by modifying and specifying | ||
specific label files (see section below). Please type: | ||
|
@@ -229,11 +231,10 @@ the specific version of NuPIC (and associated nupic.core) that is noted in the | |
|
||
This will run everything and produce results files for all anomaly detection | ||
methods. Several algorithms are included in the repo, such as the Numenta | ||
HTM anomaly detection method, as well as methods from the [Etsy | ||
Skyline](https://github.com/etsy/skyline) anomaly detection library, a sliding | ||
window detector, Bayes Changepoint, and so on. This will also pass those results | ||
files to the scoring script to generate final NAB scores. **Note**: this option | ||
will take many many hours to run. | ||
HTM anomaly detection method, as well as methods from the [Etsy Skyline](https://github.com/etsy/skyline) anomaly detection library, | ||
a sliding window detector, Bayes Changepoint, and so on. | ||
This will also pass those results files to the scoring script to generate final NAB scores. | ||
**Note**: this option will take many many hours to run. | ||
|
||
##### Run subset of NAB data files | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# HtmcoreDetector HTM implementation from [htm.core](https://github.com/htm-community/htm.core/) | ||
|
||
This detector provides HTM implementation from [htm.core](https://github.com/htm-community/htm.core/), | ||
which is an actively developed, community version of Numenta's [nupic.core](https://github.com/numenta/nupic.core). | ||
|
||
This is a python 3 detector, called `htmcore`, as Numenta is switching NAB to python 3, this is the closes detector you can get to | ||
`numenta`, `numentaTM` detectors. | ||
|
||
`Htm.core` offers API and features similar and compatible with the official HTM implementations `nupic`, `nupic.core`. Although there | ||
are significant speed and features improvements available! For more details please see [the htm.core project's README](https://github.com/htm-community/htm.core/blob/master/README.md) | ||
Bugs and questions should also be reported there. | ||
|
||
## Installation | ||
|
||
`htmcore` detector is automatically installed with your `NAB` installation (`python setup.py install`), | ||
so you don't have to do anything to have it available. | ||
|
||
### Requirements to install | ||
|
||
- [Python 3](https://www.python.org/download/) | ||
- [Git](https://git-scm.com/downloads) | ||
|
||
|
||
## Usage | ||
|
||
Is the same as the default detectors, see [NAB README section Usage](https://github.com/htm-community/NAB/blob/master/README.md#usage) | ||
|
||
### Example | ||
Follow the instructions in the main README to run optimization, scoring, and normalization, e.g.: | ||
|
||
`python run.py -d htmcore --optimize --score --normalize` | ||
|
Empty file.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first shot results are not really good, we have to pump it up guys! 📌
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed commit that uses
numenta_detector
's params, currently running benchmarks