TM anomaly #406

breznak · 2019-04-15T16:26:11Z

Anomaly is computed in TM directly, not in Anomaly class

TODO:

finish implementation :)
move AnomalyTests to TM.anomalyTests
- new AnomalyLikelihood tests -> must be fixed in separate PR.
remove Anomaly class
- remove computeRawAnomalyScore() vector version, keep only SDR
bindings for TM::getAnomalyScore()
serialization
doc
~~waiting for TM activeCells_ is a SDR #442 to resolve issue with extraActive, causing SDR to fail. Triggered by "testExtraActive " test~~

Fixes #347

Anomaly is computed in TM directly, not in Anomaly class

src/nupic/algorithms/TemporalMemory.cpp

which gets called in all cases (TM::compute, TM::activateCells) so anomalyScore would be always available

keeping computeRawAnomaly tests + Likelihood tests

problem was in cellsToColumns() which expects a SDR, but onfortunately (known err) does not fail if vector is passed!

all code related to anomaly computation in TM now moved to internal struct TMAnomaly, with public TM.anomaly.score

serialize the new functionality in TMAnomaly

deprecated, use: - TM.anomaly.score for "normal" TM's anomaly, - computeRawAnomalyScore from Anomaly.hpp - AnomalyLikelihood

rework to split update() called after activateDendrites() and getScore() called when needed

not working, need to be fixed later

breznak · 2019-05-03T23:18:42Z

Update: I came with a workaround for

Still waiting for #442 to resolve the issue. SDR failing to create when activateDendrites is used with extraActive

now, TM does not support anomaly for cases when extra inputs (param extra>0) are used. Otherwise anomaly runs just fine.

ctrl-z-9000-times

I don't like these changes. I think there is a much simpler solution to this problem. Instead of removing the anomaly class and adding a different class, just reuse the existing anomaly class. The TM should keep an instance of the existing anomaly class and call Anomaly::compute inside of TM::compute, in between the calls to activateDendrites and activateCells which ensures that the TM always has valid predictions as well as the current feed forward input.

ctrl-z-9000-times · 2019-05-04T00:27:41Z

src/nupic/algorithms/Anomaly.cpp

  // Calculate and return percent of active columns that were not predicted.
  SDR both(active.dimensions);
  both.intersection(active, predicted);

  return (active.getSum() - both.getSum()) / Real(active.getSum());
 }

-Real computeRawAnomalyScore(vector<UInt>& active,
-                            vector<UInt>& predicted)


Can we keep this overload?
This is where I keep my notes about how the SDR class should be doing set intersections.

This is where I keep my notes about how the SDR class should be doing set intersections.

We could keep it,
or keep the code just as a commet, if you're not actively developing on that?
And ideally move the comment to SDR.intersection() ?

Ok, I we can drop this method, but the rest of my criticism stands.

Friend classes are confusing and usually not necessary.

This doesn't include anomaly likelihood.

External predictive inputs don't work? I read through the code and it looks like they should just work, if you remove the checks that disable it.

Instead of removing the anomaly class and adding a different class, just reuse the existing anomaly class.

I thought you wanted me to get rid of the old Anomaly class and replace it with as simple as possibleimplementation in TM (?), which I ended up liking.

I'm OK to revive the old Anomaly class, with a few changes *).

Or I can just move the code TMAnomaly to Anomaly.hpp and be done with it.
Which of those would you prefer?

Friend classes are confusing and usually not necessary.

I'll avoid 'friend access' when moving to a separate new/old file.

This doesn't include anomaly likelihood.

Yes, and it is intended. And about other refactoring in Anomaly class *):

I've implemented Anomaly class in the past to simplify various options with anomaly (thresholds, likelihood, moving average,...)

now with later usage experience, I don;t think it's a good design, and as those features can be implemented as "one liners" with current code, I don't think it should be provided with anomaly class. "Let TM.getAnomalyScore() be the simple raw anomaly" (#A), just conveniently computed. And let the user do what they need later with it.

thereshold: simple if(score > XXX) return 1 else return 0;

likelihood:

(probably even does not truly work)

in Anomaly had a FIXME note that it's not truly configurable. And should be separated.

I could use the approach with abstract class AbstractAnomaly::compute(SDR, SDR), but it's complicating things, see argument #A

AnomalyLikelihood is not even a true anomaly (computed from 2 SDRs), but a likelihood of anomalies, thus computed AFTER and from anomaly score. So score = TM.getAnomalyScore(); Likelihood::anomalyProbability(score, timestep);

moving average: again, simple MovingAverage ma(5); ma.add(score); ma.get();

TL;DR: the proper configuration and ordering is too complicated in a wrapper Anomaly class, while relatively easy hand-tailored by user (with tools we provide).

External predictive inputs don't work? I read through the code and it looks like they should just work, if you remove the checks that disable it.

Externals break it in the way the they add "cell indices out of scope of the current TM" (idx in <numberOfCells() , numOfCells + extra>).
See for yourself in SDR::setSparseInplace(), you must use Debug build to trigger, that's why it's failing only on the OSX CI. I'm proposing to proceed as is, and discuss the fixes on external inputs in a separate issue or #442

What would you prefer?

My goal is to have convenient way to print out everything interesting about the TM to a human readable string, and from it to determine if the TM is working. I see the anomaly metric as a diagnostic tool.

I want to know the mean & standard deviation of the anomaly. The anomaly likelihood also uses this info.

It should be enabled by default.

The proper configuration and ordering is too complicated.

Agreed,

The user should be responsible for discarding anomaly readings when they haven't finished training the network.

The user should be responsible for applying the threshold.

I'm skeptical that the WEIGHTED mode can outperform the likelyhood mode. I don't understand the reasoning behind it, and it's not mentioned in the published literature. Maybe we could drop this too.

AnomalyLikelihood needs 1 parameter to do what it does, and it has 4 or 5.

ctrl-z-9000-times · 2019-05-04T02:57:54Z

I would also like to use the anomaly likelyhood code in the TM class. The original publication of NAB (https://arxiv.org/pdf/1510.03336.pdf) uses the likelihood code.

We could create a new API for this, like:

class AbstractAnomaly{
    virtual Real compute(const sdr::SDR& active, 
                         const sdr::SDR& predicted) = nullptr;
};

Or

typedef std::function<Real (const SDR&, const SDR&)> AbstractAnomaly;

And then let the user specify the function (Anomaly or AnomalyLikelihood) and its parameters.

Alternatively, if one of the three methods (Anomaly, AnomalyLikelihood, Weighted) is objectively the best then we should bake it into the TM, and not worry the users about it.

dkeeney

I have no problems with these changes.

dkeeney · 2019-05-04T14:10:30Z

src/nupic/regions/TMRegion.cpp

-      tm_->getActiveCells(sdr);
+    tm_->getActiveCells(sdr); //active cells
+    if (args_.orColumnOutputs) { //output as columns
+      sdr = tm_->cellsToColumns(sdr);


Should we add another output to TMRegion to provide access to anomaly scores, etc?

We can decide this later when discussing "anomaly region", but yes, anomaly score would be moved to TM class, so it'll end up as a new part of existing TMRegion.

@dkeeney if you have time, would you mind pushing a commit that adds a field "float anomalyScore" to TMRegion, and is computed from TM.getAnomalyScore(), please?
I could do it but Regions are still not my cup of coffee.

Understand. Yes, I can do this.
But rather than a field I suggest that it be a new output. That way, the Anomaly score can be passed to some other module after each iteration. It would be cool if we had a plot region to send it to.

If we just make it a field that can be queried then the user needs to add a callback to get it at each iteration. We can provide that as well but it would not be as useful.

I will take a look at the old .py code to see what the old python Anomaly Region did. I assume there was one.

Thank you!

But rather than a field I suggest that it be a new output. That way, the Anomaly score can be passed to some other module after each iteration.

Ok, this is the better solution. Would it be a problem that the output is different? (1 integer, instead of TM::getColumnDimensions(), or TM::getColumnDims + cellsPerColumn?

I will take a look at the old .py code to see what the old python Anomaly Region did. I assume there was one.

there wasn't any AnomalyRegion

need to save/load anomaly_ member

as its a part of TM now, tAnLikely still exists

for normal computation, constructor must always specify columnDimensions, only expection is TM tm; for deserialization, but that should never be used for compute()

removes old alternative with vectors

breznak

Please review, this finishes the new TM.anomaly implementation from #445
and has some further fixes to TM.

breznak · 2019-05-05T12:56:41Z

bindings/py/cpp_src/bindings/algorithms/py_TemporalMemory.cpp

@@ -180,6 +180,9 @@ using namespace nupic::algorithms::connections;

        py_HTM.def_property_readonly("extra", [](const HTM_t &self) { return self.extra; } );

+	py_HTM.def_property_readonly("anomaly", [](const HTM_t &self) { return self.anomaly; }, 


TM.anomaly py binding

breznak · 2019-05-05T12:58:37Z

src/nupic/algorithms/TemporalMemory.cpp

@@ -1064,7 +1044,8 @@ bool TemporalMemory::operator==(const TemporalMemory &other) {
      winnerCells_ != other.winnerCells_ ||
      maxSegmentsPerCell_ != other.maxSegmentsPerCell_ ||
      maxSynapsesPerSegment_ != other.maxSynapsesPerSegment_ ||
-      iteration_ != other.iteration_) {
+      iteration_ != other.iteration_ ||
+      anomaly_ != other.anomaly_ ) {


fixed operator ==, and op!= which was broken. added test

breznak · 2019-05-05T12:59:15Z

src/nupic/algorithms/TemporalMemory.hpp

   *
   * See TM::compute() for details of the parameters. 
   *
   */
-  void activateDendrites(const bool learn = true,
-                         const vector<UInt> &extraActive  = {std::numeric_limits<UInt>::max()},
-                         const vector<UInt> &extraWinners = {std::numeric_limits<UInt>::max()});


removed the vector version of activateDendrites. All TM's public API now uses SDR

breznak · 2019-05-05T12:59:39Z

src/nupic/algorithms/TemporalMemory.hpp

@@ -472,6 +474,7 @@ using namespace nupic::algorithms::connections;
       CEREAL_NVP(activeCells_),
       CEREAL_NVP(winnerCells_),
       CEREAL_NVP(segmentsValid_),
+       CEREAL_NVP(anomaly_),


fixed serialization for new TM.anomaly

ctrl-z-9000-times

This looks good to me!

breznak · 2019-05-05T14:18:25Z

Thank you, David! I'm quite happy with the TM.anomaly as it is now 😄

Two remaining issues:

anomaly output for TMRegion
anomaly works with TM::compute() but breaks with activateCells.
- I'd prefer moving activateCells to protected.
- or we could track TM.anomaly with some iteration number, and NTA_THROW if mismatched

dkeeney · 2019-05-05T14:34:02Z

adds a field "float anomalyScore" to TMRegion, and is computed from TM.getAnomalyScore()

I assume you still want this new function.
But we did not decide if you also want anomaly to be a Region output.

breznak · 2019-05-05T14:39:37Z

yes, but now it's called just TM.anomaly

But we did not decide if you also want anomaly to be a Region output.

I agree with your suggestion that output would be a better choice.

dkeeney · 2019-05-05T14:40:56Z

Ok, I am creating a new PR to do this.

dkeeney · 2019-05-05T14:47:25Z

For Outputs we currently have:

bottomUpOut which is from TM::getActiveCells() which is an SDR of either numberOfCols if orColumnOutputs is set or number of cells by default.
activeCells (which is always an SDR of number of cells)
predictedActiveCells which are the Winners
anomaly (which is the new output)

With all of the recent changes, is this still the expected outputs from TMRegion.

breznak · 2019-05-05T15:01:39Z

With all of the recent changes, is this still the expected outputs from TMRegion.

yes.

anomaly would be 1x1 scalar

predictedActiveCells which are the Winners

might change name to winnerCells are winners

might add predictiveCells populated with tm.getPredictiveCells()
probably all of the *Cells should use the orColumnOutputs logic (rename that to asColumns)

ctrl-z-9000-times · 2019-05-05T15:03:04Z

predictedActiveCells which are the Winners

This is not 100% correct. When a mini-column bursts because it was unpredicted (this is an anomaly!), then a winner cell is arbitrarily chosen.

dkeeney · 2019-05-05T15:54:21Z

Changing the name of predictedActiveCells will break the API. But we are changing so many other things, now is probably a good time to change this as well.

I was going to get this out in just a few min but but with these name changes this will take longer. My wife says I need to go wash the windows...company coming next week. I will try to finish this PR later today.

breznak · 2019-05-05T16:07:04Z

Changing the name of predictedActiveCells will break the API.

hmm, we wanted to keep NetworkAPI as stable as possible.

I'm not sure what @ctrl-z-9000-times meant with the comment

This is not 100% correct. When a mini-column bursts because it was unpredicted (this is an anomaly!), then a winner cell is arbitrarily chosen.

if it's just the comment about name vs winner cells, we should keep the old output name, and add this note to the description.

dkeeney · 2019-05-05T17:45:48Z

Ok, windows are washed. On to more interesting things...

Lets see, a second output called predictiveCells
orColumnOutputs, lets not change the field name
PredictedActiveCells will remain that name. Comments already say its the Winners.

dkeeney · 2019-05-05T18:06:57Z

What comment should I add to the predictiveCells output? Is "The cells that were predicted" or is this "The cells that predict the next set of active cells".

Also, The vector overload of activateDendrites( ) does not appear to be implemented.

  void activateDendrites(const bool learn = true,
                         const vector<UInt> &extraActive  = {std::numeric_limits<UInt>::max()},
                         const vector<UInt> &extraWinners = {std::numeric_limits<UInt>::max()});

breznak · 2019-05-05T18:24:52Z

"Cells predicted to be active in the next step" ?

Also, The vector overload of activateDendrites( ) does not appear to be implemented.

yes, there's only SDR version now. activateDendrites(bool), or activateDendrites(bool, SDR, SDR)

dkeeney · 2019-05-05T18:32:30Z

oh, just needed to pull from master.

TM::anomalyScore moved to TM directly

ad718f1

Anomaly is computed in TM directly, not in Anomaly class

breznak added anomaly TM code code enhancement, optimization, cleanup..programmer stuff labels Apr 15, 2019

breznak self-assigned this Apr 15, 2019

breznak commented Apr 15, 2019

View reviewed changes

src/nupic/algorithms/TemporalMemory.cpp Outdated Show resolved Hide resolved

src/nupic/algorithms/TemporalMemory.cpp Outdated Show resolved Hide resolved

src/nupic/algorithms/TemporalMemory.cpp Outdated Show resolved Hide resolved

breznak added 6 commits April 17, 2019 08:28

Merge branch 'master_community' into tm_anomaly

02a2364

Merge branch 'master_community' into tm_anomaly

e2f21cc

TM: move anomalyScore computation to activateCells

16a3461

which gets called in all cases (TM::compute, TM::activateCells) so anomalyScore would be always available

TM: anomalyScore implemented

af467a5

Anomaly: rm Anomaly class tests

f6d3730

keeping computeRawAnomaly tests + Likelihood tests

AnomalyLikelihood: moved tests to separate file

52e8ab5

breznak added the in_progress label Apr 25, 2019

breznak added 13 commits May 1, 2019 16:19

Merge branch 'master_community' into tm_anomaly

0a9ba95

TM: anomaly computation updates

372cc57

TM: small test update

8556107

TM: fix anomaly compute

f3c5157

problem was in cellsToColumns() which expects a SDR, but onfortunately (known err) does not fail if vector is passed!

TM: anomaly computation uses TMAnomaly struct

dbf7f73

all code related to anomaly computation in TM now moved to internal struct TMAnomaly, with public TM.anomaly.score

TM: fix serialization TMAnomaly

3a34e6c

serialize the new functionality in TMAnomaly

Anomaly: remove Anomaly class

affb657

deprecated, use: - TM.anomaly.score for "normal" TM's anomaly, - computeRawAnomalyScore from Anomaly.hpp - AnomalyLikelihood

Hotgym: update anomaly scores check for TM.anomaly.score

fea3d2c

cleanup TM test

a3557eb

TM: tiny doc

187e5c9

TM anomaly: split update & getScore()

43f52c5

rework to split update() called after activateDendrites() and getScore() called when needed

TM: implement getAnomalyScore()

96e8211

AnomalyLikelihood: disabled broken tests

1072e63

not working, need to be fixed later

breznak added ready and removed in_progress labels May 2, 2019

breznak requested review from dkeeney and ctrl-z-9000-times May 2, 2019 00:58

fix typo

8a05797

ctrl-z-9000-times requested changes May 4, 2019

View reviewed changes

dkeeney reviewed May 4, 2019

View reviewed changes

ctrl-z-9000-times mentioned this pull request May 5, 2019

Prototype Anomaly metric inside of Temporal Memory. #445

Merged

3 tasks

breznak added 7 commits May 5, 2019 11:52

Merge branch 'master_community' into tm_anomaly

d7cae2b

TM.anomaly add python binding

759cf29

TM: fix operator==, !=, add test

5f8bd08

TM fix serialization for new anomaly

da7ccea

need to save/load anomaly_ member

Hotgym: remove timer for anomaly

01675a2

as its a part of TM now, tAnLikely still exists

TM add test for correct constructor used

7d172b1

for normal computation, constructor must always specify columnDimensions, only expection is TM tm; for deserialization, but that should never be used for compute()

TM: activateDendrites now uses SDR

e73b54b

removes old alternative with vectors

breznak commented May 5, 2019

View reviewed changes

ctrl-z-9000-times approved these changes May 5, 2019

View reviewed changes

breznak merged commit 41f6e08 into master May 5, 2019

breznak deleted the tm_anomaly branch May 5, 2019 14:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TM anomaly #406

TM anomaly #406

breznak commented Apr 15, 2019 •

edited

Loading

breznak commented May 3, 2019

ctrl-z-9000-times left a comment

ctrl-z-9000-times May 4, 2019 •

edited

Loading

breznak May 4, 2019

ctrl-z-9000-times May 4, 2019

breznak May 4, 2019

ctrl-z-9000-times May 5, 2019 •

edited

Loading

ctrl-z-9000-times commented May 4, 2019

dkeeney left a comment

dkeeney May 4, 2019

breznak May 4, 2019

breznak May 4, 2019

dkeeney May 4, 2019

breznak May 4, 2019

breznak left a comment

breznak May 5, 2019

breznak May 5, 2019

breznak May 5, 2019

breznak May 5, 2019

ctrl-z-9000-times left a comment

breznak commented May 5, 2019

dkeeney commented May 5, 2019

breznak commented May 5, 2019

dkeeney commented May 5, 2019

dkeeney commented May 5, 2019

breznak commented May 5, 2019

ctrl-z-9000-times commented May 5, 2019

dkeeney commented May 5, 2019

breznak commented May 5, 2019

dkeeney commented May 5, 2019

dkeeney commented May 5, 2019

breznak commented May 5, 2019

dkeeney commented May 5, 2019

		@@ -180,6 +180,9 @@ using namespace nupic::algorithms::connections;

		py_HTM.def_property_readonly("extra", [](const HTM_t &self) { return self.extra; } );

		py_HTM.def_property_readonly("anomaly", [](const HTM_t &self) { return self.anomaly; },

TM anomaly #406

TM anomaly #406

Conversation

breznak commented Apr 15, 2019 • edited Loading

breznak commented May 3, 2019

ctrl-z-9000-times left a comment

Choose a reason for hiding this comment

ctrl-z-9000-times May 4, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ctrl-z-9000-times May 5, 2019 • edited Loading

Choose a reason for hiding this comment

ctrl-z-9000-times commented May 4, 2019

dkeeney left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

breznak left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ctrl-z-9000-times left a comment

Choose a reason for hiding this comment

breznak commented May 5, 2019

dkeeney commented May 5, 2019

breznak commented May 5, 2019

dkeeney commented May 5, 2019

dkeeney commented May 5, 2019

breznak commented May 5, 2019

ctrl-z-9000-times commented May 5, 2019

dkeeney commented May 5, 2019

breznak commented May 5, 2019

dkeeney commented May 5, 2019

dkeeney commented May 5, 2019

breznak commented May 5, 2019

dkeeney commented May 5, 2019

breznak commented Apr 15, 2019 •

edited

Loading

ctrl-z-9000-times May 4, 2019 •

edited

Loading

ctrl-z-9000-times May 5, 2019 •

edited

Loading