Skip to content

Latest commit

 

History

History
524 lines (459 loc) · 37.1 KB

NetworkAPI_Regions.md

File metadata and controls

524 lines (459 loc) · 37.1 KB

NetworkAPI Predefined Region Types

There are a number of predefined C++ region implementations that are included in the htm.core library. These do not need to be registered. There are also Python implemented regions and user written C++ regions that can act as plugins. These will need to be registered so they can be loaded at run-time.

The built-in C++ region implementations that are included in the htm.core library are:

  • ScalarEncoderRegion - encodes numeric and category data
  • RDSEEncoderRegion - encodes numeric and category data using a hash
  • DateEncoderRegion - encodes date and/or time data
  • SPRegion - HTM Spatial Pooler implementation
  • TMRegion - HTM Temporal Memory implementation
  • FileOutputRegion - Writes data to a file
  • FileInputRegion - Reads data from a file
  • ClassifierRegion - An SDR classifier

ScalarEncoderRegion

A ScalarEncoderRegion encapsulates the ScalarEncoder algorithm, an SDR encoder for numeric or category values. As a network runs, the client will specify new encoder inputs by setting the "sensedValue" parameter or connect a data stream to its 'values' input. An assignment to the 'sensedValue' parameter will override the input stream. On each compute cycle, the ScalarSensor will encode its input into the output. The algorithom determines the number of buckets and then encodes input such that the input results in the required number of bits in the resulting pattern such that there are no overlapping bits for values in a bucket. See HTM School videos for more insite into how this works.

These four (4) members define the total number of bits in the output pattern:

  • size (or n),
  • radius,
  • category,
  • resolution.

These are mutually exclusive and only one of them should be provided when constructing the encoder.

These two (2) members define the number of bits per bucket in the output:

  • sparsity,
  • activeBits (or w).

These are also mutually exclusive and only one of them should be provided when constructing the encoder.

Note that the output is an array of boolean values in a pattern and is output in an SDR. However, this is not a true SDR in that it is not sparse. This pattern should be passed to an SPRegion to set the sparsity.

Parameter Description Access Type Default
sensedValue An input value. ReadWrite Real64 -1
size (or n) Width of encoding (total number of boolean values in the output). Create UInt32 (required entry)
activeBits (or w) Member "activeBits" is the number of true bits in the encoded output SDR. Each output encoding will have a contiguous block of this many 1's. Create UInt32 (required entry)
sparsity Member "sparsity" is an alternative way to specify the member "activeBits". Sparsity requires that the size to also be specified. Specify only one of: activeBits or sparsity. Create Real32 (required entry)
resolution The resolution for the encoder. How wide is the quantized bucket. Create Real64 (required entry)
radius How many buckets are there in the range of possible input values. Create Real64 (required entry)
category Member "category" means that the inputs are enumerated categories. If true then this encoder will only encode unsigned integers, and all inputs will have unique / non-overlapping representations. Create Boolean false
minValue The smallest input value expected. Create Real64 -1.0
maxValue The largest input value expected Create Real64 +1.0
periodic Does the pattern repeat. Create Boolean false
clipInput Should out-of-range values be clipped to minValue or maxValue? Else it gives an error. Create Boolean false
Input Description Data Type
values The value to be encoded for the current sample. Real64
Output Description Data Type
encoded The encoded bits forming the pattern for this sample. SDR
bucket The quantized input. The value of the current bucket. This is used by the Classifier while learning. Real64

RDSEEncoderRegion

A RDSEEncoderRegion region encoder encapsulates the RandomDistributedScalarEncoder(RDSE) algorithm, an SDR encoder for numeric or category values. This is similar to the ScalarSensor region encoder except that the minimum and maximum values do not need to be known and it generates a hash that is encoded thus giving a better spread of values.

As a network runs, the client will specify new encoder inputs by setting the "sensedValue" parameter or connect a data stream to its 'values' input. An assignment to the 'sensedValue' parameter will override the input stream if both are given. On each compute cycle, the ScalarSensor will encode its input into the output. The algorithom hashes the input and the hashed value determines the number of buckets. It then encodes this hashed value such that the input results in the required number of bits in the resulting pattern such that there are no overlapping bits for values in a bucket.

These four (3) members define the total number of bits in the output pattern:

  • radius,
  • category,
  • resolution.

These are mutually exclusive and only one of them should be provided when constructing the encoder.

These two (2) members define the number of bits per bucket in the output:

  • sparsity,
  • activeBits.

These are also mutually exclusive and only one of them should be provided when constructing the encoder.

Note that the output is an array of boolean values in a pattern and is output in an SDR. However, this is not a true SDR in that it is not sparse. This pattern should be passed to an SPRegion to set the sparsity.

Parameter Description Access Type Default
sensedValue An input value. ReadWrite Real64 -1
size Width of encoding (total number of boolean values in the output). Create UInt32 (required entry)
activeBits Member "activeBits" is the number of true bits in the encoded output SDR. Each output encoding will have a contiguous block of this many 1's. Create UInt32 (required entry)
sparsity Member "sparsity" is an alternative way to specify the member "activeBits". Sparsity requires that the size to also be specified. Specify only one of: activeBits or sparsity. Create Real32 (required entry)
resolution The resolution for the encoder. How wide is the quantized bucket. Create Real64 (required entry)
radius Member "radius" Two inputs separated by more than the radius have non-overlapping representations. Two inputs separated by less than the radius will in general overlap in at least some of their bits. You can think of this as the radius of the input. Create Real64 (required entry)
category Member "category" means that the inputs are enumerated categories. If true then this encoder will only encode unsigned integers, and all inputs will have unique / non-overlapping representations. Create Boolean false
seed Member "seed" forces different encoders to produce different outputs, even if the inputs and all other parameters are the same. Two encoders with the same seed, parameters, and input will produce identical outputs. The seed 0 is special. Seed 0 is replaced with a random number. Use a non-zero value if you want the results to be reproducible. Create UInt32 0
noise amount of noise to add to the output SDR. 0.01 is 1%. Create Real64 0
Input Description Data Type
values The value to be encoded for the current sample. Real64
Output Description Data Type
encoded The encoded bits forming the pattern for this sample. SDR
bucket The quantized input. The value of the current bucket. This is used by the Classifier while learning. Real64

DateEncoderRegion

The DateEncoderRegion region encapsulates the DateEncoder algorithm. It encodes up to 6 attributes of a timestamp value into an array of 0's and 1's.

The input is a timestamp which is unix date/time; an integral value representing the number of seconds elapsed since 00:00 hours, Jan 1, 1970 UTC (the unix EPOCH). Some platforms (unix and linux) allow negitive numbers as the timestamp which allows time before EPOCH to be expressed. However some platforms (windows) allow only positive numbers. If the type time_t on your computer is is 32bits then the timestamp will not allow dates after Jan 18, 2038. By default, on windows it is 64bit but on some older 32bit linux machines time_t is 32bit. google "Y2K38".

The output is an array containing 0's except for a contiguous block of 1's for each attribute member. This is held in an SDR container although technically this is not a sparse representation. It is normally passed to a SpatialPooler which will turn this into a true sparse representation.

The *_width parameters determine if an attribute is used in the encoding. If non-zero, it indicates the number of bits to dedicate to this attribute. The more bits specified relative to the other attributes, the more important that attribute is in the resulting semantic pattern.

Parameter Description Access Type Default
season_width How many bits apply to the season attribute. Create UInt32 0
season_radius How many days are considered a season. Create Real32 91.5
dayOfWeek_width How many bits apply to the day-of-week attribute. Create UInt32 0
dayOfWeek_radius How many days are considered a day of week. Create Real32 1.0
weekend_width How many bits apply to the weekend attribute. Create UInt32 0
timeOfDay_width How many bits apply to the timeOfDay attribute. Create UInt32 0
timeOfDay_radius How many hours in a time-of-day bucket. Create Real32 4.0
custom_width How many bits apply to the custom day attribute. Create UInt32 0
custom_days A list of day-of-week to be included in the set, i.e. 'mon,tue,fri' Create String
holiday_width How many bits apply to the holiday attribute. Create UInt32 0
holiday_days A JSON encoded list of holiday dates in format of '[month,day]' or '[year,month,day]', ie for two dates, [[12,25],[2020,05,04]] Create String [[12,25]]
verbose if true, display debug info for each member encoded. ReadWrite Boolean false
size Total width of encoded output ReadOnly UInt32 0
noise The amount of noise to add to the output SDR. 0.01 is 1%" ReadWrite Real32 0.0
sensedTime The value to encode. Unix EPOCH time. Overriden by input 'values'. A value of 0 means current time. ReadWrite Int64 0.0
Input Description Data Type
values The value to be encoded for the current sample. Int64 (time_t)
Output Description Data Type
bucket Quantized samples based on the radius. One sample for each attribute used. Becomes the title for this sample in the Classifier. Real64
encoded The encoded bits forming the pattern for this sample. SDR

SPRegion

The SPRegion encapsulates the HTM Spatial Pooler algorithm described in 'BAMI https://numenta.com/biological-and-machine-intelligence/'. The Spatial Pooler is responsible for creating a sparse distributed representation of the input. Given an input it computes a set of sparse active columns and simultaneously updates its permanence, duty cycles, etc.

Parameter Description Access Type Default
columnCount Total number of columns (coincidences). This is the Output Dimension for the SDR. Create UInt32 0
inputWidth Maximum size of the 'bottomUpIn' input to the SP. This is the input Dimension. The input buffer width is taken from the width of all concatinated output buffers that are connected to the input. ReadOnly UInt32 0
potentialRadius This parameter determines the extent of the input that each column can potentially be connected to. This can be thought of as the input bits that are visible to each column, or a 'receptiveField' of the field of vision. A large enough value will result in 'global coverage', meaning that each column can potentially be connected to every input bit. This parameter defines a square(or hyper square) area: a column will have a max square potential pool with sides of length '2 * potentialRadius + 1'. Default '0'. If 0, during, Initialization it is set to the value of inputWidth parameter. ReadWrite UInt32 16
potentialPct The percent of the inputs, within a column's potential radius, that a column can be connected to.If set to 1, the column will be connected to every input within its potential radius.This parameter is used to give each column a unique potential pool when a large potentialRadius causes overlap between the columns.At initialization time we choose ((2 * potentialRadius + 1) ^ (# inputDimensions) * potentialPct) input bits to comprise the column's potential pool. ReadWrite Real32 0.5
globalInhibition If true, then during inhibition phase the winning columns are selected as the most active columns from the region as a whole.Otherwise, the winning columns are selected with respect to their local neighborhoods. Using global inhibition boosts performance x60. ReadWrite Boolean true
localAreaDensity The desired density of active columns within a local inhibition area (the size of which is set by the internally calculated inhibitionRadius, which is in turn determined from the average size of the connected potential pools of all columns). The inhibition logic will insure that at most N columns remain ON within a local inhibition area, where N = localAreaDensity * (total number of columns in inhibition area). ReadWrite Real32 0
stimulusThreshold This is a number specifying the minimum number of synapses that must be on in order for a columns to turn ON.The purpose of this is to prevent noise input from activating columns.Specified as a percent of a fully grown synapse. ReadWrite UInt32 0
synPermInactiveDec The amount by which an inactive synapse is decremented in each round. Specified as a percent of a fully grown synapse. ReadWrite Real32 0.008
synPermActiveInc The amount by which an active synapse is incremented in each round. Specified as a percent of a fully grown synapse. ReadWrite Real32 0.05
synPermConnected The default connected threshold.Any synapse whose permanence value is above the connected threshold is a 'connected synapse', meaning it can contribute to the cell's firing. ReadWrite Real32 0.1
minPctOverlapDutyCycles A number between 0 and 1.0, used to set a floor on how often a column should have at least stimulusThreshold active inputs.Periodically, each column looks at the overlap duty cycle of all other columns within its inhibition radius and sets its own internal minimal acceptable duty cycle to : minPctDutyCycleBeforeInh * max(other columns' duty cycles). On each iteration, any column whose overlap duty cycle falls below this computed value will get all of its permanence values boosted up by synPermActiveInc. Raising all permanences in response to a sub-par duty cycle before inhibition allows a cell to search for new inputs when either its previously learned inputs are no longer ever active, or when the vast majority of them have been 'hijacked' by other columns. ReadWrite Real32 0.001
dutyCyclePeriod The period used to calculate duty cycles.Higher values make it take longer to respond to changes in boost or synPerConnectedCell. Shorter values make it more unstable and likely to oscillate. ReadWrite Real32 0.0
boostStrength A number greater or equal than 0.0, used to control the strength of boosting. No boosting is applied if it is set to 0. Boosting strength increases as a function of boostStrength.Boosting encourages columns to have similar activeDutyCycles as their neighbors, which will lead to more efficient use of columns. However, too much boosting may also lead to instability of SP outputs. ReadWrite Real32 0.0
seed Seed for our own pseudo - random number generator. Create Int32 1
spVerbosity Verbosity level : 0, 1, 2, or 3 for SP. ReadWrite Int32 0
wrapAround Determines if inputs at the beginning and end of an input dimension should be considered neighbors when mapping columns to inputs. ReadWrite Boolean true
spInputNonZeros The indices of the non-zero inputs to the spatial pooler. ReadOnly SDR
spOutputNonZeros The indices of the non-zero outputs from the spatial pooler ReadOnly SDR
learningMode True if the node is in learning mode. ReadWrite Boolean true
activeOutputCount Number of active elements in bottomUpOut output. ReadOnly UInt32 0
Input Description Data Type
bottomUpIn The input pattern. SDR
Output Description Data Type
bottomUpOut The output pattern generated from bottom-up inputs. SDR

TMRegion

The TMRegion region implements the temporal memory algorithm as described in 'BAMI https://numenta.com/biological-and-machine-intelligence/'.
The implementation here attempts to closely match the pseudocode in the documentation.

Parameter Description Access Type Default
numberOfCols Number of mini-columns in the region. This values needs to be the same as the number of columns in the input from SP. Normally this value is derived from the input width but if provided, this parameter must be the same total size as the input. Create UInt32 (required input)
cellsPerColumn The number of cells per mini-column. Create UInt32 32
activationThreshold Number of synapses that must be active to activate a segment. Create UInt32 13
initialPermanence Initial permanence for newly created synapses. Create Real32 0.21
connectedPermanence Create Real32 0.5
minThreshold Minimum number of active synapses for a segment to be considered during search for the best-matching segments. Create UInt32 10
maxNewSynapseCount The max number of synapses added to a segment during learning. Create UInt32 20
permanenceIncrement Active synapses get their permanence counts incremented by this value. Create Real32 0.1
permanenceDecrement All other synapses get their permanence counts decremented by this value. Create Real32 0.1
predictedSegmentDecrement Predicted segment decrement A good value is just a bit larger than (the column-level sparsity * permanenceIncrement). So, if column-level sparsity is 2% and permanenceIncrement is 0.01, this parameter should be something like 4% * 0.01 = 0.0004). Create Real32 0.0
maxSegmentsPerCell The maximum number of segments allowed on a cell. This is used to turn on 'fixed size CLA' mode. When in effect, 'globalDecay' is not applicable and must be set to 0 and 'maxAge' must be set to 0. When this is used (> 0), 'maxSynapsesPerSegment' must also be > 0. ReadOnly UInt32 255
maxSynapsesPerSegment The maximum number of synapses allowed in a segment. ReadWrite UInt32 255
seed Random number generator seed. The seed affects the random aspects of initialization like the initial permanence values. A fixed value ensures a reproducible result. Create UInt32 42
inputWidth width of bottomUpIn Input. Will be 0 if no links. ReadOnly UInt32
learningMode True if the node is learning. Create Boolean true
activeOutputCount Number of active elements. ReadOnly UInt32
anomaly The anomaly Score computed for the current iteration. This is the same as the output anomaly. ReadOnly Real32
orColumnOutputs True if the bottomUpOut is to be aggregated by column by logical ORing all cells in a column. Note that if this is on, the buffer width is numberOfCols rather than numberOfCols * cellsPerColumn so must be set prior to initialization. Create Boolean false
Input Description Data Type
bottomUpIn The input signal, conceptually organized as an image pyramid data structure, but internally organized as a flattened vector. The width should match the output of SP. Set numberOfCols to This value if not configured. Otherwise the parameter overrides. SDR
externalPredictiveInputsActive External extra active bits from an external source. These can come from anywhere and be any size. If provided, the 'externalPredictiveInputs' flag is set to dense buffer size and both extraActive and extraWinners must be provided and have the same dense buffer size. Dimensions are set by source. SDR
externalPredictiveInputsWinners The winning active bits from an external source. These can come from anywhere and be any size. If provided, the 'externalPredictiveInputs' flag is set to dense buffer size and both extraActive and extraWinners must be provided and have the same dense buffer size. Dimensions are set by source. SDR
resetIn A boolean flag that indicates whether or not the input vector received in this compute cycle represents the first training presentation in new temporal sequence. SDR
Output Description Data Type
bottomUpOut The output signal generated from the bottom-up inputs from lower levels. The width is 'numberOfCols' * 'cellsPerColumn' by default; if orColumnOutputs is set, then this returns only numberOfCols. The activations come from TM::getActiveCells(). SDR
activeCells The cells that are active from TM computations. The width is 'numberOfCols' * 'cellsPerColumn'. SDR
predictedActiveCells The cells that are active and predicted, the winners. The width is 'numberOfCols' * 'cellsPerColumn'. SDR
anomaly The anomaly score for current iteration. Real32
predictiveCells The cells that are predicted. The width is 'numberOfCols' * 'cellsPerColumn'. SDR

FileOutputRegion

FileOutputRegion is a region that takes its input stream and writes it sequentially to a file. Note: this originally was VectorFileEffector. The current input is written (but not flushed) to the file each time the effector is executed. The file format for the file is a space-separated list of numbers, with one vector per line:

       e11 e12 e13 ... e1N
       e21 e22 e23 ... e2N
          :
       eM1 eM2 eM3 ... eMN
Parameter Description Access Type Default
outputFile Writes output vectors to this file on each run iteration. Will append to any existing data in the file. This parameter must be set at runtime before the first compute is called. Throws an exception if it is not set or the file cannot be written to. ReadWrite String (required input)
Input Description Data Type
dataIn The data to be written to the file. SDR
command Description
flushFile Flush file data to disk without closing.
closeFile close the file.

FileInputRegion

FileInputRegion is a basic sensor region for reading files containing vectors. Note: this originally was VectorFileSensor.

FileInputRegion reads in a text file containing lists of numbers and outputs these vectors in sequence. The output is updated each time the sensor's compute() method is called. If repeatCount is > 1, then each vector is repeated that many times before moving to the next one. The sensor loops when the end of the vector list is reached. The default file format is as follows (assuming the sensor is configured with N outputs): e11 e12 e13 ... e1N e21 e22 e23 ... e2N : eM1 eM2 eM3 ... eMN In this format the sensor ignores all whitespace in the file, including newlines. If the file contains an incorrect number of floats, the sensor has no way of checking and will silently ignore the extra numbers at the end of the file.

The sensor can also read in comma-separated (CSV) files following the format: e11, e12, e13, ... ,e1N e21, e22, e23, ... ,e2N : eM1, eM2, eM3, ... ,eMN When reading CSV files the sensor expects that each line contains a new vector. Any line containing too few elements or any text will be ignored. If there are more than N numbers on a line, the sensor retains only the first N.

Parameter Description Access Type Default
vectorCount The number of vectors currently loaded in memory. ReadOnly UInt32 0
position Set or get the current position within the list of vectors in memory. Index of vector last output. Before anything is output, position is -1. Setting a position will position just prior to the requested vector so that the requested value will be the next output from the next call to compute(). If the requested position is outside the range of the vector set, it will wrap to be inside the range. ReadWrite UInt32 -1
repeatCount Set or get the current repeatCount. Each vector is repeated\n" "repeatCount times before moving to the next one. ReadWrite UInt32 0
recentFile Writes output vectors to this file on each iteration. Will append to any existing data in the file. This parameter must be set at runtime before the first compute is called. Throws an exception if it is not set or the file cannot be written to. ReadOnly String
scalingMode During an iteration, each vector is adjusted as follows. If X is the data vector, S the scaling vector and O the offset vector, then the node's output
Y[i] = S[i]*(X[i] + O[i]).
Scaling is applied according to scalingMode as follows:
			If 'none', the vectors are unchanged, i.e. S[i]=1 and O[i]=0.<br>
			If 'standardForm', S[i] is 1/standard deviation(i) and O[i] = - mean(i)<br>
			If 'custom', each component is adjusted according to the vectors specified by the setScale and setOffset commands.</td><td> ReadWrite </td><td> String </td><td> 0 </td></tr>
scaleVector Set or return the current scale vector S. ReadWrite Real32 array 0
offsetVector Set or return the current offset vector. ReadWrite Real32 0
activeOutputCount The number of active outputs of the node. Create UInt32 0
maxOutputVectorCount The number of output vectors that can be generated by this sensor under the current configuration. ReadOnly UInt32 0
hasCategoryOut If 1, category info is present in data file. ReadWrite UInt32 0
hasResetOut If 1, a new sequence reset signal is present in data file. ReadWrite UInt32 0
Output Description Data Type
dataOut The data read from the file. Real32
command Description
loadFile loadFile <filename> [file_format]
Reads vectors from the specified file, replacing any vectors currently in the list. Position is set to zero. Available file formats are:
0 - Reads in unlabeled file with first number = element count
1 - Reads in a labeled file with first number = element count (deprecated)
2 - Reads in unlabeled file without element count (default)
3 - Reads in a csv file
appendFile appendFile <filename> [file_format]
Reads vectors from the specified file, appending to current vector list. Position remains unchanged. Available file formats are the same as for the loadFile command.
saveFile saveFile <filename> [format [begin [end]]]
Save the currently loaded vectors to a file. Typically used for debugging but may be used to convert between formats.
Dump return debugging information.

ClassifierRegion

This is a wrapper around the SDRClassifier algorithm. Used to map SP and TM output back to original entries. The SDR Classifier takes the form of a single layer classification network (NN). It accepts SDRs as input and outputs a predicted distribution of categories.

Note that this will require many samples of every quantized value or category during the learning phase in order to get reasonable results.

Parameter Description Access Type Default
learn If true, the classifier is in learning mode. ReadWrite Boolean true
Input Description Data Type
bucket The quantized value of the current sample, one from each encoder if more than one, for the learn step Real64
pattern An SDR output bit pattern for a sample. Usually the output of the SP or TM. SDR
Output Description Data Type
pdf probability distribution function (pdf) for each category or bucket. Sorted by title. Warning, buffer length will grow as new bucket values are encountered while learning. Real64
titles Quantized values of used samples which are the Titles corresponding to the pdf indexes. Sorted by title. Warning, buffer length will grow with pdf. Real64
predicted An index (into pdf and titles) with the highest probability of being the match with the current pattern. UInt32