TGen supports the use of Markov models to allow the user to control how TCP streams are created. TGen uses Markov models for three distinct processes:
- In a traffic action, a Markov model can be used to configure the flow creation process, i.e., the frequency with which new flows should be created. The Markov model specifies inter-flow delay distributions. This model is configured with the flowmodelpath and markovmodelseed attributes on the traffic action.
- In a flow action, a Markov model can be used to configure the stream creation process, i.e., the frequency with which new TCP streams should be created. The Markov model specifies inter-stream delay distributions. This model is configured with the streammodelpath and markovmodelseed attributes on the flow action. It can also be configured on a traffic action in order to apply the model to all flows generated by the action.
- In a stream action, a Markov model can be used to configure the packet creation process on the associated TCP stream, i.e., the frequency with which packets should be created. The Markov model specifies inter-packet delay distributions. This model is configured with the packetmodelpath and markovmodelseed attributes on the stream action. It can also be configured on traffic or flow actions in order to apply the model to all TCP streams generated the associated flows.
More information about how to set up a Markov model in your TGen configuration file can be found in the doc/TGen-Options.md file.
The remainder of this document explains the Markov model file format that TGen supports, and provides examples of how to generate Markov models that will pass TGen's Markov model validation.
As with the config file, TGen uses the graphml
file format to represent
Markov models. As we explain the structure supported by TGen, we provide
examples of generating the corresponding graphml
elements and atrributes
using python3
and the networkx
python3 module (installing the TGenTools
toolkit will install the networkx module).
Models are constructed as directed graphs:
G = networkx.DiGraph()
Generally, the Markov model specifies a set of Markov model states and a set of transitions between pairs of states. Each state is also associated with a set of observations, and a set of emissions between states and observations.
Vertices in the graph can either be Markov model "states" or "observations", and the type is encoded in the graph using the type attribute on the graph node. Each vertex must specify a type and a name. The vertex id is required and must be unique but are otherwise not used by TGen.
G.add_node('s0', type="state", name='start')
G.add_node('s1', type="state", name='anything_you_want')
The graph must contain one and only one vertex of type state
whose name
is start
. This instructs TGen in which state the Markov model begins. The
name of the other vertices of type state
are insignificant and can be
set to any string.
G.add_node('o1', type="observation", name='+')
G.add_node('o2', type="observation", name='-')
Vertices of type observation
must set one of the following as the name,
which encodes the action to be taken upon reaching a particular vertex. Valid
name strings are:
+
: Generate a packet from client to server (for packet models) or a new stream (for stream models)-
: Generate a packet from server to client (for packet models) or a new stream (for stream models)F
: Stop generating new packets (for packet models) or new streams (for stream models)
For stream models, there is no difference between +
and -
: you can simply
use +
to indicate new stream creation on stream models.
Edges in the graph can either be Markov model "transitions" or "emissions", and the type is encoded in the graph using the type attribute on the graph edge. Each edge must specify a type and a weight. The source and target vertex id must match those that were defined when creating the vertices.
G.add_edge('s0', 's1', type='transition', weight=1.0)
G.add_edge('s1', 's1', type='transition', weight=1.0)
Edges of type transition
instruct TGen how to move between pairs of
vertices of type state
. For vertices of type state
with multiple
outgoing edges of type transition
, TGen randomly selects one outgoing
edge according to the weighted probabilities (each edge's probability is
computed by dividing its weight by the sum of the weights of all outgoing
transition
edges from the same state
vertex).
G.add_edge('s1', 'o1', type='emission', weight=0.5, distribution='normal', param_location=5000000, param_scale=1000000)
G.add_edge('s1', 'o2', type='emission', weight=0.5, distribution='exponential', param_rate=0.001)
Whenever TGen moves between states, there is an associated event observation.
Edges of type emission
instruct TGen how to move between vertices of
type state
and vertices of type observation
. For vertices of type
state
with multiple outgoing edges of type emission
, TGen randomly
selects one outgoing edge according to the weighted probabilities (each edge's
probability is computed by dividing its weight by the sum of the weights of
all outgoing emission
edges from the same state
vertex).
Once an emission
edge has been selected, the observation
vertex
connected by the edge instructs TGen which type of action to take.
Additionally, each emission
edge must specify the distribution
attribute
and the associated parameters for that distribution. These distributions encode
the time delay in microseconds that TGen should create after the observation
before transitioning to the next state.
The following delay distributions are currently supported (more can be added as the need arises):
uniform
: a uniform distribution requires the attributesparam_low
(a) andparam_high
(b) such that a <= b to generate values uniformly in the range [a, b]normal
: a normal distribution requires the attributesparam_location
(mu) andparam_scale
(sigma)lognormal
: a lognormal distribution requires the attributesparam_location
(mu) andparam_scale
(sigma)exponential
: an exponential distribution requires the attributeparam_rate
(lamda)pareto
: a Pareto distribution requires the attributesparam_scale
(xm) andparam_shape
(alpha)
A final graph can be written to a file using:
networkx.write_graphml(G, 'sample.mmodel.graphml')
Below is a full example of code provided above. Another example script, which we use to generate our internal default packet and stream models, can be found in the repository at tools/scripts/generate_mmodel_graphml.py.
G = networkx.DiGraph()
G.add_node('s0', type="state", name='start')
G.add_node('s1', type="state", name='anything_you_want')
G.add_node('o1', type="observation", name='+')
G.add_node('o2', type="observation", name='-')
G.add_edge('s0', 's1', type='transition', weight=1.0)
G.add_edge('s1', 's1', type='transition', weight=1.0)
G.add_edge('s1', 'o1', type='emission', weight=0.5, distribution='normal', param_location=5000000, param_scale=1000000)
G.add_edge('s1', 'o2', type='emission', weight=0.5, distribution='exponential', param_rate=0.001)
networkx.write_graphml(G, 'sample.mmodel.graphml')