-
Notifications
You must be signed in to change notification settings - Fork 3
Numerical Optimization
The infrastructure of MrHyDE was designed to reduce the burden for users to be able to access some of the large-scale numerical optimization tools developed within the Rapid Optimization Library (ROL) at Sandia. By leveraging automatic differentiation tools and an abstract framework for solving multiscale/multiphysics problems, MrHyDE automates the construction of discrete adjoints which can then be used to provide gradient information (or Hessian vector products) which is required for higher-order optimization methods. A full derivation of the adjoint-based gradient is beyond the intended scope of this wiki, but we will mention a few key aspects that may impact a user:
- For transient problems, the adjoint problems runs backwards in time and needs to be linearized around the state approximation. Thus, the state solutions needs to be available as the adjoint solver progresses. If an adjoint solve is required, MrHyDE automatically stores the full forward solution at each time node. Checkpointing is a common approach to reduce the memory requirements in storing the forward solution, but this is not implemented in MrHyDE.
- The adjoint solve requires the derivative of the forward residual with respect to the state (the Jacobian) and the derivative of the objective function with respect to the state. The gradient calculation also requires the derivative of the forward residual with respect to the active and/or discretized parameters.
- For these discrete adjoints, the Jacobian for the adjoint system is the transpose of the forward Jacobian. MrHyDE assembles the adjoint Jacobian directly by transposing the row and column indices in the scatter step. No transposes are required at the linear algebra level.
- MrHyDE does not support Hessians or Hessian-vector products at this time (future work).
- The adjoint gradients are supported for multi-step time integrators, but not for multi-stage integrators. This is work in progress.
- For sensor (pointwise) objectives, one must be careful to measure quantities that behave nicely as the mesh is refined. Not everything that can be included in an objective function lives in
$L^{\infty}$ . - For all objectives except for sensors, MrHyDE will measure the objective at each time step and weight this by
$\Delta t$ (see below). This is to maintain consistency as the number of time steps are increased. Sensors are only measured at certain times, so they do not require this re-weighting.
MrHyDE support several different types of objective functions for optimization and inversion. The general form of the objective function is given by:
where
MrHyDE has a customized parameter manager whose sole responsibility is to deal with the various types of parameters available in MrHyDE. These types are:
- Active These are scalar or vector parameters that are eligible for sensitivities, i.e., adjoint-based gradients. Active parameters can be static, i.e., constant over time, or dynamic which may take different values on each time step. The default is always static.
- Inactive These are scalar or vector parameters that are not eligible for sensitivities. These are never updated from their initial values.
- Stochastic These are scalar or vector parameters that are not eligible for sensitivities, but are randomly distributed according to a user-defined distribution. They are assigned new values for each realization of a UQ study.
- Discretized These are parameters/controls that represent discretized fields. Other codes often call these distributed parameters. In MrHyDE, these parameters can use any of the available discretizations. The parameter manager sets up a Panzer DOF manager for these discretized parameters, the linear algebra interface is able to set up up linear algebra objects, i.e., vectors and matrices, associated with them, and sensitivities are available through the postprocess manager. Discretized parameters can also be either static or dynamic. The user can define any of these parameters from within the Parameters sublist in the input file. To define an inactive parameter, we set: Note that MrHyDe does not (yet) have a discretized stochastic parameters, i.e., it cannot handle random fields directly. It can handle random fields in a Karhunen-Loeve expansion, which only requires a basis and a vector of uncertain parameters.
The user can define any of these parameters from within the Parameters sublist in the input file. These can be used within physics modules directly or within strings defined elsewhere in the input file. For vector parameters, you will need to provide which scalar component to use, e.g., rho(2)
.
To define an inactive parameter, we set:
Parameters:
magma:
type: scalar
value: 1.0
usage: inactive
Changing this to an active parameter only requires modifying the usage
flag to active
.
To define two stochastic parameters, one uniform and one Gaussian, we set:
Parameters:
a:
type: scalar
value: 1.0 # not actually used
usage: stochastic
distribution: uniform
min: 1.0
max: 2.0
b:
type: scalar
value: 0.0 # not actually used
usage: stochastic
distribution: Gaussian
mean: 0.0
variance: 1.0
To define a discretized parameter, we set:
Parameters:
magma:
type: HGRAD
order: 1
usage: discretized
initial value: 1.0
To make this parameter dynamic, we add dynamic: true
.
An integrated response is meant for cases where the simulation produces a scalar QoI (as an integrated quantity) and the user wants to find the model input parameters that force the QoI to match a given target value. The misfit for an integrated response takes the following form:
where
where
An example of an integrated response objective is
Postprocess:
compute objective: true
compute sensitivities: true
Objective functions:
obj0:
type: integrated response
response: 'e'
target: 1.0
weight: 0.5
An integrated control attempts to force the state, or some function of the state, to match a given target function over all space and time. This involves matching (or minimizing the misfit) significantly more data than the integrated response. The misfit function for an integrated control is given by:
where
where
An example input file snippet is given below:
Postprocess:
compute objective: true
compute sensitivities: false
Objective functions:
obj0:
type: integrated control
function: '1.0*(e-targ)^2'
weight: 0.0625
A discrete control is similar to an integrated control in the sense that it seeks parameter values that cause the state to match a given target over all space/time. A key difference is in the data that is being matched. The objective function for the discrete control is given by:
where
An example input file snippet is given below:
Postprocess:
compute objective: true
compute sensitivities: true
Objective functions:
obj1:
type: discrete control
weight: 0.5
Sensors are a bit different from the previously described objective functions. These are pointwise (in space and time) measurements of functions of the state (and other variables). Settings this up requires defining both the sensor locations and the data that needs to be matched. One should be careful to only measure quantities that are well-behaved in a pointwise sense, but MrHyDE will allow almost any function to be evaluated pointwise. This capability is the main motivation for the point
forest in the function manager and some functionality in the workset. In some ways, sensors are treated as another quadrature rule, but the sensors are rarely distributed evenly amongst the elements - some have several sensors and some have none. The objective function for sensors is given by:
where
Postprocess:
compute objective: true
Objective functions:
obj0:
type: sensors
sensor points file: sensor_points.dat
sensor data file: sensor_data.dat
save sensor data: false
response: 'e'
weight: 1.0
We see that it is necessary to provide separate files for the sensor locations (physical locations in space) and the data. In the data file, the first row is the measurement times, and subsequent rows give data at these times. Regarding the sensor locations, it is best to avoid placing sensors on element boundaries or at the nodes of the mesh. Many quantities are continuous across elements, but some are not, so there can be variance in the objective function depending on which element is identified as the owner of a sensor.
The regularization functions can be any function of the parameter, but should not depend on the state.
Postprocess:
compute objective: true
Objective functions:
obj_dx:
type: sensors
sensor points file: sensor_points.dat
sensor data file: sensor_dx_data.dat
response: 'dx'
weight: 0.5
Regularization functions:
reg0:
type: integrated
location: volume
function: mufield^2
weight: 1.0e-5
breg0:
type: integrated
location: boundary
boundary name: top
function: '1.0*(grad(disc_trac)[x])^2'
weight: 0.5e-5