Modeling
with DP

Contents The examples for this section are in the Teach DP demo (teachdpdem.xls). To solve or change the model you must have the Teach DP addin loaded. Use the Relink buttons command to establish links to the worksheet buttons. 
Selecting the Model option from the menu presents the problem definition dialog shown below. Fields accept the problem name and relevant parameters. Clicking a button on the left selects one of the problem types built into the program. After entering data and pressing the OK button, a worksheet is created with the name specified in the dialog. Many ranges on the worksheet are given Excel names with the preface being the worksheet name, so that name should not be changed after the worksheet has been constructed. . In the greater part of the worksheet the program constructs the dynamic programming model consisting of the state and decision definitions and all of the other components listed in the text. We delay describing those features until later in the section and concentrate on the data structures for the various problem types. When a button is pressed to select a problem type, one of the buttons in the Objective area is selected as the default direction of optimization and the labels for the fields and checkboxes on the middle right of the dialog box are changed to reflect the problem type. The labels shown are those used for the knapsack problem. The default is to maximize the objective function. To change this, simply click on the Minimize button. The three checkboxes at the lower right are common to all problem types. When checked, the Make Random Problem box fills in the data form with random numbers appropriate to the problem type. Of course, these may be changed. The Show Comments adds a small red label to some of the cells of the worksheet. Clicking on a label presents a description of the general purpose of the cell. The Include Bounds button modifies the worksheet to accept lower and upper bounds as described in Section 13.5 of the text. Knapsack Problem As shown in the dialog above,, the knapsack model has three numerical parameters: the number of items, the maximum number of each item that can be packed, and the number of resource constraints. We use the investment problem of Section 12.1 as an example. Pressing OK builds a data structure for the problem as shown below. When the Nonlinear box is checked a matrix of nonlinear item benefits is included on the worksheet. Otherwise only one row is provided for the linear return. Below the benefit area is a row holding the maximum number of each item. In the Resources area, a row is provided for each constraint. 
Binary Knapsack
Line Partitioning Problem
Unbounded Knapsack
Grid
Network
Sequence
Traveling Salesman

A DP model uses states and decisions to describe a sequential decision process. A number of functional expressions define the beginning, middle and end of the process. A quantitative measure determines the costs and/or benefits of the process. When using this DP addin for Excel, the expressions describing the model must be placed into cells as Excel equations. All DP model information is placed in first few columns of the worksheet (the number of columns depends on the number of state variables and decision variables). Regions outside the model definition can be used to hold problem specific data or equations. We show below the Excel worksheet for the Knapsack example. Areas colored light yellow hold equations or data required by the program during the solution process and should not be changed by the user. Ranges outlined in red hold Excel equations that describe the decision process. These equations can be entered by the user, but in the case below they are filled by the computer specifically to solve the knapsack problem. In the table below, we provide a brief description of each of the model areas. In the Build Your Own Model section, the formulas are described in greater detail. 
The data in the yellow areas describe features of the model and the strategy for solution. The Solve button initiates the DP solution process.  
The state vector holds the values of the state variables during the process. This area of the worksheet has no equations but the state cells are varied by the computer during the solution process. We have filled the state with the values (2, 6, 9) to illustrate the equations below.  
The decision vector holds the values of the decision variables. We use the decision 1 (bring one of item 2) for illustration. In general, decisions are described by a vector, but the knapsack problem has only one decision variable. The cells in row 26 hold equations that determine the feasibility of a particular decision. In this case, feasible decisions are within the range specified by Min and Max (0 to 2). 

The Forward Transition Equations describe how the situation changes from one state to the next state when a decision is made. In this case a decision to bring one unit of item 2 results in a state change to (3, 12, 17).  
The Backward Transition Equations move the decision process in the reverse direction. We have filled in the state (3, 12, 17). If the decision entering this state was 1, the previous state must have been (2, 6, 9).  
In this area we provide the benefit function for a decision and the forward and backward recursive equations used for the solution of the DP. The decision benefit is the unit benefit for item 2 times the number of items (6*1=6). We explain the recursive equations later. 

The process begins in an initial state. This region contains the logical expressions that identify an initial state. The given state (2, 6, 9) is not an initial state, so we see the value False in the fourth cell outlined in red. For the example, the initial state must fall within the Min and Max bounds. The only initial state is (1, 0, 0) 

The process ends in a final state. This region contains the logical expressions that identify a final state. The given state (2, 6, 9) is not a final state, so we see the value False in the fourth red square. Again, the bounds determine the final state. In this case, any feasible state with the first value equal to 11 is a final state. 

The process must pass only through feasible states. This region contains, logical expressions that determine state feasibility. For the example, a feasible state falls withint the bounds specified by Min and Max. The specified state is feasible. 

This region provides string expressions for states and decisions. Expressions in the outlined cells construct meaningful statements that define an action. These are used in the presentation of the optimum solution. . 

Pressing the Solve button initiates the DP solution process. When it is complete, the output report at the left is generated.  

This summary at the top of the page shows the number of states and tree entries and the computation times for the parts of the solution process. States are stored in a tree structure that allows easy access to state information during the solution process. 
Operations
Research Models and Methods
by Paul A. Jensen and Jon Bard, University of Texas, Copyright
by the Authors