Skip to content

Commit 0a71431

Browse files
authored
Single policy update (#46)
Implements the single update policy heuristic
1 parent d66a73a commit 0a71431

File tree

9 files changed

+283
-2
lines changed

9 files changed

+283
-2
lines changed

docs/src/api.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,13 @@ LocalDecisionStrategy(::Node, ::Vector{VariableRef})
8787
DecisionStrategy(::DecisionVariables)
8888
```
8989

90+
## `heuristics.jl`
91+
### Single policy update
92+
```@docs
93+
randomStrategy
94+
singlePolicyUpdate
95+
```
96+
9097
## `analysis.jl`
9198
```@docs
9299
CompatiblePaths

docs/src/decision-programming/decision-model.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,9 @@ The motivation for using the minimum of these bounds is that it depends on the p
5555
## Lazy Probability Cut
5656
Constraint $(6)$ is a complicating constraint involving all path compatibility variables $x(s)$ and thus adding it directly to the model may slow down the overall solution process. It may be beneficial to instead add it as a *lazy constraint*. In the solver, a lazy constraint is only generated when an incumbent solution violates it. In some instances, this allows the MILP solver to prune nodes of the branch-and-bound tree more efficiently.
5757

58+
## Single Policy Update
59+
To obtain (hopefully good) starting solutions, the SPU heuristic described in [^3] can be used. The heuristic finds a locally optimal strategy in the sense that the strategy cannot be improved by changing any single local strategy. With large problems, the heuristic can quickly provide a solution that would otherwise take very long to obtain.
60+
5861

5962
## Expected Value
6063
The **expected value** objective is defined using the path compatibility variables $x(𝐬)$ and their associated path probabilities $p(𝐬)$ and path utilities $\mathcal{U}(𝐬)$.
@@ -144,6 +147,8 @@ where the parameter $w∈[0, 1]$ expresses the decision maker's **risk tolerance
144147

145148

146149
## References
147-
[^1]: Salo, A., Andelmin, J., & Oliveira, F. (2019). Decision Programming for Multi-Stage Optimization under Uncertainty, 1–35. Retrieved from [http://arxiv.org/abs/1910.09196](http://arxiv.org/abs/1910.09196)
150+
[^1]: Salo, A., Andelmin, J., & Oliveira, F. (2022). Decision programming for mixed-integer multi-stage optimization under uncertainty. European Journal of Operational Research, 299(2), 550-565.
148151

149152
[^2]: Hölsä, O. (2020). Decision Programming Framework for Evaluating Testing Costs of Disease-Prone Pigs. Retrieved from [http://urn.fi/URN:NBN:fi:aalto-202009295618](http://urn.fi/URN:NBN:fi:aalto-202009295618)
153+
154+
[^3]: Hankimaa, H., Herrala, O., Oliveira, F., Tollander de Balsch, J. (2023). DecisionProgramming.jl -- A framework for modelling decision problems using mathematical programming. Retrieved from [https://arxiv.org/abs/2307.13299](https://arxiv.org/abs/2307.13299)

docs/src/examples/pig-breeding.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -181,14 +181,19 @@ EV = expected_value(model, diagram, x_s)
181181
@objective(model, Max, EV)
182182
```
183183

184-
and set up the solver and solve the problem.
184+
and set up the solver.
185185

186186
```julia
187187
optimizer = optimizer_with_attributes(
188188
() -> Gurobi.Optimizer(Gurobi.Env()),
189189
"IntFeasTol" => 1e-9,
190190
)
191191
set_optimizer(model, optimizer)
192+
```
193+
194+
Finally, we use the single policy update heuristic to obtain an initial solution and then solve the problem.
195+
```
196+
spu = singlePolicyUpdate(diagram, model, z, x_s)
192197
optimize!(model)
193198
```
194199

examples/n_monitoring.jl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,7 @@ optimizer = optimizer_with_attributes(
7676
"IntFeasTol" => 1e-9,
7777
)
7878
set_optimizer(model, optimizer)
79+
7980
optimize!(model)
8081

8182
@info("Extracting results.")

examples/pig_breeding.jl

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,9 @@ optimizer = optimizer_with_attributes(
6666
"IntFeasTol" => 1e-9,
6767
)
6868
set_optimizer(model, optimizer)
69+
70+
spu = singlePolicyUpdate(diagram, model, z, x_s)
71+
@info("Single policy update found solution $(spu[end][1]) in $(spu[end][2]/1000) seconds.")
6972
optimize!(model)
7073

7174
@info("Extracting results.")

src/DecisionProgramming.jl

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ include("influence_diagram.jl")
44
include("decision_model.jl")
55
include("random.jl")
66
include("analysis.jl")
7+
include("heuristics.jl")
78
include("printing.jl")
89

910
export Node,
@@ -56,6 +57,9 @@ export CompatiblePaths,
5657
value_at_risk,
5758
conditional_value_at_risk
5859

60+
export randomStrategy,
61+
singlePolicyUpdate
62+
5963
export print_decision_strategy,
6064
print_utility_distribution,
6165
print_state_probabilities,

src/heuristics.jl

Lines changed: 184 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,184 @@
1+
"""
2+
randomStrategy(diagram::InfluenceDiagram)
3+
4+
Generates a random decision strategy for the problem. Returns the strategy as well as
5+
the expected utility of the strategy and the paths that are compatible with the strategy.
6+
7+
# Arguments
8+
- `diagram::InfluenceDiagram`: Influence diagram structure.
9+
10+
!!! warning
11+
This function does not exclude forbidden paths: the strategy returned by this function might be forbidden if the diagram has forbidden state combinations.
12+
13+
# Examples
14+
```julia
15+
objval, Z, S_active = randomStrategy(diagram)
16+
```
17+
"""
18+
function randomStrategy(diagram::InfluenceDiagram)
19+
20+
# Initialize empty vector for local decision strategies
21+
# Z_d = Vector{LocalDecisionStrategy}[] # Doesn't work for some reason...
22+
Z_d = []
23+
24+
# Loop through all decision nodes and set local decision strategies
25+
for j in diagram.D
26+
I_j = diagram.I_j[j]
27+
28+
# Generate a matrix of correct dimensions to represent the strategy
29+
dims = diagram.S[[I_j; j]]
30+
data = zeros(Int, Tuple(dims))
31+
n_states = size(data)[end]
32+
33+
# For each information state, choose a random decision state
34+
for s_Ij in paths(diagram.S[I_j])
35+
data[s_Ij..., rand(1:n_states)] = 1
36+
end
37+
push!(Z_d, LocalDecisionStrategy(j,data))
38+
end
39+
40+
# Construct a decision strategy and obtain the compatible paths
41+
Z = DecisionStrategy(diagram.D, diagram.I_j[diagram.D], Z_d)
42+
S_active = CompatiblePaths(diagram, Z)
43+
44+
# Calculate the expected utility corresponding to the strategy
45+
EU = sum(diagram.P(s)*diagram.U(s) for s in S_active)
46+
47+
return EU, Z, collect(S_active)
48+
end
49+
50+
51+
function findBestStrategy(diagram, j, s_Ij, S_active, model, EU)
52+
# Check that the model is either a minimization or maximization problem
53+
if objective_sense(model) == MOI.MIN_SENSE
54+
bestsofar = (0, Inf, [])
55+
elseif objective_sense(model) == MOI.MAX_SENSE
56+
bestsofar = (0, -Inf, [])
57+
else
58+
throw("The given model is not a maximization or minimization problem.")
59+
end
60+
61+
# Loop through all decision states and save the one corresponding to the best expected value
62+
for s_j in 1:num_states(diagram,diagram.Names[j])
63+
# Get the expected value corresponding to a strategy where the information state s_Ij maps to s_j
64+
# and the strategy stays otherwise the same. Note that the strategy is represented by the active paths.
65+
EU_new, S_active_new = get_value(diagram, S_active, j, s_j, s_Ij, EU)
66+
67+
# Update the best value so far
68+
if objective_sense(model) == MOI.MIN_SENSE
69+
if EU_new <= bestsofar[2]
70+
bestsofar = (s_j, EU_new, S_active_new)
71+
end
72+
else #objective_sense(model) == MOI.MAX_SENSE
73+
if EU_new >= bestsofar[2]
74+
bestsofar = (s_j, EU_new, S_active_new)
75+
end
76+
end
77+
end
78+
return bestsofar
79+
end
80+
81+
82+
function get_value(diagram, S_active, j, s_j, s_Ij, EU)
83+
I_j = diagram.I_j[j] # Information set of node j
84+
# Loop through all compatible paths and update the ones that correspond to the given information state s_Ij
85+
# and update the expected utility whenever a path is updated
86+
S_active_new = copy(S_active)
87+
for (k, s) in enumerate(S_active)
88+
if s[I_j] == s_Ij
89+
EU -= diagram.P(s)*diagram.U(s)
90+
s_new = [s_j for s_j in s]
91+
s_new[j] = s_j
92+
s_new = Tuple(s_new)
93+
S_active_new[k] = s_new
94+
EU += diagram.P(s_new)*diagram.U(s_new)
95+
end
96+
end
97+
98+
return EU, S_active_new
99+
end
100+
101+
function set_MIP_start(diagram, Z, S_active, z, x_s)
102+
for (k,j) in enumerate(Z.D)
103+
for s_Ij in paths(diagram.S[Z.I_d[k]])
104+
set_start_value(z.z[k][s_Ij..., Z.Z_d[k](s_Ij)], 1)
105+
end
106+
end
107+
108+
for s in S_active
109+
set_start_value(x_s[s], 1)
110+
end
111+
end
112+
113+
"""
114+
singlePolicyUpdate(diagram::InfluenceDiagram, model::Model)
115+
116+
Finds a feasible solution using single policy update and sets the model start values to that solution.
117+
Returns a vector of tuples consisting of the value of each improved solution starting from a random policy,
118+
time (in milliseconds) since the function call and the decision strategy that gave the improved value.
119+
The purpose of all this output is to allow us to examine how fast the method finds good solutions.
120+
121+
# Arguments
122+
- `diagram::InfluenceDiagram`: Influence diagram structure.
123+
- `model::Model`: The decision model, modelled in JuMP
124+
- `z::DecisionVariables`: The decision variables
125+
- `x_s::PathCompatibilityVariables`: The path compatibility variables
126+
127+
!!! warning
128+
This function does not exclude forbidden paths: the strategies explored by this function might be forbidden if the diagram has forbidden state combinations.
129+
130+
# Examples
131+
```julia
132+
solutionhistory = singlePolicyUpdate(diagram, model)
133+
```
134+
"""
135+
function singlePolicyUpdate(diagram::InfluenceDiagram, model::Model, z::DecisionVariables, x_s::PathCompatibilityVariables)
136+
t1 = time_ns() # Start time
137+
138+
# Initialize empty values
139+
solutionhistory = []
140+
lastchange = nothing
141+
142+
# Get an initial (random) solution
143+
EU, strategy, S_active = randomStrategy(diagram)
144+
push!(solutionhistory, (EU, (time_ns()-t1)/1E6, deepcopy(strategy)))
145+
146+
# In principle, this always converges, but we set a maximum number of iterations anyway to avoid very long solution times
147+
for iter in 1:20
148+
# Loop through all nodes
149+
for (idx, j) in enumerate(diagram.D)
150+
# println("Node $(diagram.Names[j]), iteration $iter")
151+
I_j = diagram.I_j[j]
152+
# Loop through all information states
153+
for s_Ij in paths(diagram.S[I_j])
154+
# Check if any improvement has happened since the last time this node and information state was visited
155+
# If not, the algorithm terminates with a locally optimal solution
156+
if iter >= 2
157+
if lastchange == (j, s_Ij)
158+
set_MIP_start(diagram, solutionhistory[end][3], S_active, z, x_s)
159+
return solutionhistory
160+
end
161+
end
162+
163+
# Find the best decision alternative s_j for information state s_Ij
164+
s_j, bestval, S_active = findBestStrategy(diagram, j, s_Ij, S_active, model, EU)
165+
166+
# If the strategy improved, save the new strategy and its expected utility
167+
if (objective_sense(model) == MOI.MIN_SENSE && bestval < EU-1E-9) || (objective_sense(model) == MOI.MAX_SENSE && bestval > EU+1E-9)
168+
lastchange = (j, s_Ij)
169+
localstrategy = strategy.Z_d[idx].data
170+
localstrategy[s_Ij..., :] .= 0
171+
localstrategy[s_Ij..., s_j] = 1
172+
strategy.Z_d[idx] = LocalDecisionStrategy(j, localstrategy)
173+
EU = bestval
174+
push!(solutionhistory, (EU, (time_ns()-t1)/1E6, deepcopy(strategy)))
175+
end
176+
end
177+
end
178+
end
179+
180+
# Set the best found solution as the MIP start to the model
181+
set_MIP_start(diagram, solutionhistory[end][3], S_active, z, x_s)
182+
183+
return solutionhistory
184+
end

test/heuristics.jl

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
using Test, Logging, Random, JuMP
2+
using DecisionProgramming
3+
4+
5+
@info("Creating a pig farm problem with 3 decision stages")
6+
const N = 4
7+
8+
@info("Creating the influence diagram.")
9+
diagram = InfluenceDiagram()
10+
11+
add_node!(diagram, ChanceNode("H1", [], ["ill", "healthy"]))
12+
for i in 1:N-1
13+
# Testing result
14+
add_node!(diagram, ChanceNode("T$i", ["H$i"], ["positive", "negative"]))
15+
# Decision to treat
16+
add_node!(diagram, DecisionNode("D$i", ["T$i"], ["treat", "pass"]))
17+
# Cost of treatment
18+
add_node!(diagram, ValueNode("C$i", ["D$i"]))
19+
# Health of next period
20+
add_node!(diagram, ChanceNode("H$(i+1)", ["H$(i)", "D$(i)"], ["ill", "healthy"]))
21+
end
22+
add_node!(diagram, ValueNode("MP", ["H$N"]))
23+
24+
generate_arcs!(diagram)
25+
26+
# Add probabilities for node H1
27+
add_probabilities!(diagram, "H1", [0.1, 0.9])
28+
29+
# Declare proability matrix for health nodes H_2, ... H_N-1, which have identical information sets and states
30+
X_H = ProbabilityMatrix(diagram, "H2")
31+
X_H["healthy", "pass", :] = [0.2, 0.8]
32+
X_H["healthy", "treat", :] = [0.1, 0.9]
33+
X_H["ill", "pass", :] = [0.9, 0.1]
34+
X_H["ill", "treat", :] = [0.5, 0.5]
35+
36+
# Declare proability matrix for test result nodes T_1...T_N
37+
X_T = ProbabilityMatrix(diagram, "T1")
38+
X_T["ill", "positive"] = 0.8
39+
X_T["ill", "negative"] = 0.2
40+
X_T["healthy", "negative"] = 0.9
41+
X_T["healthy", "positive"] = 0.1
42+
43+
for i in 1:N-1
44+
add_probabilities!(diagram, "T$i", X_T)
45+
add_probabilities!(diagram, "H$(i+1)", X_H)
46+
end
47+
48+
for i in 1:N-1
49+
add_utilities!(diagram, "C$i", [-100.0, 0.0])
50+
end
51+
52+
add_utilities!(diagram, "MP", [300.0, 1000.0])
53+
54+
generate_diagram!(diagram, positive_path_utility = true)
55+
56+
57+
@info("Creating the decision model.")
58+
model = Model()
59+
z = DecisionVariables(model, diagram)
60+
x_s = PathCompatibilityVariables(model, diagram, z, probability_cut = false)
61+
EV = expected_value(model, diagram, x_s)
62+
@objective(model, Max, EV)
63+
64+
65+
spu = singlePolicyUpdate(diagram, model, z, x_s)
66+
@info("Single policy update found solution $(spu[end][1]) in $(spu[end][2]/1000) seconds.")
67+
68+
@test spu[end][1] <= 726.8121 + 1E-9 # Result is not better than the known optimal value
69+
spu_pairs = zip(spu[1:end-1], spu[2:end])
70+
@test all(pair -> pair[1][1] < pair[2][1], spu_pairs) # Solution values increase
71+
@test all(pair -> pair[1][2] < pair[2][2], spu_pairs) # Times increase

test/runtests.jl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,4 +4,5 @@ using Test
44
include("influence_diagram.jl")
55
include("random.jl")
66
include("decision_model.jl")
7+
include("heuristics.jl")
78
end

0 commit comments

Comments
 (0)