Solene#12
Conversation
BatyLeo
left a comment
There was a problem hiding this comment.
Thank you for opening the PR, this is a good start!
Most of comment below are minor details, except a couple correctness bugs that need to be addressed (see comments).
In addition to the comments:
- We may want to have a documentation page providing a concise explanation of the algorithm, as well as a tutorial showcasing its use
- Tests for the algorithm are needed, maybe some on the toy contextual argmax, and others on the stochastic VSP to check the different kwargs are correctly provided.
- Currently failing tests are formatting and JET (which is detecting the correctness bugs caused by methods called incorrectly)
| seed=nothing, | ||
| ) | ||
|
|
||
| train_dataset = generate_dataset(benchmark, dataset_size; seed=seed) |
There was a problem hiding this comment.
By default, generate_dataset applied to an ExogenousStochasticBenchmark generates a single context and scenario per instance. Do we want to be able to choose different nb_scenarios and contexts_per_instance values?
There was a problem hiding this comment.
I added nb_scenarios and contexts_per_instance as keyword arguments in train_policy. I am not sure of what it means to generate multiple contexts for an instance. The code for dataset generation is quite heavy now, we may want to do something else
|
|
||
| # Initialize model and create policy | ||
| model = generate_statistical_model(benchmark; seed=seed) | ||
| maximizer = generate_maximizer(benchmark) |
There was a problem hiding this comment.
Depending on the considered benchmark, generate_statistical_model and generate_maximizer may have optional kwargs. Should we optionally take them as input of the train_policy method?
There was a problem hiding this comment.
Maybe maximizer_kwargs, model_kwargs and solver_kwargs could be passed as an argument to train_policy? (assuming that anticipative_solver and parametric_anticipative_solver use the same optional kwargs)
| policy = DFLPolicy(model, maximizer) | ||
|
|
||
| # vector because we store one history per iteration | ||
| histories_per_iteration = MVHistory[] |
There was a problem hiding this comment.
MVHistory is not a concrete type, this may cause type instabilities.
There was a problem hiding this comment.
I suggest to collect histories with map. I'm not sure if it's the right thing to do.
| y, | ||
| instance = sample.context, | ||
| extra = sample.extra | ||
| ) |
There was a problem hiding this comment.
This might cause a bug for some benchmarks. Using the copy constructor instead may avoid errors:
DataSample(sample; y=y)There was a problem hiding this comment.
I tried this but it gives the following error:
ERROR: LoadError: MethodError: no method matching DataSample(::DataSample{@NamedTuple{…}, @NamedTuple{…}, Vector{…}, Nothing, Nothing}; y::Vector{Float64})
The type DataSample exists, but no method is defined for this combination of argument types when trying to construct it.
| extra = sample.extra | ||
| ) | ||
|
|
||
| push!(augmented_dataset, augmented_datasample) |
There was a problem hiding this comment.
This may cause a lot of allocations in combination with the type instability mentioned above. One workaround would be to modify the dataset in place by directly doing:
dataset[i] = augmented_datasampleThis would solve both issues at once.
There was a problem hiding this comment.
Problem: train_dataset is initialised with sampled were y = Nothing. I replaced push! by a map. I am not sure that it is the right thing to do.
A generic implementation of Mirror Descent.
The first iteration is pure imitation learning.