Skip to content

Solene#12

Open
sdelannoypavy wants to merge 4 commits into
JuliaDecisionFocusedLearning:mainfrom
sdelannoypavy:solene
Open

Solene#12
sdelannoypavy wants to merge 4 commits into
JuliaDecisionFocusedLearning:mainfrom
sdelannoypavy:solene

Conversation

@sdelannoypavy
Copy link
Copy Markdown

A generic implementation of Mirror Descent.
The first iteration is pure imitation learning.

Copy link
Copy Markdown
Member

@BatyLeo BatyLeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for opening the PR, this is a good start!

Most of comment below are minor details, except a couple correctness bugs that need to be addressed (see comments).

In addition to the comments:

  • We may want to have a documentation page providing a concise explanation of the algorithm, as well as a tutorial showcasing its use
  • Tests for the algorithm are needed, maybe some on the toy contextual argmax, and others on the stochastic VSP to check the different kwargs are correctly provided.
  • Currently failing tests are formatting and JET (which is detecting the correctness bugs caused by methods called incorrectly)

Comment thread src/algorithms/MirrorDescent/mirror_descent.jl
Comment thread src/algorithms/MirrorDescent/mirror_descent.jl
seed=nothing,
)

train_dataset = generate_dataset(benchmark, dataset_size; seed=seed)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, generate_dataset applied to an ExogenousStochasticBenchmark generates a single context and scenario per instance. Do we want to be able to choose different nb_scenarios and contexts_per_instance values?

Copy link
Copy Markdown
Author

@sdelannoypavy sdelannoypavy Jun 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added nb_scenarios and contexts_per_instance as keyword arguments in train_policy. I am not sure of what it means to generate multiple contexts for an instance. The code for dataset generation is quite heavy now, we may want to do something else


# Initialize model and create policy
model = generate_statistical_model(benchmark; seed=seed)
maximizer = generate_maximizer(benchmark)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the considered benchmark, generate_statistical_model and generate_maximizer may have optional kwargs. Should we optionally take them as input of the train_policy method?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe maximizer_kwargs, model_kwargs and solver_kwargs could be passed as an argument to train_policy? (assuming that anticipative_solver and parametric_anticipative_solver use the same optional kwargs)

policy = DFLPolicy(model, maximizer)

# vector because we store one history per iteration
histories_per_iteration = MVHistory[]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MVHistory is not a concrete type, this may cause type instabilities.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to collect histories with map. I'm not sure if it's the right thing to do.

y,
instance = sample.context,
extra = sample.extra
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might cause a bug for some benchmarks. Using the copy constructor instead may avoid errors:

DataSample(sample; y=y)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this but it gives the following error:
ERROR: LoadError: MethodError: no method matching DataSample(::DataSample{@NamedTuple{…}, @NamedTuple{…}, Vector{…}, Nothing, Nothing}; y::Vector{Float64})
The type DataSample exists, but no method is defined for this combination of argument types when trying to construct it.

extra = sample.extra
)

push!(augmented_dataset, augmented_datasample)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may cause a lot of allocations in combination with the type instability mentioned above. One workaround would be to modify the dataset in place by directly doing:

dataset[i] = augmented_datasample

This would solve both issues at once.

Copy link
Copy Markdown
Author

@sdelannoypavy sdelannoypavy Jun 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem: train_dataset is initialised with sampled were y = Nothing. I replaced push! by a map. I am not sure that it is the right thing to do.

Comment thread src/algorithms/MirrorDescent/mirror_descent.jl
Comment thread src/DecisionFocusedLearningAlgorithms.jl
Comment thread src/algorithms/MirrorDescent/mirror_descent.jl
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants