Ascend Branch VLM Bug by HwVanICI · Pull Request #962 · inclusionAI/AReaL

HwVanICI · 2026-03-02T22:09:20Z

Description

The current ascend branch raises error when training VLMs. The error comes from the import str reward in areal/workflow/vision_rlvr.py. This PR fixes this issue.

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not
work as expected)
Documentation update
Code refactoring (no functional changes)
Performance improvement
Test coverage improvement

Checklist

I have read the Contributing Guide
I have run formatting tools (pre-commit or manual)
I have run relevant unit tests and they pass
I have added tests for new functionality
I have updated documentation if needed
My branch is up to date with main
This PR introduces breaking changes (if yes, fill out details below)
If this PR changes documentation, I have built and previewed it locally with
jb build docs
No critical issues raised by AI reviewers (/gemini review)

Breaking Change Details (if applicable):

Additional Context

Need help? Check the Contributing Guide or ask in
GitHub Discussions!

gemini-code-assist · 2026-03-02T22:09:31Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in the ascend branch that prevented Vision-Language Models from training successfully. The fix ensures that reward functions, when specified as strings, are correctly imported and wrapped, allowing the training process to proceed without errors. This improves the robustness of the VLM training workflow.

Highlights

Bug Fix: Resolved an error that occurred during VLM (Vision-Language Model) training due to an incorrect import of the reward function.
Dynamic Reward Function Loading: Implemented logic to dynamically import the reward function if it is provided as a string, ensuring proper initialization and preventing runtime errors.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

areal/workflow/vision_rlvr.py
- Imported AsyncRewardWrapper for asynchronous reward handling.
- Imported import_from_string utility for dynamic module loading.
- Added conditional logic within arun_episode to dynamically load self.reward_fn if it's a string, then wrap it with AsyncRewardWrapper.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request fixes a bug where using a string-based reward function with VisionRLVRWorkflow would cause a crash. The fix correctly adds logic to dynamically import and wrap the reward function. However, this introduces code duplication from the parent RLVRWorkflow class. I've added a suggestion to refactor this duplicated logic to improve maintainability.

gemini-code-assist · 2026-03-02T22:15:49Z

areal/workflow/vision_rlvr.py

+        if isinstance(self.reward_fn, str):
+            self.reward_fn = import_from_string(self.reward_fn)
+            self.async_reward_fn = AsyncRewardWrapper(self.reward_fn)


This logic for lazily initializing the reward function is also present in the parent class RLVRWorkflow.arun_episode. This duplication could lead to maintenance issues if the logic needs to be updated in the future.

To improve maintainability and avoid duplication, consider refactoring this block into a protected helper method in the RLVRWorkflow base class and calling it from both arun_episode methods.

For example, you could add the following to areal/workflow/rlvr.py:

class RLVRWorkflow(RolloutWorkflow): # ... def _initialize_reward_fn(self): """Initializes reward_fn from string if necessary.""" if isinstance(self.reward_fn, str): self.reward_fn = import_from_string(self.reward_fn) self.async_reward_fn = AsyncRewardWrapper(self.reward_fn)

Then, you could call self._initialize_reward_fn() at the beginning of arun_episode in both RLVRWorkflow and VisionRLVRWorkflow, which would remove the duplicated code.

github-actions · 2026-03-27T02:04:06Z

This pull request has been automatically marked as stale because it has not had recent activity within the last 14 days.

Please add a comment or push new commits to keep it active.

Thank you for your contribution!

HwVanICI and others added 3 commits January 22, 2026 11:00

Resolve pyproject.toml dependencies for NPU

106620d

Add Dockerfile and vllm-ascend patch for NPU

3190516

vlm bug fix

96c5873

HwVanICI marked this pull request as ready for review March 2, 2026 22:10

HwVanICI changed the title ~~vlm bug fix~~ Ascend Branch VLM Bug Mar 2, 2026

gemini-code-assist bot reviewed Mar 2, 2026

View reviewed changes

HwVanICI force-pushed the ascend branch from 3190516 to 4601254 Compare March 12, 2026 02:55

github-actions bot added the stale label Mar 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ascend Branch VLM Bug#962

Ascend Branch VLM Bug#962
HwVanICI wants to merge 3 commits intoinclusionAI:ascendfrom
HwVanICI:vlm_bug_fix

HwVanICI commented Mar 2, 2026

Uh oh!

gemini-code-assist bot commented Mar 2, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 2, 2026

Uh oh!

github-actions bot commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

HwVanICI commented Mar 2, 2026

Description

Type of Change

Checklist

Additional Context

Uh oh!

gemini-code-assist bot commented Mar 2, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant