Skip to content

Conversation

@lxasqjc
Copy link
Contributor

@lxasqjc lxasqjc commented Oct 7, 2025

  • Add interactive parameter to A1 class for human confirmation workflow
  • Implement _get_human_confirmation() method with approve/edit/reject options
  • Add plan editing capabilities with multi-line input support
  • Display interactive mode status in agent configuration
  • Maintain backward compatibility with non-interactive mode (default)
  • Add comprehensive documentation with usage examples and API reference

lxasqjc and others added 2 commits October 7, 2025 09:14
- Add interactive parameter to A1 class for human confirmation workflow
- Implement _get_human_confirmation() method with approve/edit/reject options
- Add plan editing capabilities with multi-line input support
- Display interactive mode status in agent configuration
- Maintain backward compatibility with non-interactive mode (default)
- Add comprehensive documentation with usage examples and API reference
@lxasqjc
Copy link
Contributor Author

lxasqjc commented Oct 7, 2025

🤝 Human-in-the-Loop Interactive Mode for Biomni Agents

Overview

This PR introduces human-in-the-loop functionality to Biomni agents, enabling interactive sessions where users can review, approve, edit, or reject agent-generated plans before execution. This feature provides users with fine-grained control over agent behavior while maintaining full backward compatibility.

🎯 Key Features

Interactive Mode

  • Human Confirmation Workflow: Users can approve, edit, or reject agent plans before execution
  • Plan Editing: Multi-line editing capabilities with real-time preview
  • User Control: Four options for each agent decision:
    1. Approve - Proceed with the current plan
    2. ✏️ Edit - Modify the plan before execution
    3. Reject - Ask agent to regenerate a new plan
    4. 🛑 Stop - Halt execution entirely

Backward Compatibility

  • Default Behavior: Interactive mode is disabled by default (interactive=False)
  • No Breaking Changes: Existing code continues to work without modifications
  • Optional Parameter: Users opt-in to interactive mode explicitly

🔧 Implementation Details

Core Changes

1. Enhanced A1 Class (biomni/agent/a1.py)

class A1:
    def __init__(
        self,
        path: str = "./data",
        interactive: bool = False,  # New parameter
        # ... other parameters
    ):

2. Human Confirmation Method

def _get_human_confirmation(self, plan: str, plan_type: str) -> tuple[bool, str]:
    """
    Get human confirmation for agent plans with editing capabilities.
    
    Returns:
        tuple[bool, str]: (approved, potentially_modified_plan)
    """

3. Configuration Display

  • Interactive mode status shown in agent initialization
  • Clear visual indicators when human confirmation is required

📚 Usage Examples

Basic Interactive Session

from biomni.agent.a1 import A1

# Create an interactive agent
agent = A1(
    path="./data",
    interactive=True,  # Enable human-in-the-loop
    commercial_mode=True
)

# Agent will now ask for confirmation before executing plans
log, response = agent.go("Create a scatter plot of gene expression data")

Interactive Workflow Example

When interactive mode is enabled, users see:

============================================================
🤖 BIOMNI AGENT - CODE CONFIRMATION
============================================================

📋 Generated code:
----------------------------------------
import pandas as pd
import matplotlib.pyplot as plt

# Load gene expression data
data = pd.read_csv('gene_expression.csv')
plt.scatter(data['gene1'], data['gene2'])
plt.show()
----------------------------------------

🤔 What would you like to do?
  1. ✅ Approve and proceed
  2. ✏️  Edit the plan
  3. ❌ Reject and ask agent to regenerate
  4. 🛑 Stop execution

Your choice (1-4): 

Plan Editing Workflow

If user chooses "Edit":

✏️  Edit mode - Current code:
----------------------------------------
import pandas as pd
import matplotlib.pyplot as plt

# Load gene expression data  
data = pd.read_csv('gene_expression.csv')
plt.scatter(data['gene1'], data['gene2'])
plt.show()
----------------------------------------

Enter your modifications (press Enter twice to finish):
# Add title and labels
plt.title('Gene Expression Correlation')
plt.xlabel('Gene 1 Expression')  
plt.ylabel('Gene 2 Expression')

✅ Plan updated! Proceeding with modified version...

Non-Interactive Mode (Default)

# Standard usage - no human confirmation required
agent = A1(path="./data")  # interactive=False by default
log, response = agent.go("Analyze protein interaction data")
# Executes automatically without user input

🧪 Testing

The implementation includes comprehensive testing covering:

  • Approval Workflow: Plans approved and executed unchanged
  • Rejection Workflow: Plans rejected and regeneration requested
  • Editing Workflow: Plans modified by user before execution
  • Backward Compatibility: Non-interactive mode works as before
  • Configuration Display: Interactive status properly shown

Test Results

🎯 Overall result: ALL TESTS PASSED

🎉 Human-in-the-loop functionality is working correctly!
   Users can now:
   • Approve agent plans before execution
   • Edit plans to customize behavior  
   • Reject plans they don't want executed

📖 Documentation

New Documentation Added

  • docs/HUMAN_IN_THE_LOOP.md: Comprehensive guide covering:
    • Feature overview and benefits
    • Installation and setup instructions
    • Detailed usage examples
    • API reference
    • Best practices and troubleshooting

API Reference

A1.__init__(
    interactive: bool = False,  # Enable human-in-the-loop mode
    # ... other parameters remain unchanged
)

A1._get_human_confirmation(
    plan: str,      # The generated plan to review
    plan_type: str  # Type of plan (e.g., "code", "analysis")
) -> tuple[bool, str]  # (approved, potentially_modified_plan)

🔄 Workflow Integration

Interactive Mode Flow

  1. Agent Generates Plan: Based on user query
  2. Human Review: User sees formatted plan with options
  3. User Decision: Approve, Edit, Reject, or Stop
  4. Plan Modification (if editing): Multi-line input with preview
  5. Execution: Proceed with approved/modified plan

Langgraph Integration

The feature integrates seamlessly with the existing Langgraph StateGraph architecture, adding human confirmation nodes where appropriate without disrupting the agent's decision-making flow.

🎉 Benefits

For Researchers

  • Quality Control: Review agent plans before execution
  • Learning Tool: Understand agent reasoning and decision-making
  • Customization: Modify plans to match specific requirements

For Production Use

  • Risk Mitigation: Human oversight for critical operations
  • Compliance: Meet requirements for human-supervised AI systems
  • Flexibility: Balance automation with human control

For Development

  • Debugging: Step through agent logic interactively
  • Testing: Validate agent behavior with human feedback
  • Iteration: Refine prompts and logic based on interactive sessions

🔧 Configuration

Environment Setup

No additional dependencies required. The feature uses standard Python input/output for user interaction.

Integration Examples

# Research workflow with human oversight
research_agent = A1(
    path="./research_data",
    interactive=True,
    llm="gpt-4o",
    commercial_mode=False  # Full academic datasets
)

# Production workflow with commercial datasets  
production_agent = A1(
    path="./production_data", 
    interactive=True,
    commercial_mode=True,  # Commercial-licensed datasets only
    timeout_seconds=300
)

📝 Files Changed

  • biomni/agent/a1.py: Core interactive functionality implementation
  • docs/HUMAN_IN_THE_LOOP.md: Comprehensive documentation and examples

Ready for Review: This PR is fully tested and backward-compatible. The interactive mode provides powerful human oversight capabilities while maintaining the existing API for non-interactive use cases.

@kexinhuang12345
Copy link
Collaborator

This is really cool - will test out and merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants