-
Notifications
You must be signed in to change notification settings - Fork 2.5k
fix: handle mixed type merge in usage entries #9121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix: handle mixed type merge in usage entries #9121
Conversation
- Fix TypeError when calling len() on int values during recursive merging - Improve merge logic: when dict and int types conflict, use new value (right value) as override - Refactor code for better readability with clearer variable names and comments - Add comprehensive tests for mixed type merging scenarios (dict/None, dict/int) - Fix recursive merge parameter order bug
chenmoneygithub
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! But same as @TomeHirata commented in the issue, I could not reproduce the issue.
Dropped a comment on the code change, but basically litellm shouldn't produce different format for the same key in the usage dict for the same model, which is why we made the strong assumption of format match in the code.
dspy/utils/usage_tracker.py
Outdated
|
|
||
| # Case 2: One is dict, the other is not (int/None) | ||
| # Use new value if it's not None (right value takes precedence), otherwise keep old value | ||
| elif old_is_dict != new_is_dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this even happen? it's possible that one is dict and the other is None, but I doubt if they can be of different types. If that's the case, that implies an error on the litellm side, which should provide a unified format for the same model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chenmoneygithub Regarding this issue, I was also able to reproduce it in a GitHub Actions environment.
Just as a note, I’m using a DeepSeek model served through a third-party OpenAI-compatible API:
https://github.com/chizukicn/dspy-mini-reproduction/actions/runs/20153004759/job/57849614753?pr=3
|
@chenmoneygithub {
'openai/deepseek/deepseek-v3.2': [
{
'completion_tokens': 108,
'prompt_tokens': 741,
'total_tokens': 849,
'completion_tokens_details': None,
'prompt_tokens_details': None
},
{
'completion_tokens': 158,
'prompt_tokens': 454,
'total_tokens': 612,
'completion_tokens_details': {
'accepted_prediction_tokens': None,
'audio_tokens': None,
'reasoning_tokens': 0,
'rejected_prediction_tokens': None,
'text_tokens': None,
'image_tokens': None
},
'prompt_tokens_details': {
'audio_tokens': 0,
'cached_tokens': 128,
'text_tokens': None,
'image_tokens': None,
'cache_creation_input_tokens': 0,
'cache_read_input_tokens': 0
}
}
]
}After adding the following logic to simulate a first-time tool call failure: first_time = True
def search_weather(city: str):
global first_time
if first_time:
raise RuntimeError("First time")
first_time = False
return {
"city": city,
"weather": random.choice(["sunny", "cloudy", "rainy", "snowy"]),
"temperature": random.randint(0, 40),
"humidity": random.randint(0, 100),
"pressure": random.randint(900, 1100),
"wind_speed": random.randint(0, 100),
"wind_direction": random.choice(["N", "S", "E", "W"]),
"wind_gust": random.randint(0, 100),
"wind_gust_direction": random.choice(["N", "S", "E", "W"]),
"wind_gust_speed": random.randint(0, 100),
}The output of self.usage_data becomes: {
"openai/deepseek/deepseek-v3.2": [
{
"completion_tokens": 129,
"prompt_tokens": 975,
"total_tokens": 1104,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
{
"completion_tokens": 144,
"prompt_tokens": 1418,
"total_tokens": 1562,
"completion_tokens_details": null,
"prompt_tokens_details": {
"audio_tokens": 0,
"cached_tokens": 0,
"text_tokens": null,
"image_tokens": null,
"cache_creation_input_tokens": 0,
"cache_read_input_tokens": 0
}
},
{
"completion_tokens": 121,
"prompt_tokens": 1167,
"total_tokens": 1288,
"completion_tokens_details": {
"accepted_prediction_tokens": null,
"audio_tokens": null,
"reasoning_tokens": 0,
"rejected_prediction_tokens": null,
"text_tokens": null,
"image_tokens": null
},
"prompt_tokens_details": null
}
]
}
Notice that the array now contains three entries, and the first two entries have completion_tokens_details set to None. Looking at the _merge_usage_entries method: def _merge_usage_entries(self, usage_entry1: dict[str, Any] | None, usage_entry2: dict[str, Any] | None) -> dict[str, Any]:
if usage_entry1 is None or len(usage_entry1) == 0:
return dict(usage_entry2)
if usage_entry2 is None or len(usage_entry2) == 0:
return dict(usage_entry1)
result = dict(usage_entry2)
for k, v in usage_entry1.items():
current_v = result.get(k)
if isinstance(v, dict) or isinstance(current_v, dict):
result[k] = self._merge_usage_entries(current_v, v)
else:
result[k] = (current_v or 0) + (v or 0)
return resultAccording to this logic, the first two The new reproduction code can be found here: https://github.com/chizukicn/dspy-mini-reproduction/blob/main/main.py |
close #9120