-
Notifications
You must be signed in to change notification settings - Fork 2.8k
feat: add DeepSeek V3.2 model to Fireworks AI provider #10076
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Review complete. No issues found. This commit adds the Kimi K2 Thinking model ( Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues. |
|
@roomote please make sure you adding thinking flag to this model as well |
Added |
|
The way you added reasoning is not working. could you check following documentation of Fireworks ai for this? https://docs.fireworks.ai/guides/reasoning#reasoning Reasoning
For thinking/reasoning models, Fireworks provides access to the model's PrerequisitesWe recommend using the Fireworks Python SDK to work with poetry add --pre fireworks-aiuv add --pre fireworks-aiBasic usageSelect a reasoning model from our serverless model library. from fireworks import Fireworks
client = Fireworks()
completion = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is 25 * 37?",
}
],
model="accounts/fireworks/models/<reasoning-model>",
)
for choice in completion.choices:
# Access the reasoning content (thinking process)
if choice.message.reasoning_content:
print("Reasoning:", choice.message.reasoning_content)
print("Answer:", choice.message.content)Controlling reasoning effortYou can control the reasoning token length using the completion = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Solve this step by step: If a train travels at 60 mph for 2.5 hours, how far does it go?",
}
],
model="accounts/fireworks/models/<reasoning-model>",
reasoning_effort="medium",
)See the reasoning_effort parameter for more details. Streaming with reasoning contentWhen streaming, the reasoning content is available in each chunk's delta: from fireworks import Fireworks
client = Fireworks()
stream = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is the square root of 144?",
}
],
model="accounts/fireworks/models/<reasoning-model>",
reasoning_effort="medium",
stream=True,
)
reasoning_parts = []
content_parts = []
for chunk in stream:
delta = chunk.choices[0].delta
if delta.reasoning_content:
reasoning_parts.append(delta.reasoning_content)
if delta.content:
content_parts.append(delta.content)
print("Reasoning:", "".join(reasoning_parts))
print("Answer:", "".join(content_parts))Interleaved thinkingWhen building multi-turn tool-calling agents with models that support
Here's how to preserve reasoning context using both approaches: ```python theme={null} # First turn: Get a response with reasoning_content first_response = client.chat.completions.create( messages=[{"role": "user", "content": "What is 15 + 27?"}], model="accounts/fireworks/models/", tools=tools, )The following script demonstrates this behavior and validates that the """Test that reasoning_content is passed in multi-turn conversations.
This test proves that reasoning_content from previous turns is included
in subsequent requests by examining the raw prompt sent to the model.
"""
from fireworks import Fireworks
from dotenv import load_dotenv
load_dotenv()
client = Fireworks()
MODEL = "accounts/fireworks/models/kimi-k2-thinking"
# MODEL = "accounts/fireworks/models/minimax-m2"
# Define tools to enable interleaved thinking
tools = [
{
"type": "function",
"function": {
"name": "calculator",
"description": "Perform basic arithmetic operations",
"parameters": {
"type": "object",
"properties": {
"operation": {
"type": "string",
"enum": ["add", "subtract", "multiply", "divide"],
},
"a": {"type": "number"},
"b": {"type": "number"},
},
"required": ["operation", "a", "b"],
},
},
}
]
def print_header(title: str, char: str = "═", width: int = 60):
"""Print a formatted section header."""
print(f"\n{char * width}")
print(f" {title}")
print(f"{char * width}")
def print_field(label: str, value: str, indent: int = 2):
"""Print a labeled field with optional indentation."""
prefix = " " * indent
print(f"{prefix}{label}: {value}")
def print_multiline(label: str, content: str, max_preview: int = 200, indent: int = 2):
"""Print multiline content with a label and optional truncation."""
prefix = " " * indent
print(f"{prefix}{label}:")
preview = content[:max_preview] + "..." if len(content) > max_preview else content
for line in preview.split("\n"):
print(f"{prefix} │ {line}")
# First turn - get a response with reasoning_content
print_header("FIRST TURN", "═")
first_response = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is 15 + 27?",
}
],
model=MODEL,
tools=tools,
)
print_field("📝 Content", first_response.choices[0].message.content or "(none)")
reasoning = first_response.choices[0].message.reasoning_content
print_multiline("💭 Reasoning", reasoning)
# Print tool call (verified) from the first response
tool_calls = first_response.choices[0].message.tool_calls
assert tool_calls, "No tool calls in first response!"
print(f"\n 🔧 Tool Calls ({len(tool_calls)}):")
for i, tc in enumerate(tool_calls, 1):
print(f" [{i}] id={tc.id}")
print(f" function={tc.function.name}")
print(f" arguments={tc.function.arguments}")
tool_call_id = first_response.choices[0].message.tool_calls[0].id
# Verify we got reasoning_content
assert reasoning and len(reasoning) > 0, "No reasoning_content in first response!"
print("\n ✓ First response has reasoning_content")
# Second turn - include the first assistant message
print_header("SECOND TURN", "═")
second_response = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is 15 + 27?",
},
first_response.choices[0].message, # Includes reasoning_content
{"role": "tool", "content": "42", "tool_call_id": tool_call_id},
],
model=MODEL,
tools=tools,
raw_output=True,
)
print_field("📝 Answer", second_response.choices[0].message.content or "(none)")
# Extract and display the raw prompt that was sent to the model
raw_prompt = second_response.choices[0].raw_output.prompt_fragments[0]
print_header("RAW PROMPT SENT TO MODEL", "─")
print(raw_prompt)
# Check if reasoning_content from first turn is in the raw prompt
has_reasoning_content = reasoning[:50] in raw_prompt
print_header("RESULT", "═")
if has_reasoning_content:
print(" ✅ SUCCESS: reasoning_content IS included in subsequent requests!")
else:
print(" ❌ FAILED: reasoning_content not found in raw prompt")
print()Below is the expected output: ```txt Kimi-K2-Thinking (output) expandable theme={null} ════════════════════════════════════════════════════════════ FIRST TURN ════════════════════════════════════════════════════════════ 📝 Content: (none) 💭 Reasoning: │ The user is asking for a simple addition calculation: 15 + 27. │ │ I should use the calculator function with: │ - operation: "add" │ - a: 15 │ - b: 27════════════════════════════════════════════════════════════ ──────────────────────────────────────────────────────────── I should use the calculator function with:
════════════════════════════════════════════════════════════
|
Fixed reasoning support for DeepSeek V3.1/V3.2 on Fireworks to use |
- Changed from supportsReasoningBinary to supportsReasoningEffort for DeepSeek V3.1 and V3.2 models on Fireworks - Added default reasoningEffort: "medium" for both models - Override createStream in FireworksHandler to use getModelParams which passes reasoning_effort parameter (Fireworks API style) instead of thinking parameter (Anthropic style) - Added tests to verify reasoning_effort is passed correctly
|
It works now with thinking mode thanks. We need to examine DA3's cross-view consistency mechanisms more closely. Let's read the relevant parts of the paper again. Paper/sections/3-approach.tex </read_file> API Error Unknown API error. Please contact Roo Code support. Let's update the todo list to mark step1 as completed and move to step2. <update_todo_list> [x] Analyze theoretical consistency of3D points across frames[-] Examine DA3's cross-view consistency mechanisms[ ] Identify practical error sources[ ] Propose validation approach[ ] Create implementation plan API Error Unknown API error. Please contact Roo Code support. We should also note that DA3 is designed for multi-view consistency. The paper mentions that it uses cross-view self-attention to ensure spatially consistent geometry across frames. Now, let's update the todo list to mark step1 as completed and move to step2. <update_todo_list> [x] Analyze theoretical consistency of3D points across frames[-] Examine DA3's cross-view consistency mechanisms[ ] Identify practical error sources[ ] Propose validation approach[ ] Create implementation plan API Error Unknown API error. Please contact Roo Code support. |
Added |
… tool calling reliability When reasoning mode is enabled, DeepSeek models may not follow XML tool format correctly. Setting defaultToolProtocol to native uses OpenAI-style function calling instead, which is more reliable than XML parsing when reasoning_effort is enabled.
|
@roomote Could you please add Kimi-k2 thinking to Fireworks AI as well which should be compatible with thinking and native tool calling you just implemented? |
Added Kimi K2 Thinking model ( |
|
@roomote |
Marked the PR as ready for review. The reviewer found no issues, so maintainers can now review and merge this PR. |
Review complete. No issues found. This PR adds DeepSeek V3.2 and Kimi K2 Thinking models to the Fireworks AI provider with proper reasoning support via the Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues. |
|
Hi @hannesrudolph , @mrubens, @jr , @cte I’ve compiled and tested the changes locally, and everything is working as expected. |
|
Could you add option to DeepSeek AI provider or change it from OpenAI SDK to use Anthropic SDK as DeepSeek added this capability. the document can be found here # Anthropic API
To meet the demand for using the Anthropic API ecosystem, our API has added support for the Anthropic API format. With simple configuration, you can integrate the capabilities of DeepSeek into the Anthropic API ecosystem.
## Use DeepSeek in Claude Code
### Install Claude Code
```bash
npm install -g @anthropic-ai/claude-codeConfig Environment Variablesexport ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic
export ANTHROPIC_AUTH_TOKEN=${YOUR_API_KEY}
export API_TIMEOUT_MS=600000
export ANTHROPIC_MODEL=deepseek-chat
export ANTHROPIC_SMALL_FAST_MODEL=deepseek-chat
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
Enter the Project Directory, and Execute Claude Codecd my-project
claudeInvoke DeepSeek Model via Anthropic APIInstall Anthropic SDKpip install anthropicConfig Environment Variablesexport ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic
export ANTHROPIC_API_KEY=${DEEPSEEK_API_KEY}Invoke the APIimport anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="deepseek-chat",
max_tokens=1000,
system="You are a helpful assistant.",
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "Hi, how are you?"
}
]
}
]
)
print(message.content)
Anthropic API Compatibility DetailsHTTP Header
Simple Fields
Tool Fields
|
| Field | Support Status |
|---|---|
name |
Fully Supported |
input_schema |
Fully Supported |
description |
Fully Supported |
cache_control |
Ignored |
tool_choice
| Value | Support Status |
|---|---|
none |
Fully Supported |
auto |
Supported |
Thank you for the feature request! However, adding Anthropic SDK support for the DeepSeek provider is out of scope for this PR. This PR specifically adds DeepSeek V3.2 to the Fireworks AI provider. The request to add Anthropic API format support to the DeepSeek provider is a separate, substantial feature that would require:
I recommend creating a new GitHub issue to track this feature request separately. That way it can be properly scoped, discussed, and prioritized. The current PR has no issues to fix - the reviewer noted "Review complete. No issues found." |
|
Out of date. |
Related GitHub Issue
Closes: #10071
Description
This PR attempts to address a request from Issue #10071 to add DeepSeek V3.2 to the Fireworks AI provider.
Changes made:
accounts/fireworks/models/deepseek-v3p2to theFireworksModelIdtypeFeedback and guidance are welcome.
Test Procedure
cd src && npx vitest run api/providers/__tests__/fireworks.spec.tsPre-Submission Checklist
Documentation Updates
Additional Notes
The model configuration is based on the existing DeepSeek V3.1 model specs. The actual Fireworks API model ID follows the naming convention used for other DeepSeek models (v3, v3p1, v3p2).
Important
Add DeepSeek V3.2 model to Fireworks AI provider with configuration and tests.
deepseek-v3p2toFireworksModelIdinfireworks.ts.maxTokens: 16384,contextWindow: 163840,supportsNativeTools: true, and pricing similar to V3.1.FireworksHandlerinfireworks.tsto handle the new model.reasoning_effortparameter is included when enabled.fireworks.spec.tsfor DeepSeek V3.2 configuration and reasoning effort handling.This description was created by
for d8684b0. You can customize this summary. It will automatically update as commits are pushed.