[BUG] Ollama streaming adapter drops tool_calls emitted before the done chunk

### 📋 Prerequisites
- [x] Searched existing issues
- [x] Reproducible

### 🐛 Bug Description
The `KAgentOllamaLlm` streaming path in `kagent-adk/src/kagent/adk/models/_ollama.py` only reads `tool_calls` from the chunk where `chunk.done == True`. However, Ollama's `/api/chat` streaming protocol emits `tool_calls` in an **earlier** chunk and then sends a separate final chunk with `done=True, tool_calls=None, content=""`. As a result, when an Agent has `spec.declarative.stream: true` (the default), every tool call the model makes is silently discarded. The agent yields an `LlmResponse` with empty `content.parts: []`, no event is enqueued, and the A2A request hangs in a `dequeue_event` poll loop until the client times out.

### 🔄 Steps to Reproduce
1. Apply a `ModelConfig` pointing at any Ollama-hosted model with native tool calling (`llama3.2:3b`, `qwen2.5:3b`, etc.).
2. Apply a declarative `Agent` with `stream: true` and at least one MCP tool (e.g. the default `my-first-k8s-agent` with `k8s_get_resources`).
3. Send a prompt that should trigger a tool call ("any exciting events in my cluster recently?").
4. Observe: no reply is ever returned to the UI; Phoenix shows an `LlmResponse` with `parts: []` despite non-zero `eval_count`.

### 🔬 Direct evidence

Streaming Ollama with the same tool, hitting the upstream directly:
```
$ curl -s POST /api/chat -d '{"model":"llama3.2:3b","stream":true,"tools":[...],"messages":[...]}'
done=False content=''  tool_calls=[{'function': {'name': 'k8s_get_resources', 'arguments': {'resource_type': 'events'}}}]
done=True  content=''  tool_calls=None
```

The tool call arrives in the **non-final** chunk.

### 🩹 Code location

[`kagent-adk/src/kagent/adk/models/_ollama.py`](https://github.com/kagent-dev/kagent/blob/main/python/packages/kagent-adk/src/kagent/adk/models/_ollama.py) (streaming branch in `generate_content_async`):

```python
async for chunk in response:
    if chunk.message.content:
        aggregated_text += chunk.message.content
        yield LlmResponse(..., partial=True, ...)
    if chunk.done:
        final_parts = []
        if aggregated_text:
            final_parts.append(types.Part.from_text(text=aggregated_text))
        for tc in chunk.message.tool_calls or []:   # ← only the done chunk
            ...
```

Should accumulate `tool_calls` across all chunks:

```python
aggregated_tool_calls: list = []
async for chunk in response:
    if chunk.message.content:
        ...
    if chunk.message.tool_calls:
        aggregated_tool_calls.extend(chunk.message.tool_calls)
    if chunk.done:
        ...
        for tc in aggregated_tool_calls:
            ...
```

The non-streaming branch in the same function handles this correctly — it's only the streaming path that's broken.

### 🩺 Workaround
Set `spec.declarative.stream: false` on the Agent CR. The non-streaming path correctly emits `function_call` parts.

### 💻 Environment
- Chart: `kagent-0.9.4`
- App image: `cr.kagent.dev/kagent-dev/kagent/app:0.9.4`
- `kagent-adk`: 0.3.0
- Ollama backend: tested against `llama3.2:3b` and `gemma3n:e4b` aliases; reproducible with any tool-capable Ollama model
- Kubernetes: kind in devcontainer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Ollama streaming adapter drops tool_calls emitted before the done chunk #1922

📋 Prerequisites

🐛 Bug Description

🔄 Steps to Reproduce

🔬 Direct evidence

🩹 Code location

🩺 Workaround

💻 Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[BUG] Ollama streaming adapter drops tool_calls emitted before the done chunk #1922

Description

📋 Prerequisites

🐛 Bug Description

🔄 Steps to Reproduce

🔬 Direct evidence

🩹 Code location

🩺 Workaround

💻 Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions