Skip to content

fix(weave): make ObjectRecord subscriptable for str.format templates#6896

Closed
rgao-coreweave wants to merge 2 commits into
masterfrom
fix/object-record-subscriptable
Closed

fix(weave): make ObjectRecord subscriptable for str.format templates#6896
rgao-coreweave wants to merge 2 commits into
masterfrom
fix/object-record-subscriptable

Conversation

@rgao-coreweave
Copy link
Copy Markdown
Contributor

@rgao-coreweave rgao-coreweave commented May 20, 2026

Background

The actors

ObjectRecord (weave/trace/object_record.py) is Weave's wrapper for "I deserialized a user object from the trace database but I don't have its original class available." When a user logs a call whose input is, say, a custom pydantic BaseModel, the class definition lives in their code — not in Weave's. So Weave stores the attributes, and on retrieval reconstructs a generic stand-in that hangs those attributes off itself via setattr. You can read fields with obj.answer, obj.score, etc. But it does not behave like a dict — there is no __getitem__.

Python's str.format has two ways to drill into a value:

"{x.field}".format(x=val)   # attribute access  → val.field
"{x[field]}".format(x=val)  # subscript access  → val["field"]

They look interchangeable. They are not. The second one calls __getitem__.

The collision

The scoring system lets users write prompt templates like:

Was this output correct?
Question: {inputs[question]}
Answer:   {output[answer]}

Then the worker fetches the call's inputs/output from the trace DB and runs:

prompt.format(**inputs_with_defaults)
  • If the user originally logged {"question": "..."} as a plain dict → roundtrips as a dict → inputs["question"] works.
  • If they logged a pydantic model or dataclass → roundtrips as an ObjectRecordinputs["question"] calls ObjectRecord.__getitem__("question") → no such method → TypeError: 'ObjectRecord' object is not subscriptable.

The user's template is unchanged. The user's intent is unchanged. The thing that changed is the runtime type of the value, which the user has zero control over — it's determined by Weave's serialization layer.

Why it surfaced

This was a regression flagged in Datadog with ~734 occurrences over the past week. First seen ~2 months ago at ae0963a, dormant, then resurging 5 days ago at 62e783a — likely a new scorer rollout or a change in how call inputs are unwrapped before reaching the formatter.

Stack trace observed in weave-worker:

File "/weave/src/workers/scoring_worker.py", line 1083, in _do_score_call
File "/weave/src/workers/scoring_worker.py", line 1274, in _do_score_call_llm_as_a_judge
    updated_scoring_prompt = _format_scoring_prompt(
        scorer.scoring_prompt, inputs_with_defaults
    )
File "/weave/src/workers/scoring_worker.py", line 1164, in _format_scoring_prompt
    scoring_prompt = scoring_prompt.format(**inputs_with_defaults)
TypeError: 'ObjectRecord' object is not subscriptable

The same vulnerable template.format(**inputs) pattern exists in the public SDK at:

  • weave/scorers/llm_as_a_judge_scorer.py:66LLMAsAJudgeScorer.score
  • weave/prompt/prompt.py:146format_message_with_template_vars (used by MessagesPrompt.format)

…so this isn't worker-only; any user code path that runs a prompt template over deserialized call inputs/output can hit it.

Fix

Add __getitem__ to ObjectRecord delegating to self.__dict__[key]. Now obj["field"] behaves the same as obj.field.

def __getitem__(self, key: str) -> Any:
    # Mirror attribute access so templates using `{x[field]}` subscript
    # syntax work the same as `{x.field}`. Without this, `str.format` on
    # an ObjectRecord raises `TypeError: 'ObjectRecord' object is not
    # subscriptable` — for example, when a scoring prompt formats over a
    # call's deserialized inputs/output.
    return self.__dict__[key]

Why this is the right level:

  • Purely additive. ObjectRecord had no __getitem__ previously, so no existing code path can change behavior — only previously-broken code starts working.
  • Single fix, three call sites. It repairs weave-worker's _format_scoring_prompt, LLMAsAJudgeScorer.score, and MessagesPrompt.format in one place. The worker installs the Weave SDK as a dependency, so on the next SDK release it picks up this fix with no worker-side change.
  • Recurses naturally. Nested ObjectRecord values inside an ObjectRecord work too — {x[user][name]} is two __getitem__ calls, both satisfied by the same one-line implementation.
  • Matches existing semantics. ObjectRecord.unwrap() already converts the record back to a plain dict, so treating it dict-like for read-only access aligns with the class's existing posture.

Test plan

New file tests/trace/test_object_record.py with one regression test, test_object_record_supports_subscript_for_str_format, which fails with the exact production TypeError before the patch and passes after. It covers:

  • Direct subscript access mirrors attribute access (rec["answer"] returns the stored attribute).
  • Missing keys raise KeyError (dict semantics), not TypeError.
  • Production scenario: "{out[answer]}".format(out=record).
  • Nested ObjectRecord subscripting ({x[inner][answer]}) recurses through __getitem__.

Regression check on existing tests that touch ObjectRecord:

  • tests/trace/test_object_record.py (1 new test)
  • tests/trace/test_serialize_encoders.py
  • tests/trace/test_deepcopy.py
  • tests/trace/test_llm_as_a_judge_scorer.py
  • 114 tests pass total, no regressions
  • uvx ruff check clean

🤖 Generated with Claude Code

@wandbot-3000
Copy link
Copy Markdown

wandbot-3000 Bot commented May 20, 2026

@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@w-b-hivemind
Copy link
Copy Markdown

w-b-hivemind Bot commented May 20, 2026

HiveMind Sessions

4 sessions · 45m · $15

Session Agent Duration Tokens Cost Lines
PR Code Review Analysis and Fixes
6899d2a7-092a-4ce1-af5b-bc95069982ed
claude 15m 57.3K $4.77 +0 -0
Code Review Feedback Analysis and Fixes
17fcb230-4cb4-4a79-bfe1-0f9ab191912a
claude 18m 39.0K $4.87 +115 -37
Sync Branch to Master and Push
be731229-c9e8-43ab-907e-cabf949fed30
claude 1m 1.9K $0.28 +0 -0
Fix String Format Template Bug in Scoring Worker
506fece8-fe45-4461-b3b0-060a8f1e17a8
claude 10m 38.3K $4.65 +73 -34
Total 45m 136.5K $15 +188 -71

View all sessions in HiveMind →

Run claude --resume 6899d2a7-092a-4ce1-af5b-bc95069982ed to pickup where you left off.

rgao-coreweave and others added 2 commits May 20, 2026 06:58
ObjectRecord (the wrapper Weave produces when deserializing a user
object from the trace database) only supported attribute access. Python's
str.format resolves "{x[field]}" via __getitem__, so any prompt template
using subscript syntax against an ObjectRecord raised
"TypeError: 'ObjectRecord' object is not subscriptable" — observed in
weave-worker's scoring_worker._format_scoring_prompt, and reachable via
LLMAsAJudgeScorer.score and MessagesPrompt.format.

Add __getitem__ delegating to self.__dict__ so subscript access mirrors
attribute access. Purely additive — no existing path changes behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collapse four tests into one regression test that covers attribute-like
subscript lookup, KeyError on missing keys, the production str.format
scenario, and nested ObjectRecord subscripting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rgao-coreweave rgao-coreweave force-pushed the fix/object-record-subscriptable branch from c9dad8a to 021bfc1 Compare May 20, 2026 13:59
@rgao-coreweave rgao-coreweave marked this pull request as ready for review May 20, 2026 14:19
@rgao-coreweave rgao-coreweave requested a review from a team as a code owner May 20, 2026 14:19
# an ObjectRecord raises `TypeError: 'ObjectRecord' object is not
# subscriptable` — for example, when a scoring prompt formats over a
# call's deserialized inputs/output.
return self.__dict__[key]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not look correct to me.

The intent of ObjectRecord was to provide the facade to an object uploaded to weave. Consider the failure mode:

class Something(BaseModel):
    a: int

s = Something(a=1)
s.a     # OK
s["a"]  # ERROR

ref = weave.publish(s)
s2 = ref.get()
s2.a    # OK
s2["a"] # ERROR

But after your change:

s2["a"] # OK

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair concern about the facade — s["a"] working on the proxy when it doesn't on the original is real asymmetry. Wanted to lay out the full picture so we can pick the right level.

Why this surfaced. ~734 errors/week in Datadog from weave-trace's scoring_worker._format_scoring_prompt, where scoring_prompt.format(**inputs_with_defaults) raises TypeError: 'ObjectRecord' object is not subscriptable whenever a logged input/output round-tripped as a pydantic model or dataclass (i.e. became an ObjectRecord). The same pattern exists in the SDK at weave/scorers/llm_as_a_judge_scorer.py:66 and weave/prompt/prompt.py:146, and in any user code doing template.format(**call_inputs_or_output).

Two ways to fix it:

(A) Class-level — what this PR does. Add __getitem__ to ObjectRecord delegating to self.__dict__. One-line behavioral addition; covers every callsite uniformly.

(B) Callsite-level. Revert this PR and add a string.Formatter subclass that makes {x[field]} fall back to getattr when subscripting raises TypeError. Apply at the SDK callsites. ObjectRecord stays a pure facade. Sketch:

import _string, string
class _TemplateFormatter(string.Formatter):
    def get_field(self, field_name, args, kwargs):
        first, rest = _string.formatter_field_name_split(field_name)
        obj = self.get_value(first, args, kwargs)
        for is_attr, key in rest:
            if is_attr:
                obj = getattr(obj, key)
            else:
                try:
                    obj = obj[key]
                except TypeError:
                    obj = getattr(obj, str(key))
        return obj, first

Tradeoffs:

(A) class-level (B) callsite-level
weave-trace worker MessagesPrompt.format path fixed via SDK fixed if worker imports new formatter
weave-trace worker bare scoring_prompt.format(**inputs) path (str template, scoring_worker.py:1123) fixed not fixed — needs separate worker patch
User code doing template.format(**call_inputs) fixed not fixed
ObjectRecord facade purity breaks (by one bit) preserved
Lines of code ~9 ~30 + tests, plus worker patch
Depends on private _string module no yes (stable but private)

On facade purity. The proxy is already lossy/asymmetric in both directions — weave_isinstance(obj, cls) recognizes an ObjectRecord by _class_name while plain isinstance(obj, cls) does not; ObjectRecord carries helpers (map_values, unwrap) the original lacks; the original's class-specific methods are gone on the proxy. Today the asymmetry runs both ways; this PR adds one more bit on the "proxy is more permissive" side.

On regression risk. Strictly additive — ObjectRecord had no __getitem__ previously, so no path that worked before changes behavior. I grepped: no try/except TypeError patterns around subscripting it, no Mapping duck-checks that would now wrongly include it (the ABC needs keys() too, which we don't add).

My recommendation: (A). The worker has a bare str.format() callsite on a raw string template that (B) can't reach without a separate worker patch, and the bug surface includes user code doing the same. Fixing on ObjectRecord is the single point that catches all of them.

Happy to switch to (B) if you'd rather, but flagging that it'll need a follow-up worker change too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulled in a second perspective via /talk-to-tim — a tool that emulates @tssweeney's engineering reasoning — to pressure-test my own framing before you respond. Sharing his read since it sharpens the argument better than mine did:

ObjectRecord is not a facade. it is a deserialized stand-in. it already has:

  • _class_name, _bases that the original doesn't
  • map_values, unwrap that the original doesn't
  • weave_isinstance returning true but isinstance returning false
  • no .model_dump() from the original

so the "facade purity" ship sailed a long time ago. the one bit of asymmetry [you're] pointing at is real but it's the least important one — it's the one that makes the user's reasonable code work.

and frankly — s["a"] failing on a pydantic model is a python limitation, not a feature we should be faithfully reproducing on the proxy. nobody is writing try: s["a"] except TypeError to detect pydantic vs dict. that pattern doesn't exist in real code.

And on the alternatives — including a third option I floated to him (calling unwrap() at the SDK callsite to convert ObjectRecord → dict before formatting):

every callsite that does template.format(**inputs) is now broken until someone remembers to route it through _TemplateFormatter. that's not a fix, that's a maintenance tax forever. and it leaks knowledge of ObjectRecord's quirks into every place that formats strings. that is exactly the "concept living in the wrong layer" thing — string formatting code should not need to know about weave's deserialization model.

[the unwrap-at-callsite option] is a regression, not a fix. {output.answer} worked before, breaks after. ObjectRecord's whole point is that it preserves attribute access for pydantic/dataclass roundtrips. Stripping it back to a dict at the callsite means anyone who wrote {output.answer} — which is the more natural Python idiom for objects — silently breaks.

His bottom line is the same as mine: ship the class-level fix. The reframe I'd land on is "ObjectRecord is the round-trip representation of user objects, not a behavioral facade." Under that framing, subscript access for stored attributes is just another way to read a field — there's no contract we owe the user to faithfully reproduce BaseModel.__getitem__ raising TypeError.

One thing he flagged for explicit decision before merge: __contains__ / len() / iter() will start hitting their default Python behavior once __getitem__ exists (e.g. "a" in rec will fall back to iterating). His vote — and mine — is keep it scoped: __getitem__ only, documented as "attribute access via subscript for template formatting," not signaling Mapping-ness. If you want me to add an explicit __contains__ raising or similar, happy to.

@rgao-coreweave
Copy link
Copy Markdown
Contributor Author

discussed the fix and agreed with andrew that sdk's facade is not the right place to fix this.

we should fix this in the callsite, (scoring worker), filed a ticket for triage.

closing the PR

@github-actions github-actions Bot locked and limited conversation to collaborators May 21, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants