feat(llmobs): prompt management SDK methods#18186
Conversation
Add CRUD operations for LLM Observability prompt registry to the Python SDK.
|
Codeowners resolved as |
list_prompts and list_prompt_versions only need DD_API_KEY since the backend uses ValidReportingAPIUser for read endpoints.
_request() passes body= to conn.request(), mock signature needed to match.
…[MLOB-7524] Handler expects filter[ml_app], SDK was sending ml_app.
…B-7524]
Plain JSON API returns bare arrays, not JSONAPI {"data": [...]} wrappers.
cfd1734 to
17efe7f
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 17efe7f82a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
…ecycle [MLOB-7524] Both _fetch_from_registry and _request shared identical connection setup/teardown logic. Extract low-level _http_request(method, path, body, headers, timeout) -> (status, response_body) that handles transport only. Each caller keeps its own error handling strategy.
…-7524] Map HTTP status codes to exception classes via _STATUS_EXCEPTIONS dict instead of chained if/raise statements. Cleaner, data-driven, easier to extend.
create_prompt, create_prompt_version, update_prompt, and update_prompt_version now evict hot+warm cache entries so subsequent get_prompt calls reflect the latest state instead of serving stale data until TTL expiry.
…[MLOB-7524] The API expects a numeric auto-increment version number in the URL path (parsed via PathInt on the Go side). Changing from str to int makes the contract explicit and prevents misuse with user_version strings.
- Fix cache key parsing: use rsplit instead of partition to handle prompt_ids containing colons - Wrap json.loads in _request with try-except for malformed 2xx responses - Extract _evict_prompt_caches helper to deduplicate 5 eviction sites
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bffe1315a5
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
A read path such as get_prompt() or list_prompts() can build and cache the PromptManager before enable(app_key=...) runs. enable() updated LLMObs._app_key but left the cached manager untouched, so it kept its then-empty app key and every write API (create/update/delete) raised PromptAuthError even though a valid app key had been provided. Invalidate the cached manager in enable() when an app key is supplied so it rebuilds with the new key on next use.
- refresh_prompt: guard DD-API-KEY like get_prompt instead of relying on server reject - clear_prompt_cache: snapshot prompt manager under lock to avoid TOCTOU null-deref vs enable() - HotCache.evict_prompt: match exact prompt_id so colon-bearing ids no longer over-evict - WarmCache: encode path segments with quote() (injective) so distinct ids no longer collide on the same cache file
…label param Labels are deployment markers (typically DD_ENV values) and the registry/DB already store and match them as arbitrary strings. Drop the PromptLabel = Literal["development","production"] alias entirely and type label/labels as str / list[str] directly. Deprecate the read 'label' parameter of get_prompt()/refresh_prompt() via debtcollector: the going-forward mechanism is to set DD_ENV and let the SDK resolve the version for that environment. The write 'labels' param (create/ version) is unchanged.
3224971 to
2a56014
Compare
emmettbutler
left a comment
There was a problem hiding this comment.
release note approved
| deprecations: | ||
| - | | ||
| LLM Observability: The ``label`` parameter of ``LLMObs.get_prompt()`` and ``LLMObs.refresh_prompt()`` is | ||
| deprecated. Set ``DD_ENV`` instead; the prompt version is resolved for that environment. |
There was a problem hiding this comment.
| deprecated. Set ``DD_ENV`` instead; the prompt version is resolved for that environment. | |
| deprecated. Set ``DD_ENV`` instead; the prompt version is resolved for that environment. #18186 |
Also, when will the deprecated parameter be removed?
The backend list route does not yet honor filter[ml_app] (the fix is split into a separate dd-source bugfix PR). Remove the SDK ml_app parameter so it does not advertise a silent no-op; it will be re-added once the backend filter lands.
MockHTTPResponse.read() only handled dict and str bodies; a list body (e.g. the
list_prompts response) fell through to b"", so _request saw an empty body and
returned {}. test_request_normalizes_backend_id_key[response1] (list_prompts)
asserted a list and failed. Serialize any non-str, non-None body as JSON.
Description
Adds prompt management methods to the LLMObs Python SDK, calling the new public API endpoints from MLOB-7523.
New public API
Not covered:
GET /prompts/{prompt_id}/versions/{version}is intentionally left out. Prefer to add it once #18127 (hybrid prompt delivery) lands, since it changes theget_promptsignature and it's trivial.New types (
ddtrace/llmobs/types.py)ChatMessage(TypedDict)- template message format (role+contentonly)PromptResponse(TypedDict)- returned by create/update/list prompt operationsPromptVersionResponse(TypedDict)- returned by create/update/list version operationsDeletedPromptResponse(TypedDict)- returned by deleteException hierarchy
Each method documents which exceptions it can raise.
Cache changes (
cache.py)~/.cache/datadog/llmobs/prompts/{prompt_id}/{label}.json, withquote()-encoded segments so distinct ids/labels never collide on the same fileevict_prompt(prompt_id)- usesshutil.rmtreeon the prompt directoryFiles changed
types.pyChatMessage, 3 response TypedDicts, 6 exception classes (PromptAPIErrorbase + 5 status-specific)_llmobs.py/manager.pystr/list[str]cache.pyquote()-encoded segments,evict_promptmethodmanager.py_requesthelper, 7 CRUD methods, API key validation onget_promptandrefresh_prompt_llmobs.pyclear_prompt_cachereads the manager under its locktest_prompts.pyE2E validation (staging) - 13/13 pass
Tested against
datad0g.comusing the SDK. Save the script below, install the branch via an editable local checkout, replace keys with your own staging credentials, and run withpython test_sdk.py.create_prompt()create_prompt()list_prompts()create_prompt_version()list_prompt_versions()update_prompt()update_prompt_version()get_prompt()get_prompt(label="development")update_prompt()with no fieldsdelete_prompt()get_prompt()delete_prompt()Expected output