-
Notifications
You must be signed in to change notification settings - Fork 3
Description
For large results sets some triplestores are streaming results (e.g. Qlever, MilleniumDB). For CONSTRUCT/DESCRIBE it is easy to handle you can just ask for ntriples.
But for SELECT they are streaming a huge application/sparql-results+json object. Parsing this object is not clean and requires hacks
Streaming results from an HTTP GET/POST request have became quite common lately, especially with LLM streaming responses. Which are all using a format, dubbed "OpenAI-compatible APIs", looking like Server-Sent Event (https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events), but with POST instead of GET because they are sending a complex JSON. The SSE specification is only enabling using GET, which is unfortunately disconnected from the reality of complex and long queries required on the modern web
We could describe a similar streaming format for SPARQL queries, where we stream bindings chunk by chunk, for example something along the lines:
event: sparql_results_head
data: {'vars': ['s', 'p', 'o']}
event: sparql_results_binding
data: {'s': {'type': 'uri', 'value': 'http://purl.uniprot.org/uniprot/O13227#SIP556AB4700293B270'}, 'p': {'type': 'uri', 'value': 'http://purl.uniprot.org/core/certain'}, 'o': {'datatype': 'http://www.w3.org/2001/XMLSchema#boolean', 'type': 'literal', 'value': 'false'}}
event: sparql_results_binding
data: {'s': {'type': 'uri', 'value': 'http://purl.uniprot.org/uniprot/O82081#SIPB8F70E02AB555160'}, 'p': {'type': 'uri', 'value': 'http://purl.uniprot.org/core/certain'}, 'o': {'datatype': 'http://www.w3.org/2001/XMLSchema#boolean', 'type': 'literal', 'value': 'false'}}
The following mime-type is used for streaming events: text/event-stream, we could reuse it, or have a special mime-type like application/sparql-results-stream+json (not sure what's the convention there)
Is there any discussions already around streaming standardization in SPARQL 1.2? I could not find anything