Skip to content

Standardizing a format to stream results from a SPARQL query #323

@vemonet

Description

@vemonet

For large results sets some triplestores are streaming results (e.g. Qlever, MilleniumDB). For CONSTRUCT/DESCRIBE it is easy to handle you can just ask for ntriples.

But for SELECT they are streaming a huge application/sparql-results+json object. Parsing this object is not clean and requires hacks

Streaming results from an HTTP GET/POST request have became quite common lately, especially with LLM streaming responses. Which are all using a format, dubbed "OpenAI-compatible APIs", looking like Server-Sent Event (https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events), but with POST instead of GET because they are sending a complex JSON. The SSE specification is only enabling using GET, which is unfortunately disconnected from the reality of complex and long queries required on the modern web

We could describe a similar streaming format for SPARQL queries, where we stream bindings chunk by chunk, for example something along the lines:

event: sparql_results_head
data: {'vars': ['s', 'p', 'o']}

event: sparql_results_binding
data: {'s': {'type': 'uri', 'value': 'http://purl.uniprot.org/uniprot/O13227#SIP556AB4700293B270'}, 'p': {'type': 'uri', 'value': 'http://purl.uniprot.org/core/certain'}, 'o': {'datatype': 'http://www.w3.org/2001/XMLSchema#boolean', 'type': 'literal', 'value': 'false'}}

event: sparql_results_binding
data: {'s': {'type': 'uri', 'value': 'http://purl.uniprot.org/uniprot/O82081#SIPB8F70E02AB555160'}, 'p': {'type': 'uri', 'value': 'http://purl.uniprot.org/core/certain'}, 'o': {'datatype': 'http://www.w3.org/2001/XMLSchema#boolean', 'type': 'literal', 'value': 'false'}}

The following mime-type is used for streaming events: text/event-stream, we could reuse it, or have a special mime-type like application/sparql-results-stream+json (not sure what's the convention there)

Is there any discussions already around streaming standardization in SPARQL 1.2? I could not find anything

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions