Skip to content

Scan API and Engine Integrations#44

Open
gatesn wants to merge 1 commit intodevelopfrom
ngates/scan-api
Open

Scan API and Engine Integrations#44
gatesn wants to merge 1 commit intodevelopfrom
ngates/scan-api

Conversation

@gatesn
Copy link
Copy Markdown
Contributor

@gatesn gatesn commented Apr 8, 2026

This RFC looks at how we can expose deeper integration with query engine internals like scheduling, threading models, buffer pools, and so on

Signed-off-by: Nicholas Gates <nick@nickgates.com>

- The Scan API is not itself a full relational query engine.
- `LayoutReader` should not grow unknown-cardinality operator semantics.
- Vortex should not require a specific Rust async runtime such as Tokio.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's weird that we would go through all of this and still assume Tokio but I haven't read all of it yet

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already don't assume tokio, it just continues to be an explicit goal

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The double negation here implies the opposite? You want the goal to be that the runtime doesn't assume tokio? maybe I am reading too much into random ai generated strings


- the host may provide a CPU scheduler
- Vortex may use it for bounded split-local CPU work
- Vortex must not assume ownership of the whole query runtime
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this statement mean in practice? I think there's intent behind it but I fail to understand what this means in practice?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd guess in particular in terms of use of resources, e.g. spawning threads but also unix process ownership e.g. Vortex should never crash the host. @gatesn correct me if you had sth else in mind.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vortex should never crash the host.

Error handling might deserve a small section in this PR. I briefly talked about this with @myrrc but I think we'll need a panic handler (the host maybe can configure) to prevent that we never crash a host.

- split lookahead policy
- efficient materialization of output batches

### What `Partitioning` Means
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Words are hard, partitioning usually means some arrangement of data which this is not about. But maybe this is Partitioning and the other thing is Arrangament


Correctness is more important than maximal pushdown.

## Ordering, Limits, and Future Dynamic Filters
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should mention Partitioning here (or a I redefined it Arrangement). It's a super set of ordering

Copy link
Copy Markdown

@0ax1 0ax1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought, maybe worthwhile clauding some ascii diagrams to illustrate some of the aspects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants