Skip to content

Async support in pupyarrow#66

Draft
jpc wants to merge 2 commits into
jpc/audio-refactorfrom
jpc/async-pupyarrow
Draft

Async support in pupyarrow#66
jpc wants to merge 2 commits into
jpc/audio-refactorfrom
jpc/async-pupyarrow

Conversation

@jpc
Copy link
Copy Markdown
Member

@jpc jpc commented Apr 14, 2026

Focus:

  • improve performance for S3 shards by merging IO and issuing concurrent requests (using a custom IO scheduler with async read requests)

@jpc jpc force-pushed the jpc/async-pupyarrow branch 2 times, most recently from 1d19787 to 364d0b6 Compare April 14, 2026 08:44
jpc and others added 2 commits May 17, 2026 07:50
BlockCache: sorted interval cache that merges overlapping byte ranges.
Designed for caching S3 range-read results.

LazyBuffer:
- enable_cache(readahead): activate block cache with configurable readahead
- prepopulate(ranges): pre-fetch byte ranges into cache (sync)
- async_prepopulate(ranges): parallel async variant
- read_range() checks cache before hitting reader, fetches with
  readahead on miss

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jpc jpc force-pushed the jpc/async-pupyarrow branch from 364d0b6 to 1f82a7a Compare May 17, 2026 18:57
@jpc jpc changed the base branch from main to jpc/audio-refactor May 18, 2026 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant