Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 111 additions & 0 deletions docs/superpowers/specs/2026-03-27-wire-type-oid-fix-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# Wire Protocol Type OID Fix

**Issue**: [#68](https://github.com/erans/pgsqlite/issues/68) - Wire protocol: type oid is incorrect for integer and float types
**Date**: 2026-03-27
**Scope**: Fix fallback type OID inference for pre-existing SQLite tables that lack `__pgsqlite_schema` metadata

## Problem

When returning query results, pgsqlite sends OID 25 (TEXT) for integer, bigint, smallint, boolean, and float columns. This happens because the type inference chain defaults to TEXT when `__pgsqlite_schema` has no entry for a column.

Tables created through pgsqlite's `CREATE TABLE` already work correctly — they populate `__pgsqlite_schema` with PG type names. But pre-existing SQLite tables (or tables created by other tools) have no metadata, so every column gets OID 25.

PostgreSQL client libraries use the wire OID to deserialize values into the correct host language type (e.g., Python `int`, Go `int64`). With OID 25, all values arrive as strings.

## Root Cause

Multiple code paths default to OID 25 when `__pgsqlite_schema` has no entry:

- **executor.rs line 1249**: final fallback in FieldDescription builder
- **extended.rs lines 851, 855, 899, 903**: Parse-time `inferred_types` construction — when `get_schema_type_with_session` returns `None` for source table.column lookups
- **extended.rs lines 4487-4511**: Execute-time `schema_types` population — same `get_schema_type_with_session` fallback

The codebase already has `sqlite_type_info.rs` which contains a complete `PRAGMA table_info` → PG OID mapping, but it's never called from the main query paths.

## Design

### New DbHandler method

**File**: `src/session/db_handler.rs`

Add `get_column_types_from_pragma` — uses `with_session_connection` to run `PRAGMA table_info(<table>)` and returns `HashMap<String, String>` mapping column names to PG type name strings. Internally calls `sqlite_type_info::sqlite_type_to_pg_type_name` to convert SQLite declared types to PG type names.

```rust
pub async fn get_column_types_from_pragma(
&self,
session_id: &Uuid,
table_name: &str,
) -> Result<HashMap<String, String>, PgSqliteError>
```

Returns PG type name strings (e.g., `"integer"`, `"double precision"`, `"boolean"`) so callers can insert them directly into `schema_types` HashMaps.

### New helper function

**File**: `src/types/sqlite_type_info.rs`

Add `sqlite_type_to_pg_type_name` that maps SQLite declared types to PG type name strings. Uses the same logic as the existing `sqlite_type_to_pg_oid` but returns `&'static str`:

- `"INTEGER"` → `"integer"`
- `"INT8"` / `"BIGINT"` → `"bigint"`
- `"INT2"` / `"SMALLINT"` → `"smallint"`
- `"REAL"` / `"FLOAT"` / `"DOUBLE"` → `"double precision"`
- `"BOOLEAN"` / `"BOOL"` → `"boolean"`
- `"TEXT"` / `"VARCHAR"` / `"CHAR"` → `"text"`
- `"BLOB"` → `"bytea"`
- `"DATE"` → `"date"`
- `"TIMESTAMP"` → `"timestamp"`
- `"NUMERIC"` / `"DECIMAL"` → `"numeric"`
- Default → `"text"`

### Simple query protocol fix

**File**: `src/query/executor.rs`

In the `schema_types` population loop (lines 1088-1155), after attempting `get_schema_type_with_session` for each column and finding no match, fall back to PRAGMA-based lookup. Fetch PRAGMA results once per table (before the column loop) and cache in a local `HashMap`. For each column without schema metadata, look up its type in the PRAGMA cache and insert the PG type name into `schema_types`.

This ensures the final fallback at line 1249 is rarely reached — only for computed expressions or columns not in any table.

### Extended query protocol fix

**File**: `src/query/extended.rs`

Three locations need the PRAGMA fallback:

**1. Parse-time `schema_types` population (lines 563-640):** After `get_schema_type_with_session` returns `None`, fall back to PRAGMA lookup. Same pattern as executor.rs.

**2. Parse-time `inferred_types` construction (lines 654-912):** This is the critical path for clients using the extended protocol (e.g., psycopg3). The fallback chain at lines 848-903 calls `get_schema_type_with_session` multiple times via alias resolution. At each `Ok(None)` branch (lines 851, 899) and `Err` branch (lines 855, 903), instead of pushing `PgType::Text.to_oid()`, call `get_column_types_from_pragma` and use `pg_type_string_to_oid` to get the correct OID.

To avoid calling PRAGMA per-column, fetch the PRAGMA results once per table when the table name is first resolved (line 837 or 886) and cache in a local variable.

**3. Execute-time `schema_types` population (lines 4487-4511):** Same pattern as #1 — after `get_schema_type_with_session` returns `None`, fall back to PRAGMA lookup.

The downstream consumers (lines 2799, 2832, 3783 in `send_select_response` and `encode_row`) receive `field_types` from the Parse or Execute paths above. Once those paths produce correct OIDs, no changes are needed at these downstream locations.

## Files Changed

| File | Change |
|------|--------|
| `src/session/db_handler.rs` | Add `get_column_types_from_pragma` method |
| `src/types/sqlite_type_info.rs` | Add `sqlite_type_to_pg_type_name` function + tests |
| `src/query/executor.rs` | Add PRAGMA fallback in schema_types population loop |
| `src/query/extended.rs` | Add PRAGMA fallback in 3 locations: schema_types (2) + inferred_types (1) |

## Testing

- Pre-existing SQLite table with INTEGER column → OID 23 (int4)
- Pre-existing SQLite table with REAL column → OID 701 (float8)
- Pre-existing SQLite table with BOOLEAN column → OID 16 (bool)
- Pre-existing SQLite table with TEXT column → OID 25 (text)
- Pre-existing SQLite table with BLOB column → OID 17 (bytea)
- Pre-existing SQLite table with BIGINT column → OID 20 (int8)
- Pre-existing SQLite table with SMALLINT column → OID 21 (int2)
- Tables created via pgsqlite `CREATE TABLE` still work correctly (no regression)
- `sqlite_type_to_pg_type_name` unit tests for all type mappings
- `get_column_types_from_pragma` unit test with a test table

## Known Limitations

- Computed expressions without a table (e.g., `SELECT 1+2`) still default to TEXT — would require value-based inference, which is out of scope
- SQLite's type affinity system means `INTEGER` column could contain any type at runtime; we trust the declared type
- `PRAGMA table_info` doesn't distinguish between `INT4` and `INT8` when the declared type is just `INTEGER` — defaults to `int4`
12 changes: 11 additions & 1 deletion src/query/executor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1087,7 +1087,10 @@ impl QueryExecutor {

if let Some(ref table) = table_name {
debug!("Type inference: Found table name '{}', looking up schema for {} columns", table, response.columns.len());


// Pre-fetch PRAGMA table_info as fallback for tables without __pgsqlite_schema
let pragma_types = db.get_column_types_from_pragma(&session.id, table).await.unwrap_or_default();

// Extract column mappings from query if possible
let column_mappings = extract_column_mappings_from_query(query, table);

Expand Down Expand Up @@ -1161,6 +1164,13 @@ impl QueryExecutor {

debug!("Type inference: No schema type found for '{}.{}'", table, col_name);
}
// PRAGMA table_info fallback for pre-existing SQLite tables
if !schema_types.contains_key(col_name) {
if let Some(pg_type) = pragma_types.get(col_name) {
debug!("Type inference: Found PRAGMA type for '{}.{}' -> {}", table, col_name, pg_type);
schema_types.insert(col_name.clone(), pg_type.clone());
}
}
}
} else {
debug!("Type inference: No table name extracted from query, using fallback logic");
Expand Down
65 changes: 53 additions & 12 deletions src/query/extended.rs
Original file line number Diff line number Diff line change
Expand Up @@ -563,6 +563,7 @@ impl ExtendedQueryHandler {
let mut schema_types = std::collections::HashMap::new();
if let Some(ref table) = table_name {
info!("PARSE: Fetching schema types for table '{}'", table);
let pragma_types = db.get_column_types_from_pragma(&session.id, table).await.unwrap_or_default();
// For aliased columns, try to find the source column
for col_name in &response.columns {
// First try direct lookup
Expand Down Expand Up @@ -646,13 +647,25 @@ impl ExtendedQueryHandler {
}
}
}
// PRAGMA table_info fallback
if !schema_types.contains_key(col_name) {
if let Some(pg_type) = pragma_types.get(col_name) {
info!("PARSE: Found PRAGMA type for '{}.{}' -> {}", table, col_name, pg_type);
schema_types.insert(col_name.clone(), pg_type.clone());
}
}
}
}

// Try to infer types from the first row if available
// We need to handle this asynchronously for schema lookup
let mut inferred_types = Vec::new();

let pragma_types_for_inferred = if let Some(ref table) = table_name {
db.get_column_types_from_pragma(&session.id, table).await.unwrap_or_default()
} else {
std::collections::HashMap::new()
};

for (i, col_name) in response.columns.iter().enumerate() {
info!("PARSE: Processing column {}: '{}'", i, col_name);
let inferred_type = {
Expand Down Expand Up @@ -846,13 +859,23 @@ impl ExtendedQueryHandler {
inferred_types.push(type_oid);
}
Ok(None) => {
info!("Column '{}': no schema type found for '{}.{}', defaulting to text",
col_name, source_table, source_col);
inferred_types.push(PgType::Text.to_oid());
if let Some(pg_type) = pragma_types_for_inferred.get(col_name.as_str()) {
let type_oid = crate::types::SchemaTypeMapper::pg_type_string_to_oid(pg_type);
info!("Column '{}': resolved type from PRAGMA -> {} (OID {})", col_name, pg_type, type_oid);
inferred_types.push(type_oid);
} else {
info!("Column '{}': no schema type found for '{}.{}', defaulting to text",
col_name, source_table, source_col);
inferred_types.push(PgType::Text.to_oid());
}
}
Err(_) => {
// Schema lookup error, defaulting to text
inferred_types.push(PgType::Text.to_oid());
if let Some(pg_type) = pragma_types_for_inferred.get(col_name.as_str()) {
let type_oid = crate::types::SchemaTypeMapper::pg_type_string_to_oid(pg_type);
inferred_types.push(type_oid);
} else {
inferred_types.push(PgType::Text.to_oid());
}
}
}
} else {
Expand Down Expand Up @@ -894,13 +917,23 @@ impl ExtendedQueryHandler {
inferred_types.push(type_oid);
}
Ok(None) => {
info!("Column '{}': no schema type found for '{}.{}', defaulting to text",
col_name, table_name, col_name);
inferred_types.push(PgType::Text.to_oid());
if let Some(pg_type) = pragma_types_for_inferred.get(col_name.as_str()) {
let type_oid = crate::types::SchemaTypeMapper::pg_type_string_to_oid(pg_type);
info!("Column '{}': resolved type from PRAGMA -> {} (OID {})", col_name, pg_type, type_oid);
inferred_types.push(type_oid);
} else {
info!("Column '{}': no schema type found for '{}.{}', defaulting to text",
col_name, table_name, col_name);
inferred_types.push(PgType::Text.to_oid());
}
}
Err(_) => {
// Schema lookup error, defaulting to text
inferred_types.push(PgType::Text.to_oid());
if let Some(pg_type) = pragma_types_for_inferred.get(col_name.as_str()) {
let type_oid = crate::types::SchemaTypeMapper::pg_type_string_to_oid(pg_type);
inferred_types.push(type_oid);
} else {
inferred_types.push(PgType::Text.to_oid());
}
}
}
} else {
Expand Down Expand Up @@ -4486,6 +4519,7 @@ impl ExtendedQueryHandler {
// Pre-fetch schema types for all columns if we have a table name
let mut schema_types = std::collections::HashMap::new();
if let Some(ref table) = table_name {
let pragma_types = db.get_column_types_from_pragma(&session.id, table).await.unwrap_or_default();
for col_name in &response.columns {
// Try direct lookup first
if let Ok(Some(pg_type)) = db.get_schema_type_with_session(&session.id, table, col_name).await {
Expand All @@ -4508,6 +4542,13 @@ impl ExtendedQueryHandler {
}
}
}
// PRAGMA table_info fallback
if !schema_types.contains_key(col_name) {
if let Some(pg_type) = pragma_types.get(col_name) {
info!("Found PRAGMA type for '{}.{}' -> {}", table, col_name, pg_type);
schema_types.insert(col_name.clone(), pg_type.clone());
}
}
}
}

Expand Down
33 changes: 32 additions & 1 deletion src/session/db_handler.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2270,7 +2270,38 @@ impl DbHandler {
}
}
}


/// Get column types from PRAGMA table_info for tables without __pgsqlite_schema metadata.
/// Returns a HashMap mapping column names to PostgreSQL type name strings.
pub async fn get_column_types_from_pragma(
&self,
session_id: &Uuid,
table_name: &str,
) -> Result<std::collections::HashMap<String, String>, PgSqliteError> {
let table_name = table_name.to_string();
self.with_session_connection(session_id, move |conn| {
let mut result = std::collections::HashMap::new();
let query = format!("PRAGMA table_info(\"{}\")", table_name);
let mut stmt = conn.prepare(&query)?;
let mut rows = stmt.query_map([], |row| {
let col_name: String = row.get(1)?;
let col_type: String = row.get(2)?;
Ok((col_name, col_type))
})?;
while let Some(Ok((col_name, col_type))) = rows.next() {
if col_type.is_empty() {
continue;
}
let pg_type_name = crate::types::sqlite_type_info::sqlite_type_to_pg_type_name(&col_type);
if pg_type_name == "text" {
continue;
}
result.insert(col_name, pg_type_name.to_string());
}
Ok(result)
}).await
}

/// Try fast path execution with parameters
pub async fn try_execute_fast_path_with_params(
&self,
Expand Down
Loading
Loading