Skip to content

from_immunarch() fails with DuckDB conversion error when numeric columns contain NA values #447

@giuliabriss

Description

@giuliabriss

❓ Questions and Help

When using from_immunarch() to convert an immdata object (created with repLoad()) to the new ImmunData format, the function fails with a DuckDB conversion error when numeric columns like V.end, D.start, D.end, and J.start contain NA values.

Error Message

Error in `compute_parquet()`:
! {"exception_type":"Conversion","exception_message":"Failed to write '..../output/annotations.parquet': 
CSV Error on Line: 23064
Error when converting column \"J.start\". Could not convert string \"NA\" to 'BIGINT'

Column J.start is being converted as type BIGINT
This type was auto-detected from the CSV file.
Possible solutions:
* Override the type for this column manually by setting the type explicitly, e.g., types={'J.start': 'VARCHAR'}
* Set the sample size to a larger value to enable the auto-detection to scan more values, e.g., sample_size=-1
* Use a COPY statement to automatically derive types from an existing table.
* Check whether the null string value is set correctly (e.g., nullstr = 'N/A')

Can you help me please?

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions