Skip to content

arrow-csv does not safely escape forward slashes #9657

@rtyler

Description

@rtyler

Describe the bug

arrow-csv will generate \ characters from Utf8 columns as \ in output which lousier CSV parsers, like those written in C/C++ interpret as a string escape sequence and c corrupt the output stream.

To Reproduce

Expected behavior

Arguably those bad CSV parsers should be less bad, but IMHO it's a safe operation to convert \ to \\ in the output stream out of an abundance of caution.

Additional context

From 2a7615200965a68c4808efe021b0414e6e155135 Mon Sep 17 00:00:00 2001
From: "R. Tyler Croy" <rtyler@brokenco.de>
Date: Thu, 2 Apr 2026 18:24:19 +0000
Subject: [PATCH] chore: properly escape forward slashes in CSV output of
 strings

Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
---
 arrow-csv/src/writer.rs | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/arrow-csv/src/writer.rs b/arrow-csv/src/writer.rs
index c38d1cdec33..8c7f50b3ca8 100644
--- a/arrow-csv/src/writer.rs
+++ b/arrow-csv/src/writer.rs
@@ -293,6 +293,13 @@ impl<W: Write> Writer<W> {
                     ))
                 })?;

+                let data_type = batch.schema().field(col_idx).data_type().clone();
+
+                if data_type == DataType::Utf8 || data_type == DataType::LargeUtf8 {
+                    // This is fine
+                    buffer = str::replace(&buffer, "\\", "\\\\");
+                }
+
                 let field_bytes =
                     self.get_trimmed_field_bytes(&buffer, batch.column(col_idx).data_type());
                 byte_record.push_field(field_bytes);
@@ -1358,4 +1365,28 @@ sed do eiusmod tempor,-556132.25,1,,2019-04-18T02:45:55.555,23:46:03,foo
             write_quote_style_with_null(&batch, QuoteStyle::Always, "NULL")
         );
     }
+
+    #[test]
+    fn test_write_with_forward_slashes() {
+        let schema = Schema::new(vec![
+            Field::new("text", DataType::Utf8, true),
+            Field::new("number", DataType::Int32, true),
+        ]);
+
+        let text = StringArray::from(vec![Some(r"\"), None, Some("world")]);
+        let number = Int32Array::from(vec![Some(1), Some(2), None]);
+
+        let batch =
+            RecordBatch::try_new(Arc::new(schema), vec![Arc::new(text), Arc::new(number)]).unwrap();
+
+        // Test with QuoteStyle::Always
+        assert_eq!(
+            r#""text","number"
+"\\","1"
+"","2"
+"world",""
+"#,
+            write_quote_style(&batch, QuoteStyle::Always)
+        );
+    }
 }
--
2.43.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions