Skip to content

fix(csv-stringify): double regexp-metacharacter escape/quote literally#494

Merged
wdavidw merged 2 commits into
adaltas:masterfrom
spokodev:fix/csv-stringify-regex-metachar-escape
Jul 2, 2026
Merged

fix(csv-stringify): double regexp-metacharacter escape/quote literally#494
wdavidw merged 2 commits into
adaltas:masterfrom
spokodev:fix/csv-stringify-regex-metachar-escape

Conversation

@spokodev

@spokodev spokodev commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

A field that must be quoted and contains the configured escape or quote character is corrupted, or throws, whenever that character is a regexp metacharacter.

Steps to reproduce

escape/quote are documented as any single character. When a quoted field contains the escape or quote char, the doubling was done with new RegExp(escape, "g") / new RegExp(quote, "g"), interpolating the user character straight into a pattern.

import { stringify } from "csv-stringify/sync";

// escape "|": the pattern /|/g matches the empty string everywhere
stringify([["a|b,c"]], { escape: "|", eof: false });
//   actual:   "||a|||||b||,||c||"
//   expected: "a||b,c"

// escape ".": the pattern /./g matches every character
stringify([["a.b,c"]], { escape: ".", eof: false });
//   actual:   "a..b..c" style doubling of every char
//   expected: "a..b,c"

// escape "*": the pattern /*/g is invalid
stringify([["a*b,c"]], { escape: "*", eof: false });
//   actual:   throws SyntaxError: Nothing to repeat
//   expected: "a**b,c"

^ and $ behave as anchors and silently skip the doubling, so those characters also fail to round-trip.

Root cause

The escape and quote doubling built a RegExp from the user-configured character:

const regexp = escape === "\\" ? new RegExp(escape + escape, "g") : new RegExp(escape, "g");
value = value.replace(regexp, escape + escape);
// and
const regexp = new RegExp(quote, "g");
value = value.replace(regexp, escape + quote);

Any regexp metacharacter is interpreted as regex syntax rather than a literal.

Fix

Replace the RegExp doubling with a literal replaceAll using a function replacer:

value = value.replaceAll(escape, () => escape + escape);
value = value.replaceAll(quote, () => escape + quote);

The function replacer is required, not a plain replacement string: a $ in the replacement string is special ($&, $$, etc.), so escape: "$" would still be mangled by a string replacement. Returning the replacement from a function bypasses both the pattern-side and replacement-side interpretation. replaceAll with a string pattern matches the character literally, so the previous \\ special case is no longer needed.

Authority

The round-trip invariant parse(stringify(x, opts), opts) === x (RFC 4180 section 2.7, generalized to configurable escape/quote) must hold for any single-character escape/quote. Before the fix it is broken for regexp metacharacters (garbage output, or a thrown SyntaxError); after the fix it holds.

Tests and suite status

Added cases in test/option.escape.js covering |, ., *, $ as escape and ., | as quote. They fail on the current code (garbage output and a thrown SyntaxError) and pass with the fix.

Full csv-stringify suite: 203 passing / 1 pending / 3 failing before, 205 passing / 1 pending / 1 failing after. The remaining failure is unrelated and pre-existing on a clean tree (api.callback "catch error in end handler, see #386"), where Node v26 reports RangeError: Invalid string length instead of the expected ERR_STRING_TOO_LONG message.

spokodev and others added 2 commits July 1, 2026 22:41
The escape and quote characters are configurable to any single character,
but when a quoted field contained the escape or quote char, the code doubled
occurrences via `new RegExp(escape, "g")` / `new RegExp(quote, "g")`. That
interpolates the user char into a pattern, so metacharacters misbehave:
`|` and `.` match everywhere and inject the doubled char at every position,
`*`/`+`/`?` throw "Nothing to repeat", and `^`/`$` anchor and silently no-op.

Replace the RegExp doubling with a literal `replaceAll` using a function
replacer, so neither the pattern nor a `$` in the replacement is interpreted.
This restores the round-trip invariant parse(stringify(x)) === x for any
configured escape/quote character.
@wdavidw wdavidw merged commit db852d1 into adaltas:master Jul 2, 2026
4 checks passed
@wdavidw

wdavidw commented Jul 2, 2026

Copy link
Copy Markdown
Member

Thank you for the fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants