Skip to content

Commit 27cd0ac

Browse files
committed
ls: add --quoting-style=locale support
Implement locale-aware quoting for the ls command, allowing filenames to be quoted using locale-specific quotation marks based on the current LC_CTYPE setting. Features: - Add Quotes::Locale variant for locale-aware quoting - Create locale_quotes module with comprehensive locale mappings - Support 20+ languages with appropriate quote characters: * Romance languages (French, Spanish, etc.): guillemets * Germanic languages (German, Czech, etc.): low-9 and high quotes * Japanese: corner brackets * Chinese/Korean: CJK curly quotes * English/default: ASCII double quotes - Proper UTF-8 handling for multi-byte quote characters - Environment variable precedence: LC_ALL > LC_CTYPE > LANG - Localized help text in English and French Implementation: - Enhanced CQuoter to dynamically detect and apply locale quotes - Added --quoting-style=locale CLI option - Follows C-style quoting semantics (always-quote behavior) - Safe fallback to ASCII double quotes for unknown locales - Added spell-checker:ignore comments for technical terms Testing: - Verified with multiple locales (en_US, fr_FR, de_DE, ja_JP, zh_CN) - All existing tests pass - Help text properly localized test(ls): add comprehensive tests for --quoting-style=locale Add test coverage for locale-aware quoting functionality: - Tests 10 different locales with appropriate quotation marks - Verifies English, French, German, Japanese, Chinese, Russian, Spanish, Polish, C, and POSIX locales - Tests escape sequence handling with locale quoting (newline character) - Validates UTF-8 encoding of multi-byte quote characters test(ls): use only CI-available locales in locale quoting test Fix test failures by limiting locale tests to those available in the CI environment. CI only generates en_US.UTF-8, fr_FR.UTF-8, es_ES.UTF-8, and sv_SE.UTF-8. Removed tests for: de_DE, ja_JP, zh_CN, pl_PL, ru_RU.UTF-8 - These locales are not generated in .github/workflows/GnuTests.yml - The locale_quotes module unit tests still validate these quote types - This keeps the integration test CI-friendly while maintaining comprehensive coverage test(ls): add missing tests for locale quoting feature Add missing Quotes::Locale test case in test_quotes_display() to ensure the Display trait implementation is tested for all quote variants. Add comprehensive unit tests for locale_quotes module: - Test locale environment variable precedence (LC_ALL > LC_CTYPE > LANG) - Validate quote character mappings for all supported locales - Ensure Romance, Germanic, Slavic, and Asian language quotes work correctly - Test locale string parsing with encoding and modifiers - Verify fallback behavior for unknown locales This adds 17 unit tests covering the locale detection and quote mapping functionality that was previously untested. test(ls): add integration tests for locale quoting with environment variables Add test_ls_quoting_style_locale_env_vars to verify that --quoting-style=locale correctly responds to different locale environment variables (LC_ALL). Tests verify: - French locale (fr_FR.UTF-8) uses guillemets (« ») - Spanish locale (es_ES.UTF-8) uses guillemets (« ») - Swedish locale (sv_SE.UTF-8) uses ASCII quotes - C locale uses ASCII quotes Note: Environment variable precedence testing (LC_ALL > LC_CTYPE > LANG) is already comprehensively covered in unit tests (locale_quotes.rs). These integration tests focus on end-to-end functionality. fix: address CI failures in locale quoting tests - Add locale/typography terms (CTYPE, Guillemets, guillemets) to spell checker dictionary - Skip newline filename test on Windows where such filenames are invalid optimize locale quoting and clean up code - Optimize locale parsing to avoid multiple string splits and allocations - Use efficient find() operations instead of split() for locale parsing - Remove redundant comments and GNU-specific references for MIT compliance - Clean up documentation to include only essential API documentation - Maintain all functionality while improving performance to fix CodSpeed regression
1 parent f43602d commit 27cd0ac

8 files changed

Lines changed: 517 additions & 6 deletions

File tree

.vscode/cspell.dictionaries/jargon.wordlist.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -214,3 +214,8 @@ TUNABLES
214214
tunables
215215
VMULL
216216
vmull
217+
218+
# * locale and typography
219+
CTYPE
220+
Guillemets
221+
guillemets

src/uu/ls/locales/en-US.ftl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ ls-help-set-quoting-style = Set quoting style.
4545
ls-help-literal-quoting-style = Use literal quoting style. Equivalent to `--quoting-style=literal`
4646
ls-help-escape-quoting-style = Use escape quoting style. Equivalent to `--quoting-style=escape`
4747
ls-help-c-quoting-style = Use C quoting style. Equivalent to `--quoting-style=c`
48+
ls-help-locale-quoting-style = Use locale-aware quoting style. Uses quotation marks appropriate for the current locale (e.g., « » for French, „ " for German, 「 」 for Japanese). Equivalent to `--quoting-style=locale`
4849
ls-help-replace-control-chars = Replace control characters with '?' if they are not escaped.
4950
ls-help-show-control-chars = Show control characters 'as is' if they are not escaped.
5051
ls-help-show-time-field = Show time in <field>:

src/uu/ls/locales/fr-FR.ftl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ ls-help-set-quoting-style = Définir le style de citation.
4545
ls-help-literal-quoting-style = Utiliser le style de citation littéral. Équivalent à `--quoting-style=literal`
4646
ls-help-escape-quoting-style = Utiliser le style de citation d'échappement. Équivalent à `--quoting-style=escape`
4747
ls-help-c-quoting-style = Utiliser le style de citation C. Équivalent à `--quoting-style=c`
48+
ls-help-locale-quoting-style = Utiliser le style de citation adapté à la locale. Utilise les guillemets appropriés à la locale actuelle (par ex., « » pour le français, „ " pour l'allemand, 「 」 pour le japonais). Équivalent à `--quoting-style=locale`
4849
ls-help-replace-control-chars = Remplacer les caractères de contrôle par '?' s'ils ne sont pas échappés.
4950
ls-help-show-control-chars = Afficher les caractères de contrôle 'tels quels' s'ils ne sont pas échappés.
5051
ls-help-show-time-field = Afficher l'heure dans <champ> :

src/uu/ls/src/ls.rs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -664,6 +664,9 @@ fn match_quoting_style_name(style: &str, show_control: bool) -> Option<QuotingSt
664664
"shell-escape-always" => Some(QuotingStyle::SHELL_ESCAPE_QUOTE),
665665
"c" => Some(QuotingStyle::C_DOUBLE),
666666
"escape" => Some(QuotingStyle::C_NO_QUOTES),
667+
"locale" => Some(QuotingStyle::C {
668+
quotes: uucore::quoting_style::Quotes::Locale,
669+
}),
667670
_ => None,
668671
}
669672
.map(|qs| qs.show_control(show_control))
@@ -1364,6 +1367,7 @@ pub fn uu_app() -> Command {
13641367
PossibleValue::new("shell-escape-always"),
13651368
PossibleValue::new("c").alias("c-maybe"),
13661369
PossibleValue::new("escape"),
1370+
PossibleValue::new("locale"),
13671371
]))
13681372
.overrides_with_all([
13691373
QUOTING_STYLE,

src/uucore/src/lib/features/quoting_style/c_quoter.rs

Lines changed: 28 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,15 @@
33
// For the full copyright and license information, please view the LICENSE
44
// file that was distributed with this source code.
55

6-
use super::{EscapedChar, Quoter, Quotes};
6+
use super::{EscapedChar, Quoter, Quotes, locale_quotes};
77

88
pub(super) struct CQuoter {
99
/// The type of quotes to use.
1010
quotes: Quotes,
1111

12+
/// Closing quote character (for Locale variant).
13+
close_quote: char,
14+
1215
dirname: bool,
1316

1417
buffer: Vec<u8>,
@@ -17,18 +20,37 @@ pub(super) struct CQuoter {
1720
impl CQuoter {
1821
pub fn new(quotes: Quotes, dirname: bool, size_hint: usize) -> Self {
1922
let mut buffer = Vec::with_capacity(size_hint);
23+
24+
let (open_quote, close_quote) = match quotes {
25+
Quotes::None => ('\0', '\0'),
26+
Quotes::Single => ('\'', '\''),
27+
Quotes::Double => ('"', '"'),
28+
Quotes::Locale => locale_quotes::get_locale_quote_chars(),
29+
};
30+
31+
// Add opening quote to buffer
2032
match quotes {
2133
Quotes::None => (),
22-
Quotes::Single => buffer.push(b'\''),
23-
Quotes::Double => buffer.push(b'"'),
34+
Quotes::Single | Quotes::Double => buffer.push(open_quote as u8),
35+
Quotes::Locale => Self::encode_quote_to_buffer(open_quote, &mut buffer),
2436
}
2537

2638
Self {
2739
quotes,
40+
close_quote,
2841
dirname,
2942
buffer,
3043
}
3144
}
45+
46+
/// Helper method to encode a quote character to the buffer.
47+
///
48+
/// This handles UTF-8 encoding for locale-specific quote characters.
49+
fn encode_quote_to_buffer(quote: char, buffer: &mut Vec<u8>) {
50+
let mut buf = [0; 4];
51+
let quote_str = quote.encode_utf8(&mut buf);
52+
buffer.extend_from_slice(quote_str.as_bytes());
53+
}
3254
}
3355

3456
impl Quoter for CQuoter {
@@ -47,10 +69,11 @@ impl Quoter for CQuoter {
4769
}
4870

4971
fn finalize(mut self: Box<Self>) -> Vec<u8> {
72+
// Add closing quote to buffer
5073
match self.quotes {
5174
Quotes::None => (),
52-
Quotes::Single => self.buffer.push(b'\''),
53-
Quotes::Double => self.buffer.push(b'"'),
75+
Quotes::Single | Quotes::Double => self.buffer.push(self.close_quote as u8),
76+
Quotes::Locale => Self::encode_quote_to_buffer(self.close_quote, &mut self.buffer),
5477
}
5578
self.buffer
5679
}

0 commit comments

Comments
 (0)