Skip to content

Clarify decimal datatype constraints and simplify validation#135

Merged
david-waltermire merged 6 commits intodevelopfrom
feature/adjust-decimal-type
Dec 14, 2025
Merged

Clarify decimal datatype constraints and simplify validation#135
david-waltermire merged 6 commits intodevelopfrom
feature/adjust-decimal-type

Conversation

@david-waltermire
Copy link
Copy Markdown
Contributor

@david-waltermire david-waltermire commented Dec 7, 2025

Summary

  • Clarifies the decimal datatype constraints with explicit rules for leading signs, leading zeros, and exponential notation
  • Updates the XML Schema to use xs:double base type with pattern -?(0|[1-9][0-9]*)(\.[0-9]+)?([eE][+-]?[0-9]+)?
  • Removes redundant JSON pattern since type: "number" provides validation
  • Updates documentation to match schema changes
  • Adds comprehensive test cases for decimal validation

Decimal Datatype Rules

  • No leading + - Values like +12.34 are invalid
  • Leading zeros restricted - 01.23 is invalid; 0 and 0.5 are valid
  • Exponential notation supported - 1e3, 1E+4, -2.5e-10 are all valid
  • Whitespace normalized - Leading/trailing whitespace is collapsed per XML Schema rules

Files changed

  • schema/json/metaschema-datatypes.json - Updated description to reflect constraints
  • schema/xml/metaschema-datatypes.xsd - Changed base type to xs:double, updated pattern and documentation
  • website/content/specification/datatypes.md - Updated documentation to match schema changes
  • test/decimal-type/ - Added comprehensive test suite with valid/invalid cases

Test plan

  • Verify decimal values are validated correctly in XML
  • Verify decimal values are validated correctly in JSON
  • Verify documentation renders correctly on website

Summary by CodeRabbit

  • New Features

    • Decimal datatype now allows optional exponential notation (e.g., 1.5e-3).
  • Bug Fixes

    • Decimal validation rejects leading plus signs and disallows improper leading zeros.
  • Documentation

    • Specification pages updated to clarify decimal format, whitespace and leading-zero rules.
  • Tests

    • Added comprehensive test suites and fixtures covering valid and invalid decimal representations.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 7, 2025

📝 Walkthrough

Walkthrough

JSON decimal pattern removed and description updated; XML DecimalDatatype changed from xs:decimal to xs:double with a numeric pattern and expanded docs; a test module with schemas, test data, a Java XML validator, and build config was added.

Changes

Cohort / File(s) Summary
Schema — JSON & XML
schema/json/metaschema-datatypes.json, schema/xml/metaschema-datatypes.xsd
JSON: removed pattern from DecimalDatatype, kept type: "number", and updated description to mention optional exponent and disallowed leading '+'. XML: changed DecimalDatatype base from xs:decimal to xs:double, expanded annotation, and replaced pattern with `-?(0
Documentation
website/content/specification/datatypes.md
Updated decimal description to include optional exponential notation, disallowed leading '+', and clarified leading-zero rules.
Test harness & build
test/decimal-type/README.md, test/decimal-type/XmlValidator.java, test/decimal-type/pom.xml, test/decimal-type/.gitignore
Added test README, a CLI Java XmlValidator (with public static void main), a Maven pom.xml configured to run it, and .gitignore for Java build artifacts.
Test schemas
test/decimal-type/decimal-test.json, test/decimal-type/decimal-test.xsd
Added JSON and XML test schemas describing decimal-tests and test-case; XML value uses restricted xs:double plus the numeric pattern and documentation.
JSON test data
test/decimal-type/json-valid-cases.json, test/decimal-type/json-invalid-cases.json, test/decimal-type/json-test-string-value.json
Added valid and invalid JSON test cases covering integers, decimals, exponential notation, leading plus/zeros, whitespace, and a string-with-whitespace case.
XML test data
test/decimal-type/xml-valid-cases.xml, test/decimal-type/xml-invalid-cases.xml, test/decimal-type/xml-test-leading-plus.xml, test/decimal-type/xml-test-leading-whitespace.xml, test/decimal-type/xml-test-leading-zero.xml, test/decimal-type/xml-test-no-leading-digit.xml, test/decimal-type/xml-test-trailing-decimal.xml
Added XML test suite and individual test files exercising leading plus, leading zeros, missing leading digit, trailing decimal point, whitespace handling, and exponent cases.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

  • Verify XML regex enforces: no leading '+', allows 0 and 0.x, disallows other leading zeros, supports optional fractional and exponent parts.
  • Confirm switching xs:decimalxs:double aligns with intended whitespace handling and numeric interpretation.
  • Ensure JSON schema removal of pattern and test expectations are consistent (JSON numbers vs. string representations).
  • Review XmlValidator.java for secure parser configuration and CLI behavior; confirm pom.xml exec plugin mainClass and source settings.

Possibly related PRs

Suggested reviewers

  • aj-stein

Poem

🐇 I hopped through digits, dot, and tiny e,

No leading plus, no sneaky zeroes for me.
Schemas and tests lined snug in a row,
XML learned the pattern, JSON lets numbers flow,
A tiny rabbit cheers the decimals' new show ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'Clarify decimal datatype constraints and simplify validation' directly and accurately reflects the main changes: clarifying decimal constraints and removing redundant validation (pattern removal in JSON).
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/adjust-decimal-type

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
website/content/specification/datatypes.md (1)

316-333: Decimal docs align with schemas; consider clarifying whitespace and JSON enforcement

The decimal section here correctly reflects the schema changes: the XML pattern -?[0-9]+(\.[0-9]+)? matches the XSD, and the JSON example with only "type": "number" matches the JSON schema.

Two implementation-level nuances aren’t obvious from the text, though:

  • For XML, xs:decimal collapses whitespace, so leading/trailing spaces are normalized away before the pattern is applied rather than causing validation failure. Values like " 123 " will still be accepted as 123 even though the prose currently says “no leading or trailing whitespace are allowed.” (herongyang.com)
  • For JSON, the schema example cannot prevent exponent notation (1e2, 1E-3); "type": "number" validates the numeric value, not its lexical spelling. The “no exponential notation” rule is therefore a semantic constraint on producers/consumers rather than something enforced by the JSON Schema alone.

If you want the spec text to match actual validator behavior more precisely, you could adjust the decimal paragraph along these lines:

-A real number expressed using a whole and optional fractional part separated by a period. No leading '+', no exponential notation, and no leading or trailing whitespace are allowed.
+A real number expressed using a whole and optional fractional part separated by a period. No leading '+' and no exponential notation are allowed. In XML, any leading or trailing whitespace is normalized away before validation, and producers SHOULD avoid relying on it.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d0c1d85 and cfee656.

📒 Files selected for processing (3)
  • schema/json/metaschema-datatypes.json (1 hunks)
  • schema/xml/metaschema-datatypes.xsd (1 hunks)
  • website/content/specification/datatypes.md (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-11-14T17:07:03.586Z
Learnt from: david-waltermire
Repo: metaschema-framework/metaschema-java PR: 245
File: core/src/main/java/gov/nist/secauto/metaschema/core/metapath/item/atomic/IIPv4AddressItem.java:66-73
Timestamp: 2024-11-14T17:07:03.586Z
Learning: In the Metaschema Java codebase, differences in casting patterns across atomic type implementations are intentional and required; any differences in approach are significant and necessary.

Applied to files:

  • schema/xml/metaschema-datatypes.xsd
🔇 Additional comments (1)
schema/json/metaschema-datatypes.json (1)

45-46: Decimal JSON schema: pattern removal is correct; “no exponent” is semantic only

Dropping the pattern and relying on "type": "number" is the right move here, since JSON Schema patterns only apply to strings and would not have constrained numeric instances anyway. The new description clearly states the intended constraints (no leading +, no exponential notation).

Be aware though that typical JSON Schema validators will still accept values written with exponent notation (e.g., 1e2) as valid numbers; the schema cannot distinguish those from fixed-point decimals. If you need to enforce the “no exponent” rule for JSON instances, you’ll need an additional validation layer or a string-based representation for decimals rather than relying solely on this schema.

Comment thread schema/xml/metaschema-datatypes.xsd
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
test/decimal-type/json-valid-cases.json (1)

43-59: Exponential notation allowed despite PR objectives.

Similar to the XML test file, this JSON test file includes exponential notation as valid (lines 43-59), contradicting the PR objectives which state exponential notation should be explicitly disallowed.

test/decimal-type/decimal-test.xsd (1)

17-26: XML Schema allows exponential notation despite PR objectives.

The documentation (lines 17-22) correctly describes that "optional exponential notation" is allowed, and the pattern on line 25 includes ([eE][+-]?[0-9]+)? to support this. However, this contradicts the PR objectives which state exponential notation should be explicitly disallowed.

The pattern implementation itself appears correct for the documented rules (no leading '+', proper leading-zero handling, optional exponent), but conflicts with the stated PR goals.

🧹 Nitpick comments (6)
schema/xml/metaschema-datatypes.xsd (1)

69-80: Confirm xs:double backing type and whitespace behavior are intentional

The new documentation and pattern look self-consistent (optional exponent, no leading +, and the leading‑zero rule are all correctly enforced by the regex). However, switching the restriction base from xs:decimal to xs:double changes the value semantics to IEEE‑754 doubles and also means leading/trailing whitespace in XML will be collapsed and accepted before pattern matching.

If the intent is a true decimal type (e.g., for money or other exact values) or to reject surrounding whitespace at the XML level, it may be worth double‑checking that xs:double is the right base and, if so, calling out the double/whitespace semantics explicitly in the prose.

website/content/specification/syntax/datatypes.md (1)

316-335: Decimal docs align with code/tests but conflict with stated PR objectives

This section correctly mirrors the XSD and tests: exponent is optional, no leading +, and leading zeros are restricted, with JSON relying on type: "number" for syntax (including exponent support).

The PR objectives you provided still say decimal should disallow exponential notation and show a pattern without the exponent part. To avoid confusion for implementers and reviewers, it would be good to reconcile that description with the current design (either by updating the PR summary or adjusting the docs/schema if exponent was not meant to be allowed). You might also consider adding a short note here about JSON/XML whitespace normalization around numeric values, since that is now part of the test expectations.

test/decimal-type/README.md (1)

10-46: Clarify JSON whitespace example vs string test case

The behavior matrix and notes correctly state that JSON and XML tolerate whitespace around numeric values (JSON parsing and xs:double whitespace collapse), while json-test-string-value.json asserts that the string " 12.34 " is invalid under type: "number".

To avoid confusion, consider adding a brief note near the “Leading/trailing whitespace” row that the JSON ✓ refers to whitespace surrounding a numeric token (e.g., a JSON text like ␣12.34␣), and that the separate string test case covers " 12.34 " as a string value being rejected by type: "number".

test/decimal-type/xml-invalid-cases.xml (1)

1-26: Tighten wording in xml-invalid-cases catalog

The comment header and filename frame this as “invalid decimal test cases,” but the list also includes xml-test-leading-whitespace.xml as a valid case and the placeholder test is expected="valid". To reduce reader confusion, consider tweaking the header (e.g., “edge decimal test cases for XML validation”) or explicitly splitting the bullets into “invalid” vs “valid-but-notable” subsections.

test/decimal-type/json-invalid-cases.json (1)

1-47: Consider using string values for JSON invalid test cases.

The current approach documents invalid cases in _invalid_cases_documentation because they cannot be represented as JSON numbers. However, you could test these scenarios by using string values and validating that they fail type: "number" validation (as mentioned in lines 38 and 44).

For example:

{
  "id": "json-leading-plus-string",
  "description": "Leading plus as string should fail number validation",
  "expected": "invalid",
  "value": "+12.34"
}

This would make the test cases executable rather than purely documentary.

test/decimal-type/XmlValidator.java (1)

24-27: Consider more specific exception handling for better diagnostics.

The current code catches all exceptions generically, which works but provides limited diagnostic information. For a test utility, distinguishing between different error types could be helpful:

  • SAXException - validation errors (schema violations)
  • IOException - file access errors
  • SAXParseException - XML parsing errors

This would allow more specific error messages and potentially different exit codes for different failure types.

-        } catch (Exception e) {
-            System.out.println(xmlPath + " is INVALID: " + e.getMessage());
-            System.exit(1);
+        } catch (org.xml.sax.SAXException e) {
+            System.out.println(xmlPath + " is INVALID (validation error): " + e.getMessage());
+            System.exit(1);
+        } catch (java.io.IOException e) {
+            System.out.println("Error reading files: " + e.getMessage());
+            System.exit(2);
         }

However, the current implementation is acceptable for a simple test utility.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cfee656 and 71c731f.

⛔ Files ignored due to path filters (1)
  • test/decimal-type/XmlValidator.class is excluded by !**/*.class
📒 Files selected for processing (17)
  • schema/xml/metaschema-datatypes.xsd (1 hunks)
  • test/decimal-type/README.md (1 hunks)
  • test/decimal-type/XmlValidator.java (1 hunks)
  • test/decimal-type/decimal-test.json (1 hunks)
  • test/decimal-type/decimal-test.xsd (1 hunks)
  • test/decimal-type/json-invalid-cases.json (1 hunks)
  • test/decimal-type/json-test-string-value.json (1 hunks)
  • test/decimal-type/json-valid-cases.json (1 hunks)
  • test/decimal-type/xml-invalid-cases.xml (1 hunks)
  • test/decimal-type/xml-test-leading-plus.xml (1 hunks)
  • test/decimal-type/xml-test-leading-whitespace.xml (1 hunks)
  • test/decimal-type/xml-test-leading-zero.xml (1 hunks)
  • test/decimal-type/xml-test-no-leading-digit.xml (1 hunks)
  • test/decimal-type/xml-test-trailing-decimal.xml (1 hunks)
  • test/decimal-type/xml-valid-cases.xml (1 hunks)
  • website/content/specification/datatypes.md (2 hunks)
  • website/content/specification/syntax/datatypes.md (2 hunks)
✅ Files skipped from review due to trivial changes (2)
  • test/decimal-type/xml-test-leading-whitespace.xml
  • test/decimal-type/xml-test-leading-plus.xml
🚧 Files skipped from review as they are similar to previous changes (1)
  • website/content/specification/datatypes.md
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: david-waltermire
Repo: metaschema-framework/metaschema-java PR: 245
File: core/src/main/java/gov/nist/secauto/metaschema/core/metapath/item/atomic/IIPv4AddressItem.java:66-73
Timestamp: 2024-11-14T17:07:03.586Z
Learning: In the Metaschema Java codebase, differences in casting patterns across atomic type implementations are intentional and required; any differences in approach are significant and necessary.
📚 Learning: 2024-11-14T17:07:03.586Z
Learnt from: david-waltermire
Repo: metaschema-framework/metaschema-java PR: 245
File: core/src/main/java/gov/nist/secauto/metaschema/core/metapath/item/atomic/IIPv4AddressItem.java:66-73
Timestamp: 2024-11-14T17:07:03.586Z
Learning: In the Metaschema Java codebase, differences in casting patterns across atomic type implementations are intentional and required; any differences in approach are significant and necessary.

Applied to files:

  • schema/xml/metaschema-datatypes.xsd
🔇 Additional comments (5)
test/decimal-type/xml-test-no-leading-digit.xml (1)

1-7: No-leading-digit XML test matches DecimalDatatype constraints

.47 is correctly marked as expected="invalid" given the pattern requires a leading digit before the decimal point. This test nicely exercises that edge case.

test/decimal-type/xml-test-trailing-decimal.xml (1)

1-7: Trailing-decimal XML test correctly marked invalid

123. is invalid under the DecimalDatatype pattern (requires digits after .), so this focused test case aligns with the schema rules.

test/decimal-type/xml-test-leading-zero.xml (1)

1-7: Leading-zero XML test aligns with documented rule

01.23 is correctly expected to be invalid, matching the pattern and the “no leading zeros except 0 or <1” rule in the documentation.

test/decimal-type/json-test-string-value.json (1)

1-12: JSON string-with-whitespace test correctly defends type:number semantics

This test case cleanly verifies that a string " 12.34 " is rejected when the schema uses type: "number", complementing the README’s note about whitespace around numeric tokens. Looks good as a guard against accidental string coercion in test harnesses.

test/decimal-type/XmlValidator.java (1)

8-29: LGTM!

The XmlValidator utility is straightforward and fit for purpose as a test validation tool. The implementation correctly:

  • Validates command-line arguments
  • Creates an XML Schema validator from the XSD file
  • Validates the XML document
  • Reports results with appropriate exit codes

Comment thread test/decimal-type/decimal-test.json
Comment thread test/decimal-type/json-valid-cases.json
Comment thread test/decimal-type/xml-valid-cases.xml
@david-waltermire david-waltermire changed the title Clarify decimal datatype constraints and simplify validation [WIP] Clarify decimal datatype constraints and simplify validation Dec 9, 2025
@david-waltermire david-waltermire marked this pull request as ready for review December 13, 2025 15:18
@david-waltermire
Copy link
Copy Markdown
Contributor Author

There is some support for this modified approach in usnistgov/metaschema#795 and no objections so far. I'd like to get this in for Metaschema 1.0.0.rc1 and for the updated liboscal-java and oscal-cli release for OSCAL 1.2.0. If there is a desire to go a different direction, then this can be addressed in a Metaschema 1.0.0.rc2 or 1.0.0 final release.

@david-waltermire david-waltermire changed the title [WIP] Clarify decimal datatype constraints and simplify validation Clarify decimal datatype constraints and simplify validation Dec 13, 2025
- Update description to explicitly disallow leading '+' and exponential notation
- Simplify XML pattern to explicitly match decimal format: -?[0-9]+(\.[0-9]+)?
- Remove redundant JSON pattern since type: "number" provides validation
… notation

- Change base type from xs:decimal to xs:double to support exponential notation
- Update pattern to -?(0|[1-9][0-9]*)(\.[0-9]+)?([eE][+-]?[0-9]+)?
- Add test schemas and content examples for decimal validation
- Update specification documentation to reflect new pattern and behavior
- Achieve 8/8 alignment between JSON and XML validation behavior
- Update JSON schema description to allow exponential notation (aligns
  with XML schema which supports optional exponential notation)
- Update test schema description in decimal-test.json to match
- Clarify misleading test case description for leading zeros
@david-waltermire david-waltermire force-pushed the feature/adjust-decimal-type branch from 03f0d2b to 962d4c2 Compare December 13, 2025 21:18
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/decimal-type/XmlValidator.java (1)

18-27: Consider adding file existence checks for clearer error messages.

Currently, if a file path is invalid, the exception message may be less user-friendly. This is optional for a test utility.

         try {
+            File schemaFile = new File(schemaPath);
+            File xmlFile = new File(xmlPath);
+            if (!schemaFile.exists()) {
+                System.out.println("Schema file not found: " + schemaPath);
+                System.exit(1);
+            }
+            if (!xmlFile.exists()) {
+                System.out.println("XML file not found: " + xmlPath);
+                System.exit(1);
+            }
             SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
-            Schema schema = factory.newSchema(new File(schemaPath));
+            Schema schema = factory.newSchema(schemaFile);
             Validator validator = schema.newValidator();
-            validator.validate(new StreamSource(new File(xmlPath)));
+            validator.validate(new StreamSource(xmlFile));
             System.out.println(xmlPath + " is VALID");
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 03f0d2b and 962d4c2.

⛔ Files ignored due to path filters (1)
  • test/decimal-type/XmlValidator.class is excluded by !**/*.class
📒 Files selected for processing (17)
  • schema/json/metaschema-datatypes.json (1 hunks)
  • schema/xml/metaschema-datatypes.xsd (1 hunks)
  • test/decimal-type/README.md (1 hunks)
  • test/decimal-type/XmlValidator.java (1 hunks)
  • test/decimal-type/decimal-test.json (1 hunks)
  • test/decimal-type/decimal-test.xsd (1 hunks)
  • test/decimal-type/json-invalid-cases.json (1 hunks)
  • test/decimal-type/json-test-string-value.json (1 hunks)
  • test/decimal-type/json-valid-cases.json (1 hunks)
  • test/decimal-type/xml-invalid-cases.xml (1 hunks)
  • test/decimal-type/xml-test-leading-plus.xml (1 hunks)
  • test/decimal-type/xml-test-leading-whitespace.xml (1 hunks)
  • test/decimal-type/xml-test-leading-zero.xml (1 hunks)
  • test/decimal-type/xml-test-no-leading-digit.xml (1 hunks)
  • test/decimal-type/xml-test-trailing-decimal.xml (1 hunks)
  • test/decimal-type/xml-valid-cases.xml (1 hunks)
  • website/content/specification/datatypes.md (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • test/decimal-type/xml-invalid-cases.xml
🚧 Files skipped from review as they are similar to previous changes (8)
  • test/decimal-type/json-test-string-value.json
  • website/content/specification/datatypes.md
  • test/decimal-type/xml-valid-cases.xml
  • test/decimal-type/xml-test-leading-whitespace.xml
  • test/decimal-type/json-valid-cases.json
  • test/decimal-type/decimal-test.json
  • test/decimal-type/json-invalid-cases.json
  • schema/json/metaschema-datatypes.json
🧰 Additional context used
📓 Path-based instructions (1)
schema/**/*.{json,xsd}

📄 CodeRabbit inference engine (CLAUDE.md)

Regex patterns in datatype definitions must be valid for both ECMA-262 (JSON Schema) and XML Schema dialects. Both support Unicode property escapes (\p{L}, \p{N})

Files:

  • schema/xml/metaschema-datatypes.xsd
🧠 Learnings (8)
📓 Common learnings
Learnt from: david-waltermire
Repo: metaschema-framework/metaschema PR: 135
File: test/decimal-type/xml-valid-cases.xml:37-52
Timestamp: 2025-12-08T22:31:21.314Z
Learning: The decimal datatype in the metaschema framework supports exponential notation (both lowercase 'e' and uppercase 'E', with optional +/- signs in the exponent). Both XML and JSON schemas should allow exponential notation for decimal values.
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: Applies to schema/**/*.{json,xsd} : Regex patterns in datatype definitions must be valid for both ECMA-262 (JSON Schema) and XML Schema dialects. Both support Unicode property escapes (`\p{L}`, `\p{N}`)
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: When adding or modifying datatypes, update `schema/json/metaschema-datatypes.json`, `schema/xml/metaschema-datatypes.xsd`, and `website/content/specification/datatypes.md` in sync
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: Keep `schema/json/`, `schema/xml/`, and `website/content/specification/datatypes.md` in sync when updating schema definitions
📚 Learning: 2025-12-08T22:31:15.322Z
Learnt from: david-waltermire
Repo: metaschema-framework/metaschema PR: 135
File: test/decimal-type/xml-valid-cases.xml:37-52
Timestamp: 2025-12-08T22:31:15.322Z
Learning: Ensure that the decimal datatype in the metaschema framework supports exponential notation in both lower-case 'e' and upper-case 'E', with optional +/- signs in the exponent, and that both XML and JSON schemas allow this notation. Update tests and schema validations accordingly.

Applied to files:

  • test/decimal-type/xml-test-no-leading-digit.xml
  • test/decimal-type/xml-test-trailing-decimal.xml
  • test/decimal-type/xml-test-leading-plus.xml
  • test/decimal-type/xml-test-leading-zero.xml
📚 Learning: 2025-12-08T22:31:21.314Z
Learnt from: david-waltermire
Repo: metaschema-framework/metaschema PR: 135
File: test/decimal-type/xml-valid-cases.xml:37-52
Timestamp: 2025-12-08T22:31:21.314Z
Learning: The decimal datatype in the metaschema framework supports exponential notation (both lowercase 'e' and uppercase 'E', with optional +/- signs in the exponent). Both XML and JSON schemas should allow exponential notation for decimal values.

Applied to files:

  • schema/xml/metaschema-datatypes.xsd
  • test/decimal-type/README.md
  • test/decimal-type/decimal-test.xsd
📚 Learning: 2025-12-13T21:16:12.270Z
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: When adding or modifying datatypes, update `schema/json/metaschema-datatypes.json`, `schema/xml/metaschema-datatypes.xsd`, and `website/content/specification/datatypes.md` in sync

Applied to files:

  • schema/xml/metaschema-datatypes.xsd
  • test/decimal-type/README.md
📚 Learning: 2025-12-13T21:16:12.270Z
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: The data type documentation in `website/content/specification/datatypes.md` must stay in sync with schema definitions in `schema/json/metaschema-datatypes.json` and `schema/xml/metaschema-datatypes.xsd`

Applied to files:

  • schema/xml/metaschema-datatypes.xsd
  • test/decimal-type/README.md
📚 Learning: 2024-11-14T17:07:03.586Z
Learnt from: david-waltermire
Repo: metaschema-framework/metaschema-java PR: 245
File: core/src/main/java/gov/nist/secauto/metaschema/core/metapath/item/atomic/IIPv4AddressItem.java:66-73
Timestamp: 2024-11-14T17:07:03.586Z
Learning: In the Metaschema Java codebase, differences in casting patterns across atomic type implementations are intentional and required; any differences in approach are significant and necessary.

Applied to files:

  • schema/xml/metaschema-datatypes.xsd
📚 Learning: 2025-12-13T21:16:12.270Z
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: Applies to schema/**/*.{json,xsd} : Regex patterns in datatype definitions must be valid for both ECMA-262 (JSON Schema) and XML Schema dialects. Both support Unicode property escapes (`\p{L}`, `\p{N}`)

Applied to files:

  • schema/xml/metaschema-datatypes.xsd
  • test/decimal-type/README.md
📚 Learning: 2025-12-13T21:16:12.270Z
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: Keep `schema/json/`, `schema/xml/`, and `website/content/specification/datatypes.md` in sync when updating schema definitions

Applied to files:

  • test/decimal-type/README.md
🔇 Additional comments (8)
test/decimal-type/xml-test-trailing-decimal.xml (1)

1-8: LGTM!

The test case correctly validates that a trailing decimal point (123.) is invalid per the decimal pattern. The pattern (\.[0-9]+)? requires at least one digit after the decimal point when present.

test/decimal-type/README.md (1)

1-57: LGTM!

The README provides comprehensive documentation of the decimal datatype test cases. The behavior matrix clearly shows the expected validation outcomes, and the explanation of whitespace handling correctly notes that both XML Schema and JSON normalize whitespace around numeric values.

schema/xml/metaschema-datatypes.xsd (1)

69-81: LGTM!

The DecimalDatatype changes are well-implemented:

  • Base type xs:double correctly supports exponential notation
  • Pattern -?(0|[1-9][0-9]*)(\.[0-9]+)?([eE][+-]?[0-9]+)? properly enforces: no leading +, no invalid leading zeros, required digit after decimal point, and optional exponential notation
  • Documentation accurately describes the constraints without the previous inaccurate whitespace claim

Based on learnings, the exponential notation support aligns with the framework requirements.

test/decimal-type/decimal-test.xsd (1)

1-46: LGTM!

The test schema is well-structured and maintains consistency with the main metaschema-datatypes.xsd:

  • Pattern and base type (xs:double) match exactly
  • Documentation mirrors the main schema
  • Test case structure with id, description, and expected attributes provides clear test semantics
test/decimal-type/XmlValidator.java (1)

8-28: LGTM!

The XmlValidator utility is a straightforward tool for testing XML validation against XSD schemas. The implementation correctly:

  • Validates argument count with appropriate exit code
  • Uses standard JAXP APIs for schema validation
  • Provides clear success/failure output
test/decimal-type/xml-test-leading-plus.xml (1)

1-8: Test file structure is sound.

The XML test case correctly validates that a leading plus sign (+12.34) is rejected under the updated decimal constraints.

test/decimal-type/xml-test-no-leading-digit.xml (1)

1-8: Test file validates no-leading-digit constraint correctly.

The test case properly rejects decimals without a leading digit (.47), which aligns with the explicit digit requirement in the updated decimal format.

test/decimal-type/xml-test-leading-zero.xml (1)

1-8: Test file correctly validates leading-zero constraint.

The test case properly rejects decimals with leading zeros (01.23), which aligns with the no-leading-zero rule in the updated decimal format.

Copy link
Copy Markdown

@aj-stein aj-stein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fantastic except for one small bit, you want the compiled .class file in there?

Comment thread test/decimal-type/XmlValidator.class Outdated
- Remove XmlValidator.class (compiled artifacts should not be committed)
- Add pom.xml for building and running the XmlValidator
- Add .gitignore to prevent future class file commits
- Update README with Maven build/run instructions
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
test/decimal-type/pom.xml (2)

22-22: Clarify sourceDirectory and use standard Maven layout.

Setting sourceDirectory to ${project.basedir} is non-standard; Maven conventions prefer src/main/java. This approach works for a simple test utility but may cause confusion. If this flat structure is intentional for the test suite, document it clearly.


39-39: Qualify the mainClass with its package name.

Specifying XmlValidator without a package assumes the class is in the default package, which is not best practice in Java. Use the fully qualified class name (e.g., gov.nist.secauto.metaschema.test.XmlValidator) to be explicit.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 962d4c2 and 8fd3316.

📒 Files selected for processing (3)
  • test/decimal-type/.gitignore (1 hunks)
  • test/decimal-type/README.md (1 hunks)
  • test/decimal-type/pom.xml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/decimal-type/README.md
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: david-waltermire
Repo: metaschema-framework/metaschema PR: 135
File: test/decimal-type/xml-valid-cases.xml:37-52
Timestamp: 2025-12-08T22:31:21.314Z
Learning: The decimal datatype in the metaschema framework supports exponential notation (both lowercase 'e' and uppercase 'E', with optional +/- signs in the exponent). Both XML and JSON schemas should allow exponential notation for decimal values.
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: When adding or modifying datatypes, update `schema/json/metaschema-datatypes.json`, `schema/xml/metaschema-datatypes.xsd`, and `website/content/specification/datatypes.md` in sync
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: Applies to schema/**/*.{json,xsd} : Regex patterns in datatype definitions must be valid for both ECMA-262 (JSON Schema) and XML Schema dialects. Both support Unicode property escapes (`\p{L}`, `\p{N}`)
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: Keep `schema/json/`, `schema/xml/`, and `website/content/specification/datatypes.md` in sync when updating schema definitions
📚 Learning: 2025-12-08T22:31:15.322Z
Learnt from: david-waltermire
Repo: metaschema-framework/metaschema PR: 135
File: test/decimal-type/xml-valid-cases.xml:37-52
Timestamp: 2025-12-08T22:31:15.322Z
Learning: Ensure that the decimal datatype in the metaschema framework supports exponential notation in both lower-case 'e' and upper-case 'E', with optional +/- signs in the exponent, and that both XML and JSON schemas allow this notation. Update tests and schema validations accordingly.

Applied to files:

  • test/decimal-type/pom.xml
🔇 Additional comments (3)
test/decimal-type/.gitignore (1)

1-5: LGTM!

Standard Maven/Java .gitignore patterns. No issues.

test/decimal-type/pom.xml (2)

1-43: No action required—the decimal pattern correctly supports exponential notation as indicated in learnings.

The verification confirms that both schema implementations correctly support exponential notation for the decimal datatype:

  • XML Schema pattern: -?(0|[1-9][0-9]*)(\.[0-9]+)?([eE][+-]?[0-9]+)? explicitly includes the exponent group ([eE][+-]?[0-9]+)?, permitting lowercase or uppercase 'e'/'E' with optional '+'/'-' signs.
  • JSON Schema: Uses native "type": "number", which natively supports exponential notation, with the description explicitly stating "optional exponential notation."

The PR objectives cited in the original comment appear to have been incomplete or misquoted. The actual implementation aligns with the learnings and correctly implements decimal exponential notation support.


29-31: No additional dependencies required.

The code uses only javax.xml.* classes from Java's standard library (available in JDK 11+). XmlValidator.java performs XML schema validation using SchemaFactory, Schema, and Validator—all built-in standard library classes. No external XML or JSON parsing dependencies need to be declared in pom.xml.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/decimal-type/XmlValidator.java (1)

9-13: Use stderr + better diagnostics for CI failures.
Right now failures only print e.getMessage(); in practice, schema validation errors can be opaque without location/cause. Consider printing e (or stack trace) to stderr.

 } catch (Exception e) {
-    System.err.println(xmlPath + " is INVALID: " + e.getMessage());
+    System.err.println(xmlPath + " is INVALID: " + e.getMessage());
+    e.printStackTrace(System.err);
     System.exit(1);
 }

Also applies to: 36-39

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8fd3316 and edb9c6c.

📒 Files selected for processing (1)
  • test/decimal-type/XmlValidator.java (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: david-waltermire
Repo: metaschema-framework/metaschema PR: 135
File: test/decimal-type/xml-valid-cases.xml:37-52
Timestamp: 2025-12-08T22:31:21.314Z
Learning: The decimal datatype in the metaschema framework supports exponential notation (both lowercase 'e' and uppercase 'E', with optional +/- signs in the exponent). Both XML and JSON schemas should allow exponential notation for decimal values.
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: When adding or modifying datatypes, update `schema/json/metaschema-datatypes.json`, `schema/xml/metaschema-datatypes.xsd`, and `website/content/specification/datatypes.md` in sync
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: Applies to schema/**/*.{json,xsd} : Regex patterns in datatype definitions must be valid for both ECMA-262 (JSON Schema) and XML Schema dialects. Both support Unicode property escapes (`\p{L}`, `\p{N}`)
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: Keep `schema/json/`, `schema/xml/`, and `website/content/specification/datatypes.md` in sync when updating schema definitions
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: The data type documentation in `website/content/specification/datatypes.md` must stay in sync with schema definitions in `schema/json/metaschema-datatypes.json` and `schema/xml/metaschema-datatypes.xsd`
Learnt from: david-waltermire
Repo: metaschema-framework/metaschema-java PR: 245
File: core/src/main/java/gov/nist/secauto/metaschema/core/metapath/item/atomic/IIPv4AddressItem.java:66-73
Timestamp: 2024-11-14T17:07:03.586Z
Learning: In the Metaschema Java codebase, differences in casting patterns across atomic type implementations are intentional and required; any differences in approach are significant and necessary.
🔇 Additional comments (1)
test/decimal-type/XmlValidator.java (1)

10-28: Argument/file checks are clear and test-friendly.
Usage + missing-file checks are straightforward and the exit code behavior matches typical test harness expectations.

Comment thread test/decimal-type/XmlValidator.java
Add security measures to prevent XML External Entity (XXE) and
SSRF-style attacks:
- Enable FEATURE_SECURE_PROCESSING on SchemaFactory
- Disable ACCESS_EXTERNAL_DTD and ACCESS_EXTERNAL_SCHEMA on both
  SchemaFactory and Validator
- Gracefully handle implementations that don't support these properties
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/decimal-type/XmlValidator.java (1)

12-30: Consider using System.err for error messages.

Lines 13, 24, and 28 use System.out for error/usage messages, while lines 42, 52, and 58 use System.err. For consistency and conventional practice, all error messages should go to System.err.

Apply this diff to make error output consistent:

     if (args.length < 2) {
-        System.out.println("Usage: java XmlValidator <schema.xsd> <document.xml>");
+        System.err.println("Usage: java XmlValidator <schema.xsd> <document.xml>");
         System.exit(1);
     }
     if (!schemaFile.exists()) {
-        System.out.println("Schema file not found: " + schemaPath);
+        System.err.println("Schema file not found: " + schemaPath);
         System.exit(1);
     }
     if (!xmlFile.exists()) {
-        System.out.println("XML file not found: " + xmlPath);
+        System.err.println("XML file not found: " + xmlPath);
         System.exit(1);
     }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between edb9c6c and 9dc8197.

📒 Files selected for processing (1)
  • test/decimal-type/XmlValidator.java (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: david-waltermire
Repo: metaschema-framework/metaschema PR: 135
File: test/decimal-type/xml-valid-cases.xml:37-52
Timestamp: 2025-12-08T22:31:21.314Z
Learning: The decimal datatype in the metaschema framework supports exponential notation (both lowercase 'e' and uppercase 'E', with optional +/- signs in the exponent). Both XML and JSON schemas should allow exponential notation for decimal values.
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: When adding or modifying datatypes, update `schema/json/metaschema-datatypes.json`, `schema/xml/metaschema-datatypes.xsd`, and `website/content/specification/datatypes.md` in sync
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: Applies to schema/**/*.{json,xsd} : Regex patterns in datatype definitions must be valid for both ECMA-262 (JSON Schema) and XML Schema dialects. Both support Unicode property escapes (`\p{L}`, `\p{N}`)
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: Keep `schema/json/`, `schema/xml/`, and `website/content/specification/datatypes.md` in sync when updating schema definitions
Learnt from: CR
Repo: metaschema-framework/metaschema PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-13T21:16:12.270Z
Learning: The data type documentation in `website/content/specification/datatypes.md` must stay in sync with schema definitions in `schema/json/metaschema-datatypes.json` and `schema/xml/metaschema-datatypes.xsd`
🔇 Additional comments (3)
test/decimal-type/XmlValidator.java (3)

32-43: Excellent security hardening!

The XXE/SSRF protections are properly implemented with FEATURE_SECURE_PROCESSING and external access restrictions on both DTD and schema. The graceful handling of unsupported properties ensures compatibility across different JAXP implementations.


45-53: Defense-in-depth approach is solid.

Applying external access restrictions to both the SchemaFactory and the Validator ensures comprehensive protection against external entity resolution at multiple layers.


55-56: Validation logic is correct.

The validation executes properly and reports success. Note that the StreamSource isn't explicitly closed, but this is acceptable for a short-lived CLI tool that exits immediately after validation.

@david-waltermire david-waltermire merged commit 868f1ef into develop Dec 14, 2025
7 checks passed
@github-project-automation github-project-automation bot moved this from In review to Done in Spec and Tooling Work Board Dec 14, 2025
@david-waltermire david-waltermire deleted the feature/adjust-decimal-type branch December 14, 2025 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants