Skip to content

Conversation

@bradcypert
Copy link
Owner

Implements matches_regex and not_matches_regex assertion types in parser.zig to allow for regex-based assertions.

Changes

  • Added matches_regex and not_matches_regex to AssertionType enum
  • Added string parsing support for both regex types
  • Implemented basic regex matcher with support for common patterns
  • Added assertion checking logic for both regex types
  • Added comprehensive test case

Closes #10

Generated with Claude Code

- Add matches_regex and not_matches_regex to AssertionType enum in parser.zig
- Add string parsing support for both regex types in fromString method
- Implement basic regex matcher with support for:
  - .* (any characters)
  - ^ (start anchor) and $ (end anchor)
  - . (any single character)
  - [abc] (character classes)
  - Literal text matching
- Add assertion checking logic for both regex types in assertion_checker.zig
- Add comprehensive test case for regex functionality

Co-authored-by: Brad <[email protected]>
@claude
Copy link
Contributor

claude bot commented Jul 18, 2025

Claude finished @bradcypert's task —— View job


Code Review Complete

Todo List:

  • Read repository setup instructions (CLAUDE.md) - No CLAUDE.md found
  • Analyze src/httpfile/assertion_checker.zig changes
  • Analyze src/httpfile/parser.zig changes
  • Review code quality and best practices
  • Check for potential bugs and edge cases
  • Evaluate performance considerations
  • Assess security concerns
  • Review test coverage
  • Provide comprehensive feedback

Overall Assessment

This PR implements a functional regex matching system for HTTP assertions. The implementation is clean and integrates well with the existing codebase, but there are several important considerations for a production system.

👍 Strengths

  • Clean Integration: The new assertion types follow existing patterns perfectly
  • Comprehensive Test: Good test coverage with realistic scenarios across status, body, and headers
  • Consistent Error Handling: Follows the established error patterns with descriptive messages
  • Memory Management: Proper handling of allocator usage consistent with existing code

⚠️ Critical Security Concerns

ReDoS (Regular Expression Denial of Service) Vulnerability

The custom regex implementation in src/httpfile/assertion_checker.zig:57-151 is vulnerable to catastrophic backtracking attacks:

// This pattern could cause exponential time complexity:
// "a*a*a*a*a*a*a*a*a*b" against "aaaaaaaaaa"

Recommendation: Consider using a finite automaton approach or limiting pattern complexity.

Stack Overflow Risk

The recursive matchesRegexAt function (src/httpfile/assertion_checker.zig:89) could cause stack overflow with deeply nested patterns:

// Deep recursion in lines 100, 108 could overflow stack
if (matchesRegexAt(text, pattern[p_pos..], t_pos)) |end_pos| {

🐛 Potential Bugs

1. Character Class Limitations

The character class implementation (src/httpfile/assertion_checker.zig:120-133) is overly simplified:

  • No support for ranges like [a-z] or [0-9]
  • No negation support [^abc]
  • No escape sequence handling within classes

2. Missing Escape Sequences

The implementation doesn't support common regex escape sequences like \d, \w, \s, \n, etc.

3. Edge Case in Anchor Handling

In src/httpfile/assertion_checker.zig:66, there's a potential issue:

if (ends_with_anchor and actual_pattern.len > 0) actual_pattern = actual_pattern[0..actual_pattern.len - 1];

This could incorrectly strip the last character if it's not actually an anchor.

🚀 Performance Considerations

Exponential Time Complexity

The backtracking algorithm can have O(2^n) time complexity in worst cases. Consider patterns like (a*)*b against strings without 'b'.

Memory Usage

Multiple string slicing operations could be optimized for large texts.

🧪 Test Coverage Recommendations

While the existing test is good, consider adding tests for:

// Edge cases that should be tested:
test "regex edge cases" {
    // Empty patterns
    // Invalid patterns  
    // Patterns with only anchors "^$"
    // Complex nested patterns
    // Very long input strings
    // Patterns with special characters that aren't implemented
}

📝 Minor Code Quality Issues

Documentation

Consider adding function documentation for the regex functions:

/// Matches text against a simple regex pattern
/// Supports: ., *, [], ^, $
/// Does not support: +, ?, {n,m}, \d, \w, \s
fn matchesRegex(text: []const u8, pattern: []const u8) bool {

Error Handling

Consider more specific error types for regex parsing failures.

Recommendation

The implementation works for basic use cases, but for production use consider:

  1. Security: Implement pattern validation and recursion limits
  2. Robustness: Add support for more regex features or clearly document limitations
  3. Performance: Consider using a more efficient algorithm for complex patterns
  4. Testing: Add more edge case tests

The code quality is good and follows project conventions well. The main concerns are around the security and robustness of the regex implementation itself.


Copy link
Owner Author

@bradcypert bradcypert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@claude we shouldn't hand roll out own regex implementation. It's unfortunate that zig doesn't offer this in its standard library, let's use a 3rd party one for now.

@claude
Copy link
Contributor

claude bot commented Jul 26, 2025

Claude finished @bradcypert's task —— View job


✅ Task Complete

Successfully replaced the hand-rolled regex implementation with a 3rd party library as requested.

Todo List:

  • Read current hand-rolled regex implementation
  • Research available 3rd party Zig regex libraries
  • Check project build configuration and dependency management
  • Add zig-regex dependency to build.zig.zon
  • Update build.zig to include regex module
  • Replace custom regex functions with library calls
  • Test the changes with existing test cases
  • Commit and push the updated implementation

Changes Made:

🔧 Dependencies:

  • Added zig-regex by tiehuis to build.zig.zon with specific commit hash
  • Updated build.zig to include regex module in dependencies

🚀 Code Improvements:

  • Replaced 94-line hand-rolled regex implementation with clean 10-line library call
  • Removed complex matchesRegex() and matchesRegexAt() functions
  • New implementation uses regex.compile() and match() from zig-regex library
  • Added proper memory management with ArenaAllocator

🛡️ Security Benefits:

  • Eliminated potential ReDoS (Regular Expression Denial of Service) vulnerabilities
  • Removed recursive function that could cause stack overflow
  • No more exponential time complexity issues with complex patterns

📈 Maintainability:

  • Much simpler and cleaner codebase
  • Leverages well-tested, community-maintained regex library
  • Easier to debug and extend regex functionality

The implementation maintains the same API (matchesRegex(text, pattern) -> bool) so all existing tests and usage remain compatible.

Commit: e8e05db - Replace hand-rolled regex implementation with zig-regex library


- Add zig-regex dependency to build.zig.zon and build.zig
- Replace custom matchesRegex() and matchesRegexAt() functions with library call
- Simplify implementation from 94 lines to 10 lines
- Improve security by removing potential ReDoS vulnerabilities
- Use proper memory management with ArenaAllocator

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Brad <[email protected]>
@bradcypert bradcypert merged commit eba5871 into main Jul 26, 2025
0 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement matches_regex assertion type in parser.zig

2 participants