fix(api): enforce configurable upload size limit#231
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (7)
📝 WalkthroughWalkthroughThis PR implements configurable file upload size limits to prevent denial-of-service attacks via large uploads. It adds a new UPLOAD_MAX_BYTES environment variable parsed into the application settings, enforces the limit using chunked reading with early rejection (HTTP 413) in the /api/scan/upload endpoint, and provides test coverage across development and production deployment configurations. ChangesUpload Size Limits for DoS Prevention
Sequence DiagramsequenceDiagram
participant Client
participant ScanUploadEndpoint as /api/scan/upload
participant BoundedReader as _read_upload_with_limit
participant UploadFile
participant Settings
Client->>ScanUploadEndpoint: POST file
ScanUploadEndpoint->>Settings: fetch upload_max_bytes
ScanUploadEndpoint->>BoundedReader: call with file + max_bytes
loop Read chunks
BoundedReader->>UploadFile: read(chunk_size)
UploadFile-->>BoundedReader: chunk bytes
BoundedReader->>BoundedReader: accumulate size
alt Size exceeded
BoundedReader-->>ScanUploadEndpoint: HTTPException 413
ScanUploadEndpoint-->>Client: 413 + error detail
else Size OK
BoundedReader->>BoundedReader: continue
end
end
BoundedReader-->>ScanUploadEndpoint: full content
ScanUploadEndpoint->>ScanUploadEndpoint: process upload normally
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
21cbdee to
3d95862
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/extension_shield/core/config.py (1)
213-222: ⚡ Quick winConsider extracting the hardcoded default to a module constant.
The default value
104857600is repeated three times (lines 215, 219, 221). While the defensive parsing logic is sound (handles ValueError and validates > 0), extracting the default to a named constant would improve maintainability.♻️ Suggested refactor
+# Upload and extraction limits (DoS protection) +_DEFAULT_UPLOAD_MAX_BYTES = 104857600 # 100 MiB + `@lru_cache`(maxsize=1) def get_settings() -> Settings: """ ... """ ... # Upload and zip-bomb protection: reject large uploads before saving/analysis, # then bound extracted archive complexity separately. - _upload_max_bytes = os.environ.get("UPLOAD_MAX_BYTES", "104857600") # 100 MiB + _upload_max_bytes = os.environ.get("UPLOAD_MAX_BYTES", str(_DEFAULT_UPLOAD_MAX_BYTES)) try: upload_max_bytes = int(_upload_max_bytes) except ValueError: - upload_max_bytes = 104857600 + upload_max_bytes = _DEFAULT_UPLOAD_MAX_BYTES if upload_max_bytes <= 0: - upload_max_bytes = 104857600 + upload_max_bytes = _DEFAULT_UPLOAD_MAX_BYTES🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/extension_shield/core/config.py` around lines 213 - 222, Extract the repeated literal 104857600 into a named module-level constant (e.g., DEFAULT_UPLOAD_MAX_BYTES) and use that constant in the upload parsing logic: replace the three occurrences of the literal in the _upload_max_bytes default, the except ValueError fallback for upload_max_bytes, and the non-positive check (currently operating on upload_max_bytes) with DEFAULT_UPLOAD_MAX_BYTES; ensure the new constant is defined near the top of the module and referenced by the existing variables _upload_max_bytes and upload_max_bytes to preserve the current parsing/validation behavior.src/extension_shield/api/main.py (1)
180-199: 💤 Low valueConsider optimizing chunk size calculation to minimize excess reads.
The bounded reader correctly enforces upload limits and rejects oversized payloads before they're saved. However, line 187 reads
min(chunk_size, max_bytes + 1)on every iteration, which means the function may read up tomax_bytes + chunk_sizebytes (e.g., 101 MiB for a 100 MiB limit) before detecting an oversized upload.For the typical use case (100 MiB limit, 1 MiB chunks), this overhead is acceptable and the implementation is correct. However, reading
min(chunk_size, max_bytes - total + 1)would ensure you never read more thanmax_bytes + 1bytes total, reducing memory pressure for edge cases.♻️ Potential optimization
async def _read_upload_with_limit(file: UploadFile, max_bytes: int) -> bytes: """Read an UploadFile without accepting payloads larger than max_bytes.""" chunk_size = 1024 * 1024 chunks = [] total = 0 while True: - chunk = await file.read(min(chunk_size, max_bytes + 1)) + remaining = max_bytes - total + 1 + chunk = await file.read(min(chunk_size, remaining)) if not chunk: break total += len(chunk) if total > max_bytes: raise HTTPException( status_code=413, detail=f"File too large. Maximum size is {_format_byte_limit(max_bytes)}.", ) chunks.append(chunk) return b"".join(chunks)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/extension_shield/api/main.py` around lines 180 - 199, In _read_upload_with_limit, avoid over-reading by changing the read length calculation: instead of await file.read(min(chunk_size, max_bytes + 1)), compute the remaining allowance using max_bytes - total and call await file.read(min(chunk_size, max(0, max_bytes - total) + 1)) (or equivalent) so each iteration never requests more than the remaining bytes plus one; keep the existing total update, overflow check (if total > max_bytes -> HTTPException), and chunk append logic intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/extension_shield/api/main.py`:
- Around line 180-199: In _read_upload_with_limit, avoid over-reading by
changing the read length calculation: instead of await file.read(min(chunk_size,
max_bytes + 1)), compute the remaining allowance using max_bytes - total and
call await file.read(min(chunk_size, max(0, max_bytes - total) + 1)) (or
equivalent) so each iteration never requests more than the remaining bytes plus
one; keep the existing total update, overflow check (if total > max_bytes ->
HTTPException), and chunk append logic intact.
In `@src/extension_shield/core/config.py`:
- Around line 213-222: Extract the repeated literal 104857600 into a named
module-level constant (e.g., DEFAULT_UPLOAD_MAX_BYTES) and use that constant in
the upload parsing logic: replace the three occurrences of the literal in the
_upload_max_bytes default, the except ValueError fallback for upload_max_bytes,
and the non-positive check (currently operating on upload_max_bytes) with
DEFAULT_UPLOAD_MAX_BYTES; ensure the new constant is defined near the top of the
module and referenced by the existing variables _upload_max_bytes and
upload_max_bytes to preserve the current parsing/validation behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 087626a3-4ef3-4f92-8294-c3fd274957b6
📒 Files selected for processing (7)
.env.exampledeploy/env.production.templatedeploy/railway.tomlrailway.tomlsrc/extension_shield/api/main.pysrc/extension_shield/core/config.pytests/api/test_upload_limits.py
Reject oversized CRX/ZIP uploads with a configurable UPLOAD_MAX_BYTES limit before saving or analysis. Document the setting in deployment templates and add regression coverage for rejection, boundary acceptance, and invalid env fallback.
3d95862 to
8cd4b37
Compare
Summary
Fixes #146.
The upload endpoint already had a hardcoded 100 MiB check, but it read the whole upload before applying that check. This keeps the same default limit and moves it into config so deployments can tune it.
Changes:
UPLOAD_MAX_BYTES, defaulting to104857600bytes;413once the configured limit is exceeded;Validation
The warnings are dependency/deprecation warnings already present in the test environment.
Summary by CodeRabbit