Skip to content

Conversation

@vershwal
Copy link
Member

@vershwal vershwal commented Dec 4, 2025

Note

Add comprehensive timing and progress logging to S3 storage uploads, including multipart, with chunk-sized stream reads.

  • Backend (storage):
    • Instrumentation: Add detailed timing/progress logging in core/server/adapters/storage/S3Storage.ts for save, readFileInChunks, uploadMultipart, and saveRaw.
      • Logs durations, sizes, part/chunk counts, and outcomes for S3 commands (PutObject, CreateMultipartUpload, UploadPart, CompleteMultipartUpload, AbortMultipartUpload).
      • Configure fs.createReadStream with highWaterMark equal to multipart chunk size; track buffer concat/slice metrics.
      • Standardize log prefixes to [S3Storage] and add start/completion logs for operations.

Written by Cursor Bugbot for commit a6ff96a. This will update automatically on new commits. Configure here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 4, 2025

Walkthrough

The pull request adds comprehensive runtime logging and timing instrumentation to the S3Storage.ts file across all major save and upload operations. Logging has been added to track execution flow and measure durations for save operations (both simple and multipart), file reading in chunks, multipart upload phases, and raw save operations. All log messages are standardized with a [S3Storage] prefix and contextual information. The existing control logic for choosing between multipart and simple uploads remains unchanged.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

  • Verify consistency of [S3Storage] logging prefix across all new log statements
  • Confirm timing measurements (start/end markers) are correctly paired in all code paths
  • Ensure multipart and simple upload logging paths both capture appropriate metrics and events
  • Check that no unintended changes were introduced to the upload decision logic or control flow
  • Validate that error logging in abort scenarios includes updated messaging context

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'High water mark test' does not clearly summarize the main change; the PR primarily adds comprehensive S3 logging instrumentation, not testing. Consider renaming the title to reflect the main change, such as 'Add comprehensive timing and progress logging to S3 storage uploads' for clarity.
✅ Passed checks (2 passed)
Check name Status Explanation
Description check ✅ Passed The description is directly related to the changeset, providing detailed context about the logging and instrumentation changes to S3Storage.ts.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch highWaterMarkTest

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

Body: buffer
}));
const uploadDuration = Date.now() - uploadStart;
logging.info(`[S3Storage] saveRaw() completed: file=${key} duration=${uploadDuration}ms`);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Excessive debug logging left in production code

The PR titled "High water mark test" introduces extensive debug instrumentation that appears to be test code not intended for production. Over 20 logging.info() calls were added that log detailed timing metrics for every operation: individual stream chunks, buffer concatenations, buffer slices, file stats, and upload durations. Variables like bufferConcatTime, bufferSliceTime, bufferConcatCount, and yieldedChunks are tracked and logged at the info level. This verbose logging would generate significant log noise in production for every file upload.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
ghost/core/core/server/adapters/storage/S3Storage.ts (1)

180-188: Rename uploadStart2 to a more descriptive name.

The variable name uploadStart2 is not descriptive. Consider renaming it to putObjectStart or s3UploadStart to clarify what operation is being timed.

Apply this diff:

-        const uploadStart2 = Date.now();
+        const putObjectStart = Date.now();
         await this.client.send(new PutObjectCommand({
             Bucket: this.bucket,
             Key: key,
             Body: body,
             ContentType: file.type
         }));
-        const uploadDuration = Date.now() - uploadStart2;
+        const uploadDuration = Date.now() - putObjectStart;
         logging.info(`[S3Storage] PutObjectCommand completed: file=${key} duration=${uploadDuration}ms`);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9504e8d and a6ff96a.

📒 Files selected for processing (1)
  • ghost/core/core/server/adapters/storage/S3Storage.ts (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: E2E Tests (Ember 1/4)
  • GitHub Check: [Optional] E2E Tests (React 4/4)
  • GitHub Check: [Optional] E2E Tests (React 1/4)
  • GitHub Check: E2E Tests (Ember 3/4)
  • GitHub Check: E2E Tests (Ember 2/4)
  • GitHub Check: E2E Tests (Ember 4/4)
  • GitHub Check: [Optional] E2E Tests (React 3/4)
  • GitHub Check: [Optional] E2E Tests (React 2/4)
  • GitHub Check: Inspect Docker Image
  • GitHub Check: Acceptance tests (Node 22.18.0, mysql8)
  • GitHub Check: Acceptance tests (Node 22.18.0, sqlite3)
  • GitHub Check: Legacy tests (Node 22.18.0, sqlite3)
  • GitHub Check: Legacy tests (Node 22.18.0, mysql8)
  • GitHub Check: Unit tests (Node 22.18.0)
  • GitHub Check: Ghost-CLI tests
  • GitHub Check: Lint
  • GitHub Check: Cursor Bugbot
🔇 Additional comments (5)
ghost/core/core/server/adapters/storage/S3Storage.ts (5)

199-199: Explicit highWaterMark setting aligns with PR objectives.

Setting highWaterMark explicitly to match chunkSize ensures predictable stream behavior for testing multipart upload performance, which aligns with the PR title "High water mark test."


209-246: Verify intended use: logging volume may be excessive for production.

The per-chunk logging (especially line 220) will generate substantial log output for large files. For example, a 1GB file with 5MB chunks produces 200+ log entries from this method alone. This level of instrumentation is appropriate for testing and debugging but may overwhelm production logs and impact performance.

Confirm whether this instrumentation is:

  • Temporary for high water mark testing/debugging
  • Intended for production with specific performance monitoring requirements
  • Should be gated behind a debug flag or reduced log level

249-325: Well-structured multipart instrumentation.

The logging provides good visibility into multipart upload performance without being overly verbose. Each part logs once (rather than per stream chunk), and the summary metrics (total parts, durations) are useful for performance analysis.


350-358: LGTM!

The logging for saveRaw() is concise and provides useful timing information without excessive verbosity.


152-153: Consider the production impact of comprehensive instrumentation.

This PR adds extensive timing and logging throughout the S3 upload paths. While valuable for understanding performance characteristics (especially given the PR title "High water mark test"), consider:

  1. Performance overhead: Frequent Date.now() calls and logging operations add measurable overhead, especially in hot paths like per-chunk processing
  2. Log volume: Production deployments handling many large files could generate significant log data
  3. Configuration: Consider adding a feature flag or debug mode to control instrumentation verbosity

If this is temporary instrumentation for testing, consider documenting that intent in the PR description or code comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants