Skip to content

feat: add rules to block out most bot traffic#195

Merged
nikilok merged 2 commits into
mainfrom
feat/block-semrush
Jun 27, 2026
Merged

feat: add rules to block out most bot traffic#195
nikilok merged 2 commits into
mainfrom
feat/block-semrush

Conversation

@nikilok

@nikilok nikilok commented Jun 27, 2026

Copy link
Copy Markdown
Owner

Summary by CodeRabbit

  • Chores
    • Updated site crawling rules to block a number of specific automated bots while keeping general access unchanged.
    • Preserved the existing sitemap reference for search engines.

@vercel

vercel Bot commented Jun 27, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
learn-tanstack-start Ready Ready Preview, Comment Jun 27, 2026 6:18pm

Request Review

@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@nikilok, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 56 minutes and 50 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: ead48280-2b7d-4582-91ab-335a4c4ca1df

📥 Commits

Reviewing files that changed from the base of the PR and between 5dca32e and a605a4d.

📒 Files selected for processing (1)
  • apps/web/public/robots.txt
📝 Walkthrough

Walkthrough

apps/web/public/robots.txt gains explicit Disallow: / entries for SemrushBot, SiteAuditBot, several SemrushBot-* variants, SplitSignalBot, and RyteBot. The existing User-agent: * / Allow: / rule and sitemap entry are unchanged.

Crawler Blocking

Layer / File(s) Summary
Semrush bot disallow rules
apps/web/public/robots.txt
New User-agent / Disallow: / blocks added for Semrush and related crawlers after the existing wildcard allow rule.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Poem

🐇 Hop hop, away you go,
No Semrush bots allowed to roam!
The / is blocked, the path is clear,
Only friendly crawlers welcome here.
This rabbit guards the garden gate! 🌿

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the change: adding robots.txt rules to block most bot traffic.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/block-semrush

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
apps/web/public/robots.txt (1)

4-4: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Broaden comment to reflect all blocked bots.

The comment states "Block Semrush crawlers" but SiteAuditBot (also used by other SEO platforms) and RyteBot (Ryte GmbH, not Semrush) are unrelated to Semrush. Update the comment to avoid implying all listed bots are Semrush-owned.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/public/robots.txt` at line 4, The comment in robots.txt is too
narrow because it implies all blocked crawlers are Semrush-owned, but the list
also includes unrelated bots like SiteAuditBot and RyteBot. Update the comment
near the robots directives to describe the broader set of blocked SEO crawlers,
using the existing bot names in the file as the reference point, so it
accurately reflects all entries without mentioning Semrush specifically.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@apps/web/public/robots.txt`:
- Line 4: The comment in robots.txt is too narrow because it implies all blocked
crawlers are Semrush-owned, but the list also includes unrelated bots like
SiteAuditBot and RyteBot. Update the comment near the robots directives to
describe the broader set of blocked SEO crawlers, using the existing bot names
in the file as the reference point, so it accurately reflects all entries
without mentioning Semrush specifically.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 56b12ed4-c074-41cd-808d-c7efe138f935

📥 Commits

Reviewing files that changed from the base of the PR and between 6870727 and 5dca32e.

📒 Files selected for processing (1)
  • apps/web/public/robots.txt

@nikilok nikilok merged commit 8a2ffab into main Jun 27, 2026
4 of 5 checks passed
@nikilok nikilok deleted the feat/block-semrush branch June 27, 2026 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant