Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 53 additions & 17 deletions develop-docs/self-hosted/tasks.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -63,32 +63,49 @@ on "taskworker" topic in Kafka, the number of partitions should be evenly divisi
the number of `taskbroker` replicas that you're planning to scale to.
Then due to the limited capability of Docker Compose, you will need to manually
scale your `taskbroker` replicas. You can do this by adding more container
declarations to your `docker-compose.override.yml` file:
declarations to your `docker-compose.override.yml` file. Each replica reuses the
same `taskbroker/config.yml` that ships with self-hosted (the one the default
`taskbroker` service already mounts), so they consume the same topics and join
the same consumer groups — only the SQLite volume differs per replica:

```yaml
services:
taskbroker-beta:
restart: "unless-stopped"
image: "$TASKBROKER_IMAGE"
environment:
TASKBROKER_KAFKA_CLUSTER: "kafka:9092"
TASKBROKER_KAFKA_DEADLETTER_CLUSTER: "kafka:9092"
TASKBROKER_DB_PATH: "/opt/sqlite/taskbroker-activations-beta.sqlite"
TASKBROKER_KAFKA_CLUSTERS__DEFAULT__ADDRESS: "kafka:9092"
TASKBROKER_DB_PATH: "/opt/sqlite/taskbroker-activations.sqlite"
TASKBROKER_STATSD_ADDR: ${STATSD_ADDR:-127.0.0.1:8125}
command: /opt/taskbroker -c /etc/taskbroker/config.yml

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The documentation for scaling brokers is missing instructions to create the taskbroker/config.yml file, which will cause container startup failures for the taskbroker-beta and taskbroker-charlie services.
Severity: HIGH

Suggested Fix

Add a section that instructs users on how to create the taskbroker/config.yml file, including its necessary content. This should be similar to the instructions already provided for taskbroker/config.ingest.yml in the "Separate Ingest Workers" section.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: develop-docs/self-hosted/tasks.mdx#L77

Potential issue: The documentation for scaling brokers instructs users to configure the
`taskbroker-beta` and `taskbroker-charlie` services to use a configuration file at
`/etc/taskbroker/config.yml`. The services mount a local `./taskbroker` directory to
`/etc/taskbroker`. However, the documentation omits the step of creating the
`taskbroker/config.yml` file. Users following these instructions will encounter
container startup failures because the taskbroker binary is explicitly told to use a
config file that does not exist. This is inconsistent with the "Separate Ingest Workers"
section, which correctly documents the creation of its required configuration file.

Also affects:

  • develop-docs/self-hosted/tasks.mdx:93

Did we get this right? 👍 / 👎 to inform future reviews.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong, but it's a bit confusing without context, updated

volumes:
- sentry-taskbroker:/opt/sqlite
- sentry-taskbroker-beta:/opt/sqlite
- type: bind
read_only: true
source: ./taskbroker
target: /etc/taskbroker
depends_on:
- kafka
taskbroker-charlie:
restart: "unless-stopped"
image: "$TASKBROKER_IMAGE"
environment:
TASKBROKER_KAFKA_CLUSTER: "kafka:9092"
TASKBROKER_KAFKA_DEADLETTER_CLUSTER: "kafka:9092"
TASKBROKER_DB_PATH: "/opt/sqlite/taskbroker-activations-charlie.sqlite"
TASKBROKER_KAFKA_CLUSTERS__DEFAULT__ADDRESS: "kafka:9092"
TASKBROKER_DB_PATH: "/opt/sqlite/taskbroker-activations.sqlite"
TASKBROKER_STATSD_ADDR: ${STATSD_ADDR:-127.0.0.1:8125}
command: /opt/taskbroker -c /etc/taskbroker/config.yml
volumes:
- sentry-taskbroker:/opt/sqlite
- sentry-taskbroker-charlie:/opt/sqlite
- type: bind
read_only: true
source: ./taskbroker
target: /etc/taskbroker
depends_on:
- kafka

volumes:
sentry-taskbroker-beta: {}
sentry-taskbroker-charlie: {}
```

Note that each `taskbroker` replica needs their own SQLite database per replica, to prevent
Expand Down Expand Up @@ -131,9 +148,9 @@ To achieve this work separation we need to make a few changes:
predefined topics in [`src/sentry/conf/types/kafka_definition.py`](https://github.com/getsentry/sentry/blob/master/src/sentry/conf/types/kafka_definition.py).
By default, any topics will automatically be created during `./install.sh`
process.
2. Deploy the additional broker replicas. You can use the
`TASKBROKER_KAFKA_TOPIC` environment variable to define the topic a
taskbroker consumes from.
2. Deploy the additional broker replicas. Give each broker its own config file
declaring the topic it consumes under `kafka_topics`, and point the broker at
it with the `-c` flag.
3. Deploy additional workers that use the new brokers in their `rpc-host-list`
CLI flag.
4. Find the list of namespaces you want to shift to the new topic. The list of
Expand All @@ -151,7 +168,23 @@ To achieve this work separation we need to make a few changes:
Having separate ingest `taskbroker` and `taskworker` is useful for high-throughput
installations, therefore you can receive timely alerts and not have to wait for
ingest-related tasks to finish. As an implementation of the above steps,
you need to add a few new containers on your `docker-compose.override.yml` file:
you need to create a config file for the ingest broker that declares the
`taskworker-ingest` topic. Create `taskbroker/config.ingest.yml`:

```yaml
kafka_deadletter_topic: taskworker-ingest-dlq

kafka_topics:
taskworker-ingest:
cluster: default
consumer_group: taskworker-ingest
taskworker-ingest-dlq:
cluster: default
consumer_group: taskworker-ingest
produce_only: true
```

Then add a few new containers on your `docker-compose.override.yml` file:

```yaml
# Copy `x-sentry_defaults` and `file_healthcheck_defaults` section from
Expand All @@ -162,13 +195,16 @@ services:
restart: "unless-stopped"
image: "$TASKBROKER_IMAGE"
environment:
TASKBROKER_KAFKA_TOPIC: "taskworker-ingest"
TASKBROKER_KAFKA_CONSUMER_GROUP: "taskworker-ingest"
TASKBROKER_KAFKA_CLUSTER: "kafka:9092"
TASKBROKER_KAFKA_DEADLETTER_CLUSTER: "kafka:9092"
TASKBROKER_KAFKA_CLUSTERS__DEFAULT__ADDRESS: "kafka:9092"
TASKBROKER_DB_PATH: "/opt/sqlite/taskbroker-activations-ingest.sqlite"
TASKBROKER_STATSD_ADDR: ${STATSD_ADDR:-127.0.0.1:8125}
command: /opt/taskbroker -c /etc/taskbroker/config.ingest.yml
volumes:
- sentry-taskbroker-ingest:/opt/sqlite
- type: bind
read_only: true
source: ./taskbroker
target: /etc/taskbroker
depends_on:
- kafka
taskworker-ingest:
Expand Down
Loading