| copyright |
|
||
|---|---|---|---|
| lastupdated | 2026-04-02 | ||
| keywords | mongodb, databases, monitoring, scaling, autoscaling, resources, troubleshooting | ||
| subcollection | databases-for-mongodb |
{{site.data.keyword.attribute-definition-list}}
{: #troubleshooting-performance}
Use this guide to help you identify and resolve performance issues in your {{site.data.keyword.databases-for-mongodb}} deployment running on {{site.data.keyword.cloud_notm}} and powered by MongoDB.
You can also find more information about solving performance problems as follows:
- IBM Cloud tools and diagnostic commands for troubleshooting performance
- Best practices for performance
- IBM Cloud Support integration
If your applications are experiencing slow responses, timeouts, or inconsistent database performance, consider the following steps and information.
{: #troubleshooting-symptoms}
You might observe some of the following symptoms that indicate problems with performance:
- Increased application latency
- Slow query log entries
- High CPU or memory utilization
- Increased disk latency
- Replication lag
- Connection timeouts
Complete the following steps to determine the cause of the issues:
{: #troubleshooting-step1}
-
Log in to the {{site.data.keyword.cloud_notm}} console and navigate to your MongoDB deployment.
-
Review the Monitoring section for:
- CPU utilization
- Memory usage
- Disk IOPS and latency
- Active connections
{: #troubleshooting-step1-symptoms}
- CPU consistently above 75%
- Memory consistently above 80%
- Disk latency increasing over time
- Connections approaching plan limits
{: #troubleshooting-step1-actions}
- Increase storage or IOPS if disk latency is high.
- Review workload spikes in your application.
If resource usage remains elevated for sustained periods, scaling is recommended.
{: #troubleshooting-step2}
Slow queries are one of the most common causes of degraded performance.
-
Enable profiling:
db.setProfilingLevel(1, { slowms: 100 })
{: codeblock}
-
Review recent slow operations:
db.system.profile.find().sort({ ts: -1 }).limit(20)
{: codeblock}
-
Analyze query execution:
db.collection.find({ ... }).explain("executionStats")
{: codeblock}
{: #troubleshooting-step2-symptoms}
COLLSCAN(collection scan instead of index usage)- High
totalDocsExaminedcompared tonReturned
{: #troubleshooting-step2-actions}
- Create appropriate indexes.
- Use compound indexes for multi-field queries.
- Ensure aggregation pipelines begin with
$match. - Avoid large
skip()pagination.
{: #troubleshooting-step3}
High or poorly managed connections can impact performance.
{: #troubleshooting-step3-statistics}
db.serverStatus().connections{: codeblock}
{: #troubleshooting-step3-actions}
- Use connection pooling in your application.
- Avoid opening a new connection for each request.
- Close unused cursors.
Connection limits are determined by your deployment plan.
{: #troubleshooting-step4}
Replication lag can affect read performance and data freshness.
{: #troubleshooting-step4-status}
rs.printSecondaryReplicationInfo(){: codeblock}
{: #troubleshooting-step4-lag}
- High write throughput
- Disk bottlenecks
- Network latency
{: #troubleshooting-step4-actions}
- Scale storage performance.
- Review write concern settings.
- Scale to a higher plan if lag is persistent.
{: #troubleshooting-step5}
You might need sharding in the following situations:
- Working set is greater than RAM
- Single-node IOPS maxed out even after scaling
- Horizontal write scaling is required
- Collections exceed 1–2 TB
For more information, see performance tuning and sharding.
If your deployment uses sharding, run:
sh.status(){: codeblock}
{: #troubleshooting-step5-check}
- Uneven chunk distribution
- Jumbo chunks
- Traffic concentrated on a single shard
{: #troubleshooting-step5-actions}
- Review shard key selection.
- Avoid monotonically increasing shard keys.
- Consider hashed shard keys.
Improper shard key selection can significantly affect performance at scale.
{: #troubleshooting-step6}
Deleting a significant percentage of data does not immediately reduce disk usage at the operating system level.
{: #troubleshooting-step6-impacts}
- Internal fragmentation
- High disk utilization
- Reduced performance
{: #troubleshooting-step6-actions}
- Plan compaction operations carefully.
- Consider dump and restore for severe fragmentation.
- Keep disk utilization below 80–85%.
Schedule maintenance activities appropriately.
{: #troubleshooting-step7}
Lock contention can severely impact concurrent operations and overall throughput.
-
Check global lock statistics:
db.serverStatus().locks
{: codeblock}
-
Check current operations for locks:
db.currentOp({ $or: [ { waitingForLock: true }, { "locks.Global": "w" } ] })
{: codeblock}
-
Analyze lock wait time:
db.serverStatus().globalLock
{: codeblock}
{: #troubleshooting-step7-symptoms}
- High
currentQueuevalues (readers or writers). - Operations with
waitingForLock: true. - Long-running operations holding locks.
- Index builds that block operations.
{: #troubleshooting-step7-causes}
- Long-running queries without proper indexes.
- Large write operations.
- Index builds on large collections.
- Administrative commands (compact, repairDatabase).
{: #troubleshooting-step7-actions}
-
Kill long-running operations if necessary:
db.killOp(opid)
{: codeblock}
-
Build indexes in the background:
db.collection.createIndex({ field: 1 }, { background: true })
{: codeblock}
-
Break large operations into smaller batches.
-
Schedule maintenance operations during low-traffic periods.
-
Use read concern and write concern appropriately.
{: #troubleshooting-step8}
Understanding your workload patterns helps identify optimization opportunities.
-
Check operation counters:
db.serverStatus().opcounters
{: codeblock}
-
Analyze operations over time:
db.serverStatus().opcountersRepl
{: codeblock}
-
Identify hot collections:
db.adminCommand({ top: 1 })
{: codeblock}
-
Check the read ratio compared to the write ratio:
var stats = db.serverStatus().opcounters; print("Read ratio: " + (stats.query + stats.getmore) / (stats.query + stats.getmore + stats.insert + stats.update + stats.delete));
{: codeblock}
{: #troubleshooting-step8-symptoms}
- Disproportionate operations on specific collections
- High read-to-write or write-to-read ratios
- Sudden spikes in operation counts
- Time-based patterns (peak hours)
{: #troubleshooting-step8-actions}
- Optimize frequently accessed collections first.
- Consider read replicas for read-heavy workloads.
- Use appropriate read preferences.
- Implement caching for frequently read data.
- Review indexing strategy for hot collections.
- Consider sharding for write-heavy collections.
{: #troubleshooting-step9}
MongoDB's WiredTiger storage engine relies heavily on cache efficiency.
-
Check WiredTiger cache statistics:
db.serverStatus().wiredTiger.cache
{: codeblock}
-
Review key metrics:
var cache = db.serverStatus().wiredTiger.cache; print("Cache size: " + cache["bytes currently in the cache"]); print("Max cache size: " + cache["maximum bytes configured"]); print("Pages read into cache: " + cache["pages read into cache"]); print("Pages written from cache: " + cache["pages written from cache"]); print("Cache hit ratio: " + (1 - cache["pages read into cache"] / (cache["pages read into cache"] + cache["pages requested from the cache"])));
{: codeblock}
-
Check for eviction pressure:
db.serverStatus().wiredTiger.cache["pages evicted by application threads"]
{: codeblock}
{: #troubleshooting-step9-symptoms}
- Cache hit ratio below 95%
- High eviction rates
- Cache size consistently at maximum
- Application threads performing evictions
{: #troubleshooting-step9-size}
db.serverStatus().wiredTiger.cache["tracked dirty bytes in the cache"]{: codeblock}
{: #troubleshooting-step9-actions}
- Scale to a plan with more memory if the cache is consistently full.
- Review and optimize indexes (remove unused indexes).
- Limit result set sizes in queries.
- Use projections to reduce document size.
- Consider archiving old data.
- Monitor working set size trends.
{: #troubleshooting-step9-best}
- The WiredTiger cache should be 50% of available RAM (default).
- Leave sufficient memory for other processes.
- Monitor swap usage, which should be minimal.
{: #troubleshooting-step10}
Write concern and read preference settings significantly impact performance and consistency.
-
Check current write concern:
db.getWriteConcern()
{: codeblock}
-
Check replica set configuration:
rs.conf()
{: codeblock}
-
Write concern options:
Write concern Durability Performance Use case w: 1Low High Non-critical data, high throughput w: "majority"High Medium Default, balanced approach w: <number>Medium-High Medium-Low Specific replica count j: trueHighest Lowest Critical data requiring journal sync {: caption="Write concern options" caption-side="top"} -
Read preference options:
Read preference Consistency Performance Use case primaryHighest Medium Default, strong consistency primaryPreferredHigh Medium-High Fallback to secondary secondaryEventual High Analytics, reporting secondaryPreferredEventual High Read scaling nearestEventual Highest Lowest latency {: caption="Read preference options" caption-side="top"} -
Check read preference in your application:
// Example in Node.js driver db.collection('users').find({}).readPreference('secondary')
{: codeblock}
{: #troubleshooting-step10-symptoms}
- Overly strict write concerns for non-critical data
- Using
primaryread preference when eventual consistency is acceptable - Not leveraging secondaries for read-heavy workloads
{: #troubleshooting-step10-actions}
- Use
w: 1for high-throughput, non-critical writes. - Use
w: "majority"for important data (default). - Use
secondaryorsecondaryPreferredfor analytics queries. - Consider
nearestfor geographically distributed applications. - Balance consistency requirements with performance needs.
- Test different configurations under load.
{: #troubleshooting-step11}
Backup operations and maintenance tasks can temporarily affect performance.
{: #troubleshooting-step11-schedule}
{{site.data.keyword.databases-for-mongodb}} automatically does a backup. Check your backup schedule in the {{site.data.keyword.cloud_notm}} console under Backups.
{: #troubleshooting-step11-backup}
db.currentOp({
$or: [
{ op: "command", "command.backup": { $exists: true } },
{ desc: /^conn/ }
]
}){: codeblock}
{: #troubleshooting-step11-symptoms}
- Performance degradation during backup windows
- Increased disk I/O during backups
- Replication lag during backups
{: #troubleshooting-step11-actions}
- Monitor performance metrics during backup times.
- Consider scaling if backups consistently impact performance.
- Review backup retention policies.
- Plan for increased resource usage during restore operations.
{: #troubleshooting-step11-best}
- Schedule index builds during low-traffic periods.
- Use background index builds when possible.
- Monitor replication lag during maintenance.
- Test maintenance operations in non-production first.
- Coordinate with {{site.data.keyword.cloud_notm}} maintenance windows.