The Too Many Parts Problem
This guide is part of a collection of findings gained from community meetups. For more real world solutions and insights you can browse by specific problem. Need more performance optimization tips? Check out the Performance Optimization community insights guide.
Universal pain point: Small frequent inserts create performance degradation through part explosion.
Recognize the Problem Early
Proper Insert Batching
Community-proven batching strategy from production deployments:
Alternative: Async Inserts (ClickHouse 21.11+)
"We developed a function called async insert... this mechanism is straightforward similar to buffer table we insert to the server side and use some buffer to collect these inserts by default we have 16 threads to collect this buffer and if the buffer is large enough or reach timeout we will flush the buffer to the storage so a part will contain multiple inserts" - ClickHouse team explaining built-in solution
Video Sources
- Fast, Concurrent, and Consistent Asynchronous INSERTS in ClickHouse - ClickHouse team member explains async inserts and the too many parts problem
- Production ClickHouse at Scale - Real-world batching strategies from observability platforms