Reference
Zabbix default configuraton, even with 128 cores and 256 GB of memory, can only handle monitoring 10-20 machines. To monitor more, configuration changes are necessary.
1. Configuration Files
Add the following to the server configuration file:
StartPollers=160
StartPollersUnreacheable=80
StartTrappers=20
StartPingers=100
StartDiscoverers=120
Cachesize=1024M
startDBSyncers=16
HistoryCacheSize=1024M
TrendCacheSize=1024M
HIstoryTextCacheSize=512M
2. Database
If the database is on the same machine as Zabbix, use a socket connection for improved speed. For Zabbix, using the InnoDB engine is recommended, as it is 1.5 times more efficient than other engines.
Split large history type tables (e.g., history, history_uint). Disable the internal housekeeper to prevent automatic periodic deletion of historical data, as deleting large amounts of data, especially from InnoDB tables, can be problematic.
Optimize the database configuration file. Since all data is submitted to the database, it can become a bottleneck when monitoring many machines.
If the load is still too high, consider moving MySQL to a dedicated server and implementing read/write splitting, possibly using middleware.
3. Server-Side Configuration
1. Disable Server Housekeeping
Housekeeping is the mechanism that cleans up historical data. By default, it runs every hour, deleting old records beyond the configured retention period. For example, MySQL monitoring items might have a default retention of 90 days, so housekeeping would delete anything older than that. This frequent cleaning can sometimes cause errors like: Zabbix housekeeper processes more than 75% busy.
Two methods:
A: Edit the server configuration file and add or modify these two lines to run housekeeping once a day, deleting at most 500 rows per run:
HousekeepingFrequency=24 # Frequency of Zabbix housekeeping in hours
MaxHousekeeperDelete=500 # Maximum number of historical data rows to delete per run
B: Disable housekeeping entirely and manually clean up logs.
For versions below 2.2, add this to the server configuration:
DisableHousekeeping=1
For versions 2.2 and above, change it via the web interface:
- Go to
Administration->General, selectHousekeeping. Ensure the checkboxes forEnable internal housekeepingare unchecked for bothhistoryandtrendssections.
Alternatively:
[Image: Housekeeping settings screenshot]
[Image: Another housekeeping settings screenshot]
2. Adjust Monitoring Items
Many monitoring items may be unnecessary or not currently useful. For instance, in a Redis monitoring template, items for subscription/publication (like pubsub) should be removed.
Prefer using numeric types for monitoring items; avoid character types when possible. Character types consume more storage space, are more complex to set up triggers for, and Zabbix processes numeric data more efficiently. If character-type items are essential for business needs, consider increasing the data collection interval to improve processing efficiency.
In triggers, functions like last() and nodata() are fastest, while min(), max(), and avg() are slowest. Always try to use faster functions. Also ensure correct logic when configuring triggers, as incorrect logic can lead to slow database queries.
Most monitoring items default to retaining 90 days or 1 week of historical data, with trend data kept for 365 days.
Often, trend data is sufficiant. Retaining historical data for only 7 days is usually enough. For items with slow-changing values (e.g., disk space, file size), increase the collection interval to reduce load and save space.
3. Use Proxies for Large Number of Hosts
If the number of hosts is too large, consider using proxies for distribution based on data center, business, or group. A zabbix_proxy can collect data and monitor on behalf of the server, but monitoring results are still sent to the server for aggregation. Proxies do not have a web interface.
If there are too many machines, consider using active mode. By default, all items are passive; the client opens port 10051, and the server fetches data from it.
Key actions:
- a) Reduce the history retention time.
- b) Increase the data collection interval (i.e., reduce frequency).
- c) Remove unnecessary monitoring items.