SQL Execution and Memory Management
Q1: Why does a single SQL query's memory usage not respect the exec_mem_limit setting?
A1: The exec_mem_limit parameter governs the maximum memory allocated per query fragment instance within a query plan. A single query plan may involve multiple instances, each executed on one or more Backend (BE) nodes. As a result, this setting does not accurately control total memory consumption across the cluster or for individual BE nodes.
Data Import Issues
Q2: Error [E-124] Arithmetic overflow during data import
A2: This error typically arises from numeric overflow. To resolve it:
- Increase the decimal precision of fields involved in calculations.
- Upgrade to version 2.0 or later for better stability and fewer overflow issues.
Q3: Error [E-238] too many segments in rowset during import
A3: Common in wide-table scenarios with large data volumes. Solution:
# Adjust max_segment_num_per_rowset in be.conf
curl -X POST http://{be_ip}:8040/api/update_config?max_segment_num_per_rowset=3000\&persist=true
Refer to BE Configuration Documentation for further details.
Q4: Optimizing Java UDF memory consumption in Doris
A4: Consider these optimizations:
- Review and optimize the internal logic of the UDF implementation.
- Use static variable loading techniques as described in the UDF documentation.
Q5: Can observer nodes perform write operations in Doris?
A5: In Doris, Frontend (FE) nodes can be master, follower, or observer. Observer nodes are primairly used to enhance read performance by offloading query tasks. They do not participate in metadata writes or transaction coordination. Therefore, observer nodes are generally not capable of handling write operations. If writes appear successful through an observer, it is because the operation is routed to a writable follower node rather than being performed directly by the observer.
Q6: Error [INTERNAL_ERROR] too many filtered rows during data import
A6: Usually caused by data quality issues. Examine the error URL returned by the system to identify problematic records. Common causes include field length mismatches, incorrect number of columns after splitting, or incompatible data formats.
Q7: Flink-Doris-Connector fails to write data without errors
A7: Most likely due to checkpointing being disabled. Ensure that checkpointing is enabled in your Flink job configuration.
Operational Concerns
Q8: Configuring password strength validation in Doris
A8: Password complexity is controlled via the global variable validate_password_policy. By default, it is set to NONE/0, meaning no checks are enforced. Setting it to STRONG/2 requires passwords to contain at least three of the following: uppercase letters, lowercase letters, digits, and special characters, with a minimum length of 8 characters.
For more information, refer to the Authentication and Authorization Guide.
Q9: BE startup failure due to /proc/sys/vm/overcommit_memory
A9: This issue occurs when the kernel’s memory overcommit policy is misconfigured. Resolve by adjusting the value to 1:
echo 1 > /proc/sys/vm/overcommit_memory
Then restart the BE service.
Disaster Recovery
Q10: How to implement disaster recovery in Doris?
A10: Two primary strategies are available:
- Data Backup: Regular backups allow cluster restoration in case of failures.
- Cross-Cluster Replication (CCR): Enables synchronization of data changes from source clusters to target clusters at dataabse or table level. Ideal for high availability, load isolation, and multi-site architectures like "two cities, three centers".
For detailed setup instructions, see CCR Documentation.
About Apache Doris
Apache Doris is a high-performance, real-time analytical database built on a Massively Parallel Processing (MPP) architecture. It delivers sub-second query responses and supports both high-concurrency point queries and high-throughput complex analytics.
Access official resources:
Engage with the community to share experiences or submit feedback for continuous improvement.