Accelerating Elasticsearch Indexing Through Gateway Optimization

Test Environment

- Primary cluster: http://10.0.1.2:9200, username: elastic, password: ***, 9 nodes, hardware specs: 12C64GB (31GB JVM) - Secondary cluster: http://10.0.1.15:9200, username: elastic, password: ***, 9 nodes, hardware specs: 12C64GB (31GB JVM) - Gateway server 1 (Public IP:120.92.43.31, Internal IP:192.168.0.24) hardware specs: 40C 256GB 3.7T NVME SSD - Load testing server 1 (Internal IP: 10.0.0.117) hardware specs: 24C 48GB - Load testing server 2 (Internal IP: 10.0.0.69) hardware specs: 24C 48GB

Test Overview

This test primarily evaluates the practical implementation of gateway indexing acceleration and assesses the hardware specifications required to achieve different performance levels, serving as a reference for production deployment configuration.

Scenario Description

The gateway improves overall cluster write throughput by reorganizing requests according to target nodes, implementing request speed separation.

Data Description

Using Nginx data auto-generated by Loadgen as an example, we compare the speed difference between direct Elasticsearch writes and gateway-accelerated Elasticsearch writes. The data sample format is as follows:
{
  "_index": "test-10",
  "_type": "_doc",
  "_id": "cak5emoke01flcq9q760",
  "_source": {
    "batch_number": "2328917",
    "id": "cak5emoke01flcq9r19g",
    "ip": "192.168.0.1",
    "message": "175.10.75.216 - webmaster [29/Jul/2020:17:01:26 +0800] \"GET /rest/system/status HTTP/1.1\" 200 1838 \"http://dl-console.elasticsearch.cn/\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36\"",
    "now_local": "2022-06-14 17:39:39.420724895 +0800 CST",
    "now_unix": "1655199579",
    "random_no": "13",
    "routing_no": "cak5emoke01flcq9pvu0"
  }
}

Data Architecture

The gateway can locally calculate the target storage location in the backend Elasticsearch cluster for each indexed document, enabling precise request定位. In a bulk request, data may exist for multiple backend nodes. The bulk_reshuffle filter is used to break down normal bulk requests and reassemble them according to target nodes or shards, preventing Elasticsearch nodes from redistributing requests after receiving them. This reduces traffic and load between Elasticsearch clusters, avoids single node bottlenecks, ensures balanced processing across data nodes, and enhances the overall indexing throughput of the cluster. We test scenarios with both 3 shards and 30 shards.

Test Preparation

Deploying the Gateway Program

  1. System Tuning Refer to the documentation at: https://gateway.infinilabs.com/zh/docs/getting-started/optimization/
  2. Download the Program
    [root@iZbp1gxkifg8uetb33pvcoZ ~]# mkdir /opt/gateway
    [root@iZbp1gxkifg8uetb33pvcoZ ~]# cd /opt/gateway/
    [root@iZbp1gxkifg8uetb33pvcoZ gateway]# tar vxzf gateway-1.6.0_SNAPSHOT-649-linux-amd64.tar.gz
    gateway-linux-amd64
    gateway.yml
    sample-configs/
    sample-configs/elasticsearch-with-ldap.yml
    sample-configs/indices-replace.yml
    sample-configs/record_and_play.yml
    sample-configs/cross-cluster-search.yml
    sample-configs/kibana-proxy.yml
    sample-configs/elasticsearch-proxy.yml
    sample-configs/v8-bulk-indexing-compatibility.yml
    sample-configs/use_old_style_search_response.yml
    sample-configs/context-update.yml
    sample-configs/elasticsearch-route-by-index.yml
    sample-configs/hello_world.yml
    sample-configs/entry-with-tls.yml
    sample-configs/javascript.yml
    sample-configs/log4j-request-filter.yml
    sample-configs/request-filter.yml
    sample-configs/condition.yml
    sample-configs/cross-cluster-replication.yml
    sample-configs/secured-elasticsearch-proxy.yml
    sample-configs/fast-bulk-indexing.yml
    sample-configs/es_migration.yml
    sample-configs/index-docs-diff.yml
    sample-configs/rate-limiter.yml
    sample-configs/async-bulk-indexing.yml
    sample-configs/elasticssearch-request-logging.yml
    sample-configs/router_rules.yml
    sample-configs/auth.yml
    sample-configs/index-backup.yml
    
  3. Modify Configuration Copy the sample configuration provided by the gateway and modify it according to actual cluster information:
    [root@iZbp1gxkifg8uetb33pvcoZ gateway]# cp sample-configs/async-bulk-indexing.yml
    
    Modify the cluster registration information as needed. Also adjust the gateway listening port and TLS settings based on your requirements (if clients access ES via http:// protocol, set entry.tls.enabled to false). Different clusters can use different configurations, listening on different ports for separate business access.
  4. Start the Gateway Start the gateway with the configuration you just created:
    [root@iZbp1gxkifg8uetb33pvcoZ gateway]# ./gateway-linux-amd64 -config gateway.yml
    
       ___   _   _____  __  __    __  _
      / _ \ /_\ /__   \/__\/ / /\ \ \ \/\_\ /\_/
     / /_\///_\\  / /\/\_  \ \/  \/ /_\\\_ _/
    / /_\/\  _  \/ / //__   \  /\  /  _  \/ \\
    \____/\_/ \_/\/__ \__/    \/  \/\_/ \_/\_/
    
    [GATEWAY] A light-weight, powerful and high-performance elasticsearch gateway.
    [GATEWAY] 1.6.0_SNAPSHOT, 2022-05-18 11:09:54, 2023-12-31 10:10:10, 73408e82a0f96352075f4c7d2974fd274eeafe11
    [05-19 13:35:43] [INF] [app.go:174] initializing gateway.
    [05-19 13:35:43] [INF] [app.go:175] using config: /opt/gateway/gateway.yml.
    [05-19 13:35:43] [INF] [instance.go:72] workspace: /opt/gateway/data1/gateway/nodes/ca2tc22j7ad0gneois80
    [05-19 13:35:43] [INF] [app.go:283] gateway is up and running now.
    [05-19 13:35:50] [INF] [actions.go:358] elasticsearch [primary] is available
    [05-19 13:35:50] [INF] [api.go:262] api listen at: http://0.0.0.0:2900
    [05-19 13:35:50] [INF] [reverseproxy.go:261] elasticsearch [primary] hosts: [] => [192.168.0.19:9200]
    [05-19 13:35:50] [INF] [reverseproxy.go:261] elasticsearch [backup] hosts: [] => [xxxxxxxx-backup:9200]
    [05-19 13:35:50] [INF] [reverseproxy.go:261] elasticsearch [primary] hosts: [] => [192.168.0.19:9200]
    [05-19 13:35:50] [INF] [reverseproxy.go:261] elasticsearch [backup] hosts: [] => [xxxxxxxx-primary:9200]
    [05-19 13:35:50] [INF] [reverseproxy.go:261] elasticsearch [primary] hosts: [] => [192.168.0.19:9200]
    [05-19 13:35:50] [INF] [entry.go:322] entry [my_es_entry/] listen at: https://0.0.0.0:8000
    [05-19 13:35:50] [INF] [module.go:116] all modules are started
    
  5. Install as Service Quickly install the gateway as a system service:
    [root@iZbp1gxkifg8uetb33pvcpZ console]# ./gateway-linux-amd64 -service install
    Success
    [root@iZbp1gxkifg8uetb33pvcpZ console]# ./gateway-linux-amd64 -service start
    Success
    

Deploying the Management Console

To facilitate quick switching between multiple clusters, we use Console for management.
  1. Download and Install Simply extract the provided installation package to complete installation:
    [root@iZbp1gxkifg8uetb33pvcpZ console]# tar vxzf console-0.3.0_SNAPSHOT-596-linux-amd64.tar.gz
    console-linux-amd64
    console.yml
    
  2. Modify Configuration Use [http://10.0.1.2:9200](http://10.0.1.2:9200) as the Console system cluster to retain monitoring metrics and metadata information. Modify the configuration as follows:
    [root@iZbp1gxkifg8uetb33pvcpZ console]# cat console.yml
    
    elasticsearch:
      - name: default
        enabled: true
        monitored: false
        endpoint: http://10.0.1.2:9200
        basic_auth:
          username: elastic
          password: xxxxx
        discovery:
          enabled: false
     ...
    
  3. Start Service
    [root@iZbp1gxkifg8uetb33pvcpZ console]# ./console-linux-amd64 -service install
    Success
    [root@iZbp1gxkifg8uetb33pvcpZ console]# ./console-linux-amd64 -service start
    Success
    
  4. Access Console Access port 9000 on this host to open the Console backend, http://10.0.128.58:9000/#/cluster/overview. Open the [System][Cluster] menu to register the Elasticsearch clusters and gateway addresses to be managed.
  5. Register Gateway Open the GATEWAY registration function and set it to the gateway's API address for management.

Testing the Gateway

To verify that the gateway is working properly, we quickly verify it through Console. First, create an index through the gateway interface and write a document: First, check the data status of the primary cluster. Then, check the data status of the secondary cluster. Both clusters return the same data, indicating that the gateway configuration is working properly, and verification is complete.

Installing Loadgen

The test machine also needs tuning. Refer to the gateway optimization instructions.
  1. On the test machine, download and install Loadgen:
    [root@vm10-0-0-69 opt]# tar vxzf loadgen-1.4.0_SNAPSHOT-50-linux-amd64.tar.gz
    
  2. Download an Nginx log sample and save it as `nginx.log`:
    [root@vm10-0-0-69 opt]# head nginx.log
    175.10.75.216 - - [28/Jul/2020:21:20:26 +0800] \"GET / HTTP/1.1\" 200 8676 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36\"
    175.10.75.216 - - [28/Jul/2020:21:20:26 +0800] \"GET /vendor/bootstrap/css/bootstrap.css HTTP/1.1\" 200 17235 \"http://dl-console.elasticsearch.cn/\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36\"
    175.10.75.216 - - [28/Jul/2020:21:20:26 +0800] \"GET /vendor/daterangepicker/daterangepicker.css HTTP/1.1\" 200 1700 \"http://dl-console.elasticsearch.cn/\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36\"
    175.10.75.216 - - [28/Jul/2020:21:20:26 +0800] \"GET /vendor/fork-awesome/css/v5-compat.css HTTP/1.1\" 200 2091 \"http://dl-console.elasticsearch.cn/\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36\"
    175.10.75.216 - - [28/Jul/2020:21:20:26 +0800] \"GET /assets/font/raleway.css HTTP/1.1\" 200 145 \"http://dl-console.elasticsearch.cn/\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36\"
    175.10.75.216 - - [28/Jul/2020:21:20:26 +0800] \"GET /vendor/fork-awesome/css/fork-awesome.css HTTP/1.1\" 200 8401 \"http://dl-console.elasticsearch.cn/\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36\"
    175.10.75.216 - - [28/Jul/2020:21:20:26 +0800] \"GET /assets/css/overrides.css HTTP/1.1\" 200 2524 \"http://dl-console.elasticsearch.cn/\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36\"
    175.10.75.216 - - [28/Jul/2020:21:20:26 +0800] \"GET /assets/css/theme.css HTTP/1.1\" 200 306 \"http://dl-console.elasticsearch.cn/\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36\"
    175.10.75.216 - - [28/Jul/2020:21:20:26 +0800] \"GET /vendor/fancytree/css/ui.fancytree.css HTTP/1.1\" 200 3456 \"http://dl-console.elasticsearch.cn/\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36\"
    175.10.75.216 - - [28/Jul/2020:21:20:26 +0800] \"GET /syncthing/development/logbar.js HTTP/1.1\" 200 486 \"http://dl-console.elasticsearch.cn/\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36\"
    
  3. Modify the Loadgen configuration file Modify the variables to point the message to the nginx log you just prepared, and update the ES address and authentication information. Loadgen will generate write requests randomly. The specific configuration is as follows:
    [root@vm10-0-0-117 opt]# cat loadgen.yml
    variables:
      - name: ip
        type: file
        path: dict/ip.txt
      - name: message
        type: file
        path: nginx.log
      - name: user
        type: file
        path: dict/user.txt
      - name: id
        type: sequence
      - name: uuid
        type: uuid
      - name: now_local
        type: now_local
      - name: now_utc
        type: now_utc
      - name: now_unix
        type: now_unix
      - name: suffix
        type: range
        from: 10
        to: 13
    requests:
      - request:
          method: POST
          runtime_variables:
            batch_no: id
          runtime_body_line_variables:
            routing_no: uuid
          basic_auth:
            username: elastic
            password: xxxx
          url: http://10.0.128.58:8000/_bulk
          body_repeat_times: 5000
          body: \"{ \"create\" : { \"_index\" : \"test-$[[suffix]]\",\"_type\":\"_doc\", \"_id\" : \"$[[uuid]]\" } }\\n{ \"id\" : \"$[[uuid]]\",\"routing_no\" : \"$[[routing_no]]\",\"batch_number\" : \"$[[batch_no]]\", \"message\" : \"$[[message]]\", \"random_no\" : \"$[[suffix]]\",\"ip\" : \"$[[ip]]\",\"now_local\" : \"$[[now_local]]\",\"now_unix\" : \"$[[now_unix]]\" }\\n\"
    
  4. Start Loadgen for testing Specify the relevant runtime parameters `-d` and concurrency parameters `-c`, and enable request compression:
    [root@vm10-0-0-117 opt]# ./loadgen-linux-amd64  -d 60000 -c 200 --compress
    
       __   ___  _      ___  ___   __    __
      / /  /___\/\_\    /   \/ _ \/__\/\ \ \
     / /  //  ///_\\  / /\ / /_\/_\ /  \/ /
    / /__/ \_//  _  \/ /_// /_\\//__/ /\  /
    \____|___/\_/ \_/___,'\____/\__/\_\ \/
    
    [LOADGEN] A http load generator and testing suit.
    [LOADGEN] 1.4.0_SNAPSHOT, 2022-06-01 09:58:17, 2023-12-31 10:10:10, b6a73e2434ac931d1d43bce78c0f7622a1d08b2e
    [06-14 18:47:29] [INF] [app.go:174] initializing loadgen.
    [06-14 18:47:29] [INF] [app.go:175] using config: /opt/loadgen.yml.
    [06-14 18:47:29] [INF] [module.go:116] all modules are started
    [06-14 18:47:30] [INF] [instance.go:72] workspace: /opt/data/loadgen/nodes/cajfdg0ke012ka748j30
    [06-14 18:47:30] [INF] [app.go:283] loadgen is up and running now.
    [06-14 18:47:30] [INF] [loader.go:320] warmup started
    [06-14 18:47:30] [INF] [loader.go:329] [POST] http://10.0.128.58:8000/_bulk -{\"took\":115,\"errors\":false,\"\":[{\"create\":{\"_index\":\"test-11\",\"_type\":\"_doc\",\"_id\":\"cak6eggke0184a2dcc70\",\"_version\":1,\"result\":\"created\",\"_shards\":{\"total\":1,\"successful\":1,\"failed\":0},\"_seq_no\":39707421,\"_primary_term\":1,\"status\":201}},{\"create\":{\"_i
    [06-14 18:47:30] [INF] [loader.go:330] status: 200,,{\"took\":115,\"errors\":false,\"\":[{\"create\":{\"_index\":\"test-11\",\"_type\":\"_doc\",\"_id\":\"cak6eggke0184a2dcc70\",\"_version\":1,\"result\":\"created\",\"_shards\":{\"total\":1,\"successful\":1,\"failed\":0},\"_seq_no\":39707421,\"_primary_term\":1,\"status\":201}},{\"create\":{\"_i
    [06-14 18:47:30] [INF] [loader.go:338] warmup finished
    
Perform the same installation operations on another load testing machine, which won't be repeated here.

Testing Methodology

Preparing Template

Create a default index template to optimize write performance:
PUT _template/test
{
  \"index_patterns\": [
    \"test*\"
  ],
  \"settings\": {
    \"index.translog.durability\": \"async\",
    \"refresh_interval\": \"-1\",
    \"number_of_shards\": 3,
    \"number_of_replicas\": 0
  },
  \"mappings\": {
    \"dynamic_templates\": [
      {
        \"strings\": {
          \"mapping\": {
            \"ignore_above\": 256,
            \"type\": \"keyword\"
          },
          \"match_mapping_type\": \"string\"
        }
      }
    ]
  }
}

Starting Load Test

Execute the load testing tool on the load testing machines respectively:
[root@vm10-0-0-117 opt]# ./loadgen-linux-amd64  -d 60000 -c 200 --compress

Observing Throughput

Open the Console tool to observe the cluster's throughput. Open the monitoring menu and click the dropdown at the top to quickly switch between different clusters and view the primary cluster's throughput.

Limiting CPU

To test gateway performance under different CPU resources, we use taskset to bind the process CPU:

Testing Process

Gateway Configuration

Direct ES Writing

Loadgen Configuration
[root@vm10-0-0-69 opt]# cat loadgen2.yml
statsd:
  enabled: false
  host: 192.168.3.98
  port: 8125
  namespace: loadgen.
variables:
  - name: ip
    type: file
    path: dict/ip.txt
  - name: message
    type: file
    path: nginx.log
  - name: user
    type: file
    path: dict/user.txt
  - name: id
    type: sequence
  - name: uuid
    type: uuid
  - name: now_local
    type: now_local
  - name: now_utc
    type: now_utc
  - name: now_unix
    type: now_unix
  - name: suffix
    type: range
    from: 10
    to: 13
requests:
  - request:
      method: POST
      runtime_variables:
        batch_no: id
      runtime_body_line_variables:
        routing_no: uuid
      basic_auth:
        username: elastic
        password: ####
      #url: http://localhost:8000/_search?q=$[[id]]
      url: http://10.0.1.2:9200/_bulk
      body_repeat_times: 10000
      body: \"{ \"create\" : { \"_index\" : \"test-$[[suffix]]\",\"_type\":\"_doc\", \"_id\" : \"$[[uuid]]\"  } }\\n{ \"id\" : \"$[[uuid]]\",\"routing_no\" : \"$[[routing_no]]\",\"message\" : \"$[[message]]\",\"batch_number\" : \"$[[batch_no]]\", \"random_no\" : \"$[[suffix]]\",\"ip\" : \"$[[ip]]\",\"now_local\" : \"$[[now_local]]\",\"now_unix\" : \"$[[now_unix]]\" }\\n\"
Second Loadgen Configuration:
[root@vm10-0-0-117 opt]# cat loadgen2.yml
statsd:
  enabled: false
  host: 192.168.3.98
  port: 8125
  namespace: loadgen.
variables:
  - name: ip
    type: file
    path: dict/ip.txt
  - name: message
    type: file
    path: nginx.log
  - name: user
    type: file
    path: dict/user.txt
  - name: id
    type: sequence
  - name: uuid
    type: uuid
  - name: now_local
    type: now_local
  - name: now_utc
    type: now_utc
  - name: now_unix
    type: now_unix
  - name: suffix
    type: range
    from: 10
    to: 13
requests:
  - request:
      method: POST
      runtime_variables:
        batch_no: id
      runtime_body_line_variables:
        routing_no: uuid
      basic_auth:
        username: elastic
        password: ####
      url: http://10.0.1.2:9200/_bulk
      body_repeat_times: 5000
      body: \"{ \"create\" : { \"_index\" : \"test-$[[suffix]]\",\"_type\":\"_doc\", \"_id\" : \"$[[uuid]]\" } }\\n{ \"id\" : \"$[[uuid]]\",\"routing_no\" : \"$[[routing_no]]\",\"batch_number\" : \"$[[batch_no]]\", \"message\" : \"$[[message]]\", \"random_no\" : \"$[[suffix]]\",\"ip\" : \"$[[ip]]\",\"now_local\" : \"$[[now_local]]\",\"now_unix\" : \"$[[now_unix]]\" }\\n\"
Start load testing respectively:
[root@vm10-0-0-69 opt]# ./loadgen-linux-amd64  -c 100 -d 66000  -config loadgen2.yml
[root@vm10-0-0-117 opt]# ./loadgen-linux-amd64  -c 100 -d 66000  -config loadgen2.yml
Direct ES writing throughput stabilizes at ~600k eps, with 3 shards per index.

Gateway 1C

Testing with gateway mode, first with default 3 shards:

Gateway 2C

Gateway 4C

Gateway 6C

Gateway 8C

Set Loadgen concurrency to 200:
[root@vm10-0-0-117 opt]# ./loadgen-linux-amd64  -c 200 -d 66000  -config loadgen1.yml
No performance improvement, gateway CPU not fully utilized.

Direct ES Writing - 32 Shards

Delete all and modify template to default 30 shards:
DELETE test-10
DELETE test-11
DELETE test-12
DELETE test-13
DELETE test-14
DELETE test-15

PUT _template/test
{
  \"index_patterns\": [
    \"test*\"
  ],
  \"settings\": {
    \"index.translog.durability\": \"async\",
    \"refresh_interval\": \"-1\",
    \"number_of_shards\": 30,
    \"number_of_replicas\": 0
  },
  \"mappings\": {
    \"dynamic_templates\": [
      {
        \"strings\": {
          \"mapping\": {
            \"ignore_above\": 256,
            \"type\": \"keyword\"
          },
          \"match_mapping_type\": \"string\"
        }
      }
    ]
  }
}
Continue load testing: 30 shards, direct ES stabilizes at ~750k eps.

Gateway 1C - 32 Shards

Gateway 2C - 32 Shards

Gateway 4C - 32 Shards

Gateway 6C - 32 Shards

Gateway 8C - 32 Shards

Traffic and writing are relatively large. Enable compression: Modify message compression to disk: Note: Enabling traffic or disk compression will incur additional overhead, and throughput will decrease to some extent.

Gateway 12C - 32 Shards

Remove compression, expand CPU to 12C, throughput unchanged, reached limit.

Shard Level

Test Results

3 shards * 4 indices, direct ES writing 600k eps.
Gateway CPU Cores Throughput Capacity (events per second) Notes
Gateway 1C ~180k
Gateway 2C ~350k
Gateway 4C ~650k
Gateway 6C ~770k
Gateway 8C ~930k Backend ES processing capacity nearly saturated
30 shards * 4 indices, direct ES writing 750k eps.
Gateway CPU Cores Throughput Capacity (events per second) Notes
Gateway 1C ~200k
Gateway 2C ~400k
Gateway 4C ~760k
Gateway 6C ~1000k Backend ES processing capacity nearly saturated
Gateway 8C ~930k Backend ES processing capacity nearly saturated

Tags: elasticsearch Gateway indexing Performance Optimization

Posted on Thu, 07 May 2026 12:30:31 +0000 by hdpt00