ElasticSearch DSL Querying: A Practical Guide with Examples

ElasticSearch provides a powerful JSON-based query DSL (Domain-Specific Language) for executing searches. Understanding this query language is essential for anyone working with ElasticSearch, much like knowing SQL is necessary for relational databases.

Query DSL Structure

The query DSL consists of two main types of clauses:

  • Leaf Query Clauses: These search for specific values in particular fields, such as match, term, or range queries. They can be used independently.
  • Compound Query Clauses: These wrap other leaf or compound queries to combine multiple queries logically (using bool or dis_max) or modify their behavior (using constant_score).

The behavior of query clauses differs based on whether they are used in a query context or a filter context.

Basic Operations

Creating Documents

ElasticSearch allows document creation with or without specifying an explicit ID:

POST products/_doc/1
{
    "product_id": "P-9876",
    "sku": "ABC12345",
    "name": "wireless mouse",
    "stock": 50,
    "created_at": "2024-01-15 10:30:00"
}

When no ID is specified, ElasticSearch generates one automatically. You can inspect index settings and mappings using GET products/ or GET products/_settings and GET products/_mapping.

Creating Indexes

While ElasticSearch can create indexes automatically when documents are indexed, explicit index creation provides better control over configuration and mapping:

PUT products
{
    "settings": {
        "number_of_shards": 5,
        "number_of_replicas": 2,
        "refresh_interval": "2s"
    },
    "mappings": {
        "_doc": {
            "properties": {
                "product_id": { "type": "keyword" },
                "sku": { "type": "keyword" },
                "name": { "type": "text" },
                "stock": { "type": "integer" },
                "price": { "type": "double" },
                "active": { "type": "boolean" },
                "created_at": {
                    "type": "date",
                    "format": "yyyy-MM-dd HH:mm:ss"
                }
            }
        }
    }
}

Important parameters:

  • number_of_shards: Cannot be changed after index creation
  • refresh_interval: Adjustable for performnace tuning
  • number_of_replicas: Should be at least 1 for production

Additional field attributes:

  • store: Controls whether the field is stored separately
  • doc_values: Enables aggregation and sorting on the field
  • index: Determines if the field is searchable

Updating Documents

Updates can be performed using the document ID or through query-based updates:

POST products/_doc/1
{
    "product_id": "P-9876",
    "sku": "ABC12345",
    "name": "wireless mouse",
    "stock": 75,
    "created_at": "2024-01-15 10:30:00"
}

Query-based update:

POST products/_update_by_query
{
  "query": {
    "term": {
      "sku": "ABC12345"
    }
  },
  "script": {
    "source": "ctx._source['stock'] = 100"
  }
}

Deleting Operations

Delete by ID:

DELETE products/1

Delete by query:

POST products/_delete_by_query
{
  "query": {
    "term": {
      "sku": "ABC12345"
    }
  }
}

Delete a specific field from documents:

POST products/_update_by_query
{
  "script": {
    "lang": "painless",
    "inline": "ctx._source.remove('stock')"
  }
}

Query Examples

Match All

Retrieve all documents from the cluster:

GET _search
{
  "query": {
    "match_all": {}
  }
}

Query all documents in a specific index:

GET products/_doc/_search

Retrieve a specific document by ID:

GET products/_doc/1

Term Query

Use term queries for exact matches on numbers, dates, booleans, or keyword fields:

GET products/_doc/_search
{
  "query": {
    "term": {
      "sku": "ABC12345"
    }
  }
}

Multiple values using terms (similar to SQL IN):

GET products/_doc/_search
{
  "query": {
    "terms": {
      "product_id": ["P-9876", "P-5432", "P-1111"]
    }
  }
}

Range Query

For numeric or date range filtering:

GET products/_doc/_search
{
  "query": {
    "range": {
      "price": {
        "gte": 10,
        "lte": 100
      }
    }
  }
}

Exists Query

Check if a field exists:

GET products/_doc/_search
{
  "query": {
    "exists": {
      "field": "price"
    }
  }
}

Bool Query

Combine multiple conditions using boolean logic:

GET products/_search
{
  "query": {
    "bool": {
      "must": {
        "term": {
          "active": true
        }
      },
      "must_not": {
        "term": {
          "stock": 0
        }
      },
      "should": [
        {
          "term": {
            "name": "mouse"
          }
        },
        {
          "term": {
            "name": "keyboard"
          }
        }
      ]
    }
  }
}

Boolean operators:

  • must: All conditions must match (AND)
  • must_not: None of the conditions should match (NOT)
  • should: At least one condition should match (OR)

Wildcard Query

Pattern matching similar to SQL LIKE:

GET products/_search
{
  "query": {
    "wildcard": {
      "name": "*wire*"
    }
  }
}

Regexp Query

Regular expression pattern matching:

GET products/_search
{
  "query": {
    "regexp": {
      "sku": "ABC[0-9]+"
    }
  }
}

Data Types Reference

Core types:

  • text, keyword (string)
  • long, integer, short, byte (integers)
  • double, float (decimals)
  • boolean
  • date
  • binary

Complex types:

  • object (JSON objects)
  • nested (arrays of objects)

Special types:

  • geo_point (latitude/longitude)
  • geo_shape (complex shapes)
  • ip (IPv4/IPv6)
  • join (parent-child relationships)

For most search operations requiring exact matching or sorting/aggregation, keyword type is preferred over text, which performs tokenization and consumes more storage.

Tags: elasticsearch dsl database Search Engine Tutorial

Posted on Thu, 14 May 2026 16:35:41 +0000 by vchris