Implementing Field-Level Custom Dictionaries in Easysearch with IK Analyzer

Dictionary Storage Structure

The system uses a default index named .analysis_ik to store dictionary entries. Users can override this with a custom index while maintaiinng the required structure:

PUT .custom_dict_index/_doc
{
  "dictionary_name": "tech_terms",
  "dictionary_category": "main_dicts",
  "entries": "Artificial Intelligence\nMachine Learning\nDeep Learning"
}

Key fields include:

  • entries: Contains the dictionary terms separated by newlines
  • dictionary_name: Identifier for the dictionary set
  • dictionary_category: Specifies dictionary type (main, stopwords, or quantifier)

Configuration Example

To apply customm dictionaries at the field level:

PUT tech_documents
{
  "settings": {
    "analysis": {
      "analyzer": {
        "tech_analyzer": {
          "tokenizer": "tech_tokenizer"
        }
      },
      "tokenizer": {
        "tech_tokenizer": {
          "type": "ik_max_word",
          "use_custom_dict": true,
          "include_defaults": false,
          "case_sensitive": false,
          "dict_reference": "tech_terms"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "description": {
        "type": "text",
        "analyzer": "tech_analyzer"
      }
    }
  }
}

Configuration parameters:

  • use_custom_dict: Enables/disables custom dictionary usage
  • include_defaults: Controls whether to merge with default dictionary
  • case_sensitive: Determines case sensitivity in tokenization
  • dict_reference: Points to the dictionary name in the storage index

Dictionary Updates

The system supports dictionary updates through document appends:

POST .analysis_ik/_doc
{
  "dictionary_name": "tech_terms",
  "dictionary_category": "main_dicts",
  "entries": "Neural Networks\nComputer Vision"
}

The analyzer automatically detects new entries by comparing timestamps, with updates processed at one-minute intervals.

Testing the Configuration

Analyze sample text to verify the tokenization:

POST tech_documents/_analyze
{
  "field": "description",
  "text": "Artificial Intelligence and Neural Networks"
}

Tags: Easysearch ik-analyzer text-analysis search-engine custom-dictionaries

Posted on Tue, 19 May 2026 07:04:14 +0000 by cdennste