Building a Domain-Specific RAG Assistant with Huixiangdou and InternLM

Retrieval-Augmented Generation Architecture

Retrieval-Augmented Generation (RAG) enhances generative models by dynamically fetching relevant context from external knowledge stores before synthesizing a resposne. This methodology addresses core limitations of standalone large language models, including factual hallucination, temporal knowledge decay, and opaque reasoning pathways. By decoupling knowledge storage from model parameters, RAG enables instant domain adaptation without gradient updates or retraining. The Huixiangdou framework leverages this architecture to rapidly construct specialized conversational agents.

Runtime Environment Preparation

Initialize an isolated execution workspace by duplicating the platform's base image and installing required libraries:

# Clone base environment to a dedicated workspace
studio-conda -o internlm-base -t RAG_Workspace

# Switch to the new environment
conda activate RAG_Workspace

# Install core dependencies
pip install protobuf==4.25.3 accelerate==0.28.0 aiohttp==3.9.3 auto-gptq==0.7.1 bcembedding==0.1.3 beautifulsoup4==4.8.2 einops==0.7.0 faiss-gpu==1.7.2 langchain==0.1.14 loguru==0.7.2 lxml_html_clean==0.1.0 openai==1.16.1 openpyxl==3.1.2 pandas==2.2.1 pydantic==2.6.4 pymupdf==1.24.1 python-docx==1.1.0 pytoml==0.1.21 readability-lxml==0.8.1 redis==5.0.3 requests==2.31.0 scikit-learn==1.4.1.post1 sentence_transformers==2.2.2 textract==1.6.5 tiktoken==0.6.0 transformers==4.39.3 transformers_stream_generator==0.0.5 unstructured==0.11.2

Confirm environment activation via conda env list. Ensure RAG_Workspace is marked as active before proceeding.

Model Linking and Resource Management

Avoid redundant downloads by creating symbolic links to pre-cached weights in the shared storage directory:

cd /root && mkdir -p models
cd /root/models

# Establish links for embedding and reranking engines
ln -s /root/share/new_models/maidalun1020/bce-embedding-base_v1 ./embedding_engine
ln -s /root/share/new_models/maidalun1020/bce-reranker-base_v1 ./reranker_engine

# Link the primary generative model
ln -s /root/share/new_models/Shanghai_AI_Laboratory/internlm2-chat-7b ./generative_llm

Project Cloning and Configuration

Fetch the Huixiangdou repository and lock to a stable release tag:

cd /root/code
git clone https://github.com/internlm/huixiangdou && cd huixiangdou
git checkout 447c6f7e68a1657fce1c4f7c740ea1700bde0440

Update the central configuration file to reference the locally linked models. Modify the paths for the vectorization engine, reranker, and local LLM:

sed -i '6s#.*#embedding_model_path = "/root/models/embedding_engine"#' config.ini
sed -i '7s#.*#reranker_model_path = "/root/models/reranker_engine"#' config.ini
sed -i '29s#.*#local_llm_path = "/root/models/generative_llm"#' config.ini

Vector Database Construction

Transform domain documentation into a searchable vector index. The pipeline utilizes LangChain for text segmentation and relies on the BCE bilingual models for semantic encoding.

First, acquire the target documentation set:

cd /root/code/huixiangdou && mkdir repodir
git clone https://github.com/internlm/huixiangdou --depth=1 repodir/huixiangdou

The retrieval system maintains two intent filters: a positive corpus (topics to address) and a negative corpus (off-topic or casual prompts to discard). Backup the default positive list and inject domain-specific examples:

cd /root/code/huixiangdou
mv resource/good_questions.json resource/good_questions_bak.json

cat > resource/good_questions.json << 'EOF'
[
    "How to integrate mmyolo interfaces within mmpose?",
    "What is the workflow for behavior recognition after pose estimation in mmpose?",
    "How to switch the detection checkpoint from Faster R-CNN to YOLO in the topdown demo?",
    "What is Huixiangdou and how is it installed?",
    "Can Huixiangdou be deployed to WeChat or Feishu groups?",
    "How to configure the config.ini file?",
    "Which remote LLM models are supported?"
]
EOF

Prepare a validation dataset to test the filtering logic:

cat > ./test_queries.json << 'EOF'
[
    "What is the primary function of Huixiangdou?",
    "Hello, please introduce yourself"
]
EOF

Execute the feature extraction pipeline to build the vector store:

cd /root/code/huixiangdou && mkdir -p workdir
python3 -m huixiangdou.service.feature_store --sample ./test_queries.json

During inference, the system computes cosine similarity between incoming prompts and both reference sets. High-confidence matches proceed to keyword extraction and top-K chunk retrieval. The aggregated context and original query are forwarded to the generative model for response synthesis.

Execution and Verification

With the vector index populated, conduct a local inference test. Update the query array in the execution script:

cd /root/code/huixiangdou/
# Replace test prompts
sed -i '74s/.*/    user_queries = ["What is Huixiangdou?", "How to deploy to a WeChat group?", "What is the weather today?"]/' huixiangdou/main.py

# Launch in standalone mode
python3 -m huixiangdou.main --standalone

The engine will evaluate each prompt, suppress irrelevant queries, and produce precise, context-grounded answers for domain-specific inputs. The entire process operates on the base InternLM2-Chat-7B weights without additional fine-tuning.

Graphical Interface Deployment

For interactive usage, the Huixiangdou Web Demo on OpenXLab supports direct document ingestion and browser-based querying. Access the deployment portal and initialize a fresh knowledge repository with a custom identifier and access credential.

Upload source files in compatible formats (PDF, DOCX, Markdown, XLSX). Upon completion, the system automatically re-indexes the new content. Users can then interact with the assistant through the web UI, verifying its capacity to extract specific policy clauses, explain platform features, or guide users through registration and compliance steps. The interface conifrms real-time chunk retrieval and response generation, validating the end-to-end RAG pipeline.

Tags: RAG LangChain VectorDatabase InternLM Huixiangdou

Posted on Fri, 08 May 2026 15:48:59 +0000 by nimbus