Prerequisites and Core Tools
To establish a robust development environment for an AI-driven question-answering system, several core components must be installed and configured correctly. This setup ensures isolation, reproducibility, and ease of management throughout the development lifecycle.
Python Runtime
The project requires Python version 3.10 or higher. It is recommended to download the official installer from the Python website. During installation, ensure that the option to add Python to the system PATH is selected to facilitate command-line access. Once installed, verify the installation by running python --version in the terminal.
Docker Desktop
Docker Desktop provides the necessary containerization engine to run dependencies such as vector databases in isolated environments. Download the appropriate installer for your operating system from the official Docker documentation site. After installation, a system reboot may be required to finalize network driver configurations. Verify the installation by executing docker version in the command prompt.
Visual Studio Code
Visual Studio Code serves as the primary integrated development environment (IDE). It supports extensive extensions for Python development and container management. Download the installer from the official VS Code website. While other AI-assisted editors exist, VS Code offers stable support for Dev Containers, which is critical for this workflow.
Infrastructure Orchestration
The backend infrastructure relies on a vector database to store embeddings. Milvus is selected for this purpose, managed via Docker Compose to handle its dependencies (etcd and MinIO) simultaneously.
Docker Compose Configuration
Create a file named docker-compose.yaml in the project root. This file defines the network topology and service configurations. The following configuration sets up the Milvus standalone mode along with its required storage and metadata services.
version: '3.8'
services:
ai-engine:
build:
context: .
dockerfile: Dockerfile
target: development
args:
- BUILD_ENV=local
environment:
- LLM_API_KEY=${LLM_API_KEY}
networks:
- ai-network
vec-metadata:
container_name: vec-etcd
image: quay.io/coreos/etcd:v3.5.5
environment:
- ETCD_AUTO_COMPACTION_MODE=revision
- ETCD_AUTO_COMPACTION_RETENTION=1000
- ETCD_QUOTA_BACKEND_BYTES=4294967296
volumes:
- ./data/etcd:/etcd
command: etcd -advertise-client-urls=http://127.0.0.1:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcd
networks:
- ai-network
vec-storage:
container_name: vec-minio
image: minio/minio:RELEASE.2023-03-20T20-16-18Z
environment:
MINIO_ACCESS_KEY: minioadmin
MINIO_SECRET_KEY: minioadmin
volumes:
- ./data/minio:/minio_data
command: minio server /minio_data
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
interval: 30s
timeout: 20s
retries: 3
networks:
- ai-network
vec-core:
container_name: vec-milvus
image: milvusdb/milvus:v2.3.4
command: ["milvus", "run", "standalone"]
environment:
ETCD_ENDPOINTS: vec-metadata:2379
MINIO_ADDRESS: vec-storage:9000
volumes:
- ./data/milvus:/var/lib/milvus
ports:
- "19530:19530"
depends_on:
- "vec-metadata"
- "vec-storage"
networks:
- ai-network
networks:
ai-network:
driver: bridge
Application Container Definition
The Dockerfile defines the runtime environment for the application code. It utilizes a multi-stage build approach to manage dependencies via Poetry. The development target keeps the container running indefinitely to allow for interactive debugging.
FROM python:3.10-slim as base
WORKDIR /workspace
# Install system dependencies
RUN apt-get update && apt-get install -y curl
# Install Poetry
COPY . .
RUN pip install --no-cache-dir poetry==1.5.0
# Configure Poetry and install project dependencies
RUN poetry config virtualenvs.create false && \
poetry install --no-interaction --no-ansi --only main
# Development stage
FROM base as development
CMD ["sleep", "infinity"]
Dependency Management
Project depandencies are managed using Poetry. Create a pyproject.toml file to specify the required libraries, including the web framework, LLM client, and orcchestration tools.
[tool.poetry]
name = "neural-qa"
version = "1.0.0"
description = "Generative AI Question Answering System"
authors = ["DevTeam"]
[tool.poetry.dependencies]
python = "^3.10"
fastapi = "^0.104.0"
ipykernel = "^6.26.0"
langchain = "^0.0.350"
openai = "^1.3.0"
[[tool.poetry.source]]
name = "pypi"
url = "https://pypi.org/simple/"
priority = "primary"
Development Environment Integration
To streamline the workflow, Visual Studio Code can be configured to develop directly inside the Docker container. This ensures that the local development environment matches the production deployment exactly.
Dev Container Configuration
Create a .devcontainer.json file (usually inside a .devcontainer folder) to instruct VS Code on how to attach to the Docker Compose services. This configuration forwards necessary ports and installs recommended extensions automatically.
{
"dockerComposeFile": "../docker-compose.yaml",
"service": "ai-engine",
"workspaceFolder": "/workspace",
"shutdownAction": "stopCompose",
"forwardPorts": [19530, 8000],
"customizations": {
"vscode": {
"extensions": [
"ms-python.python",
"ms-python.vscode-pylance",
"ms-toolsai.jupyter"
]
}
}
}
Environment Variables
Secure credentials should never be committed to version control. Create a .env file in the project root to store sensitive API keys. This file is referenced by Docker Compose during container startup.
LLM_API_KEY="sk-placeholder-key-do-not-commit"
Ensure that .env is listed in your .gitignore file to prevent accidental exposure. A comprehensive .gitignore should exclude Python bytecode, virtual environments, IDE settings, and local data volumes.
Launching the Environment
Once all configuration files are in place, open the project folder in Visual Studio Code. Use the command palete (Ctrl+Shift+P) and select "Dev Containers: Reopen in Container". VS Code will build the images and start the services defined in the Compose file.
After the container starts, open a new terminal within VS Code. This terminal operates inside the ai-engine container. You can verify connectivity to the vector database by checking the Docker Desktop dashboard or running health check commands against the exposed ports. The environment is now ready for code implementation and testing.