Integrating DeepSeek with Spring AI for Enhanced Business Process Intelligence

Spring AI Integration Architecture

The integration leverages Ollama as the underlying deployment platform, with Spring AI providing standardized interfaces for model interaction. The Ollama and OpenAI interfaces maintain compatibility, allowing developers to choose based on specific requirements.

Ollama Configuraton Setup

Begin by adding the Spring AI Ollama dependency to your project:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
</dependency>

Configure the connection parameters in your application properties:

spring.ai.ollama.base-url=http://your-server-ip:6399
spring.ai.ollama.chat.options.model=deepseek-r1:7b

Replace the IP address with your HAI server's external IP. The 7b model variant is recommended over the 1.5b version for improved performance and capability.

Service Implementation

Create a REST endpoint to handle inference requests:

@PostMapping("/inference")
ResponseEntity<InferenceResult> processUserQuery(@RequestParam("query") String userQuery) {
    String modelResponse = this.aiChatClient.prompt()
            .user(userQuery)
            .memory(new SessionMemoryAdvisor(chatSession))
            .execute()
            .getContent();
    
    InferenceResult result = InferenceResult.builder()
            .type("text")
            .content(ResponseContent.builder().text(modelResponse).build())
            .build();
    return ResponseEntity.ok(result);
}

The current implementation uses blocking responses. For real-time feedback, consider implementing streaming responses instead.

Model Selection Considerations

While the 7b model provides adequate performance, the 70b variant offers enhancde capabilities at the cost of increased computational requirements. Larger models demand significant storage capacity and GPU memory.

To deploy larger models, use the Ollama command-line interface:

ollama pull deepseek-r1:70b

Systems with limited storage (200GB baseline) may encounter constraints when handling larger models. For optimal performance with 70b models, ensure at least 32GB of GPU memory is available.

DeepSeek's reasoning capabilities, when integrated through Spring AI, provide valualbe intelligence augmentation for business processes. The flexible model selection allows organizations to balance performance requirements with infrastructure constraints effectively.

Tags: Spring AI Deepseek Ollama Model Inference Business Process Automation

Posted on Mon, 22 Jun 2026 18:48:16 +0000 by jrodd32