Introduction
In previous discussions, we explored the fundamentals of Spring AI and its underlying principles. For those still unfamiliar with these concepts, I recommend reviewing this comprehensive resource to solidify your understanding: https://www.cnblogs.com/guoxiaoyu/p/18441709
Today, we'll focus on practical AI implementation in real-world applications, particularly how Spring AI can help businesses optimize functionality and deepen AI integration with their operations. We'll approach this through the lens of intelligent agents—a technology that major tech companies are actively pursuing. This focus will help us better understand the practical application and effectiveness of these concepts.
In this guide, we'll walk through creating a locally deployable, stable agent solution. I'll detail each step to ensure everyone can follow along. Additionally, I'll analyze the strengths and weaknesses of this approach compared to existing agent solutions, helping you make informed decisions about your implementation choices.
Preparation
Spring AI integrates interfaces from numerous well-known companies. For those interested in using OpenAI's interface, Hunyuan's API presents a viable alternative. Hunyuan's API is compatible with OpenAI's interface specifications, meaning you can directly use OpenAI's official SDK to call Hunyuan's large models. This design significantly simplifies migration—you only need to replace the base_url and api_key with Hunyuan's configuration without modifying your existing application. This enables seamless switching to Hunyuan's large models, allowing you to leverage powerful AI capabilities and support.
API Key Application
Rest assured, after thorough testing, all interfaces function correctly without any anomalies. You can apply for access through this link: Hunyuan API Application Portal
Ensure you apply for the necessary API Key under your personal account.
Please safeguard your API Key information, as it will be crucial for subsequent usage.
Integration Documentation
While reviewing the documentation isn't mandatory since we won't directly interface with Hunyuan's APIs, understanding the details can be beneficial. We'll use OpenAI-compatible interfaces within Spring AI, which simplifies our workflow. For those interested in exploring the API documentation further, you can find comprehensive instructions here: API Documentation
Please note that since agents need to call plugins or workflows during execution, only the following three models support function callbacks. This means other models lack this capability. Keep this in mind when selecting your model.
Currently, Hunyuan hasn't introduced prepaid large model resource packages—users can only pre-purchase concurrency packages. For billing details, please refer to the illustration below.
Project Configuration
We'll continue using our previous Spring AI demonstration project, making necessary modifications. The required Maven POM dependencies are as follows:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
In my initial configuration, I chose the functioncall model due to its lower cost. However, I later discovered that it doesn't perform well with system prompts, so I switched to the pro model. You can make your choice based on specific requirements and budget.
The functioncall model is less sensitive to prompts but excels at parsing function callback results. The pro model is prompt-sensitive but doesn't directly answer based on function callback results—it continuously outputs planner content without responding to the user. More details will follow.
The application.properties file handles global configuration, which all ChatClient instances follow. A significant advantage is that developers don't need to modify code at the application level—simply changing dependencies in Maven's POM file allows easy switching between different AI large model providers. This flexibility enhances project maintainability and facilitates model replacement and upgrades.
Spring AI Agent Construction
Now, assuming you've completed all preparations, we can begin building our agent. First, we'll focus on customizing configuration parameters. As mentioned earlier, application.properties provides global settings for all ChatClient instances, but each model has specific domains and use cases. Therefore, we need to configure personalized settings for each interface to ensure model performance aligns with actual business requirements.
Custom Model Configuration
Standard Invocation
First, let's examine how code should be written in normal circumstances:
@PostMapping("/ai-function")
ChatDataPO functionGenerationByText(@RequestParam("userInput") String userInput) {
String content = this.myChatClientWithSystem
.prompt()
.system("You are Alex, a Java server-side developer dedicated to exploring the mysteries of AI technology. Passionate about technical exchange and sharing, and enthusiastic about the open-source community.")
.user(userInput)
.advisors(messageChatMemoryAdvisor)
.functions("CurrentWeather")
.call()
.content();
log.info("content: {}", content);
ChatDataPO chatDataPO = ChatDataPO.builder().code("text").data(ChildData.builder().text(content).build()).build();;
return chatDataPO;
}
As shown, if we set a breakpoint before making the request, we can observe the chatOptions parameter, which represents our default configuration. Our primary goal is to explore how to effectively modify the chatOptions parameter before sending the request.
During prompt testing, we found that the functioncall model doesn't respond significantly to system prompts, failing to deliver expected performance. However, this model has a notable advantage—it supports function callbacks (explained in previous chapters) and is less expensive than the pro model, making it a more cost-effective choice in certain scenarios.
Custom Invocation
To make the model's responses better align with prompt requirements, we can configure the model individually. If you want to adjust a specific method rather than using global settings in application.properties, you can achieve this by modifying the relevant parameters. The specific configuration method is as follows:
// Omitted duplicate code
OpenAiChatOptions openAiChatOptions = OpenAiChatOptions.builder()
.withModel("hunyuan-pro").withTemperature(0.5f).build();
String content = this.myChatClientWithSystem
.prompt()
.system("You are Alex, a Java server-side developer dedicated to exploring the mysteries of AI technology. Passionate about technical exchange and sharing, and enthusiastic about the open-source community.")
.user(userInput)
.options(openAiChatOptions)
.advisors(messageChatMemoryAdvisor)
// Omitted duplicate code
}
Here, we simply need to configure the relevant options to complete the setup. Next, we can check the configuration at the breakpoint to ensure these settings have taken effect and are operating correctly.
Similarly, as mentioned earlier, the pro model we configured is more friendly to system prompts compared to the function-call model.
Reasoning Path
In most agents, these reasoning paths aren't displayed—only Baidu's agent system presents them to users. These paths are generated and returned by the large model, so I haven't made additional configurations here. We could choose to return these paths, with the relevant source code as follows:
private void writeWithMessageConverters(Object body, Type bodyType, ClientHttpRequest clientRequest)
throws IOException {
// Omitted code
for (HttpMessageConverter messageConverter : DefaultRestClient.this.messageConverters) {
if (messageConverter instanceof GenericHttpMessageConverter genericMessageConverter) {
if (genericMessageConverter.canWrite(bodyType, bodyClass, contentType)) {
logBody(body, contentType, genericMessageConverter);
genericMessageConverter.write(body, bodyType, contentType, clientRequest);
return;
}
}
if (messageConverter.canWrite(bodyClass, contentType)) {
logBody(body, contentType, messageConverter);
messageConverter.write(body, contentType, clientRequest);
return;
}
}
// Omitted code
}
As shown, we've only implemented simple printing without creating a message converter. Since our business system doesn't need to present this information to customers, we consider the current implementation sufficient to meet our needs.
The printed reasoning path information looks like this:
org.springframework.web.client.DefaultRestClient [453] -| Writing [ChatCompletionRequest[messages=[ChatCompletionMessage[ > > ## Omitted other content, key code as follows: > > role=SYSTEM, name=null, toolCallId=null, toolCalls=null, refusal=null], ChatCompletionMessage[rawContent=What's the weather like in Changchun?, role=USER, name=null, toolCallId=null, toolCalls=null, refusal=null], ChatCompletionMessage[rawContent=Use the 'CurrentWeather' function to get weather information for Changchun. The user wants to know the current weather in Changchun. The user's request is about getting weather information for a specific location, which matches the 'CurrentWeather' function provided by the tool. > > ,## Omitted other content
Plugin Configuration
I previously explained in detail how to create custom plugins for agents in my videos. In this practice, we'll continue using the Baidu weather plugin to get real-time weather information. However, unlike before, we'll integrate this functionality into our Spring AI project.
Database Configuration
Each business system typically has its own database to better serve users. For demonstration purposes, we'll create a MySQL example that retrieves area code values and passes them to the API for calling. Through plugins, you can perform various database operations, but here we'll primarily focus on query demonstrations.
For this example, I'll continue using a Tencent Cloud Lightweight Application Server to set up a standalone MySQL environment. After successfully setting up the environment, we'll proceed with subsequent operations. Please ensure all necessary configurations and settings are completed before starting to facilitate data base queries and API calls.
The relevant POM file dependencies are as follows:
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.49</version>
</dependency>
<dependency>
<groupId>com.baomidou<
The database connection configuration information is as follows:
spring.datasource.url=jdbc:mysql://ip:3306/agent?useSSL=false&serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8
spring.datasource.username=agent
spring.datasource.password=password
spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver
Data Import
I've successfully completed importing data provided by Baidu Maps, as shown in the figure below:
Database Operations
Next, you can directly perform database operations within the plugin class. For the SearchHttpAK entity class, you can directly copy it from the Java SDK provided by Baidu Maps without additional explanation. Additionally, please note that areaInfoPOMapper requires you to perform Bean injection in the configuration class to ensure its normal use.
public class WeatherDataService implements Function<WeatherRequest, WeatherResponse> {
private final AreaInfoMapper areaInfoMapper;
public WeatherDataService(AreaInfoMapper areaInfoMapper) {
this.areaInfoMapper = areaInfoMapper;
}
@JsonClassDescription("location: City address, for example: Changchun")
public record WeatherRequest(String location) {}
public record WeatherResponse(String weather) {}
public WeatherResponse apply(WeatherRequest request) {
SearchHttpAK snCal = new SearchHttpAK();
Map params = new LinkedHashMap<String, String>();
QueryWrapper<AreaInfoPO> queryWrapper = new QueryWrapper<>();
queryWrapper.like("city", request.location());
List<AreaInfoPO> areaInfoPOS = areaInfoMapper.selectList(queryWrapper);
String result = "";
try {
params.put("district_id", areaInfoPOS.get(0).getCityGeocode());
result = "Weather information has been retrieved. Please organize this information and respond to the user in a clear and understandable way: " + snCal.requestGetAKForPlugins(params);
log.info("result:{}", result);
} catch (Exception e) {
// Use this return with caution as it may cause infinite tool chain calls
// So please set the number of times or return an error directly
// result = "This call failed, please retry CurrentWeather!";
result = "The call failed this time!";
}
return new WeatherResponse(result);
}
}
Regardless of whether this operation succeeds or fails, please avoid allowing the large model to initiate calls again on its own. Doing so may cause the program to enter an infinite loop, affecting system stability and reliability. Be sure to ensure appropriate control and management after the operation ends to prevent this situation.
Plugin Invocation
Through this approach, when we ask about Changchun's weather again, the large model can effectively use the data returned by the plugin to answer our question accurately and promptly.
As mentioned earlier, the Pro model is highly sensitive to system prompts. However, it's important to note that it doesn't directly optimize returned callback results.
To ensure the system's response meets expectations, it's recommended to use system prompts again for restriction and guidance. Through clear prompts, we can better control the model's output.
Please format the data returned by the tool and respond to the user's question in a friendly manner.
After optimization, the returned results are normal:
Workflow Configuration
I won't demonstrate workflows in Spring AI here. In fact, the business logic written in some of our plugins essentially constitutes a workflow logic framework. Next, I'll focus on explaining how to use third-party workflow tools to quickly meet business requirements.
Integrating Third-Party Workflows
When considering using Spring AI to implement agent functionality, we shouldn't easily abandon third-party visualization platforms. Integrating these third-party workflows can help us quickly achieve desired functionality, especially during development. Writing Java code is often tedious and complex, and a simple requirement might involve creating and maintaining multiple entity classes. In contrast, implementing simple business logic through third-party workflows can undoubtedly improve our development efficiency and reduce unnecessary workload.
Taking the Coze agent platform as an example, we can first focus on writing an efficient workflow. The main goal of this workflow is to provide users with comprehensive query services, including travel flights, train schedules, hotel reservations, and other information.
After applying for an API key, we'll proceed with integration work and carefully study the development documentation to ensure smooth implementation of required functionality.
Workflow Plugins
Based on the above information, we can encapsulate workflow calls as plugins. In fact, for agent platforms, workflows and plugins essentially exist in the form of function calls, so converting workflows to plugins is relatively simple and direct.
public class TravelService implements Function<TravelRequest, TravelResponse> {
@JsonClassDescription("dep_city: Departure city address, for example Changchun; arr_city: Destination city, for example Beijing")
public record TravelRequest(String dep_city, String arr_city) {}
public record TravelResponse(String travel_info) {}
public TravelResponse apply(TravelRequest request) {
CozeWorkflow cozeWorkflow = new CozeWorkflow<TravelRequest>();
Map params = new LinkedHashMap<String, String>();
String result = "";
try {
// I've encapsulated the HTTP call here
result = cozeWorkflow.getCoze("7423018070586064915", request);
log.info("result:{}", result);
} catch (Exception e) {
result = "The call failed this time!";
}
return new TravelResponse(result);
}
}
Since our TravelRequest uses Java 14's record feature, which older versions of Fastjson don't support for JSON conversion, we must use the latest version of Fastjson in the project. Using incompatible older versions will cause functionality to fail or not execute properly.
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>2.0.31</version>
</dependency>
After configuration, if the Coze plugin runs normally, we can begin providing corresponding answers for the Hunyuan large model.
Workflow Invocation
We've successfully integrated this plugin into the request processing flow, with the specific implementation code as follows:
// Omitted duplicate code
.functions("CurrentWeather","TravelPlanning")
.call()
.content();
// Omitted duplicate code
Since the returned information is quite lengthy, the Hunyuan large model's response time typically extends significantly. In this case, our regular API calls might timeout, preventing successful acquisition of expected results. The specific error message is as follows:
I/O error on POST request for "https://api.hunyuan.cloud.tencent.com/v1/chat/completions": timeout
Retry Template Timeout Fix
We need to readjust our current configuration. Initially, I thought the issue was with the retryTemplate configuration, as we mentioned this earlier. However, after careful inspection, I found that retryTemplate only handles retry-related information configuration and doesn't involve timeout settings. To further troubleshoot the problem, I examined the subsequent source code and discovered that modifications to the RestClientAutoConfiguration class were necessary.
Notably, the RestClientAutoConfiguration class provides options for customized configuration, allowing us to control request behavior more finely. Here's a source code example of this class, showing what specific adjustments we can make:
@Bean
@ConditionalOnMissingBean
RestClientBuilderConfigurer restClientBuilderConfigurer(ObjectProvider<RestClientCustomizer> customizerProvider) {
RestClientBuilderConfigurer configurer = new RestClientBuilderConfigurer();
configurer.setRestClientCustomizers(customizerProvider.orderedStream().toList());
return configurer;
}
@Bean
@Scope("prototype")
@ConditionalOnMissingBean
RestClient.Builder restClientBuilder(RestClientBuilderConfigurer restClientBuilderConfigurer) {
RestClient.Builder builder = RestClient.builder()
.requestFactory(ClientHttpRequestFactories.get(ClientHttpRequestFactorySettings.DEFAULTS));
return restClientBuilderConfigurer.configure(builder);
}
Therefore, we need to make necessary modifications to restClientBuilder. Currently, all DEFAULTS configurations in restClientBuilder are null, meaning it's using default configurations. When we call the Coze workflow, since it uses the okhttp class, it actually integrates okhttp internally, thus following okhttp's configuration approach.
To solve this problem, we can directly adjust ClientHttpRequestFactorySettings to set our desired timeout. The specific configuration adjustments are as follows:
@Bean
RestClient.Builder restClientBuilder(RestClientBuilderConfigurer restClientBuilderConfigurer) {
ClientHttpRequestFactorySettings defaultConfigurer = ClientHttpRequestFactorySettings.DEFAULTS
.withReadTimeout(Duration.ofMinutes(5))
.withConnectTimeout(Duration.ofSeconds(30));
RestClient.Builder builder = RestClient.builder()
.requestFactory(ClientHttpRequestFactories.get(defaultConfigurer));
return restClientBuilderConfigurer.configure(builder);
}
Please note that in the reasoning path mentioned earlier, messageConverter is also configured here. If you have specific needs, you can completely customize it. The key code section below will call our custom method to implement customized logic.
If you want to set other personalized configurations or information, you can refer to the following example for adjustments:
public RestClient.Builder configure(RestClient.Builder builder) {
applyCustomizers(builder);
return builder;
}
private void applyCustomizers(Builder builder) {
if (this.customizers != null) {
for (RestClientCustomizer customizer : this.customizers) {
customizer.customize(builder);
}
}
}
At this point, after a series of adjustments and configurations, we've successfully resolved the timeout issue. This means that when calling the Hunyuan model, we can now smoothly obtain returned results.
Private Knowledge Base
Since knowledge bases are a common and important feature of agents, we'll also implement this part. Notably, Hunyuan's API is compatible with vector functionality, meaning we can directly leverage knowledge bases to enhance our agent's capabilities. Through this implementation, we not only enjoy unrestricted access but also achieve a high degree of customization to meet specific business needs.
More importantly, this design gives us complete control over our knowledge base usage—you don't need to worry about data leakage issues.
Vector Database Configuration
Next, we'll continue integrating Milvus, a vector database we've used before. Although Tencent Cloud also offers its own vector database solution, it hasn't yet been integrated into Spring AI. For demonstration and development purposes, we've decided to first use Milvus as our vector database.
To successfully complete this integration, we need to configure the corresponding dependencies as follows:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-milvus-store-spring-boot-starter</artifactId>
</dependency>
The required configuration file is as follows:
# Configure Milvus client host address
spring.ai.vectorstore.milvus.client.host=
# Configure Milvus client port
spring.ai.vectorstore.milvus.client.port=19530
# Configure Milvus database name
spring.ai.vectorstore.milvus.databaseName=
# Configure Milvus collection name
spring.ai.vectorstore.milvus.collectionName=
# Will create a collection by default if none exists, default value is false
spring.ai.vectorstore.milvus.initialize-schema=true
# Configure vector embedding dimensions
spring.ai.vectorstore.milvus.embeddingDimension=1024
# Configure index type
spring.ai.vectorstore.milvus.indexType=IVF_FLAT
# Configure distance metric type
spring.ai.vectorstore.milvus.metricType=COSINE
Tencent Hunyuan's embedding interface currently only supports input and model parameters, with model fixed as hunyuan-embedding and dimensions fixed at 1024.
spring.ai.openai.embedding.base-url=https://api.hunyuan.cloud.tencent.com
spring.ai.openai.embedding.options.model=hunyuan-embedding
spring.ai.openai.embedding.options.dimensions=1024
Here, we continue to use the Hunyuan large model API key we applied for, so no additional configuration is needed. It's worth emphasizing that proper configuration of these parameters is crucial. If not properly set, it will cause errors when the system makes calls.
Basic Operations
Most agent platforms fully open up knowledge bases, allowing users to freely view, modify, delete, and add operations. Next, we'll demonstrate how to perform these operations:
@GetMapping("/ai/embedding")
public Map embed(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
EmbeddingResponse embeddingResponse = this.embeddingModel.embedForResponse(List.of(message));
return Map.of("embedding", embeddingResponse);
}
@GetMapping("/ai/addKnowledge")
public boolean addKnowledge(@RequestParam(value = "meta-message") String message, @RequestParam(value = "vector-content") String content) {
String uuid = UUID.randomUUID().toString();
DocumentInfoPO documentInfoPO = new DocumentInfoPO();
documentInfoPO.setVectorId(uuid);
documentInfoPO.setMetaMessage(message);
documentInfoPO.setVectorContent(content);
documentInfoPOMapper.insert(documentInfoPO);
List<Document> documents = List.of(
new Document(uuid, content, Map.of("text", message)));
vectorStore.add(documents);
return true;
}
@GetMapping("/ai/selectKnowledge")
public List<Document> selectKnowledge(@RequestParam(value = "vector-content") String content) {
List<Document> result = vectorStore.similaritySearch(SearchRequest.query(content).withTopK(5).withSimilarityThreshold(0.9));
return result;
}
@GetMapping("/ai/deleteKnowledge")
public Boolean deleteKnowledge(@RequestParam(value = "vector-id") String id) {
Optional<Boolean> delete = vectorStore.delete(List.of(id));
return delete.get();
}
In my opinion, the basic implementation of add, delete, and query operations is complete. The reason third-party agent platforms provide modification operations is that in subsequent processes, data is deleted and reinserted—an unavoidable operation since everyone has modification needs. Additionally, it's worth noting that default vector databases don't support displaying all data, a limitation that prompts us to introduce corresponding database operations to compensate for this deficiency, ensuring data integrity and operability.
To better validate the effectiveness of this process, I called the interface in advance and uploaded some knowledge base data. Next, I'll demonstrate the query effect of this data.
This is the knowledge base information I just uploaded. To improve efficiency, I'll now directly demonstrate the application of the knowledge base's RAG (Retrieval-Augmented Generation) retrieval functionality in our agent.
Automatic Invocation
Based on my current observations, all agent platforms can be divided into two implementation approaches: automatic invocation and on-demand invocation. Most platforms primarily implement automatic invocation, unless written in workflows (i.e., our functions), which I won't explain here as they're similar to the plugins mentioned above. Today, I'll focus on how automatic invocation is implemented.
Automatic knowledge base invocation relies on the Advisor interface, specifically by constructing an additional prompt before each request. Currently, Spring AI has implemented long-term memory functionality, with its specific class being VectorStoreChatMemoryAdvisor. Therefore, we can directly reference this class's implementation approach to build a knowledge base automatic invocation system that meets our needs.
We can implement this as follows. Since our main goal is to provide reference information to the large model to better understand context, we can directly ignore the response enhancement part. This means we don't need to perform additional processing or optimization on the response content during this process. Here's the specific code example:
public class KnowledgePromptAdvisor implements RequestResponseAdvisor {
private final VectorStore vectorStore;
private static final String userTextAdvice = """
Please use the following reference information to answer the question. If no reference information is available, please answer directly.
---------------------
Reference information:
{memory}
---------------------
""";
public KnowledgePromptAdvisor(VectorStore vectorStore) {
this.vectorStore = vectorStore;
}
@Override
public AdvisedRequest adviseRequest(AdvisedRequest request, Map<String, Object> context) {
// 1. Add a knowledge base prompt
String advisedSystemText = request.userText() + System.lineSeparator() + this.userTextAdvice;
List<Document> documents = vectorStore.similaritySearch(request.userText());
// 2. Concatenate knowledge base data
String data = documents.stream().map(d -> d.getMetadata().get("text").toString()).collect(Collectors.joining(","));
Map<String, Object> advisedParams = new HashMap<>(request.userParams());
advisedParams.put("memory", data);
// 3. Assign prompt parameters
AdvisedRequest advisedRequest = AdvisedRequest.from(request)
.withSystemText(advisedSystemText)
.withSystemParams(advisedParams) // Knowledge base RAG retrieval data
.build();
return advisedRequest;
}
@Override
public ChatResponse adviseResponse(ChatResponse chatResponse, Map<String, Object> context) {
// No need to modify anything
return chatResponse;
}
@Override
public Flux<ChatResponse> adviseResponse(Flux<ChatResponse> fluxChatResponse, Map<String, Object> context) {
// No need to modify anything
return fluxChatResponse;
}
}
You need to pass the same VectorStore instance through constructor injection in the configuration class:
@Bean
KnowledgePromptAdvisor knowledgePromptAdvisor(VectorStore vectorStore) {
return new KnowledgePromptAdvisor(vectorStore);
}
Next, we just need to add the corresponding code or configuration in the request method to integrate the new functionality:
// Omitted duplicate code
.advisors(messageChatMemoryAdvisor, knowledgePromptAdvisor)
.functions("CurrentWeather","TravelPlanning")
.call()
.content();
// Omitted duplicate code
This is precisely the significant effect brought by automatic invocation—all operations are completely encapsulated, clear, and easy to understand.
Next, let's look at the second on-demand invocation approach, which is implemented using plugins (i.e., function callbacks). In this mode, the system can dynamically call corresponding plugins based on actual needs, providing flexible and efficient functionality support. We've already demonstrated two related plugins earlier, so we won't elaborate further here.
Production Deployment
I've decided not to deploy this separately on a server but instead expose the interface through local startup. Additionally, I've created a standalone page. Since this isn't the focus of this chapter, I won't explain frontend knowledge in detail.
To better showcase these contents, I've provided a demonstration video for your reference:
Pros and Cons Analysis
First, let's discuss some notable advantages of current major agent platforms:
- Visual Operations: These platforms provide intuitive visual interfaces, enabling even beginners to quickly develop agents suitable for their business needs, better meeting their specific requirements.
- Diverse Publishing Channels: Many platforms support multiple publishing channels, such as official accounts, which is very friendly to newcomers. In contrast, configuring server backends often requires professional knowledge, and these platforms significantly lower the entry barrier.
- Rich Plugin Stores: Regardless of the agent platform, plugin diversity is crucial. These platforms typically offer various plugins created officially and by developers, helping users extend functionality to meet different needs.
- Diverse Workflows: Workflow functionality is actually similar to plugins, just with different names. For external systems, these workflows are all integrated through API interfaces, enhancing interoperability and flexibility between systems.
Everything in the world has flaws, and agents are no exception. Even powerful platforms like Coze have some shortcomings. The following points are particularly evident:
- Function Exception Handling: When agents experience functional exceptions, even if you submit a ticket, customer service and technical personnel often resolve problems slowly. In this situation, you can only wait helplessly, unable to determine where the problem lies. If it's just an individual user's issue, you might not even receive feedback scheduling. With self-developed agents, when encountering errors, you can quickly locate problems, and regardless of requirements, you can fix and release new versions at any time.
- Knowledge Base Storage Limitations: Since these agents cater to a wide range of users, knowledge base storage quotas are often limited and may start charging in the future. Coze has gradually introduced different pricing standards with various fee structures that can be confusing. In this case, maintaining your own server is undoubtedly more cost-effective. Additionally, current major cloud service providers and domestic databases all recommend vector databases, often offering preferential policies that are very atractive.
- Knowledge Base Material Optimization: Knowledge base management varies across major agent platforms, requiring users to spend time adapting to their operation methods. The advantage of maintaining your own vector database is that all additional metadata information can be freely configured, enabling information filtering according to specific business needs, better aligning with your own business standards. This flexibility is something other agent platforms cannot provide.
- Uncontrollable Costs: For enterprises, managing various costs controllably is crucial. However, agent platform charges often become uncontrollable as traffic increases, potentially leading to arbitrary billing that puts enterprises in a passive position. In contrast, when developing agents independently, you can freely switch models, with costs under your control—both server and large model expenses can be effectively managed.
- Limited Selection: Agent platforms are typically tied to their own companies, limiting user freedom. One day, a platform might decide to no longer support a particular large model, requiring all related workflows to be changed, as different large models have significant differences in response capabilities, forcing users to readapt.
- And so on...
Having said all this, it doesn't mean Spring AI will completely replace agent platforms in the future. After all, niche clients typically lack development and maintenance personnel to manage code. Therefore, the future trend is likely that these two will complement each other. Agent platforms' development speed and capabilities can basically meet 80% of business needs, a principle that aligns with the 80/20 rule practiced by large companies. The remaining 20% might need to be addressed by companies developing their own agent platforms, a proportion that could potentially be even higher.
Therefore, mastering relevant technology is the most crucial factor for enterprises in this transformation. Having technical capabilities will make enterprises more flexible when choosing and using agent platforms, enabling customization and optimization according to their specific needs. At the same time, I hope Hunyuan large models can soon be compatible with OpenAI interfaces or join the Spring AI family, providing users with more choices and flexibility.
Conclusion
Today, we explored in-depth the practical application of Spring AI in agent construction, particularly its value and effectiveness in enterprise environments. By gradually implementing a locally deployable agent solution, we not only demonstrated Spring AI's flexibility and ease of use but also emphasized its potential in promoting deep integration of AI technology with business operations.
The core of agents lies in their ability to efficiently handle complex business requirements, and this implementation relies on proper architectural design and technology selection. Through Spring AI integration, we can flexibly call different APIs, whether using domestic Hunyuan APIs or other mainstream AI interfaces. Developers can quickly switch in projects, ensuring system maintainability and extensibility. This characteristic not only improves development efficiency but also enables enterprises to respond quickly when facing market demand changes, flexibly adjusting their technical routes.
The personalized configuration and plugin calling we involved in the process fully demonstrated how to combine traditional development models with modern AI technology. Through custom plugins and workflows, enterprises can design more targeted agents according to specific business needs, thereby improving service quality and customer satisfaction. For example, in weather query scenarios, agents can not only obtain real-time data through APIs but also combine it with information in databases to achieve precise and personalized services. This deep functional integration not only simplifies user operation processes but also improves system response speed.
Additionally, we mentioned the integration of private knowledge bases, emphasizing the importance of data security and autonomy. Utilizing vector databases like Milvus, enterprises can not only efficiently manage massive data but also enhance agent intelligence through embedding technology. This provides a more solid guarantee for enterprises in terms of information security and intellectual property protection, especially in the context of rapid current information development, where this point is particularly important.
In summary, this article is not just an explanation of Spring AI agent construction process but also an in-depth reflection on how enterprises can effectively utilize this technology to achieve business upgrades and transformation. Thrrough our discussion, I hope to provide new perspectives and insights for your agent development and application, helping you walk more steadily on the future AI path.