langchain4j-8 (AI Services AI服务)

此外，通常不需要 state/memory，因此与的LLM每次交互都是独立于其他交互的。例如，当用户简单地向聊天机器人打招呼或说再见时，提供LLM数十或数百个工具的访问权限是昂贵的，有时甚至是危险的（LLM 通话中包含的每个工具都会消耗大量令牌），并且可能导致意外结果（LLMs 可能会产生幻觉或纵以调用具有意外输入的工具）。关于 RAG：同样，有时需要为 LLM 提供一些上下文，但并非总是如此，

21:22

521人浏览 · 2025-04-14 17:42:31

21:22 · 2025-04-14 17:42:31 发布

AI Services AI服务

到目前为止，我们已经介绍了 ChatLanguageModel、ChatMessage、ChatMemory 等低级组件。在这个级别上工作非常灵活，可以为您提供完全的自由，但它也会迫使您编写大量样板代码。由于 LLM-powered 应用程序通常不仅需要一个组件，还需要多个组件协同工作（例如，提示模板、聊天内存、LLMs 输出解析器、RAG 组件：嵌入模型和存储），并且通常涉及多个交互，因此编排它们变得更加麻烦。

我们希望您专注于业务逻辑，而不是低级实现细节。因此，目前 LangChain4j 中有两个高级概念可以帮助解决这个问题：AI 服务和链。

Chains (legacy) 链（旧版）

Chains 的概念起源于 Python 的 LangChain（在引入 LCEL 之前）。这个想法是为每个常见用例（如聊天机器人、RAG 等）设置一个 Chain。链将多个低级组件组合在一起，并编排它们之间的交互。它们的主要问题是，如果您需要自定义某些东西，它们太僵化了。LangChain4j 只实现了两个链（ConversationalChain 和 ConversationalRetrievalChain），我们目前不打算添加更多。

AI Services AI 服务

我们提出了另一种名为 AI 服务的解决方案，专为 Java 量身定制。这个想法是将与其他LLMs组件交互的复杂性隐藏在一个简单的 API 后面。

这种方法与 Spring Data JPA 或 Retrofit 非常相似：你用所需的 API 声明性地定义一个接口，然后 LangChain4j 提供了一个实现这个接口的对象（代理）。您可以将 AI 服务视为应用程序中服务层的一个组件。它提供 AI 服务。因此得名。

AI 服务处理最常见的操作：

格式化LLM
解析来自LLM

它们还支持更高级的功能：

Chat memory 聊天记忆
Tools 工具
RAG

AI 服务可用于构建有状态的聊天机器人，以促进来回交互，以及自动化流程，其中每个调用LLM都是隔离的。
让我们看一下最简单的 AI 服务。之后，我们将探索更复杂的示例。

Simplest AI Service 最简单的 AI 服务

首先，我们定义一个具有单个方法 chat 的接口，该方法将 String 作为输入并返回 String。

interface Assistant {
    String chat(String userMessage);
}

然后，我们创建低级组件。这些组件将在我们的 AI Service 的后台使用。在这种情况下，我们只需要 ChatLanguageModel：

ChatLanguageModel model = OpenAiChatModel.builder()
    .apiKey(System.getenv("OPENAI_API_KEY"))
    .modelName(GPT_4_O_MINI)
    .build();

最后，我们可以使用 AiServices 类来创建 AI Service 的实例：

Assistant assistant = AiServices.create(Assistant.class, model);

注意：在 Quarkus 和 Spring Boot 应用程序，则自动配置会处理创建 Assistant bean。这意味着你不需要调用 AiServices.create（…），你可以简单地在任何需要的地方注入 / 自动装配 Assistant。

现在我们可以使用 Assistant

String answer = assistant.chat("Hello");
System.out.println(answer); // Hello, how can I help you?

How does it work? 它是如何工作的？

您将接口的 Class 与 AiServices 一起提供给低级组件，然后 AiServices 会创建一个实现此接口的代理对象。目前，它使用反射，但我们也在考虑替代方案。此代理对象处理输入和输出的所有转换。在本例中，输入是单个 String，但我们使用的是将 ChatMessage 作为输入的 ChatLanguageModel。因此，AiService 会自动将其转换为 UserMessage 并调用 ChatLanguageModel。由于 chat 方法的输出类型为 String，因此 ChatLanguageModel 返回 AiMessage 后，会先转换为 String，然后再从 chat 方法返回。

AI Services in Quarkus Application （Quarkus 应用程序中的 AI 服务）

LangChain4j Quarkus 扩展大大简化了在 Quarkus 应用程序中使用 AI 服务的过程。
更多信息可以在这里找到 https://docs.quarkiverse.io/quarkus-langchain4j/dev/ai-services.html

AI Services in Spring Boot Application (Spring Boot 应用程序中的 AI 服务)

LangChain4j Spring Boot 启动器大大简化了在 Spring Boot 应用程序中使用 AI 服务的过程。
https://docs.langchain4j.dev/tutorials/spring-boot-integration/#spring-boot-starter-for-declarative-ai-services

@SystemMessage

现在，让我们看一个更复杂的示例。我们将使用 slang 😉 强制LLM回复
这通常是通过在 SystemMessage 中提供说明来实现的

interface Friend {

    @SystemMessage("You are a good friend of mine. Answer using slang.")
    String chat(String userMessage);
}

Friend friend = AiServices.create(Friend.class, model);

String answer = friend.chat("Hello"); // Hey! What's up?

在此示例中，我们添加了 @SystemMessage 注释以及我们要使用的系统提示模板。这将在后台转换为 SystemMessage 并与 UserMessage 一起发送到LLM。

@SystemMessage 还可以从 resources 加载 Prompt 模板： @SystemMessage(fromResource = “my-prompt-template.txt”)

System Message Provider 系统消息提供程序

系统消息也可以使用系统消息提供程序动态定义：

Friend friend = AiServices.builder(Friend.class)
    .chatLanguageModel(model)
    .systemMessageProvider(chatMemoryId -> "You are a good friend of mine. Answer using slang.")
    .build();

如您所见，您可以根据聊天内存 ID（用户或对话）提供不同的系统消息。

@UserMessage

现在，让我们假设我们使用的模型不支持系统消息，或者我们只是想使用 UserMessage 来实现此目的。

interface Friend {

    @UserMessage("You are a good friend of mine. Answer using slang. {{it}}")
    String chat(String userMessage);
}

Friend friend = AiServices.create(Friend.class, model);

String answer = friend.chat("Hello"); // Hey! What's shakin'?

我们已将 @SystemMessage 注释替换为 @UserMessage 并指定一个提示模板，其中包含引用唯一方法参数的变量 it。

也可以使用 String userMessage 注释 @V 并为 prompt 模板变量分配自定义名称：

interface Friend {

    @UserMessage("You are a good friend of mine. Answer using slang. {{message}}")
    String chat(@V("message") String userMessage);

}

注意：将 LangChain4j 与 Quarkus 或 Spring Boot 一起使用时，不需要使用 @V。只有在 Java 编译期间未启用 -parameters 选项时，才需要此注释。

@UserMessage 还可以从 resources 加载 prompt 模板： @UserMessage(fromResource = “my-prompt-template.txt”)

Examples of valid AI Service methods (有效的 AI Service 方法示例)

以下是一些有效的 AI 服务方法示例。

//UserMessage  用户留言
String chat(String userMessage);

String chat(@UserMessage String userMessage);

String chat(@UserMessage String userMessage, @V("country") String country); // userMessage contains "{{country}}" template variable

@UserMessage("What is the capital of Germany?")
String chat();

@UserMessage("What is the capital of {{it}}?")
String chat(String country);

@UserMessage("What is the capital of {{country}}?")
String chat(@V("country") String country);

@UserMessage("What is the {{something}} of {{country}}?")
String chat(@V("something") String something, @V("country") String country);

@UserMessage("What is the capital of {{country}}?")
String chat(String country); // this works only in Quarkus and Spring Boot applications

//SystemMessage and UserMessage
@SystemMessage("Given a name of a country, answer with a name of it's capital")
String chat(String userMessage);

@SystemMessage("Given a name of a country, answer with a name of it's capital")
String chat(@UserMessage String userMessage);

@SystemMessage("Given a name of a country, {{answerInstructions}}")
String chat(@V("answerInstructions") String answerInstructions, @UserMessage String userMessage);

@SystemMessage("Given a name of a country, answer with a name of it's capital")
String chat(@UserMessage String userMessage, @V("country") String country); // userMessage contains "{{country}}" template variable

@SystemMessage("Given a name of a country, {{answerInstructions}}")
String chat(@V("answerInstructions") String answerInstructions, @UserMessage String userMessage, @V("country") String country); // userMessage contains "{{country}}" template variable

@SystemMessage("Given a name of a country, answer with a name of it's capital")
@UserMessage("Germany")
String chat();

@SystemMessage("Given a name of a country, {{answerInstructions}}")
@UserMessage("Germany")
String chat(@V("answerInstructions") String answerInstructions);

@SystemMessage("Given a name of a country, answer with a name of it's capital")
@UserMessage("{{it}}")
String chat(String country);

@SystemMessage("Given a name of a country, answer with a name of it's capital")
@UserMessage("{{country}}")
String chat(@V("country") String country);

@SystemMessage("Given a name of a country, {{answerInstructions}}")
@UserMessage("{{country}}")
String chat(@V("answerInstructions") String answerInstructions, @V("country") String country);

Multimodality 综合

AI 服务目前不支持多模态，请使用低级 API。
https://docs.langchain4j.dev/tutorials/chat-and-language-models/#multimodality

Return Types 返回类型

AI Service 方法可以返回以下类型之一：

String - 在这种情况下，返回 LLM-generated output 而不进行任何处理/解析
结构化输出支持的任何类型 - 在这种情况下，AI 服务会在返回之前将生成的输出解析LLM为所需的类型
https://docs.langchain4j.dev/tutorials/structured-outputs/#supported-types

任何类型都可以额外包装到 Result 中，以获取有关 AI 服务调用的额外元数据：

TokenUsage - AI 服务调用期间使用的令牌总数。如果 AI 服务对 LLM 进行了多次调用（例如，因为执行了工具），它将对所有调用的令牌使用量求和。
在 RAG 检索期间检索的内容
Executed tools 执行的工具
FinishReason 完成原因

一个例子：

interface Assistant {
    
    @UserMessage("Generate an outline for the article on the following topic: {{it}}")
    Result<List<String>> generateOutlineFor(String topic);
}

Result<List<String>> result = assistant.generateOutlineFor("Java");

List<String> outline = result.content();
TokenUsage tokenUsage = result.tokenUsage();
List<Content> sources = result.sources();
List<ToolExecution> toolExecutions = result.toolExecutions();
FinishReason finishReason = result.finishReason();

Structured Outputs 结构化输出

如果要从 LLM 接收结构化输出（例如，复杂的 Java 对象，而不是 String 中的非结构化文本），则可以将 AI Service 方法的返回类型从 String 更改为其他类型。

boolean 作为返回类型

interface SentimentAnalyzer {

    @UserMessage("Does {{it}} has a positive sentiment?")
    boolean isPositive(String text);

}

SentimentAnalyzer sentimentAnalyzer = AiServices.create(SentimentAnalyzer.class, model);

boolean positive = sentimentAnalyzer.isPositive("It's wonderful!");
// true

Enum 作为返回类型

enum Priority {
    CRITICAL, HIGH, LOW
}

interface PriorityAnalyzer {
    
    @UserMessage("Analyze the priority of the following issue: {{it}}")
    Priority analyzePriority(String issueDescription);
}

PriorityAnalyzer priorityAnalyzer = AiServices.create(PriorityAnalyzer.class, model);

Priority priority = priorityAnalyzer.analyzePriority("The main payment gateway is down, and customers cannot process transactions.");
// CRITICAL

POJO 作为返回类型

class Person {

    @Description("first name of a person") // you can add an optional description to help an LLM have a better understanding
    String firstName;
    String lastName;
    LocalDate birthDate;
    Address address;
}

@Description("an address") // you can add an optional description to help an LLM have a better understanding
class Address {
    String street;
    Integer streetNumber;
    String city;
}

interface PersonExtractor {

    @UserMessage("Extract information about a person from {{it}}")
    Person extractPersonFrom(String text);
}

PersonExtractor personExtractor = AiServices.create(PersonExtractor.class, model);

String text = """
            In 1968, amidst the fading echoes of Independence Day,
            a child named John arrived under the calm evening sky.
            This newborn, bearing the surname Doe, marked the start of a new journey.
            He was welcomed into the world at 345 Whispering Pines Avenue
            a quaint street nestled in the heart of Springfield
            an abode that echoed with the gentle hum of suburban dreams and aspirations.
            """;

Person person = personExtractor.extractPersonFrom(text);

System.out.println(person); // Person { firstName = "John", lastName = "Doe", birthDate = 1968-07-04, address = Address { ... } }

JSON mode JSON 模式

在提取自定义 POJO（实际上是 JSON，然后解析到 POJO 中）时，建议在模型配置中启用 “JSON 模式”。这样，LLM 将被迫使用有效的 JSON 进行响应。

注意：JSON 模式和工具/函数调用是类似的功能，但具有不同的 API，并且用于不同的目的。

当您始终需要来自LLM结构化格式（有效 JSON）的响应时，JSON 模式非常有用。此外，通常不需要 state/memory，因此与的LLM每次交互都是独立于其他交互的。例如，您可能希望从文本中提取信息，例如此文本中提到的人物列表或者将自由格式的产品评论转换为结构化表单，其中包含如下字段字符串 productName、Sentiment sentiment、List claimedProblems 等。

另一方面，当LLM应该能够执行某些作时（例如，查找数据库、搜索 Web、取消用户的预订等），工具/功能非常有用。在这种情况下，将向提供一个工具列表及其预期的 JSON 架构 LLM，并自主决定是否调用其中任何一个工具来满足用户请求。

以前，函数调用经常用于结构化数据的提取，但现在我们有了 JSON 模式功能，更适合这个目的。

以下是启用 JSON 模式的方法：

//对于支持结构化输出的较新型号（例如，gpt-4o-mini、gpt-4o-2024-08-06）：
OpenAiChatModel.builder()
    ...
    .supportedCapabilities(RESPONSE_FORMAT_JSON_SCHEMA)
    .strictJsonSchema(true)
    .build();
    
//对于较旧的型号（例如 gpt-3.5-turbo、gpt-4）：
	OpenAiChatModel.builder()
    ...
    .responseFormat("json_object")
    .build();

//For Azure OpenAI:  对于 Azure OpenAI：
	AzureOpenAiChatModel.builder()
    ...
    .responseFormat(new ChatCompletionsJsonResponseFormat())
    .build();

//For Vertex AI Gemini:  对于 Vertex AI Gemini：
	VertexAiGeminiChatModel.builder()
    ...
    .responseMimeType("application/json")
    .build();

//或者从 Java 类指定显式模式：
	VertexAiGeminiChatModel.builder()
    ...
    .responseSchema(SchemaHelper.fromClass(Person.class))
    .build();

//From a JSON schema:  从 JSON 架构：
VertexAiGeminiChatModel.builder()
    ...
    .responseSchema(Schema.builder()...build())
    .build();

//For Google AI Gemini:  对于 Google AI Gemini：
GoogleAiGeminiChatModel.builder()
    ...
    .responseFormat(ResponseFormat.JSON)
    .build();

//或者从 Java 类指定显式模式：
GoogleAiGeminiChatModel.builder()
    ...
    .responseFormat(ResponseFormat.builder()
        .type(JSON)
        .jsonSchema(JsonSchemas.jsonSchemaFrom(Person.class).get())
        .build())
    .build();

//From a JSON schema:  从 JSON 架构：
GoogleAiGeminiChatModel.builder()
    ...
    .responseFormat(ResponseFormat.builder()
        .type(JSON)
        .jsonSchema(JsonSchema.builder()...build())
        .build())
    .build();

//For Mistral AI:  对于 Mistral AI：
MistralAiChatModel.builder()
    ...
    .responseFormat(MistralAiResponseFormatType.JSON_OBJECT)
    .build();

//For Ollama:  对于 Ollama：
OllamaChatModel.builder()
    ...
    .responseFormat(JSON)
    .build();

对于其他模型提供程序：如果底层模型提供程序不支持 JSON 模式，则提示工程是您的最佳选择。此外，尝试降低温度以获得更高的确定性。

Streaming 流

AI 服务可以在使用 TokenStream 返回类型时逐个令牌流式传输响应：

interface Assistant {

    TokenStream chat(String message);
}

StreamingChatLanguageModel model = OpenAiStreamingChatModel.builder()
    .apiKey(System.getenv("OPENAI_API_KEY"))
    .modelName(GPT_4_O_MINI)
    .build();

Assistant assistant = AiServices.create(Assistant.class, model);

TokenStream tokenStream = assistant.chat("Tell me a joke");

tokenStream.onPartialResponse((String partialResponse) -> System.out.println(partialResponse))
    .onRetrieved((List<Content> contents) -> System.out.println(contents))
    .onToolExecuted((ToolExecution toolExecution) -> System.out.println(toolExecution))
    .onCompleteResponse((ChatResponse response) -> System.out.println(response))
    .onError((Throwable error) -> error.printStackTrace())
    .start();

您还可以使用 Flux 而不是 TokenStream。为此，请导入 langchain4j-reactor 模块：

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-reactor</artifactId>
    <version>1.0.0-beta3</version>
</dependency>

interface Assistant {

  Flux<String> chat(String message);
}

//示例
https://github.com/langchain4j/langchain4j-examples/blob/main/other-examples/src/main/java/ServiceWithStreamingExample.java

Chat Memory 聊天记忆

AI 服务可以使用聊天内存来 “记住” 以前的交互：

Assistant assistant = AiServices.builder(Assistant.class)
    .chatLanguageModel(model)
    .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
    .build();

在此方案中，同一 ChatMemory 实例将用于 AI 服务的所有调用。但是，如果您有多个用户，则此方法将不起作用，因为每个用户都需要自己的 ChatMemory 实例来维护其个人对话。

此问题的解决方案是使用 ChatMemoryProvider：


interface Assistant  {
    String chat(@MemoryId int memoryId, @UserMessage String message);
}

Assistant assistant = AiServices.builder(Assistant.class)
    .chatLanguageModel(model)
    .chatMemoryProvider(memoryId -> MessageWindowChatMemory.withMaxMessages(10))
    .build();

String answerToKlaus = assistant.chat(1, "Hello, my name is Klaus");
String answerToFrancine = assistant.chat(2, "Hello, my name is Francine");

在此方案中，ChatMemoryProvider 将提供两个不同的 ChatMemory 实例，每个内存 ID 一个实例。
以这种方式使用 ChatMemory 时，为了避免内存泄漏，驱逐不再需要的对话的内存也很重要。要使 AI 服务内部使用的聊天记忆可访问，定义它的接口扩展 ChatMemoryAccess 即可。

interface Assistant extends ChatMemoryAccess {
    String chat(@MemoryId int memoryId, @UserMessage String message);
}

这样就可以访问单个对话的 ChatMemory 实例，并在对话终止时将其删除。

String answerToKlaus = assistant.chat(1, "Hello, my name is Klaus");
String answerToFrancine = assistant.chat(2, "Hello, my name is Francine");

List<ChatMessage> messagesWithKlaus = assistant.getChatMemory(1).messages();
boolean chatMemoryWithFrancineEvicted = assistant.evictChatMemory(2);

请注意，如果 AI 服务方法没有用 @MemoryId 注释的参数，则 ChatMemoryProvider 中的 memoryId 值将默认为字符串 “default”。
请注意，不应为同一 @MemoryId 同时调用 AI 服务，因为它可能导致 ChatMemory 损坏。目前，AI Service 未实施任何机制来防止同一 @MemoryId 的并发调用。

Tools (Function Calling) 工具（函数调用）

AI 服务可以配置使用以下工具 LLM：

class Tools {
    
    @Tool
    int add(int a, int b) {
        return a + b;
    }

    @Tool
    int multiply(int a, int b) {
        return a * b;
    }
}

Assistant assistant = AiServices.builder(Assistant.class)
    .chatLanguageModel(model)
    .tools(new Tools())
    .build();

String answer = assistant.chat("What is 1+2 and 3*4?");

在这种情况下，LLM将会请求在提供最终答案之前执行 add（1， 2）和 multiply（3， 4）方法。LangChain4j 将自动执行这些方法。
有关工具的更多详细信息，请参阅此处：
https://docs.langchain4j.dev/tutorials/tools/#high-level-tool-api

RAG

AI 服务可以配置 ContentRetriever 以启用朴素 RAG：


EmbeddingStore embeddingStore  = ...
EmbeddingModel embeddingModel = ...

ContentRetriever contentRetriever = new EmbeddingStoreContentRetriever(embeddingStore, embeddingModel);

Assistant assistant = AiServices.builder(Assistant.class)
    .chatLanguageModel(model)
    .contentRetriever(contentRetriever)
    .build();

配置 RetrievalAugmentor 提供了更大的灵活性，支持高级 RAG 功能，例如查询转换、重新排名等：

RetrievalAugmentor retrievalAugmentor = DefaultRetrievalAugmentor.builder()
        .queryTransformer(...)
        .queryRouter(...)
        .contentAggregator(...)
        .contentInjector(...)
        .executor(...)
        .build();

Assistant assistant = AiServices.builder(Assistant.class)
    .chatLanguageModel(model)
    .retrievalAugmentor(retrievalAugmentor)
    .build();

有关 RAG 的更多详细信息，请点击此处。
https://docs.langchain4j.dev/tutorials/rag/

更多 RAG 示例可在此处找到。
https://github.com/langchain4j/langchain4j-examples/tree/main/rag-examples/src/main/java

Auto-Moderation （自动审核）

Example：https://github.com/langchain4j/langchain4j-examples/blob/main/other-examples/src/main/java/ServiceWithAutoModerationExample.java

Chaining multiple AI Services （链接多个 AI 服务）

由 LLM-powered 的应用程序的逻辑变得越复杂，将其分解为更小的部分就越重要，这是软件开发中的常见做法。

例如，在系统提示符中塞入大量指令以考虑所有可能的情况很容易出现错误和效率低下。如果说明太多，LLMs 可能会忽略一些。此外，说明的呈现顺序也很重要，这使得该过程更具挑战性。

此原则也适用于工具、RAG 和模型参数，例如温度、maxTokens 等。

您的聊天机器人可能不需要始终了解您拥有的所有工具。例如，当用户简单地向聊天机器人打招呼或说再见时，提供LLM数十或数百个工具的访问权限是昂贵的，有时甚至是危险的（LLM 通话中包含的每个工具都会消耗大量令牌），并且可能导致意外结果（LLMs 可能会产生幻觉或纵以调用具有意外输入的工具）。

关于 RAG：同样，有时需要为 LLM 提供一些上下文，但并非总是如此，因为它会产生额外的成本（更多的上下文 = 更多的令牌）并增加响应时间（更多的上下文 = 更高的延迟）。

关于模型参数：在某些情况下，您可能需要LLM高度确定性，因此您需要设置一个低温。在其他情况下，您可以选择更高的温度，依此类推。

关键是，更小、更具体的组件更容易开发、测试、维护和理解，成本更低。

另一个需要考虑的方面涉及两个极端：

您是否希望您的应用程序具有高度确定性，其中应用程序控制流，而只是LLM其中一个组件？
或者您希望LLM拥有完全的自主权并驱动您的应用程序？

或者两者兼而有之，具体取决于情况？当您将应用程序分解为更小、更易于管理的部分时，所有这些选项都是可能的。

AI 服务可以作为常规（确定性）软件组件使用，也可以与常规（确定性）软件组件结合使用：

您可以一个接一个地调用一个 AI 服务（也称为链接）。
您可以使用确定性和 LLM-powered if/else 语句（AI 服务可以返回布尔值）。
您可以使用确定性和 LLM-powered switch 语句（AI 服务可以返回枚举）。
您可以使用确定性和 LLM-powered 的 for/while 循环（AI 服务可以返回 int 和其他数字类型）。
您可以在单元测试中模拟 AI 服务（因为它是一个接口）。
您可以单独对每个 AI 服务进行集成测试。
您可以单独评估和查找每个 AI 服务的最佳参数。
…

让我们考虑一个简单的例子。我想为我的公司构建一个聊天机器人。如果用户向聊天机器人打招呼，我希望它使用预定义的问候语进行响应，而无需依赖生成LLM问候语。如果用户提出问题，我希望它使用LLM公司的内部知识库（又名 RAG）生成响应。

以下是如何将此任务分解为 2 个单独的 AI 服务：

interface GreetingExpert {

    @UserMessage("Is the following text a greeting? Text: {{it}}")
    boolean isGreeting(String text);
}

interface ChatBot {

    @SystemMessage("You are a polite chatbot of a company called Miles of Smiles.")
    String reply(String userMessage);
}

class MilesOfSmiles {

    private final GreetingExpert greetingExpert;
    private final ChatBot chatBot;
    
    ...
    
    public String handle(String userMessage) {
        if (greetingExpert.isGreeting(userMessage)) {
            return "Greetings from Miles of Smiles! How can I make your day better?";
        } else {
            return chatBot.reply(userMessage);
        }
    }
}

GreetingExpert greetingExpert = AiServices.create(GreetingExpert.class, llama2);

ChatBot chatBot = AiServices.builder(ChatBot.class)
    .chatLanguageModel(gpt4)
    .contentRetriever(milesOfSmilesContentRetriever)
    .build();

MilesOfSmiles milesOfSmiles = new MilesOfSmiles(greetingExpert, chatBot);

String greeting = milesOfSmiles.handle("Hello");
System.out.println(greeting); // Greetings from Miles of Smiles! How can I make your day better?

String answer = milesOfSmiles.handle("Which services do you provide?");
System.out.println(answer); // At Miles of Smiles, we provide a wide range of services ...

请注意，我们是如何将更便宜的 Llama2 用于识别文本是否是问候语的简单任务，以及使用带有内容检索器（RAG）的更昂贵的 GPT-4 来完成更复杂的任务。
这是一个非常简单且有点幼稚的示例，但希望它能证明这个想法。

现在，我可以同时模拟 GreetingExpert 和 ChatBot，并单独测试 MilesOfSmiles 此外，我可以分别集成测试 GreetingExpert 和 ChatBot。我可以分别评估它们并为每个子任务找到最佳参数，或者从长远来看，甚至可以为每个特定的子任务微调一个小型的专用模型。

Testing 测试

客户支持代理的集成测试示例
https://github.com/langchain4j/langchain4j-examples/blob/main/customer-support-agent-example/src/test/java/dev/langchain4j/example/CustomerSupportAgentIT.java