LangChain for Java as an Abstraction for Different Large Language Models

LangChain4j aims to streamline the integration of AI and large language model (LLM) capabilities into Java applications by providing a unified API. This API supports various LLM providers, such as OpenAI, Mistral, and Google Vertex AI, and embedding stores like Pinecone and Vespa, eliminating the need to learn and implement specific APIs for each provider. We wrote Java code to interact with OpenAI APIs and local LLMs running on Ollama in the last two chapters. LangChain4j provides abstract interfaces for many more models. This flexibility allows developers to switch between different LLMs or embedding stores without rewriting their code. LangChain4j currently supports over 10 popular LLM providers and more than 15 embedding stores, functioning similarly to Hibernate but for LLMs and embedding stores.

The framework also offers a comprehensive toolbox that encapsulates the community’s collective experience in building LLM-powered applications over the past year. This toolbox includes tools for low-level prompt templating, memory management, and output parsing, as well as high-level patterns like Agents and Retrieval-Augmented Generation (RAGs). LangChain4j provides interfaces and multiple ready-to-use implementations for each pattern and abstraction, based on proven techniques. This makes it suitable for a wide range of applications, from chatbots to complete RAG pipelines, offering developers a variety of options to build sophisticated LLM-powered solutions efficiently.

Architecture diagram

Why Use an Abstraction Layer?

In the previous two chapters we wrote raw HTTP client code to interact with the OpenAI and Ollama APIs. While this approach teaches us how LLM APIs work at the wire level, it has practical disadvantages. Each provider has its own JSON schema for requests and responses, its own authentication mechanism, and its own model naming conventions. If your project needs to support multiple providers, or if you want the freedom to switch providers when pricing or capabilities change, you face the burden of maintaining multiple low-level HTTP integrations.

LangChain4j solves this problem by defining a single Java interface, ChatLanguageModel, that every provider implements. Your application code calls model.generate(prompt) regardless of whether the underlying model is OpenAI’s GPT, a local Ollama instance, or Google’s Gemini. Switching providers is a configuration change, not a code rewrite.

For reference the LangChain4j project documentation is available at https://docs.langchain4j.dev.

Maven Project Setup

The example project is in the directory source-code/langchain4j-ollama. The Maven POM file declares three dependencies: the LangChain4j core library, the OpenAI-compatible provider module (which also supports Ollama’s API since Ollama exposes an OpenAI-compatible endpoint), and the org.json library for JSON manipulation in our utility methods.

 1 <dependencies>
 2     <dependency>
 3         <groupId>dev.langchain4j</groupId>
 4         <artifactId>langchain4j</artifactId>
 5         <version>0.30.0</version>
 6     </dependency>
 7     <dependency>
 8         <groupId>dev.langchain4j</groupId>
 9         <artifactId>langchain4j-open-ai</artifactId>
10         <version>0.30.0</version>
11     </dependency>
12     <dependency>
13         <groupId>org.json</groupId>
14         <artifactId>json</artifactId>
15         <version>20240303</version>
16     </dependency>
17 </dependencies>

The key insight here is the langchain4j-open-ai artifact. LangChain4j organizes provider support into separate modules so that your application only pulls in the dependencies for the providers you actually use. The OpenAI module works with any API that implements the OpenAI chat completions protocol, which includes Ollama when started with ollama serve.

Implementation: The OllamaLlmLangChain4j Class

The main class provides a getCompletion method that wraps LangChain4j’s ChatLanguageModel interface, along with several utility methods for file I/O and prompt template variable substitution. Let us walk through the complete implementation:

 1 package com.markwatson.langchain4j_ollama;
 2 
 3 import dev.langchain4j.model.ollama.OllamaChatModel;
 4 
 5 import java.io.IOException;
 6 import java.nio.file.Files;
 7 import java.nio.file.Path;
 8 import java.time.Duration;
 9 
10 public class OllamaLlmLangChain4j {
11 
12     private static final String DEFAULT_BASE_URL = "http://localhost:11434";
13     private static final Duration DEFAULT_TIMEOUT = Duration.ofSeconds(120);
14 
15     public static void main(String[] args) {
16         String prompt = "Translate the following English text to French: 'Hello, how are you?'";
17         try {
18             String completion = getCompletion(prompt, "mistral");
19             System.out.println("completion: " + completion);
20         } catch (Exception e) {
21             System.err.println("Error getting completion: " + e.getMessage());
22         }
23     }
24 
25     public static String getCompletion(String prompt, String modelName) {
26         System.out.println("\n\n**********\n\nprompt: " + prompt + ", modelName: " + modelName);
27 
28         String baseUrl = System.getenv("OLLAMA_BASE_URL");
29         if (baseUrl == null || baseUrl.isBlank()) {
30             baseUrl = DEFAULT_BASE_URL;
31         }
32 
33         OllamaChatModel model = OllamaChatModel.builder()
34                 .baseUrl(baseUrl)
35                 .modelName(modelName)
36                 .temperature(0.7)
37                 .timeout(DEFAULT_TIMEOUT)
38                 .build();
39 
40         String answer = model.chat(prompt);
41 
42         System.out.println(answer);
43         return answer;
44     }
45 
46     /***
47      * Utilities for using the Ollama LLM APIs
48      */
49 
50     // read the contents of a file path into a Java string
51     public static String readFileToString(String filePath) throws IOException {
52         return Files.readString(Path.of(filePath));
53     }
54 
55     public static String promptVar(String prompt, String varName, String varValue) {
56         return prompt.replace(varName, varValue);
57     }
58 }

Understanding the Core API Call

The heart of this class is the getCompletion method. Notice how concise the LLM interaction is:

  1. We retrieve the optional OLLAMA_BASE_URL from the environment variables, defaulting to http://localhost:11434 if not specified.
  2. We construct an OllamaChatModel instance using its builder, specifying the baseUrl, modelName, temperature (0.7), and timeout. This replaces dozens of lines of manual HTTP connection setup, JSON payload building, and response parsing that we wrote by hand in the previous chapter.
  3. We call model.chat(prompt) which returns a plain Java String containing the model’s response.

Compare this with the raw HTTP approach from the OpenAI chapter where we had to manually construct JSON request bodies, set HTTP headers, read input streams, and parse JSON responses. The LangChain4j abstraction reduces all of that ceremony to two lines of code.

Prompt Template Utilities

The utility methods readFileToString, replaceSubstring, and promptVar implement a simple but effective prompt templating system. The promptVar method replaces placeholder variables like {input_text} in a prompt template string with actual values. This pattern is useful when you maintain a library of reusable prompt templates as text files, which is exactly what we do in the source-code/prompts directory.

Prompt Templates: Two-Shot Entity Extraction

One of the most effective techniques for getting consistent structured output from an LLM is few-shot prompting, where you provide the model with examples of the desired input-output format before presenting the actual task. Our two-shot extraction prompt template, stored in the file prompts/two-shot-2-var.txt, demonstrates this:

 1 Given the two examples below, extract the names,
 2 addresses, and email addresses of individuals mentioned
 3 later as Process Text. Format the extracted information
 4 in JSON, with keys for "name", "address", and "email".
 5 If any information is missing, use "null" for that field.
 6 Be very concise in your output by providing only the
 7 output JSON.
 8 
 9 Example 1:
10 Text: "John Doe lives at 1234 Maple Street, Springfield.
11 His email is johndoe@example.com."
12 Output: 
13 {
14   "name": "John Doe",
15   "address": "1234 Maple Street, Springfield",
16   "email": "johndoe@example.com"
17 }
18 
19 Example 2:
20 Text: "Jane Smith has recently moved to 5678 Oak Avenue,
21 Anytown. She hasn't updated her email yet."
22 Output: 
23 {
24   "name": "Jane Smith",
25   "address": "5678 Oak Avenue, Anytown",
26   "email": null
27 }
28 
29 Process Text: "{input_text}"
30 Output:

The prompt begins with a clear task description specifying the desired output format (JSON with specific keys). Two worked examples follow, including one where a field is intentionally missing to show the model how to handle incomplete data. The placeholder {input_text} at the bottom is replaced at runtime by our promptVar utility method.

We also use a summarization prompt template stored in prompts/summarization_prompt.txt:

1 Summarize the following text: "{input_text}"
2 Output:

This is deliberately minimal. For summarization tasks an LLM typically needs very little instructional scaffolding beyond a clear directive.

Test Examples

The JUnit test class exercises three distinct use cases: translation, structured entity extraction using two-shot prompting, and summarization. Each test loads a prompt template from disk, fills in variables, and sends the completed prompt through the LangChain4j abstraction layer.

 1 package com.markwatson.langchain4j_ollama;
 2 
 3 import org.junit.jupiter.api.DisplayName;
 4 import org.junit.jupiter.api.Tag;
 5 import org.junit.jupiter.api.Test;
 6 
 7 import static org.junit.jupiter.api.Assertions.*;
 8 
 9 /**
10  * Integration tests for OllamaLlmLangChain4j.
11  * Requires a running Ollama server with the specified models pulled.
12  */
13 @Tag("integration")
14 class OllamaLlmLangChain4jTest {
15 
16     @Test
17     @DisplayName("Simple completion with Ollama model")
18     void testCompletion() {
19         String result = OllamaLlmLangChain4j.getCompletion(
20                 "Translate the following English text to French: 'Hello, how are you?'",
21                 "gemma3:1b");
22 
23         System.out.println("\n\n&&&&&&&&&&\n\ncompletion: " + result);
24         assertNotNull(result, "Completion result should not be null");
25         assertFalse(result.isBlank(), "Completion result should not be blank");
26     }
27 
28     @Test
29     @DisplayName("Two-shot template extraction")
30     void testTwoShotTemplate() throws Exception {
31         String inputText = "Mark Smith enjoys living in Berkeley California at 102 Dunston Street and use mjess@foobar.com for contacting him.";
32         String prompt0 = OllamaLlmLangChain4j.readFileToString("../prompts/two-shot-2-var.txt");
33         System.out.println("prompt0: " + prompt0);
34 
35         String prompt = OllamaLlmLangChain4j.promptVar(prompt0, "{input_text}", inputText);
36         System.out.println("prompt: " + prompt);
37 
38         String result = OllamaLlmLangChain4j.getCompletion(prompt, "gemma3:1b");
39         System.out.println("two shot extraction completion: " + result);
40 
41         assertNotNull(result, "Two-shot extraction result should not be null");
42         assertFalse(result.isBlank(), "Two-shot extraction result should not be blank");
43     }
44 
45     @Test
46     @DisplayName("Text summarization")
47     void testSummarization() throws Exception {
48         String inputText = "Jupiter is the fifth planet from the Sun and the largest in the Solar System. It is a gas giant with a mass one-thousandth that of the Sun, but two-and-a-half times that of all the other planets in the Solar System combined. Jupiter is one of the brightest objects visible to the naked eye in the night sky, and has been known to ancient civilizations since before recorded history. It is named after the Roman god Jupiter.[19] When viewed from Earth, Jupiter can be bright enough for its reflected light to cast visible shadows,[ and is on average the third-brightest natural object in the night sky after the Moon and Venus.";
49         String prompt0 = OllamaLlmLangChain4j.readFileToString("../prompts/summarization_prompt.txt");
50         System.out.println("prompt0: " + prompt0);
51 
52         String prompt = OllamaLlmLangChain4j.promptVar(prompt0, "{input_text}", inputText);
53         System.out.println("prompt: " + prompt);
54 
55         String result = OllamaLlmLangChain4j.getCompletion(prompt, "gemma3:1b");
56         System.out.println("summarization completion: " + result);
57 
58         assertNotNull(result, "Summarization result should not be null");
59         assertFalse(result.isBlank(), "Summarization result should not be blank");
60     }
61 }

Test 1: Translation (testCompletion)

The simplest test sends a direct English-to-French translation request. This exercises the basic round-trip through the LangChain4j API without any prompt template processing.

Test 2: Two-Shot Entity Extraction (testTwoShotTemplate)

This test loads the two-shot prompt template from disk and substitutes the {input_text} variable with a test sentence containing a person’s name, address, and email. The model is expected to return a JSON object with the extracted fields.

Test 3: Summarization (testSummarization)

This test loads the summarization prompt template and substitutes a paragraph about the planet Jupiter. The model is expected to return a concise summary of the input text.

Running the Examples

The project includes a Makefile that runs all three tests:

1 run:
2     mvn test -q # run test in quiet mode

Before running, ensure that you have Ollama running locally (ollama serve) and that you have pulled the required model:

1 ollama pull gemma3:1b

The example program output is a few hundred lines due to the verbose prompt logging. Here is a small representative portion of the output showing the results of each test:

 1 $ make run
 2 
 3 **********
 4 
 5 prompt: Translate the following English text to French:
 6   'Hello, how are you?', modelName: llama3.2:latest
 7 Bonjour, comment allez-vous ?
 8 
 9 &&&&&&&&&&
10 
11 completion: Bonjour, comment allez-vous ?
12 
13 **********
14 
15 prompt: Given the two examples below, extract the
16   names, addresses, and email addresses ...
17   Process Text: "Mark Smith enjoys living in Berkeley
18   California at 102 Dunston Street and use
19   mjess@foobar.com for contacting him."
20   Output:
21 {
22   "name": "Mark Smith",
23   "address": "102 Dunston Street, Berkeley, California",
24   "email": "mjess@foobar.com"
25 }
26 
27 two shot extraction completion: {
28   "name": "Mark Smith",
29   "address": "102 Dunston Street, Berkeley, California",
30   "email": "mjess@foobar.com"
31 }
32 
33 **********
34 
35 prompt: Summarize the following text:
36   "Jupiter is the fifth planet from the Sun..."
37   Output:
38 Jupiter is the largest planet in our Solar System
39 and the fifth from the Sun. A gas giant with a mass
40 two-and-a-half times that of all other planets
41 combined, it has been observed since ancient times
42 and is named after the Roman god Jupiter.
43 
44 summarization completion: Jupiter is the largest
45 planet in our Solar System ...

Wrap Up

The key takeaway from this chapter is the value of abstraction in AI application development. By using LangChain4j’s ChatLanguageModel interface we wrote a complete LLM client, including prompt template processing and three different NLP tasks, in under 70 lines of library code. The same application code works with OpenAI’s cloud API, a local Ollama instance, or any other provider that LangChain4j supports, simply by changing the model construction line.

The prompt template utilities we built here, while simple, demonstrate a pattern that scales well to production systems. Maintaining prompts as external text files with variable placeholders separates prompt engineering concerns from application logic and makes it easy for non-programmers to iterate on prompt design without modifying Java code.

In the next chapter we explore the AgentScope framework, which takes the concept of LLM abstraction further by adding agent-oriented programming patterns like reasoning loops and tool calling on top of the basic completion interface we used here.