Using Local LLMs with Ollama in Java Applications

Using local Large Language Models (LLMs) with Ollama offers a range of advantages and applications that significantly enhance the accessibility and functionality of these powerful AI tools in various settings. Ollama is like the Docker system, but for easily downloading, running, and managing LLMs on your local computer. Ollama was originally written to support Apple Silicon Macs, but now supports Intel Macs, Linux, and Windows.

Advantages of Using Local LLMs with Ollama

Accessibility and Ease of Use

Ollama democratizes the use of sophisticated LLMs by making them accessible to users of all technical backgrounds. You don’t need to be an AI expert to leverage the capabilities of LLMs when using Ollama. The platform’s user-friendly interface and simple text-based interaction make it intuitive and straightforward for anyone to start using LLMs locally.

Privacy and Data Security

Running LLMs locally on your system via Ollama ensures that your data does not leave your device, which is crucial for maintaining privacy and security, especially when handling sensitive information. This setup prevents data from being sent to third-party servers, thus safeguarding it from potential misuse or breaches.

Cost-Effectiveness

Using Ollama to run LLMs locally eliminates the need for costly cloud computing resources. This can be particularly advantageous for users who require extensive use of LLMs, as it avoids the recurring costs associated with cloud services.

Customization and Control

Local deployment of LLMs through Ollama allows users to have greater control over the models and the computational environment. This includes the ability to choose which models to run and to configure settings to optimize performance according to specific hardware capabilities.

Applications of Local LLMs with Ollama

Personalized AI Applications

For hobbyists and personal use, Ollama allows the exploration of LLMs’ capabilities such as text generation, language translation, and more, all within the privacy of one’s own computer. This can be particularly appealing for those interested in building personalized AI tools or learning more about AI without making significant investments.

Development and Testing

Ollama is well-suited for developers who need to integrate LLMs into their applications but wish to do so in a controlled and cost-effective manner. It is particularly useful in development environments where frequent testing and iterations are required. The local setup allows for quick changes and testing without the need to interact with external servers.

Educational and Research Purposes

Educators and researchers can benefit from the local deployment of LLMs using Ollama. It provides a platform for experimenting with AI models without the need for extensive infrastructure, making it easier to teach AI concepts and conduct research in environments with limited resources.

In summary, using local LLMs with Ollama not only makes powerful AI tools more accessible and easier to use but also ensures privacy, reduces costs, and provides users with greater control over their AI applications. Whether for professional development, research, or personal use, Ollama offers a versatile and user-friendly platform for exploring the potential of LLMs locally.

Java Library to Use Ollama’s REST API

The library defined in the directory Java-AI-Book-Code/ollama-llm-client defines a class named OllamaLlmClient with a method getCompletion that sends a JSON payload to a server and reads the response. Here’s an explanation of what each significant part of the method does:

Build JSON request payload: Constructs a JSON object message containing prompt, model, and stream (set to false to receive the full response at once instead of a stream).
Prepare HTTP Request: Uses Java 11 HttpRequest builder to configure a POST request to /api/generate with the JSON payload as the body publisher and sets the content type header.
Execute HTTP Request: Sends the request using a static shared HttpClient instance with a defined request timeout (defaulting to 3 minutes).
Process Server Response: Parses the response body string as a JSONObject and extracts the value of the response key, which is returned.

In summary, this method sends a JSON payload containing a prompt and model name to a specified server endpoint, reads the JSON response from the server, extracts a specific field from the JSON response, and returns that field’s value.

 1 package com.markwatson.ollama;
 2 
 3 import java.io.IOException;
 4 import java.net.URI;
 5 import java.net.http.HttpClient;
 6 import java.net.http.HttpRequest;
 7 import java.net.http.HttpResponse;
 8 import java.nio.file.Files;
 9 import java.nio.file.Path;
10 import java.time.Duration;
11 
12 import org.json.JSONObject;
13 
14 public class OllamaLlmClient {
15 
16     private static final String DEFAULT_BASE_URL = "http://localhost:11434";
17     private static final Duration REQUEST_TIMEOUT = Duration.ofMinutes(3);
18 
19     private static final HttpClient HTTP_CLIENT = HttpClient.newBuilder()
20             .connectTimeout(Duration.ofSeconds(10))
21             .build();
22 
23     public static void main(String[] args) throws Exception {
24         String prompt = "Translate the following English text to French: 'Hello, how are you?'";
25         String completion = getCompletion(prompt, "mistral");
26         System.out.println("completion: " + completion);
27     }
28 
29     public static String getCompletion(String prompt, String modelName) throws IOException, InterruptedException {
30         return getCompletion(prompt, modelName, DEFAULT_BASE_URL);
31     }
32 
33     public static String getCompletion(String prompt, String modelName, String baseUrl)
34             throws IOException, InterruptedException {
35         System.out.println("prompt: " + prompt + ", modelName: " + modelName);
36 
37         // Build JSON request payload
38         var message = new JSONObject();
39         message.put("prompt", prompt);
40         message.put("model", modelName);
41         message.put("stream", false);
42 
43         HttpRequest request = HttpRequest.newBuilder()
44                 .uri(URI.create(baseUrl + "/api/generate"))
45                 .header("Content-Type", "application/json")
46                 .timeout(REQUEST_TIMEOUT)
47                 .POST(HttpRequest.BodyPublishers.ofString(message.toString()))
48                 .build();
49 
50         HttpResponse<String> response = HTTP_CLIENT.send(request, HttpResponse.BodyHandlers.ofString());
51         System.out.println(response.body());
52 
53         var jsonObject = new JSONObject(response.body());
54         return jsonObject.getString("response");
55     }
56 
57     /***
58      * Utilities for using the Ollama LLM APIs
59      */
60 
61     // read the contents of a file path into a Java string
62     public static String readFileToString(String filePath) throws IOException {
63         return Files.readString(Path.of(filePath));
64     }
65 
66     public static String replaceSubstring(String originalString, String substringToReplace, String replacementString) {
67         return originalString.replace(substringToReplace, replacementString);
68     }
69 
70     public static String promptVar(String prompt0, String varName, String varValue) {
71         String prompt = replaceSubstring(prompt0, varName, varValue);
72         return replaceSubstring(prompt, varName, varValue);
73     }
74 }

Example Using the Library

The Java library for getting local LLM text completions using Ollama contains a unit test that contains an example showing how to call the API:

1         String r = OllamaLlmClient.getCompletion(
2                 "Translate the following English text to French: 'Hello, how are you?'",
3                 "gemma3:1b");
4         System.out.println("completion: " + r);

The output looks like:

1 prompt: Translate the following English text to French: 'Hello, how are you?', modelName: mistral
2 completion:  In French, "Hello, how are you?" can be translated as "Bonjour, comment allez-vous?" or simply "Comment allez-vous?" depending on the context.

For reference the JSON response object from the API call looks like this:

1 {"model":"mistral","created_at":"2024-05-05T19:38:26.893374Z","response":" In French, \"Hello, how are you?\" can be translated as \"Bonjour, comment allez-vous?\" or simply \"Comment allez-vous?\" depending on the context.","done":true,"context":[733,16289,28793, ...],"total_duration":1777944500,"load_duration":563601792,"prompt_eval_count":25,"prompt_eval_duration":133415000,"eval_count":41,"eval_duration":1079766000}

Extraction of Facts and Relationships from Text Data

Traditional methods for extracting email addresses, names, addresses, etc. from text included the use of hand-crafted regular expressions and custom software. LLMs are text processing engines with knowledge of grammar, sentence structure, and some real world embedded knowledge. Using LLMs can reduce the development time of information extraction systems.

There are sample text prompts in the directory Java-AI-Book-Code/prompts and we will specifically use the file ** two-shot-2-var.txt** that is listed here:

 1 Given the two examples below, extract the names, addresses, and email addresses of individuals mentioned later as Process Text. Format the extracted information in JSON, with keys for "name", "address", and "email". If any information is missing, use "null" for that field. Be very concise in your output by providing only the output JSON.
 2 
 3 Example 1:
 4 Text: "John Doe lives at 1234 Maple Street, Springfield. His email is johndoe@example.com."
 5 Output: 
 6 {
 7   "name": "John Doe",
 8   "address": "1234 Maple Street, Springfield",
 9   "email": "johndoe@example.com"
10 }
11 
12 Example 2:
13 Text: "Jane Smith has recently moved to 5678 Oak Avenue, Anytown. She hasn't updated her email yet."
14 Output: 
15 {
16   "name": "Jane Smith",
17   "address": "5678 Oak Avenue, Anytown",
18   "email": null
19 }
20 
21 Process Text: "{input_text}"
22 Output:

The example code is a test method in OllamaLlmClientTest. testTwoShotTemplate() that is shown here:

1         String inputText = "Mark Johnson enjoys living in Berkeley California at 102 Dunston Street and use mjess@foobar.com for contacting him.";
2         String prompt0 = OllamaLlmClient.readFileToString("../prompts/two-shot-2-var.txt");
3         System.out.println("prompt0: " + prompt0);
4         String prompt = OllamaLlmClient.promptVar(prompt0, "{input_text}", inputText);
5         System.out.println("prompt: " + prompt);
6         String r = OllamaLlmClient.getCompletion(prompt, "gemma3:1b");
7         System.out.println("two shot extraction completion: " + r);

The output is (edited for brevity):

1 two shot extraction completion:
2 {
3   "name": "Mark Johnson",
4   "address": "102 Dunston Street, Berkeley, California",
5   "email": "mjess@foobar.com"
6 }

Using LLMs to Summarize Text

LLMs bring a new level of ability to text summarization tasks. With their ability to process massive amounts of information and “understand” natural language, they’re able to capture the essence of lengthy documents and distill them into concise summaries.

Here is a listing or the prompt file:

1 Summarize the following text: "{input_text}"
2 Output:

The example code is in the test OllamaLlmClientTest. testSummarization() listed here:

1         String inputText = "Jupiter is the fifth planet from the Sun and the largest in the Solar System. It is a gas giant with a mass one-thousandth that of the Sun, but two-and-a-half times that of all the other planets in the Solar System combined. Jupiter is one of the brightest objects visible to the naked eye in the night sky, and has been known to ancient civilizations since before recorded history. It is named after the Roman god Jupiter.[19] When viewed from Earth, Jupiter can be bright enough for its reflected light to cast visible shadows,[ and is on average the third-brightest natural object in the night sky after the Moon and Venus.";
2         String prompt0 = OllamaLlmClient.readFileToString("../prompts/summarization_prompt.txt");
3         System.out.println("prompt0: " + prompt0);
4         String prompt = OllamaLlmClient.promptVar(prompt0, "{input_text}", inputText);
5         System.out.println("prompt: " + prompt);
6         String r = OllamaLlmClient.getCompletion(prompt, "gemma3:1b");
7         System.out.println("summarization completion: " + r);

The output is (edited for brevity):

1 summarization completion:
2 
3 Here is a summary of the text:
4 
5 Jupiter is the 5th planet from the Sun and the largest gas giant in our Solar System, with a mass 1/1000 that of the Sun and 2.5 times that of all other planets combined. It's one of the brightest objects visible to the naked eye and has been known since ancient times. On average, it's the 3rd-brightest natural object in the night sky after the Moon and Venus.

Optional Practice Problems

Model Switching and Custom Host Configuration (Easy) Build on the core functionality in the OllamaLlmClient class. Write a runner program that prompts the user dynamically for a model name (e.g., gemma3:1b, mistral, or llama3) and a custom query, then passes those variables to the OllamaLlmClient.getCompletion method. Try querying two different local models with the same prompt and compare the quality and structure of their outputs.
Multi-Variable Prompt Templates (Medium) The utility method OllamaLlmClient.promptVar replaces a single variable inside a prompt template. Often, you will need to replace multiple variables (e.g., {input_text}, {tone}, {language}). Create a new utility class or extend the client to support a method signature like:
```
1 public static String promptVars(String promptTemplate, Map<String, String> variables)
```
Test your method by loading the two-shot-2-var.txt template and substituting multiple variables dynamically.
Robust JSON Output Parsing (Medium) When running the test in OllamaLlmClientTest.testTwoShotTemplate, the returned response is a raw text representation of a JSON object. Create a parser method that accepts this raw String, strips away any Markdown wrappers (such as ```json and ```), parses the cleaned string into an org.json.JSONObject, and maps the fields into a Java record class called ContactInfo containing the name, address, and email properties. Handle null fields gracefully.
Advanced Ollama Request Options (Hard) Ollama’s /api/generate endpoint supports optional parameters under a "system" parameter (setting a system prompt) and an "options" map (containing parameters like "temperature", "top_k", or "num_predict"). Modify the method signature in OllamaLlmClient to accept these configurations:
```
1 public static String getCompletion(
2     String prompt, 
3     String modelName, 
4     String baseUrl, 
5     String systemPrompt, 
6     Map<String, Object> options
7 )
```
Construct the correct JSONObject to serialize these parameters, send them to Ollama, and write a test case to demonstrate the differences in output creativity by setting low and high "temperature" values.

Up next

LangChain for Java as an Abstraction for Different Large Language Models