Using the Hugging Face Deep Learning Natural Language Processing APIs

Accessing the HuggingFace NLP APIs is similar to the code we used previously to access the OpenAI and Anthropic APIs.

History of Hugging Face and How They Differ from OpenAI and Anthropic

Hugging Face was founded in 2016 by French entrepreneurs Clement Delangue, Julien Chaumond, and Thomas Wolf. Hugging Face was initially developed as a chatbot application but open sourcing the model behind the chatbot the company decided to use its experience in developing a platform to run their own models to instead work on a general purpose platform for machine learning. They acquired Gradio, a software library used to make interactive browser demos of machine learning models. While we won’t use Gradio here because it is a Python library and platform, it is worth mentioning that I use Gradio when working on deep learning/LLM projects on Google’s Colab.

Comparing Hugging Face with OpenAI and Anthropic, Hugging Face provides an open-source platform for the machine learning community to collaborate on models, datasets, and applications. It supports users in hosting and sharing their own AI models. While OpenAI does allow developers to fine tune models on their platform, I find that Hugging Face to be designed for development and collaboration. Anthropic does not currently support fine tuning their models with your data.

Common Lisp Library for Hugging Face APIs

The following Common Lisp code is very similar to what we used in the last chapter to call OpenAI’s and Anthropic’s APIs.

The GitHub repository for this Quicklisp compatible library can be found here:

https://github.com/mark-watson/huggingface

 1 (in-package #:huggingface)
 2 
 3 ;; define the environment variable "HF_API_TOKEN" with the value
 4 ;;  of your Hugging Face API key
 5 
 6 (defun huggingface-helper (curl-command)
 7   (let ((response
 8           (uiop:run-program
 9            curl-command
10            :output :string)))
11     (with-input-from-string
12         (s response)
13       (let* ((json-as-list (json:decode-json s)))
14         json-as-list))))
15 
16 (defun summarize (some-text max-tokens)
17   (let* ((curl-command
18           (concatenate
19            'string
20            "curl https://api-inference.huggingface.co/models/facebook/bart-large-cnn"
21            " -H \"Content-Type: application/json\""
22            " -H \"Authorization: Bearer " (uiop:getenv "HF_API_TOKEN") "\" " 
23            " -d '{\"inputs\": \"" some-text "\", \"max_length\": "
24            (write-to-string max-tokens) " }'")))
25     (cdaar (huggingface-helper curl-command))))
26 
27 (defun answer-question (question-text context-text)
28   (let* ((curl-command
29           (concatenate
30            'string
31            "curl https://api-inference.huggingface.co/models/deepset/roberta-base-squad2"
32            " -H \"Content-Type: application/json\""
33            " -H \"Authorization: Bearer " (uiop:getenv "HF_API_TOKEN") "\" " 
34            " -d '{\"question\": \"" question-text "\", \"context\": \""
35            context-text "\" }'"))
36          (answer (huggingface-helper curl-command)))
37     (cdar (last answer))))

Here are two examples using this code:

 1 CL-USER>  (ql:quickload :huggingface)
 2 To load "huggingface":
 3   Load 1 ASDF system:
 4     huggingface
 5 ; Loading "huggingface"
 6 
 7 (:HUGGINGFACE)
 8 CL-USER> (huggingface:summarize "Jupiter is the fifth planet from the Sun and the largest in the Solar System. It is a gas giant with a mass one-thousandth that of the Sun, but two-and-a-half times that of all the other planets in the Solar System combined. Jupiter is one of the brightest objects visible to the naked eye in the night sky, and has been known to ancient civilizations since before recorded history. It is named after the Roman god Jupiter.[19] When viewed from Earth, Jupiter can be bright enough for its reflected light to cast visible shadows,[20] and is on average the third-brightest natural object in the night sky after the Moon and Venus." 30)
 9 "Jupiter is the fifth planet from the Sun and the largest in the Solar System. When viewed from Earth, Jupiter can be bright enough for its reflected light to cast visible shadows. It is on average the third-brightest natural object in the night sky after the Moon and Venus. It has been known to ancient civilizations since before recorded history."
10 
11 "Jupiter is the fifth planet from the Sun and the largest in the Solar System. When viewed from Earth, Jupiter can be bright enough for its reflected light to cast visible shadows. It is on average the third-brightest natural object in the night sky after the Moon and Venus. It has been known to ancient civilizations since before recorded history."
12 
13 CL-USER> (huggingface:answer-question "Where were the 1992 Olympics held?" "The 1992 Summer Games were the first since the end of the Cold War, and the first unaffected by boycotts since the 1972 Summer Games. The 1992 Olympics were in Greece. 1992 was also the first year South Africa was re-invited to the Olympic Games by the International Olympic Committee, after a 32-year ban from participating in international sport.")
14 
15 "Greece"
16 CL-USER> 

Hugging Face APIs Wrapup

I believe in supporting Hugging Face because they allow individual developers and smaller organizations to do meaningful custom work LLMs. Hugging Face has emerged as an important public resource in the AI industry by providing an open platform for experimenting with LLMs. This platform has democratized access to AI resources, allowing researchers and developers to collaborate and share LLMs. I like Hugging Face’s commitment to open source principles. This is particularly important in an era where there is a growing concern that a few large AI companies might monopolize AI resources. By providing an open platform Hugging Face is acting as an antidote to this potential issue. It ensures that access to cutting-edge AI technology is not restricted to a select few but is available to anyone with the interest and capability to use it. This democratization of AI resources promotes diversity in AI development and helps prevent the concentration of power in a few hands. It encourages a more equitable distribution of AI benefits and mitigates the risks associated with the monopolization of AI technology.