ssh UCL_ID@myriad.rc.ucl.ac.uk
How to use Ollama with your university’s HPC cluster
1 Pre-requisites
You have access to your university’s HPC cluster. For instance, as a UCL member of staff, I can apply for an account to use the UCL HPC cluster through this link.
You have a working knowledge of how to use the HPC cluster. For instance, you should be comfortable with using shell commands, submitting jobs, and managing files on the cluster. If you are new to the HPC cluster, you can refer to the UCL HPC documentation.
You have a working knowledge of how to use Ollama in the format of images.
You should have access to the university’s HPC login node. It’s likely that you would need a VPN connection to access the HPC cluster from outside the university network.
2 Walkthrough of setting up Ollama on the UCL HPC cluster: The Container Image Method
In this first method, I will show you how to set up Ollama on the UCL HPC cluster using the container image method. This is the most straightforward method to set up Ollama on the UCL HPC cluster.
Why container image method? Because different University HPC clusters have different configurations (different Linux distros etc.), and the container image method is the most portable way to set up Ollama on the UCL HPC cluster. You can think of the container image as a pre-packaged version of Ollama that contains all the necessary dependencies and configurations to run Ollama on the UCL HPC cluster.
To be updated in the future: I will also show you how to set up Ollama on the UCL HPC cluster using the source code method. This method is more flexible and allows you to customize the installation of Ollama on the UCL HPC cluster. However, my experience with UCL’s HPC cluster is not good because the cluster uses very old linux version, so some dependencies (especially
glibc
) are not compatible with the latest version of Ollama.
2.1 Step 1: Connect to the UCL HPC cluster
First, you would need to have access to UCL’s internal network. You can do this by connecting to the UCL VPN. Alternatively, if you are on campus, you can connect to the UCL network directly.
Then, you need to SSH
into the UCL HPC cluster.1. Depending on your OS, you can launch the terminal and type the following command. You can refer to this page for more information on how to connect to the UCL HPC cluster.
-
ssh
is the command to initiate an SSH connection to a remote server. -
UCL_ID
is your UCL user ID, e.g.,ucabxyz
. -
myriad.rc.ucl.ac.uk
is the hostname of the UCL HPC cluster. -
UCL_ID\@myriad.rc.ucl.ac.uk
is the full address to connect to the UCL HPC cluster, meaning you are connecting to themyriad.rc.ucl.ac.uk
server with theUCL_ID
account.
You will be prompted to enter your password. You won’t see the password as you type it, but it’s being entered. Press Enter
after you’ve typed your password.
If this is your first time connecting to the UCL HPC cluster, you will be prompted to accept the RSA key fingerprint. You can type yes
and press Enter
to accept the RSA key fingerprint.
Once you are connected to the UCL HPC cluster, you will see the following screen:
2.2 Step 2: Load the necessary modules
Once you are connected to the UCL HPC cluster, you need to load the necessary modules to use Ollama. In this guide, I’m using apptainer
to pull the Ollama container image from the Docker Hub. Apptainer
is a tool to manage container images on the UCL HPC cluster. You may have heard of Docker
before, and apptainer
is a similar tool to manage container images on HPC clusters.
Because UCL’s cluster has a module system, you need to load the apptainer
module before you can use it. You can do this by typing the following command. For more information on how to use apptainer
on UCL’s cluster, you can refer to this page.
module load apptainer
# Create a directory to store the Ollama models
mkdir -p ~/Scratch/ollama/models
Next, we need to set some environment variables to make the apptainer
tool work for Ollama. You can copy paste the following command into your terminal and hit enter to set the environment variables.
-
mkdir -p
is a command to create a directory. The-p
flag is used to create the parent directory if it doesn’t exist. -
~/Scratch/ollama/models
is the path to the Ollama models you will download later. For UCL users, on our disk on the cluster, we have a directory calledScratch
where we can store large files. We will store the Ollama models in theScratch
directory.
# This is the path to the Ollama models you will download later
export OLLAMA_MODELS="~/Scratch/ollama/models"
# This is the path to the Ollama logs. You can change the log level to "debug" to see more logs
export OLLAMA_LOG_LEVEL="error"
-
export
is a command to set environment variables in the shell. -
OLLAMA_MODELS
is the environment variable to set the path to the Ollama models. You can change the path to your preferred location. -
OLLAMA_LOG_LEVEL
is the environment variable to set the log level of Ollama. You can change the log level todebug
to see more logs. “error” will only show error logs.
2.3 Step 3: Pull the Ollama container image
In this tutorial, we will not use apptainer build
to build the Ollama container image from scratch. Instead, we would pull the Ollama container image from the Docker Hub.
The most recent version of the image is tagged as ollama/ollama:latest
. You can type the following command to pull the Ollama container image from the Docker Hub.
apptainer pull ollama-latest.sif docker://ollama/ollama:latest
After running the command, the apptainer
tool will pull the Ollama container image from the Docker Hub and convert it as ollama-latest.sif
in the current directory. Internally, it will download the image to a temporary location and then save it as ollama-latest.sif
in the current directory.
If you would like to pull a specific version of the Ollama container image, you can specify the tag. For instance, if you would like to pull the Ollama container image tagged as 0.5.11
(which is the latest version as of Feb 16, 2025), you can type the following command.
apptainer pull ollama-0.5.11.sif docker://ollama/ollama:0.5.11
apptainer pull
is a command to pull the Ollama container image from the Docker Hub.ollama-0.5.11.sif
is the name of the Ollama container image to be saved in the current directory.docker://ollama/ollama:0.5.11
is the address of the Ollama container image on the Docker Hub. The image is tagged as0.5.11
.
2.4 Step 4: Run Ollama on the UCL HPC cluster
Once the image is pulled, you can run Ollama on the UCL HPC cluster. You can type the following command to run Ollama on the UCL HPC cluster.
apptainer run --nv ~/ollama-0.5.11.sif &
-
apptainer run
is a command to run the Ollama container image on the UCL HPC cluster. -
--nv
is a flag to enable GPU inference. If you don’t have access to GPU inference, you can remove this flag. -
~/ollama-0.5.11.sif
is the path to the Ollama container image you pulled earlier. Note that this is the path to the pulled image we have saved in the current directory. -
&
is a command to run the Ollama container image in the background. This way, you can continue to use the terminal while Ollama is running in the background.
Note that by default, if you are at the login node (you will see userid\@login
at your terminal prompt), you won’t have access to GPU inference. Therefore, you can only run Ollama in CPU mode, and you will see the following warning message.
WARNING: Could not find any nv files on this host!
If you would like to run Ollama in GPU mode, you will either need to:
- request an interactive session on the GPU nodes. Refer to the UCL HPC documentation on interactive sessions for more information on how to request an interactive session on the GPU nodes.
- submit a GPU job to the GPU nodes. Refer to the UCL HPC documentation on GPU nodes for more information on how to submit a job to the GPU nodes.
Now that the Ollama service is run in the background, you need to download some Ollama models to the UCL HPC cluster.
For instance, you can type the following command to download the qwen2.5:14b
model.
apptainer run --nv ~/ollama-0.5.11.sif pull qwen2.5:14b
apptainer run --nv ~/ollama-0.5.11.sif
can be thought of asollama run
if you are familiar with theollama
command. It is a command to run the Ollama container image on the UCL HPC cluster.pull qwen2.5:14b
is a command to download theqwen2.5:14b
model to the UCL HPC cluster. The model will be saved in theOLLAMA_MODELS
directory you set earlier.If you tried ollama on your laptop before, this is similar to
ollama pull qwen2.5:14b
to download the model to your local machine.Therefore, in order to enter the chat mode with Ollama, you can type the following command:
apptainer run --nv ~/ollama-0.5.11.sif run qwen2.5:14b
.
You can test Ollama is running by typing the following command. For how to use the API, you can refer to the Ollama API documentation.
curl http://localhost:11434/api/generate -d '{
"model": "qwen2.5:7b",
"prompt": "Why is the sky blue?",
"stream": false
}'
2.5 Step 5: Call Ollama API with your preferred language
Since it’s very likely that you will use Ollama in your own programming environment, you can call Ollama with your preferred programming language. For instance, you can use R/Python to classify whether a Twitter text contains hate speech
You can also call Ollama with your preferred programming language. Say, You can use R/Python to classify whether a Twitter text contains hate speech.
2.6 Conclusion
The above steps show you a quick example of how to use Ollama with the UCL HPC cluster. You can replace the toy example with your own use case. You can also streamline the process by writing a shell script to automate the process of loading the modules, pulling the Ollama container image, and running Ollama on the UCL HPC cluster.
Below is my example shell script to automate the process of loading the modules, pulling the Ollama container image, and running Ollama on the UCL HPC cluster.
#!/bin/bash -l
#$ -l h_rt=48:00:0
#$ -l mem=32G
#$ -l gpu=1
#$ -l tmpfs=10G
#$ -N find_company_matches
#$ -wd ~/Scratch/Accounting-Marketing
#$ -m be
#$ -M wei.miao@ucl.ac.uk
#$ -t 1-3
# Load the R module and run your R program
# source /shared/ucl/apps/bin/defmods
export OLLAMA_MODELS="~/Scratch/ollama/models"
export R_LIBS_USER="~/R/x86_64-pc-linux-gnu-library/4.4"
export GIN_MODE="release"
export OLLAMA_LOG_LEVEL="error"
module -f unload compilers mpi gcc-libs
module load curl/7.86.0/gnu-4.9.2
module load r/4.4.2-openblas/gnu-10.2.0
module load apptainer
apptainer run --nv ~/ollama-0.5.11.sif &
apptainer run ~/ollama-0.5.11.sif pull qwen2.5:7b
export WORK_DIR="~/Scratch/"
# below is the R script to run
cd $TMPDIR
R --no-save < $WORK_DIR/shell/find_company_name_matches.R > $JOB_NAME$SGE_TASK_ID.out
# Copy the output files back to the current directory
tar zcvf $WORK_DIR/shell/files_from_job_$JOB_NAME$SGE_TASK_ID.tgz $TMPDIR
Enjoy using Ollama on the UCL HPC cluster!
Footnotes
SSH stands for Secure Shell. It is widely used for remote login to computer systems by users. As long as you have the user ID and your user password, you will be able to SSH into the cluster.↩︎