Working with Ollama

Now that I’m working with Ollama, I needed to figure out how to locate the models on my storage medium and determine the amount of space they were occupying. This is especially important for the MacBook Air, which has a 256 GB drive that is already half full. My Linux computer, on the other hand, has a terabyte NVME drive, so I’m not as worried about storage there. However, I still like to track where everything is and how much storage it uses. I wanted to compile a list of the downloaded models and learn how to delete them.

After Ollama is installed you can get a list of the commands available to you by entering the following command in the terminal.

$ ollama help

Usage:
ollama [flags]
ollama [command]

Available Commands:
serve Start ollama
create Create a model from a Modelfile
show Show information for a model
run Run a model
pull Pull a model from a registry
push Push a model to a registry
list List models
ps List running models
cp Copy a model
rm Remove a model
help Help about any command

Flags:
-h, --help help for ollama
-v, --version Show version information


Check the version of ollama you are using with:

$ ollama -v

ollama version is 0.3.9


Downloading models is easy using ‘ollama pull’

$ ollama pull gemma2

This downloads the model to your local computer. You can also initiate a pull request by running the following command:

$ ollama run phi3

This command sequence pulls the ‘phi3’ model down to your computer. After I had tried a few models I wanted to get a sense of how much space they were taking on my system and I used the ‘list’ or ‘ls’ switch of the ‘ollama’ command.

$ ollama list
NAME            	ID          	SIZE  	MODIFIED     
llama3.1:latest 	f66fc8dc39ea	4.7 GB	2 hours ago 	
codegemma:latest	0c96700aaada	5.0 GB	43 hours ago	
phi3:latest     	4f2222927938	2.2 GB	47 hours ago	
gemma2:latest   	ff02c3702f32	5.4 GB	3 days ago  	
phi3:medium     	cf611a26b048	7.9 GB	3 days ago  	
gemma2:2b       	8ccf136fdd52	1.6 GB	3 days ago  

Now I can see how much storage the models are using. It’s easy to delete a model. I decided to delete the oldest one first.

$ ollama rm gemma2
deleted 'gemma2'

Just to be sure, I ran ‘ollama ls’; sure enough, the model was gone.

NAME            	ID          	SIZE  	MODIFIED     
llama3.1:latest 	f66fc8dc39ea	4.7 GB	2 hours ago 	
codegemma:latest	0c96700aaada	5.0 GB	44 hours ago	
phi3:latest     	4f2222927938	2.2 GB	47 hours ago	
phi3:medium 

You can check on how many models are running on your machine with the following command:

$ ollama ps
NAME       	ID          	SIZE  	PROCESSOR	UNTIL              
phi3:latest	4f2222927938	6.0 GB	100% CPU 	4 minutes from now	

Using the ‘show’ command shows you information about the model.

ollama show phi3
  Model                                 
  	arch            	phi3  	                
  	parameters      	3.8B  	                
  	quantization    	Q4_0  	                
  	context length  	131072	                
  	embedding length	3072  	                
  	                                      
  Parameters                            
  	stop	"<|end|>"      	                   
  	stop	"<|user|>"     	                   
  	stop	"<|assistant|>"	                   
  	                                      
  License                               
  	Microsoft.                          	  
  	Copyright (c) Microsoft Corporation.

Stop a model from running by pressing CTRL +c. Exit the model by entering CTRL +d. In conclusion, Ollama has revolutionized our understanding and interaction with AI.

Breaking Free from the Cloud: Exploring the Benefits of Local,Open-Source AI with Ollama

Everywhere you look, someone is talking or writing about artificial intelligence. I have been keenly interested in the topic since my graduate school days in the 1990s. I have used ChatGPT, Microsoft Copilot, Claude, Stable Diffusion, and other AI software to experiment with how this technology works and satisfy my innate curiosity. Recently, I discovered Ollama. Developed by Meta, it is an open-source large language model that can run locally on Linux, MacOS, and Microsoft Windows. There is a great deal of concern that while using LLMs in the cloud, your data is being scraped and reused by one of the major technology companies. Ollama is open-source and has an MIT license. Since Ollama runs locally, there is no danger that your work could end up in someone else’s LLM.

The Ollama website proclaims, “Get up and running with Large Language Models.” That invitation was all I needed to get started. Open a terminal on Linux and enter the following to install Ollama:

curl -fsSL https://ollama.com/install.sh | sh

The project lists all the models that you can use, and I chose the first one in the list, Llama3.1. Installation is easy, and it did not take long to install the Llama3.1 model. I followed the instructions and, in the terminal, entered the following command:

$ ollama run llama3.1

The model began to install, which took a couple of minutes. This could vary depending on your CPU and internet connection. I have an Intel i7 with 64 GB RAM and a robust internet connection. Once the model was downloaded, I was prompted to ‘talk’ with the LLM. I decided to ask a question about the history of my alma mater, St. Bonaventure University. I entered the following commands:

$ ollama run llama3.1
>>>What is the history of St. Bonaventure University?

The results were good but somewhat inaccurate. “St. Bonaventure University is a private Franciscan university located in Olean, New York. The institution was founded by the Diocese of Buffalo and  has a rich history dating back to 1856.” St. Bonaventure is located near Olean, New York, and it is in the Diocese of Buffalo, but it was founded in 1858. I asked the model to name some famous St. Bonaventure alumni; more inaccuracies were comic. Bob Lanier was a famous alumnus but Danny Ainge was not.

The results are rendered in MarkDown, which is a real plus. I also knew that having a GPU would render the results much quicker. I wanted to install Ollama on my M2 MacBook Air which I soon did. I followed the much easier directions: Download the Ollama-darwin.zip, unzip the archive, and double-click the Ollama icon. The program is installed in the MacBook’s Application folder. When the program is launched, it directs me to the Mac Terminal app, where I can enter the same commands I had entered on my Linux computer.

Unsurprisingly, Ollama uses a great deal of processing power, which is lessened if you run it on a computer with a GPU. My Intel NUC 11 is a very powerful desktop computer with quad-core 11th Gen Intel Core i7-1165G7, 64 gigabytes of RAM, and a robust connection to the internet to download additional models. I posed similar questions to the Llama3.1 model first on the Intel running Linux and then on the M2 MacBook Air running MacOS. You can see the CPU utilization below on my Linux desktop. It’s pegged, and the output from the model is slow at an approximate rate of 50 words per minute. Contrast that with the M2 MacBook, which has a GPU with a CPU utilization of approximately 6.9% and words per minute faster than I could read.

Screen picture by Don Watkins CC by SA 4.0

While Ollama Llama3.1 might not excel at history recall, it does very well when asked to create Python code. I entered a prompt to create Python code to create a circle without specifying how to accomplish the task. It rendered the code shown below. I had to install the ‘pygame’ module, which is not on my system.

$  sudo apt install python3-pygame
# Python Game Development

import pygame
from pygame.locals import *

# Initialize the pygame modules
pygame.init()

# Create a 640x480 size screen surface
screen = pygame.display.set_mode((640, 480))

# Define some colors for easy reference
WHITE = (255, 255, 255)
RED = (255, 0, 0)

while True:
    # Handle events
    for event in pygame.event.get():
        if event.type == QUIT or (event.type == KEYDOWN and event.key == 
K_ESCAPE):
            pygame.quit()
            quit()

    screen.fill(WHITE)  # Fill the background with white color

    # Drawing a circle on the screen at position (250, 200), radius 100
    pygame.draw.circle(screen, RED, (250, 200), 100)
    
    # Update the full display Surface to the screen
    pygame.display.flip()

I copied the code into VSCodium and ran it. You can see the results below.

Screen picture by Don Watkins CC by SA 4.0

As I continue experimenting with Ollama and other open-source LLMs, I’m struck by the significance of this shift toward local, user-controlled AI. No longer are we forced to rely on cloud-based services that may collect our data without our knowledge or consent. With Ollama and similar projects, individuals can harness the power of language models while maintaining complete ownership over their work and personal information. This newfound autonomy is a crucial step forward for AI development and I’m eager to see where it takes us.