Evaluating writing using open source artificial intelligence

In today’s digital age, writers seek tools that enhance their craft and provide real-time feedback and assistance. Enter Ollama – an open-source machine learning system designed to democratize accessibility for natural language processing tasks across a wide range of languages and scripts with ease. Coupled with the Phi3 model, this powerful duo promises unparalleled benefits in refining your writing style through sophisticated evaluations beyond grammar checking. This article will explore how Ollama, integrated with phi3’s innovative language comprehension and assessment approach, revolutionizes the writer’s journey toward excellence. So buckle up as we embark on a transformative exploration of your writing potential using these advanced AI tools!

I have been experimenting with Ollama and various models designed to work with it. In this article, I wanted to use Ollama and Phi3 to demonstrate the power of open source to revolutionize and evaluate writing. Both Ollama and Phi3 are open source, both have MIT licenses, and they work well together. You need to have Ollama installed on your computer. My daily driver is an Intel NUC 11 with an i7 processor running Linux Mint Cinnamon. Though this is a powerful computer, running a large language model takes more than a great CPU. A powerful GPU would make the process a little quicker.

First, I needed to make sure that Ollama was installed. If you are on Linux, you do that by executing the following code in a terminal.

$ curl -fsSL https://ollama.com/install.sh | sh

The installation is very straightforward and then you will need to download the Phi3 model. Enter the following command into your terminal.

$ ollama pull phi3

This pulls the Phi3 model down to your computer. You can check to make sure it is there by entering the following code.

$ ollama list

You should see a list of all the models loaded on your computer.

NAME            	ID          	SIZE  	MODIFIED       
phi3:latest     	4f2222927938	2.2 GB	56 minutes ago	
cooking:latest  	59c8496c62aa	4.7 GB	7 days ago    	
llama3.1:latest 	42182419e950	4.7 GB	7 days ago

Now I am ready to begin. I wanted Ollama and the Phi3 model to develop a rubric to evaluate my writing sample. I need to enter two commands. First, I need to get the LLM running, which will open a REPL (Read Only Print Loop) prompt very similar to a Python REPL.

ollama run phi3:latest
>>> create a rubric to score college level writing.

The model quickly created a rubric to score my writing some of which I have copied and pasted here:

“College Level Writing Rubric: Essay (20 points)

I. Introduction and Thesis Statement (4/20 = 2 points):

A student’s introduction must be clear, concise, engaging, and
well-structured. A strong thesis statement should identify the topic of
discussion clearly while presenting an argument or perspective that sets
up a logical progression throughout the essay. Students will lose one
point for each instance where these two components fail to establish
context effectively (e.g., unclear writing style, unconvincing tone).

1 Point – Clear and engaging introduction with well-structured paragraphs setting the stage of discussion or argumentation

2 Points – Introduction is structured clearly but fails in some areas such as clarity of language use or lacking an enticing hook to captivate readers’ interest. (Lose one point)

1 Point – Thesis statement presented, but lacks precision and clear articulation about the topic discussed; does not establish a logical thread for essay progression

2 Points – Clearly stated thesis that presents argument or perspective while setting up an effective structure to follow throughout. (Lose one point)

0 PointS – Introduction is unclear, disjointed language use fails in
providing context and lacks persuasive tone necessary; no clear indication
of what the essay will discuss

The rubric, over seven hundred words long, was generated in a few minutes. All text output from Ollama is in MarkDown. The rubric will be rendered much quicker if you are a developer using an Apple Silicon MacBook and an M2 processor. Though exceptionally powerful, the NUC 11 with the i7 lacks a GPU, which eases the strain on the CPU.

Now that the rubric has been created, I entered the following command in the Ollama REPL to evaluate an article I wrote for Opensource.com nearly ten years ago.

>>>Use the rubric to evaluate this article https://opensource.com/education/15
... /12/my-open-source-story-don-watkins

The process took a few minutes and provided an in-depth review of my article. The evaluation was very detailed and over nine hundred words long. I have pasted the first part of the review here.

“2 Points – Introduction engages with opening hook; however, lacks clarity in setting the stage and doesn’t align well with Watkins’ thesis statement (lose up to three points) 1 Point – Thesis statement present but vague or imprecise about what readers should expect throughout this article. Lacks clear alignment between I and II components of essay-like structure; no explicit roadmap provided for reader follow along (lose two maximum points); fails in captivating the audience right from introduction

0 PointS – Introduction lacks coherence, disjointed language use provides little context or interest to readers about open source contributions. No engaging hook presented nor clear alignment between I and II components of essay-like structure; does not provide explicit roadmap for reader follow along (lose one point)…”

Using large language models to assess writing could offer the subtlety writers require to enhance their writing. Are there potential issues? Will artificial intelligence models replace copywriters? What other implications might they have that change how we write and re-write?