| Title: | Computer Vision with Large Language Models |
|---|---|
| Description: | Make computer vision tasks approachable in R by leveraging Large Language Models. Providing fine-tuned prompts, boilerplate functions, and input/output helpers for common computer vision workflows, such as classifying and describing images. Functions are designed to take images as input and return structured data, helping users build practical applications with minimal code. |
| Authors: | Frank Hull [aut, cre, cph], Johannes Breuer [ctb] (ORCID: <https://orcid.org/0000-0001-5906-7873>), Jordi Rosell [ctb] (ORCID: <https://orcid.org/0000-0002-4349-1458>) |
| Maintainer: | Frank Hull <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0.9000 |
| Built: | 2026-05-27 07:08:58 UTC |
| Source: | https://github.com/frankiethull/kuzco |
a minimal wrapper function to switch which provider is used for each llm_image* function when ellmer backend is selected, ollamar only supports ollama
chat_ellmer(provider = "ollama")chat_ellmer(provider = "ollama")
provider |
a provider, such as "ollama", or "claude", or "github" |
which ellmer function (provider) to use for kuzco llm_image_* when backend is ellmer
edit a listed prompt installed with kuzco
edit_prompt(prompt)edit_prompt(prompt)
prompt |
a prompt from list_prompts() |
a prompt markdown file to edit
## Not run: edit_prompt("system-prompt-alt-text.md") ## End(Not run)## Not run: edit_prompt("system-prompt-alt-text.md") ## End(Not run)
a simple wrapper of kuzco to make computer vision for everyone. few-shot via frank hull and shiny assistant (https://gallery.shinyapps.io/assistant/)
kuzco_app()kuzco_app()
a shiny app instance as a playground for local llms
## Not run: kuzco_app() ## End(Not run)## Not run: kuzco_app() ## End(Not run)
list prompts installed with kuzco
list_prompts()list_prompts()
a list of prompts stored within kuzco
list_prompts()list_prompts()
Image Alt Text using LLMs
llm_image_alt_text( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = "ellmer", additional_prompt = "", provider = "ollama", language = "English", ... )llm_image_alt_text( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = "ellmer", additional_prompt = "", provider = "ollama", language = "English", ... )
llm_model |
a local LLM model either pulled from ollama or hosted |
image |
a local image path that has a jpeg, jpg, or png |
backend |
either 'ellmer' or 'ollamar', note that 'ollamar' suggests structured outputs while 'ellmer' enforces structured outputs |
additional_prompt |
text to append to the image prompt |
provider |
for |
language |
a language to guide the LLM model outputs |
... |
a pass through for other generate args and model args like temperature. set the temperature to 0 for more deterministic output |
a df with text
llm_image_alt_text( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = 'ellmer', additional_prompt = "", provider = "ollama", language = "English" )llm_image_alt_text( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = 'ellmer', additional_prompt = "", provider = "ollama", language = "English" )
Image Classification using LLMs
llm_image_classification( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = "ellmer", additional_prompt = "", provider = "ollama", language = "English", ... )llm_image_classification( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = "ellmer", additional_prompt = "", provider = "ollama", language = "English", ... )
llm_model |
a local LLM model either pulled from ollama or hosted |
image |
a local image path that has a jpeg, jpg, or png |
backend |
either 'ollamar' or 'ellmer', note that 'ollamar' suggests structured outputs while 'ellmer' enforces structured outputs |
additional_prompt |
text to append to the image prompt |
provider |
for |
language |
a language to guide the LLM model outputs |
... |
a pass through for other generate args and model args like temperature |
a df with image_classification, primary_object, secondary_object, image_description, image_colors, image_proba_names, image_proba_values
llm_image_classification( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = 'ellmer', additional_prompt = "", provider = "ollama", language = "English" )llm_image_classification( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = 'ellmer', additional_prompt = "", provider = "ollama", language = "English" )
Customized Vision using LLMs
llm_image_custom( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = "ellmer", system_prompt = "You are a terse assistant in computer vision sentiment.", image_prompt = "return JSON describing image, do not include json or backticks", example_df = NULL, provider = "ollama", ... )llm_image_custom( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = "ellmer", system_prompt = "You are a terse assistant in computer vision sentiment.", image_prompt = "return JSON describing image, do not include json or backticks", example_df = NULL, provider = "ollama", ... )
llm_model |
a local LLM model either pulled from ollama or hosted |
image |
a local image path that has a jpeg, jpg, or png |
backend |
either 'ollamar' or 'ellmer' |
system_prompt |
overarching assistant description, please note that the LLM should be told to return as JSON while kuzco will handle the conversions to and from JSON |
image_prompt |
anything you want to really remind the llm about. |
example_df |
an example data.frame to show the llm what you want returned note this will be converted to JSON for the LLM. |
provider |
for |
... |
a pass through for other generate args and model args like temperature |
a customized return based on example_df for custom control
llm_image_custom( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = "ellmer", system_prompt = "You are a terse assistant in computer vision sentiment.", image_prompt = "return JSON describing image, do not include json or backticks", example_df = NULL, provider = "ollama" )llm_image_custom( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = "ellmer", system_prompt = "You are a terse assistant in computer vision sentiment.", image_prompt = "return JSON describing image, do not include json or backticks", example_df = NULL, provider = "ollama" )
Image OCR for Text Extraction using LLMs
llm_image_extract_text( llm_model = "qwen2.5vl", image = system.file("img/text_img.jpg", package = "kuzco"), backend = "ellmer", additional_prompt = "", provider = "ollama", language = "English", ... )llm_image_extract_text( llm_model = "qwen2.5vl", image = system.file("img/text_img.jpg", package = "kuzco"), backend = "ellmer", additional_prompt = "", provider = "ollama", language = "English", ... )
llm_model |
a local LLM model either pulled from ollama or hosted |
image |
a local image path that has a jpeg, jpg, or png |
backend |
either 'ellmer' or 'ollamar', note that 'ollamar' suggests structured outputs while 'ellmer' enforces structured outputs |
additional_prompt |
text to append to the image prompt |
provider |
for |
language |
a language to guide the LLM model outputs |
... |
a pass through for other generate args and model args like temperature. set the temperature to 0 for more deterministic output |
a df with text and a confidence score
llm_image_extract_text( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = 'ellmer', additional_prompt = "", provider = "ollama", language = "English" )llm_image_extract_text( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = 'ellmer', additional_prompt = "", provider = "ollama", language = "English" )
Image Recognition using LLMs
llm_image_recognition( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), recognize_object = "face", backend = "ellmer", additional_prompt = "", provider = "ollama", language = "English", ... )llm_image_recognition( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), recognize_object = "face", backend = "ellmer", additional_prompt = "", provider = "ollama", language = "English", ... )
llm_model |
a local LLM model either pulled from ollama or hosted |
image |
a local image path that has a jpeg, jpg, or png |
recognize_object |
an item you want to LLM to look for |
backend |
either 'ollamar' or 'ellmer', note that 'ollamar' suggests structured outputs while 'ellmer' enforces structured outputs |
additional_prompt |
text to append to the image prompt |
provider |
for |
language |
a language to guide the LLM model outputs |
... |
a pass through for other generate args and model args like temperature. set the temperature to 0 for more deterministic output |
a df with object_recognized, object_count, object_description, object_location
llm_image_recognition( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), recognize_object = "nose", backend = 'ellmer', additional_prompt = "", provider = "ollama", language = "English" )llm_image_recognition( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), recognize_object = "nose", backend = 'ellmer', additional_prompt = "", provider = "ollama", language = "English" )
Image Sentiment using LLMs
llm_image_sentiment( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = "ellmer", additional_prompt = "", provider = "ollama", language = "English", ... )llm_image_sentiment( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = "ellmer", additional_prompt = "", provider = "ollama", language = "English", ... )
llm_model |
a local LLM model either pulled from ollama or hosted |
image |
a local image path that has a jpeg, jpg, or png |
backend |
either 'ollamar' or 'ellmer', note that 'ollamar' suggests structured outputs while 'ellmer' enforces structured outputs |
additional_prompt |
text to append to the image prompt |
provider |
for |
language |
a language to guide the LLM model outputs |
... |
a pass through for other generate args and model args like temperature. set the temperature to 0 for more deterministic output |
a df with image_sentiment, image_score, sentiment_description, image_keywords
llm_image_sentiment( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = 'ellmer', additional_prompt = "", provider = "ollama", language = "English" )llm_image_sentiment( llm_model = "qwen2.5vl", image = system.file("img/test_img.jpg", package = "kuzco"), backend = 'ellmer', additional_prompt = "", provider = "ollama", language = "English" )
View Images quickly and easily
view_image(image = system.file("img/test_img.jpg", package = "kuzco"))view_image(image = system.file("img/test_img.jpg", package = "kuzco"))
image |
an image to view |
a plot of the image in a Plots pane
view_image(image = system.file("img/test_img.jpg", package = "kuzco"))view_image(image = system.file("img/test_img.jpg", package = "kuzco"))
view llm results as a tidy great table
view_llm_results(llm_results)view_llm_results(llm_results)
llm_results |
results from one of the llm_image_* functions |
a great table to view the results neatly
## Not run: view_llm_results(llm_image_alt_text()) ## End(Not run)## Not run: view_llm_results(llm_image_alt_text()) ## End(Not run)