Using Local Models with Sortio

Learn how to set up local AI models with Sortio using Ollama or LM Studio.

Choosing a Local Engine

Sortio can run AI models entirely on your own machine. In Settings → AI, the provider choices are Sortio Cloud, Local Model, and Bring Your Own Key. When you choose Local Model, Sortio asks which engine you want to use:

Ollama

Command-line tool

A lightweight, terminal-based way to pull and run models. Best if you're comfortable with a quick command or two and want the simplest possible setup.

LM Studio

Desktop app with a model browser

A full graphical app for finding, downloading, and loading open-weight models with no terminal required. Best if you prefer a point-and-click experience.

Which should I use?

Both run models locally with identical privacy and offline behavior, both work with every Sortio feature, and with either engine nothing counts against your Sortio AI allowance. Choose Ollama if you like a simple command-line setup, or LM Studio if you'd rather browse and manage models in a desktop app. The Ollama sections below come first, followed by an LM Studio setup guide.

Installing Ollama

Sortio can use Ollama to run AI models locally on your computer. Follow these steps to install Ollama:

Installation Options

Download Ollama

Visit ollama.com/download and select your platform (macOS, Windows, or Linux)

Install and Run

Follow the installation instructions for your operating system

Verify Installation

Once installed, Ollama will run in the background ready to serve models

Installing a Model

After installing Ollama, you'll need to download a model. Choose one based on your computer's capabilities:

Recommended: Llama 3.3 or Deepseek-r1

Best performance for most file organization tasks

Terminal
$ ollama pull llama3.3

Alternative: DeepSeek

Another excellent option for file organization

Terminal
$ ollama pull deepseek

For Limited Resources

If your computer has less RAM or processing power

Terminal
$ ollama pull llama3.2

Using with Sortio

Once you've installed Ollama and a model, connecting it with Sortio is simple:

1

Start Ollama

Ensure Ollama is running in the background on your computer

2

Open Settings

Go to Sortio's Settings menu and navigate to the AI Models section

3

Select Model

Choose your installed model from the dropdown menu

Enhanced Privacy

Using local models ensures your files never leave your computer. All AI processing happens locally, providing complete privacy and offline functionality.

Using LM Studio

LM Studio is a free desktop app that runs open-weight models locally and exposes an OpenAI-compatible local server. If you'd rather browse and manage models in a graphical app instead of the command line, choose LM Studio as your Local Model engine in Sortio. The privacy story is identical to Ollama: everything runs on your machine, nothing is sent to Sortio or any cloud, nothing counts against your Sortio allowance, and it works fully offline. All downstream features (sorting, Spaces, and Automations/rules) work the same way.

Setting Up LM Studio

Install LM Studio

Download LM Studio from lmstudio.ai and install it for your platform (macOS, Windows, or Linux).

Download and Load a Model

Use the in-app model browser to download a chat model, then load it so it's ready to serve. Any chat model loaded in LM Studio works with Sortio.

Start the Local Server

Open the Developer tab in LM Studio and start its local server. By default it runs at http://localhost:1234.

Select LM Studio in Sortio

In Sortio go to Settings → AI → Local Model → LM Studio. Sortio auto-detects the running server and lists your loaded models. Pick the model you want to use.

Custom Server URL

If you run LM Studio's server on a non-default port, use the editable Server URL field on the LM Studio screen to point Sortio at the correct address. Most setups can leave it on the default http://localhost:1234.

Tip: Picking a Model

Smaller instruct models (for example Qwen2.5-7B-Instruct) tend to be faster and respond well for file organization. Large "reasoning" models are more capable but noticeably slower. Any chat model loaded in LM Studio will work, so you can experiment to find the right balance of speed and accuracy for your hardware.

Performance & Reliability

Sortio includes several features to ensure local model processing is fast and reliable:

Intelligent Batching

When sorting large numbers of files, Sortio automatically batches requests to your local model. This optimizes memory usage and ensures consistent performance even with hundreds of files.

Automatic Retry Logic

If a request to Ollama fails due to a temporary issue (network hiccup, model busy, etc.), Sortio automatically retries with exponential backoff. This makes sorting more resilient without requiring manual intervention.

Clear Error Messages

When something goes wrong, Sortio provides actionable error messages that help you understand the issue. Whether Ollama isn't running, the model isn't loaded, or there's a configuration problem, you'll know exactly what to fix.

Tip: Model Performance

For the best balance of speed and accuracy, we recommend llama3.3 or deepseek-r1 models. If you're sorting very large batches of files, smaller models like llama3.2 may complete faster while still providing good results.

Troubleshooting Local Models

If you're experiencing issues with local models, use these diagnostic commands to identify and solve the problem.

Diagnostic Commands

Run these commands in your Terminal (macOS/Linux) or Command Prompt (Windows) to check if Ollama is working correctly.

1
Check if Ollama is Running
Terminal
$ curl http://localhost:11434/api/tags

Success: Returns JSON with your installed models

Failed: "Connection refused" means Ollama isn't running

2
List Installed Models
Terminal
$ ollama list

Success: Shows table of installed models with sizes

Empty: No models installed - run ollama pull llama3.2

3
Test Model Response
Terminal
$ ollama run llama3.2 "Say hello"

Success: Model responds with a greeting

Failed: Model may need to be re-downloaded

4
Start Ollama (if not running)
Terminal
$ ollama serve

This starts the Ollama server. Keep this terminal window open, or use the Ollama app which runs in the background.

Hardware Requirements

  • Memory: 8GB RAM minimum, 16GB+ recommended for larger models
  • GPU: Dedicated GPU recommended for optimal performance
  • Storage: At least 10GB free space for model files
  • CPU: Modern multi-core processor (4+ cores recommended)
Recommended Models by Hardware
  • 8GB RAM: llama3.2:1b, llama3.2:3b, deepseek-r1:1.5b
  • 16GB RAM: llama3.2, deepseek-r1:8b
  • 24GB+ VRAM: llama3.3:70b, deepseek-r1:70b
Common Issues

The diagnostic tool above can help identify common issues. Here are additional troubleshooting steps for specific problems:

  • Ollama not starting: Check your system resources; Ollama requires at least 8GB of RAM to function properly.
  • Very slow responses: Your model may be too large for your hardware. Try a smaller model like "llama3:8b" instead.
  • Permission errors: Make sure Sortio has network permissions to access localhost in your system's security settings.
Still Having Issues?

If you're still experiencing problems after running diagnostics and following the troubleshooting steps, contact our support team at marcus@getsortio.com.