Setup Guide

Last updated: March 15, 2026

MAUDE runs on Linux with CUDA. It was built for the NVIDIA DGX Spark (Grace Blackwell, 128 GB unified memory) but works on any Linux machine with an NVIDIA GPU and enough VRAM to run a local model.

Step 1

Prerequisites

Make sure you have the following installed:

Linux — Ubuntu 22.04+ or similar
NVIDIA GPU — with CUDA drivers installed and nvidia-smi working
Python 3.12+
Git, cmake, make, g++ — for building llama.cpp
tmux — background services run in tmux sessions
curl — health checks during startup

Step 2

Clone & Create Environment

git clone https://github.com/mboard8070/terminal-llm.git
cd terminal-llm
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
playwright install chromium

Step 3

Build llama.cpp & Download Model

The setup script clones llama.cpp, builds it with CUDA support, and downloads the Nemotron 30B model (~38 GB).

./setup_local.sh

This will take a while depending on your internet speed and whether cmake needs to compile CUDA kernels for your GPU architecture.

Step 4 (Optional)

Vision Model

If you want MAUDE to analyze images and screenshots, install Ollama and pull the LLaVA vision model:

curl -fsSL https://ollama.com/install.sh | sh
ollama pull llava:13b

Step 5

API Keys

Create a variables.env file in the project root. You only need the keys for the providers you plan to use:

# Required for cloud models (at least one)
MISTRAL_API_KEY=your-key-here        # console.mistral.ai
CODESTRAL_API_KEY=your-key-here      # same Mistral console (separate key)

# Optional
CLAUDE_API_KEY=your-key-here         # console.anthropic.com
OPEN_ROUTER_API_KEY=your-key-here    # openrouter.ai/keys

Tip: You can also set keys from inside MAUDE with /keys set mistral YOUR_KEY. Keys are stored in ~/.config/maude/keys.json.

Step 6 (Optional)

Remote Access with Tailscale

If you want to access MAUDE from other machines (Mac, PC, phone), install Tailscale on both the server and your client devices:

curl -fsSL https://tailscale.com/install.sh | sh
sudo tailscale up

For HTTPS access (required by the mobile app), generate SSL certificates and place them at certs/cert.pem and certs/key.pem. The gateway auto-enables HTTPS when these files exist.

Step 7

Launch

The single launcher starts all services and drops you into the MAUDE chat:

./maude

This starts:

Nemotron — local inference on port 30010
Gateway — model routing, tool execution, SSE streaming on ports 30000/30080
File Server — file transfers on port 30002
PersonaPlex — voice server on port 8998
MAUDE Chat — the terminal interface (runs in foreground)

# Other commands
./maude stop      # Stop all services
./maude status    # Show what's running
./maude -n        # Start services only (no TUI)

Step 8 (Optional)

Client Install (Mac / PC / Linux)

The lightweight CLI client connects to your MAUDE server over Tailscale. Install it with pip:

pip install maude-client
maude-client --server http://<your-tailscale-ip>:30080

The client supports all models, tool execution traces, conversation history, and cross-machine task dispatch.

Step 9 (Optional)

Google Workspace

To use Gmail, Drive, Sheets, Calendar, Contacts, Slides, and YouTube tools, you need a Google Cloud project with OAuth 2.0 credentials:

Create a project at console.cloud.google.com
Enable the Gmail, Drive, Sheets, Calendar, Slides, People, and YouTube APIs
Create OAuth 2.0 credentials (Desktop application type)
Download credentials.json to the project root
On first use, MAUDE will open a browser for the OAuth consent flow

Verification

Verify Everything Works

Once MAUDE is running, try these commands in the chat:

# Check available models
/model switch mistral

# Test tool execution
what files are in my home directory?

# Test web search
search for the latest NVIDIA news

# Check system status
./maude status

You should see tool call traces (╭─ tool_name / ╰─ result) as MAUDE executes commands on your behalf.