Initial commit: llama.cpp Server GUI

A professional PyQt6-based GUI for managing llama.cpp server instances. Features: - Server binary and model file selection - Comprehensive server options (host, port, context, GPU layers, etc.) - Start/Stop controls with non-blocking operations - Real-time server log viewer - Profile management (save/load/delete configurations) - Configuration persistence - System tray support - Auto-start option 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-12 19:00:43 -05:00
commit 0356871946
4 changed files with 923 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,23 @@
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+*.egg-info/
+dist/
+build/
+
+# User configuration (don't upload personal settings)
+.llama_server_gui_config.json
+
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+*~
+
+# OS
+.DS_Store
+Thumbs.db
--- a/README.md
+++ b/README.md
@@ -0,0 +1,119 @@
+# llama.cpp Server GUI
+
+A professional PyQt6-based graphical interface for managing llama.cpp server instances.
+
+## Features
+
+- **Server Binary Selection**: Browse and select your llama.cpp server binary
+- **Model Selection**: Easy selection of GGUF model files
+- **Comprehensive Options**: Configure host, port, context length, GPU layers, threads, batch size, and more
+- **Start/Stop Controls**: Simple buttons to start and stop the server
+- **Real-time Logs**: View server output and errors in real-time
+- **Profile Management**: Save and load different configurations for different models/use cases
+- **Configuration Persistence**: All settings are saved between sessions
+- **System Tray Icon**: Minimize to tray to keep the server running in the background
+- **Auto-start**: Option to automatically start the server when the GUI launches
+
+## Requirements
+
+- Python 3
+- PyQt6
+- llama.cpp server binary
+
+## Installation
+
+1. Install PyQt6:
+```bash
+sudo apt install python3-pyqt6
+```
+
+2. Make sure you have llama.cpp compiled with the server binary
+
+## Usage
+
+Run the application:
+```bash
+./llama_server_gui.py
+```
+
+Or:
+```bash
+python3 llama_server_gui.py
+```
+
+## Quick Start
+
+1. **Select Server Binary**: Click "Browse..." in the "Server Binary" section and navigate to your llama.cpp server binary (e.g., `/home/xero110/dev/llama.cpp/build/bin/llama-server`)
+
+2. **Select Model**: Click "Browse..." in the "Model Selection" section and choose your GGUF model file
+
+3. **Configure Options**: Adjust the server options as needed:
+   - Host: IP address to bind to (default: 127.0.0.1)
+   - Port: Port number (default: 8080)
+   - Context Length: Maximum context size (default: 2048)
+   - GPU Layers (ngl): Number of layers to offload to GPU (default: 33)
+   - Threads: CPU threads to use (default: 8)
+   - Batch Size: Batch size for processing (default: 512)
+   - Additional Arguments: Any extra command-line arguments
+
+4. **Start Server**: Click "Start Server"
+
+5. **Save Profile**: Once you have a configuration you like, click "Save Profile" to save it for later use
+
+## Profile Management
+
+- **Save Profile**: Saves the current configuration with a custom name
+- **Load Profile**: Select a profile from the dropdown and click "Load" to load its settings (profiles also auto-load when selected from dropdown)
+- **Delete Profile**: Removes the selected profile
+- **Auto-start**: Check this option to automatically start the server when the GUI launches
+
+The GUI now includes detailed logging in the log viewer at the bottom, showing when profiles are saved, loaded, and what settings are being applied.
+
+## System Tray
+
+The application includes a system tray icon that allows you to:
+- Show/hide the main window
+- Start/stop the server from the tray menu
+- Quit the application
+
+When you close the window while the server is running, you can choose to:
+- Minimize to tray (server keeps running)
+- Stop server and quit
+- Cancel the close operation
+
+## Configuration File
+
+Settings are stored in `~/.llama_server_gui_config.json`
+
+## Common Server Options Explained
+
+- **Context Length (-c)**: Maximum number of tokens the model can process at once. Larger values use more RAM/VRAM.
+- **GPU Layers (-ngl)**: Number of model layers to offload to GPU. Higher = faster but uses more VRAM. Set to -1 for all layers.
+- **Threads (-t)**: Number of CPU threads for processing. Usually set to your CPU core count or less.
+- **Batch Size (-b)**: Number of tokens processed in parallel. Larger = faster but uses more memory.
+- **Host**: Network interface to bind to. Use 127.0.0.1 for local-only access, or 0.0.0.0 to allow network access.
+- **Port**: Network port for the server API.
+
+## Tips
+
+- For RTX 4070 (8GB VRAM): Start with ngl=33 and adjust based on your model size
+- With 96GB RAM and i9 CPU: You can use high thread counts (16-24) and large context sizes
+- Create different profiles for different models (e.g., "Llama-3-8B", "Mistral-7B", etc.)
+- Use the system tray to keep the server running while working on other tasks
+
+## Troubleshooting
+
+**Server won't start:**
+- Check that the binary path is correct and the file is executable
+- Verify the model path is correct
+- Check the logs for error messages
+
+**Out of memory errors:**
+- Reduce context length
+- Reduce GPU layers (ngl)
+- Use a smaller model
+
+**Slow performance:**
+- Increase GPU layers if you have VRAM available
+- Adjust thread count
+- Increase batch size (if you have memory available)
--- a/inspect_config.py
+++ b/inspect_config.py
@@ -0,0 +1,30 @@
+#!/usr/bin/env python3
+"""
+Simple script to inspect the saved configuration file
+"""
+
+import json
+from pathlib import Path
+
+config_file = Path.home() / ".llama_server_gui_config.json"
+
+if config_file.exists():
+    with open(config_file, 'r') as f:
+        config = json.load(f)
+
+    print("=== Configuration File Contents ===")
+    print(f"File: {config_file}")
+    print(f"\nLast Profile: {config.get('last_profile', 'None')}")
+    print(f"\nNumber of Profiles: {len(config.get('profiles', {}))}")
+
+    profiles = config.get('profiles', {})
+    if profiles:
+        print("\n=== Profiles ===")
+        for name, settings in profiles.items():
+            print(f"\nProfile: {name}")
+            for key, value in settings.items():
+                print(f"  {key}: {value}")
+    else:
+        print("\nNo profiles saved yet")
+else:
+    print(f"Configuration file not found: {config_file}")
--- a/llama_server_gui.py
+++ b/llama_server_gui.py
@@ -0,0 +1,751 @@
+#!/usr/bin/env python3
+"""
+llama.cpp Server GUI
+A graphical interface for managing llama.cpp server instances
+"""
+
+import sys
+import os
+import json
+import subprocess
+from pathlib import Path
+from PyQt6.QtWidgets import (
+    QApplication, QMainWindow, QWidget, QVBoxLayout, QHBoxLayout,
+    QPushButton, QLabel, QLineEdit, QTextEdit, QFileDialog,
+    QGroupBox, QSpinBox, QComboBox, QCheckBox, QMessageBox,
+    QSystemTrayIcon, QMenu
+)
+from PyQt6.QtCore import QThread, pyqtSignal, QTimer, Qt
+from PyQt6.QtGui import QIcon, QAction
+
+
+class ServerOutputReader(QThread):
+    """Thread to read server output without blocking the GUI"""
+    output_received = pyqtSignal(str)
+
+    def __init__(self, process):
+        super().__init__()
+        self.process = process
+        self.running = True
+
+    def run(self):
+        """Read output from the process"""
+        import select
+        while self.running:
+            if self.process.poll() is not None:
+                # Process has terminated, read any remaining output
+                self.read_remaining_output()
+                break
+
+            # Use select to check if there's data available (non-blocking)
+            try:
+                readable, _, _ = select.select([self.process.stdout, self.process.stderr], [], [], 0.1)
+
+                if self.process.stdout in readable:
+                    output = self.process.stdout.readline()
+                    if output:
+                        self.output_received.emit(output.strip())
+
+                if self.process.stderr in readable:
+                    error = self.process.stderr.readline()
+                    if error:
+                        # Don't prefix with [ERROR] - llama.cpp uses stderr for normal logging
+                        self.output_received.emit(error.strip())
+            except (ValueError, OSError):
+                # File descriptor closed
+                break
+
+    def read_remaining_output(self):
+        """Read any remaining output after process termination"""
+        try:
+            # Read remaining stdout
+            for line in self.process.stdout:
+                if line:
+                    self.output_received.emit(line.strip())
+
+            # Read remaining stderr
+            for line in self.process.stderr:
+                if line:
+                    # Don't prefix with [ERROR] - llama.cpp uses stderr for normal logging
+                    self.output_received.emit(line.strip())
+        except (ValueError, OSError):
+            pass
+
+    def stop(self):
+        """Stop reading output"""
+        self.running = False
+
+
+class LlamaServerGUI(QMainWindow):
+    def __init__(self):
+        super().__init__()
+        self.config_file = Path.home() / ".llama_server_gui_config.json"
+        self.server_process = None
+        self.output_reader = None
+        self.config = self.load_config()
+
+        self.init_ui()
+        self.load_last_profile()
+
+        # Auto-start if enabled
+        if self.auto_start_checkbox.isChecked():
+            QTimer.singleShot(500, self.start_server)
+
+    def init_ui(self):
+        """Initialize the user interface"""
+        self.setWindowTitle("llama.cpp Server Manager")
+        self.setMinimumSize(900, 700)
+
+        # Central widget
+        central_widget = QWidget()
+        self.setCentralWidget(central_widget)
+        main_layout = QVBoxLayout(central_widget)
+
+        # Profile management
+        profile_group = self.create_profile_section()
+        main_layout.addWidget(profile_group)
+
+        # Section 1: Server binary selection
+        binary_group = self.create_binary_section()
+        main_layout.addWidget(binary_group)
+
+        # Section 2: Model selection
+        model_group = self.create_model_section()
+        main_layout.addWidget(model_group)
+
+        # Section 3: Server options
+        options_group = self.create_options_section()
+        main_layout.addWidget(options_group)
+
+        # Control buttons
+        control_layout = self.create_control_buttons()
+        main_layout.addLayout(control_layout)
+
+        # Log viewer
+        log_group = self.create_log_section()
+        main_layout.addWidget(log_group)
+
+        # System tray
+        self.create_system_tray()
+
+        self.update_button_states()
+
+    def create_profile_section(self):
+        """Create profile management section"""
+        group = QGroupBox("Profile Management")
+        layout = QHBoxLayout()
+
+        layout.addWidget(QLabel("Profile:"))
+
+        self.profile_combo = QComboBox()
+        self.profile_combo.setMinimumWidth(200)
+        self.profile_combo.currentTextChanged.connect(self.on_profile_selected)
+        layout.addWidget(self.profile_combo)
+
+        load_btn = QPushButton("Load")
+        load_btn.clicked.connect(self.load_selected_profile)
+        load_btn.setMaximumWidth(60)
+        layout.addWidget(load_btn)
+
+        save_btn = QPushButton("Save")
+        save_btn.clicked.connect(self.save_current_profile)
+        save_btn.setMaximumWidth(60)
+        layout.addWidget(save_btn)
+
+        delete_btn = QPushButton("Delete")
+        delete_btn.clicked.connect(self.delete_profile)
+        delete_btn.setMaximumWidth(80)
+        layout.addWidget(delete_btn)
+
+        layout.addStretch()
+
+        self.auto_start_checkbox = QCheckBox("Auto-start on launch")
+        layout.addWidget(self.auto_start_checkbox)
+
+        group.setLayout(layout)
+        self.update_profile_list()
+        return group
+
+    def create_binary_section(self):
+        """Create server binary selection section"""
+        group = QGroupBox("1. Server Binary")
+        layout = QHBoxLayout()
+
+        self.binary_path_edit = QLineEdit()
+        self.binary_path_edit.setPlaceholderText("Path to llama.cpp server binary (e.g., llama-server)")
+        layout.addWidget(self.binary_path_edit)
+
+        browse_btn = QPushButton("Browse...")
+        browse_btn.clicked.connect(self.browse_binary)
+        layout.addWidget(browse_btn)
+
+        group.setLayout(layout)
+        return group
+
+    def create_model_section(self):
+        """Create model selection section"""
+        group = QGroupBox("2. Model Selection")
+        layout = QHBoxLayout()
+
+        self.model_path_edit = QLineEdit()
+        self.model_path_edit.setPlaceholderText("Path to model file (e.g., model.gguf)")
+        layout.addWidget(self.model_path_edit)
+
+        browse_btn = QPushButton("Browse...")
+        browse_btn.clicked.connect(self.browse_model)
+        layout.addWidget(browse_btn)
+
+        group.setLayout(layout)
+        return group
+
+    def create_options_section(self):
+        """Create server options section"""
+        group = QGroupBox("3. Server Options")
+        layout = QVBoxLayout()
+
+        # Row 1: Host and Port
+        row1 = QHBoxLayout()
+        row1.addWidget(QLabel("Host:"))
+        self.host_edit = QLineEdit("127.0.0.1")
+        self.host_edit.setMaximumWidth(150)
+        row1.addWidget(self.host_edit)
+
+        row1.addWidget(QLabel("Port:"))
+        self.port_spin = QSpinBox()
+        self.port_spin.setRange(1, 65535)
+        self.port_spin.setValue(8080)
+        self.port_spin.setMaximumWidth(100)
+        row1.addWidget(self.port_spin)
+
+        row1.addStretch()
+        layout.addLayout(row1)
+
+        # Row 2: Context length and GPU layers
+        row2 = QHBoxLayout()
+        row2.addWidget(QLabel("Context Length:"))
+        self.context_spin = QSpinBox()
+        self.context_spin.setRange(128, 1048576)
+        self.context_spin.setValue(2048)
+        self.context_spin.setSingleStep(512)
+        self.context_spin.setMaximumWidth(100)
+        row2.addWidget(self.context_spin)
+
+        row2.addWidget(QLabel("GPU Layers (ngl):"))
+        self.ngl_spin = QSpinBox()
+        self.ngl_spin.setRange(-1, 999)
+        self.ngl_spin.setValue(33)
+        self.ngl_spin.setMaximumWidth(100)
+        row2.addWidget(self.ngl_spin)
+
+        row2.addStretch()
+        layout.addLayout(row2)
+
+        # Row 3: Threads and batch size
+        row3 = QHBoxLayout()
+        row3.addWidget(QLabel("Threads:"))
+        self.threads_spin = QSpinBox()
+        self.threads_spin.setRange(1, 256)
+        self.threads_spin.setValue(8)
+        self.threads_spin.setMaximumWidth(100)
+        row3.addWidget(self.threads_spin)
+
+        row3.addWidget(QLabel("Batch Size:"))
+        self.batch_spin = QSpinBox()
+        self.batch_spin.setRange(1, 2048)
+        self.batch_spin.setValue(512)
+        self.batch_spin.setMaximumWidth(100)
+        row3.addWidget(self.batch_spin)
+
+        row3.addStretch()
+        layout.addLayout(row3)
+
+        # Row 4: Additional options
+        row4 = QHBoxLayout()
+        row4.addWidget(QLabel("Additional Arguments:"))
+        self.additional_args_edit = QLineEdit()
+        self.additional_args_edit.setPlaceholderText("e.g., --numa --mlock")
+        row4.addWidget(self.additional_args_edit)
+        layout.addLayout(row4)
+
+        group.setLayout(layout)
+        return group
+
+    def create_control_buttons(self):
+        """Create start/stop control buttons"""
+        layout = QHBoxLayout()
+
+        self.start_btn = QPushButton("Start Server")
+        self.start_btn.clicked.connect(self.start_server)
+        self.start_btn.setMinimumHeight(40)
+        layout.addWidget(self.start_btn)
+
+        self.stop_btn = QPushButton("Stop Server")
+        self.stop_btn.clicked.connect(self.stop_server)
+        self.stop_btn.setMinimumHeight(40)
+        layout.addWidget(self.stop_btn)
+
+        return layout
+
+    def create_log_section(self):
+        """Create log viewer section"""
+        group = QGroupBox("Server Logs")
+        layout = QVBoxLayout()
+
+        self.log_text = QTextEdit()
+        self.log_text.setReadOnly(True)
+        self.log_text.setMinimumHeight(200)
+        layout.addWidget(self.log_text)
+
+        clear_btn = QPushButton("Clear Logs")
+        clear_btn.clicked.connect(self.log_text.clear)
+        layout.addWidget(clear_btn)
+
+        group.setLayout(layout)
+        return group
+
+    def create_system_tray(self):
+        """Create system tray icon"""
+        self.tray_icon = QSystemTrayIcon(self)
+
+        # Try to use a default icon, fallback if not available
+        icon = QApplication.style().standardIcon(QApplication.style().StandardPixmap.SP_ComputerIcon)
+        self.tray_icon.setIcon(icon)
+
+        # Tray menu
+        tray_menu = QMenu()
+
+        show_action = QAction("Show", self)
+        show_action.triggered.connect(self.show)
+        tray_menu.addAction(show_action)
+
+        tray_menu.addSeparator()
+
+        start_action = QAction("Start Server", self)
+        start_action.triggered.connect(self.start_server)
+        tray_menu.addAction(start_action)
+
+        stop_action = QAction("Stop Server", self)
+        stop_action.triggered.connect(self.stop_server)
+        tray_menu.addAction(stop_action)
+
+        tray_menu.addSeparator()
+
+        quit_action = QAction("Quit", self)
+        quit_action.triggered.connect(self.quit_application)
+        tray_menu.addAction(quit_action)
+
+        self.tray_icon.setContextMenu(tray_menu)
+        self.tray_icon.activated.connect(self.tray_icon_activated)
+        self.tray_icon.show()
+
+    def tray_icon_activated(self, reason):
+        """Handle tray icon activation"""
+        if reason == QSystemTrayIcon.ActivationReason.Trigger:
+            if self.isVisible():
+                self.hide()
+            else:
+                self.show()
+                self.activateWindow()
+
+    def browse_binary(self):
+        """Browse for server binary"""
+        default_path = "/home/xero110/dev/llama.cpp/build/bin"
+        if os.path.exists(default_path):
+            start_dir = default_path
+        else:
+            start_dir = str(Path.home())
+
+        file_path, _ = QFileDialog.getOpenFileName(
+            self,
+            "Select llama.cpp Server Binary",
+            start_dir,
+            "Executable Files (*);;All Files (*)"
+        )
+
+        if file_path:
+            self.binary_path_edit.setText(file_path)
+
+    def browse_model(self):
+        """Browse for model file"""
+        start_dir = str(Path.home())
+
+        file_path, _ = QFileDialog.getOpenFileName(
+            self,
+            "Select Model File",
+            start_dir,
+            "GGUF Files (*.gguf);;All Files (*)"
+        )
+
+        if file_path:
+            self.model_path_edit.setText(file_path)
+
+    def start_server(self):
+        """Start the llama.cpp server"""
+        binary_path = self.binary_path_edit.text().strip()
+        model_path = self.model_path_edit.text().strip()
+
+        if not binary_path:
+            QMessageBox.warning(self, "Error", "Please select a server binary")
+            return
+
+        if not os.path.exists(binary_path):
+            QMessageBox.warning(self, "Error", f"Server binary not found: {binary_path}")
+            return
+
+        if not model_path:
+            QMessageBox.warning(self, "Error", "Please select a model file")
+            return
+
+        if not os.path.exists(model_path):
+            QMessageBox.warning(self, "Error", f"Model file not found: {model_path}")
+            return
+
+        if self.server_process is not None and self.server_process.poll() is None:
+            QMessageBox.warning(self, "Error", "Server is already running")
+            return
+
+        # Build command
+        cmd = [
+            binary_path,
+            "-m", model_path,
+            "--host", self.host_edit.text(),
+            "--port", str(self.port_spin.value()),
+            "-c", str(self.context_spin.value()),
+            "-ngl", str(self.ngl_spin.value()),
+            "-t", str(self.threads_spin.value()),
+            "-b", str(self.batch_spin.value())
+        ]
+
+        # Add additional arguments
+        additional_args = self.additional_args_edit.text().strip()
+        if additional_args:
+            cmd.extend(additional_args.split())
+
+        self.log_text.append(f"Starting server with command:\n{' '.join(cmd)}\n")
+
+        try:
+            self.server_process = subprocess.Popen(
+                cmd,
+                stdout=subprocess.PIPE,
+                stderr=subprocess.PIPE,
+                text=True,
+                bufsize=1
+            )
+
+            # Start output reader thread
+            self.output_reader = ServerOutputReader(self.server_process)
+            self.output_reader.output_received.connect(self.append_log)
+            self.output_reader.start()
+
+            self.log_text.append("Server started successfully!\n")
+            self.update_button_states()
+            self.tray_icon.showMessage(
+                "llama.cpp Server",
+                "Server started successfully",
+                QSystemTrayIcon.MessageIcon.Information,
+                2000
+            )
+
+        except Exception as e:
+            QMessageBox.critical(self, "Error", f"Failed to start server:\n{str(e)}")
+            self.log_text.append(f"Error starting server: {str(e)}\n")
+
+    def stop_server(self):
+        """Stop the llama.cpp server"""
+        if not self.server_process or self.server_process.poll() is not None:
+            QMessageBox.warning(self, "Error", "Server is not running")
+            return
+
+        self.log_text.append("Stopping server...\n")
+
+        # Disable stop button while stopping
+        self.stop_btn.setEnabled(False)
+
+        # Stop the output reader thread
+        if self.output_reader:
+            self.output_reader.stop()
+
+        # Terminate the process
+        self.server_process.terminate()
+
+        # Use a timer to check if process has stopped (non-blocking)
+        self.stop_timer = QTimer()
+        self.stop_timer.timeout.connect(self.check_server_stopped)
+        self.stop_attempts = 0
+        self.stop_timer.start(200)  # Check every 200ms
+
+    def check_server_stopped(self):
+        """Check if server has stopped (called by timer)"""
+        if self.server_process.poll() is not None:
+            # Process has terminated
+            self.stop_timer.stop()
+            self.cleanup_after_stop("Server stopped successfully!\n")
+        else:
+            self.stop_attempts += 1
+            if self.stop_attempts >= 25:  # 25 * 200ms = 5 seconds
+                # Force kill after 5 seconds
+                self.log_text.append("Server not responding, forcing kill...\n")
+                self.server_process.kill()
+                self.stop_timer.stop()
+                # Wait a bit more for kill to take effect
+                QTimer.singleShot(500, lambda: self.cleanup_after_stop("Server killed (forced)\n"))
+
+    def cleanup_after_stop(self, message):
+        """Clean up after server has stopped"""
+        self.log_text.append(message)
+
+        # Wait for output reader thread to finish
+        if self.output_reader:
+            self.output_reader.wait(1000)  # Wait max 1 second
+            self.output_reader = None
+
+        self.server_process = None
+        self.update_button_states()
+        self.tray_icon.showMessage(
+            "llama.cpp Server",
+            "Server stopped",
+            QSystemTrayIcon.MessageIcon.Information,
+            2000
+        )
+
+    def append_log(self, text):
+        """Append text to log viewer"""
+        self.log_text.append(text)
+        # Auto-scroll to bottom
+        scrollbar = self.log_text.verticalScrollBar()
+        scrollbar.setValue(scrollbar.maximum())
+
+    def update_button_states(self):
+        """Update button enabled/disabled states"""
+        is_running = self.server_process is not None and self.server_process.poll() is None
+        self.start_btn.setEnabled(not is_running)
+        self.stop_btn.setEnabled(is_running)
+
+    def get_current_settings(self):
+        """Get current settings as dictionary"""
+        return {
+            "binary_path": self.binary_path_edit.text(),
+            "model_path": self.model_path_edit.text(),
+            "host": self.host_edit.text(),
+            "port": self.port_spin.value(),
+            "context": self.context_spin.value(),
+            "ngl": self.ngl_spin.value(),
+            "threads": self.threads_spin.value(),
+            "batch": self.batch_spin.value(),
+            "additional_args": self.additional_args_edit.text(),
+            "auto_start": self.auto_start_checkbox.isChecked()
+        }
+
+    def apply_settings(self, settings):
+        """Apply settings to UI"""
+        self.log_text.append("Applying settings to UI fields...\n")
+
+        self.binary_path_edit.setText(settings.get("binary_path", ""))
+        self.model_path_edit.setText(settings.get("model_path", ""))
+        self.host_edit.setText(settings.get("host", "127.0.0.1"))
+        self.port_spin.setValue(settings.get("port", 8080))
+        self.context_spin.setValue(settings.get("context", 2048))
+        self.ngl_spin.setValue(settings.get("ngl", 33))
+        self.threads_spin.setValue(settings.get("threads", 8))
+        self.batch_spin.setValue(settings.get("batch", 512))
+        self.additional_args_edit.setText(settings.get("additional_args", ""))
+        self.auto_start_checkbox.setChecked(settings.get("auto_start", False))
+
+        self.log_text.append("Settings applied to UI\n")
+
+    def save_current_profile(self):
+        """Save current settings as a profile"""
+        from PyQt6.QtWidgets import QInputDialog
+
+        current_name = self.profile_combo.currentText()
+        profile_name, ok = QInputDialog.getText(
+            self,
+            "Save Profile",
+            "Profile name:",
+            text=current_name if current_name else ""
+        )
+
+        if ok and profile_name:
+            # Get current settings from UI
+            settings = self.get_current_settings()
+
+            # Debug: log what we're saving
+            self.log_text.append(f"Saving profile '{profile_name}' with settings:\n")
+            self.log_text.append(f"  Binary: {settings['binary_path']}\n")
+            self.log_text.append(f"  Model: {settings['model_path']}\n")
+            self.log_text.append(f"  Port: {settings['port']}, Context: {settings['context']}, NGL: {settings['ngl']}\n")
+
+            # Save to config
+            self.config["profiles"][profile_name] = settings
+            self.config["last_profile"] = profile_name
+            self.save_config()
+
+            # Update profile list and select the saved profile
+            self.update_profile_list()
+
+            # Temporarily block signals, set the text, then unblock and manually trigger load
+            self.profile_combo.blockSignals(True)
+            self.profile_combo.setCurrentText(profile_name)
+            self.profile_combo.blockSignals(False)
+
+            # Show confirmation
+            self.log_text.append(f"Profile '{profile_name}' saved successfully\n")
+
+    def on_profile_selected(self, profile_name):
+        """Called when profile selection changes in dropdown (auto-load)"""
+        if profile_name and profile_name in self.config["profiles"]:
+            self.log_text.append(f"Auto-loading profile '{profile_name}'...\n")
+            self.load_profile(profile_name)
+        else:
+            # Empty selection or invalid profile
+            pass
+
+    def load_selected_profile(self):
+        """Load the currently selected profile (manual load via button)"""
+        profile_name = self.profile_combo.currentText()
+        if not profile_name:
+            QMessageBox.warning(self, "No Profile Selected", "Please select a profile to load")
+            return
+
+        if profile_name not in self.config["profiles"]:
+            QMessageBox.warning(self, "Profile Not Found", f"Profile '{profile_name}' not found")
+            return
+
+        self.log_text.append(f"Manually loading profile '{profile_name}'...\n")
+        self.load_profile(profile_name)
+
+    def load_profile(self, profile_name):
+        """Load a profile"""
+        if profile_name and profile_name in self.config["profiles"]:
+            settings = self.config["profiles"][profile_name]
+
+            # Debug: log what we're loading
+            self.log_text.append(f"Loading profile '{profile_name}' with settings:\n")
+            self.log_text.append(f"  Binary: {settings.get('binary_path', 'N/A')}\n")
+            self.log_text.append(f"  Model: {settings.get('model_path', 'N/A')}\n")
+            self.log_text.append(f"  Port: {settings.get('port', 'N/A')}, Context: {settings.get('context', 'N/A')}, NGL: {settings.get('ngl', 'N/A')}\n")
+
+            self.apply_settings(settings)
+            self.config["last_profile"] = profile_name
+            self.save_config()
+            self.log_text.append(f"Profile '{profile_name}' loaded successfully\n")
+
+    def load_last_profile(self):
+        """Load the last used profile"""
+        last_profile = self.config.get("last_profile")
+        if last_profile and last_profile in self.config["profiles"]:
+            self.profile_combo.setCurrentText(last_profile)
+
+    def delete_profile(self):
+        """Delete the current profile"""
+        profile_name = self.profile_combo.currentText()
+        if not profile_name:
+            return
+
+        reply = QMessageBox.question(
+            self,
+            "Delete Profile",
+            f"Are you sure you want to delete profile '{profile_name}'?",
+            QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No
+        )
+
+        if reply == QMessageBox.StandardButton.Yes:
+            del self.config["profiles"][profile_name]
+            if self.config.get("last_profile") == profile_name:
+                self.config["last_profile"] = None
+            self.save_config()
+            self.update_profile_list()
+
+    def update_profile_list(self):
+        """Update the profile combo box"""
+        current = self.profile_combo.currentText()
+
+        # Block signals to prevent triggering load_profile during update
+        self.profile_combo.blockSignals(True)
+        self.profile_combo.clear()
+        self.profile_combo.addItems(sorted(self.config["profiles"].keys()))
+        if current in self.config["profiles"]:
+            self.profile_combo.setCurrentText(current)
+        self.profile_combo.blockSignals(False)
+
+    def load_config(self):
+        """Load configuration from file"""
+        if self.config_file.exists():
+            try:
+                with open(self.config_file, 'r') as f:
+                    return json.load(f)
+            except Exception as e:
+                print(f"Error loading config: {e}")
+
+        return {"profiles": {}, "last_profile": None}
+
+    def save_config(self):
+        """Save configuration to file"""
+        try:
+            with open(self.config_file, 'w') as f:
+                json.dump(self.config, f, indent=2)
+            # Debug: log what we saved
+            if hasattr(self, 'log_text'):
+                profile_count = len(self.config.get("profiles", {}))
+                self.log_text.append(f"Config saved: {profile_count} profile(s)\n")
+        except Exception as e:
+            print(f"Error saving config: {e}")
+            if hasattr(self, 'log_text'):
+                self.log_text.append(f"Error saving config: {e}\n")
+
+    def closeEvent(self, event):
+        """Handle window close event"""
+        if self.server_process is not None and self.server_process.poll() is None:
+            reply = QMessageBox.question(
+                self,
+                "Server Running",
+                "The server is still running. Do you want to:\n\n"
+                "Yes - Minimize to tray\n"
+                "No - Stop server and quit\n"
+                "Cancel - Do nothing",
+                QMessageBox.StandardButton.Yes |
+                QMessageBox.StandardButton.No |
+                QMessageBox.StandardButton.Cancel
+            )
+
+            if reply == QMessageBox.StandardButton.Yes:
+                event.ignore()
+                self.hide()
+            elif reply == QMessageBox.StandardButton.No:
+                # Force kill when closing (no need to wait gracefully)
+                if self.output_reader:
+                    self.output_reader.stop()
+                self.server_process.kill()
+                if self.output_reader:
+                    self.output_reader.wait(1000)
+                event.accept()
+            else:
+                event.ignore()
+        else:
+            event.accept()
+
+    def quit_application(self):
+        """Quit the application"""
+        if self.server_process is not None and self.server_process.poll() is None:
+            # Force kill when quitting (no need to wait gracefully)
+            if self.output_reader:
+                self.output_reader.stop()
+            self.server_process.kill()
+            if self.output_reader:
+                self.output_reader.wait(1000)
+        QApplication.quit()
+
+
+def main():
+    app = QApplication(sys.argv)
+    app.setApplicationName("llama.cpp Server Manager")
+    app.setQuitOnLastWindowClosed(False)
+
+    window = LlamaServerGUI()
+    window.show()
+
+    sys.exit(app.exec())
+
+
+if __name__ == "__main__":
+    main()