Tool dossier

LLMKube

Kubernetes operator for llama.cpp-native LLM inference with GPU scheduling, Apple Silicon Metal support, and OpenAI-compatible API.

1 sources 40 stars Self-hosted Apache-2.0

Positioning

What this project is really offering

The goal here is to separate raw catalog facts from the sharper product shape users care about before they commit time.

About

Kubernetes operator for llama.cpp-native LLM inference with GPU scheduling, Apple Silicon Metal support, and OpenAI-compatible API.

Evidence

What backs up the editorial summary