LMOS PROJECT

Language Model Orchestration System

A modular, scalable AI system for deploying full-featured language models, transcription, embeddings, and reranking, with support for Docker and Kubernetes orchestration.

View on GitHub Join The Discord Read The Docs

The LMOS Ecosystem - One endpoint, Full Language Stack.

LMOS-Router

Central gateway with OpenAI-compatible endpoints

LLM-Runner

ZMQ-based LLM inference template

STT-Runner

Speech-to-text service

Embedding-Runner

Vector embedding service

Key Features

Built on OpenAI API Spec
Baked in Auth and Rate Limiting
Designed for Containerization
Modular and Ready for Scale

Built for developers, enabling faster feature delivery to users.

Pre-configured dev containers • Extensive documentation • Developers that enjoy communicating