Allma Studio

AI Platform

Demo
Technical Documentation

System Architecture

Deep dive into Allma Studio's technical implementation, from the React frontend to the FastAPI backend and local LLM integration via Ollama.

System Layers

Layered Architecture

A clean separation of concerns with five distinct layers

Frontend Layer

React 18 SPAVite BuildTailwindCSSAxios HTTP

API Gateway

FastAPIUvicorn ASGICORS ProtectionRate Limiting

Orchestration

Request RouterRAG ServiceConversation ServiceDocument Service

Data Layer

ChromaDB VectorsSQLite SessionsNomic EmbeddingsDocument Store

LLM Layer

Ollama RuntimeDeepSeek R1Gemma 2Qwen 2.5 Coder
Request Lifecycle

Data Flow Pipeline

User Query

Natural language input

API Gateway

Request validation

Orchestrator

Route to services

Vector Search

Find relevant docs

LLM Processing

Generate response

Streaming

Token-by-token output

Technology Stack

Built With Modern Tools

Frontend

React18.3.1
UI Framework
Vite5.2.11
Build Tool
TailwindCSS3.4.3
Styling
Axios1.7.2
HTTP Client
React Markdown9.0.1
Markdown Rendering
Lucide React0.378.0
Icons

Backend

Python3.11+
Runtime
FastAPI0.115.0
Web Framework
Uvicorn0.31.1
ASGI Server
SQLAlchemy2.0.36
ORM
ChromaDB0.5.17
Vector DB
httpx0.28.1
Async HTTP

AI/ML

OllamaLatest
Local LLM Runtime
Nomic EmbedText
Embeddings Model
DeepSeek R15.2GB
Reasoning LLM
Gemma 29B
General LLM
Qwen 2.5Coder
Code LLM
LLaMA 3.22GB
Fast LLM

Infrastructure

DockerLatest
Containerization
Docker Composev2
Orchestration
Kubernetesv1.28+
Container Orchestration
Helmv3
K8s Package Manager
GitHub ActionsCI/CD
Automation
VercelEdge
Frontend Hosting
Security & Privacy

Security First Design

Zero Telemetry

No data collection whatsoever

CORS Protection

Configurable cross-origin security

Rate Limiting

Built-in API throttling

Non-root Containers

Security-hardened Docker images