ThinkWatch Architecture#
1. Overview#
ThinkWatch is an enterprise AI API gateway and MCP (Model Context Protocol) management platform built in Rust. It acts as a secure gateway for all AI access within an organization — every request from developer tools, automated agents, and internal applications flows through a single controlled gateway before reaching upstream AI providers or MCP servers.
This architecture provides centralized:
- Authentication and authorization — API key validation, JWT-based sessions, OIDC SSO, and role-based access control.
- Cost management — per-key and per-team budgets, token-level usage tracking, and real-time cost calculation.
- Rate limiting — requests-per-minute (RPM) and tokens-per-minute (TPM) enforcement per API key.
- Audit logging — every request is recorded with full context for compliance and debugging.
- Model access governance — fine-grained permissions controlling which users, teams, and roles can access which AI models and MCP tools.
- Multi-provider abstraction — a unified API (OpenAI, Anthropic, and Responses formats) that proxies to OpenAI, Anthropic, Google, Azure OpenAI, AWS Bedrock, and custom providers with automatic request/response format translation.
2. System Architecture Diagram#
Downstream Clients
+----------------+ +----------------+ +----------------+
| Claude Code | | Cursor | | Custom Agents |
| (MCP client) | | (OpenAI-compat)| | (API / MCP) |
+-------+--------+ +-------+--------+ +-------+--------+
| | |
| tw-xxx API key | JWT / API key |
+--------------------+--------------------+
|
+----------+-----------+
| Reverse Proxy |
| (nginx / traefik) |
+----+------------+----+
| |
+--------------+--+ +----+----------------+
| Gateway :3000 | | Console :3001 |
| +-----------+ | | +--------------+ |
| | /v1/* | | | | /api/* | |
| | AI Proxy | | | | Mgmt REST API| |
| +-----------+ | | +--------------+ |
| +-----------+ | | +--------------+ |
| | /mcp | | | | Web UI (SPA) | |
| | MCP Proxy | | | | React 19 | |
| +-----------+ | | +--------------+ |
| +-----------+ | +---------------------+
| +-----------+ |
| | /health/* | |
| | /metrics | |
| +-----------+ |
+---------+-------+
|
+-------------------+-------------------+
| | |
+-----+------+ +------+------+ +-------+-------+
| PostgreSQL | | Redis | | ClickHouse |
| users, keys | | rate limits | | audit logs |
| providers | | sessions | | |
| settings | | config sync | | |
| usage, RBAC | | OIDC state | +---------------+
+-------------+ +-------------+
+-------------------+-------------------+
| | |
+-----+------+ +------+------+ +-------+-------+
| OpenAI | | Anthropic | | Google |
| API | | API | | Gemini API |
+------------+ +-------------+ +---------------+
+-----+------+ +------+------+
| Azure | | AWS |
| OpenAI | | Bedrock |
+------------+ +-------------+
Upstream AI Providers
+---------------------------------------------+
| Upstream MCP Servers (Streamable HTTP) |
| e.g. filesystem, github, database, etc. |
+---------------------------------------------+
+---------------------------------------------+
| Zitadel (OIDC SSO Provider) |
| External identity provider for SSO login |
+---------------------------------------------+
3. Dual-Port Architecture#
ThinkWatch binds two separate TCP listeners on a single process:
| Port | Name | Purpose | Audience |
|---|---|---|---|
| 3000 | Gateway | AI API proxy (/v1/chat/completions, /v1/messages, /v1/responses, /v1/models), MCP proxy (/mcp), health checks (/health/live, /health/ready), Prometheus metrics (/metrics) | AI clients, agents, monitoring |
| 3001 | Console | Management REST API (/api/*), Web UI | Administrators |
Why Two Ports?#
Security isolation. The gateway port is the only port that should be exposed to the public network or developer workstations. It accepts API key authentication and serves the high-throughput proxy path. The console port serves the administrative interface and should be restricted to an internal network, VPN, or localhost.
Network topology recommendations:
- Production: Place a reverse proxy (nginx, traefik, or a cloud load balancer) in front of both ports. Expose only port 3000 to the public or developer network. Restrict port 3001 to an internal network segment or require VPN access.
- Kubernetes: Create two separate Ingress resources — a public Ingress for the gateway host and an internal Ingress (or no Ingress at all, using port-forward) for the console host.
- Development: Both ports are accessible on localhost. The React dev server (pnpm dev) proxies API calls to
localhost:3001.
4. AI API Gateway Data Flow#
When a client sends a request through the gateway (via /v1/chat/completions, /v1/messages, or /v1/responses), the following steps occur:
Client Gateway :3000 Upstream Provider
| | |
| POST /v1/chat/completions |
| Authorization: Bearer tw-xxxx |
|----------------------->| |
| | |
| 1. API Key Auth Middleware |
| - Extract key prefix (tw-xxxx) |
| - Hash key, lookup in PostgreSQL |
| - Validate: active, not expired, budget ok |
| - Attach user_id, team_id, scopes to request |
| | |
| 2. Model Router |
| - Extract model from request body |
| - Find provider + model in registry |
| - Check model_permissions for user/team/role |
| - Check allowed_models on the API key |
| | |
| 3. Request Transform |
| - If provider is Anthropic: -> Claude format |
| - If provider is Google: -> Gemini format |
| - If provider is Azure: -> Azure OpenAI format|
| - If provider is Bedrock: -> Converse API |
| - If provider is OpenAI: pass through |
| | |
| 4. Rate Limiter |
| - Check RPM counter in Redis |
| - Estimate tokens, check TPM counter |
| - If exceeded: return 429 Too Many Requests |
| | |
| 5. Upstream Proxy |
| - Build request with provider API key |
| (decrypted from providers.api_key_encrypted)|
| - Forward to provider base_url |
| |--- POST provider.base_url ---------->|
| | |
| |<--- SSE stream / JSON response ------|
| | |
| 6. Response Transform |
| - Convert upstream response to client format |
| - Bedrock: decode binary event-stream |
| - Stream SSE chunks back to client |
| | |
|<-- SSE stream ---------| |
| | |
| 7. Async Post-Processing (spawned task) |
| - Count input/output tokens (tiktoken) |
| - Calculate cost from model pricing |
| - Insert usage_record into PostgreSQL |
| - Push audit log to ClickHouse |
| - Forward to log forwarders (syslog/kafka/webhook) |
| - Update rate limit counters in Redis |
| | |
| 8. Access logging |
| - method, path, status, latency, and client |
| IP recorded to ClickHouse asynchronously |
Key Design Decisions#
- Streaming-first: The proxy uses
eventsource-streamandasync-streamto forward SSE chunks in real time with minimal buffering. For AWS Bedrock, native binary event-stream decoding is used. Token counting happens after the stream completes. - Provider API keys are encrypted at rest using AES-256-GCM. The
ENCRYPTION_KEYenvironment variable provides the 256-bit key. For AWS Bedrock, credentials (ACCESS_KEY_ID:SECRET_ACCESS_KEY) are encrypted with the same scheme and used for SigV4 request signing via the officialaws-sigv4crate. - Multi-format support: The gateway accepts three API formats: OpenAI Chat Completions (
/v1/chat/completions), Anthropic Messages (/v1/messages), and OpenAI Responses (/v1/responses). Regardless of the inbound format, the request is routed through the same model router and translated to the upstream provider’s native format. This means a singletw-API key can be used with any client tool.
5. MCP Gateway Data Flow#
The MCP gateway allows clients (such as Claude Code or Cursor) to access remote MCP tool servers through a single authenticated endpoint:
MCP Client MCP Gateway :3000/mcp Upstream MCP Server
| | |
| POST /mcp | |
| Authorization: Bearer <JWT> |
| (or tw-xxx API key) | |
|-------------------------->| |
| | |
| 1. Authentication |
| - Validate JWT or API key |
| - Extract user identity + roles |
| | |
| 2. Session Management |
| - Mcp-Session-Id header check |
| - Create new session or resume existing |
| - Session state stored in Redis |
| | |
| 3. JSON-RPC Dispatch |
| Parse method: |
| - "initialize" -> return capabilities |
| - "tools/list" -> aggregate from registry |
| - "tools/call" -> route to server |
| | |
| 4. Tool Namespace Resolution (tools/call) |
| - Tool name: "github__create_issue" |
| - Split on "__" -> server="github", |
| tool="create_issue" |
| - Look up server in mcp_servers registry |
| | |
| 5. Access Control Check |
| - Check mcp_tool_permissions for |
| user/team/role against this tool |
| - Deny if not permitted |
| | |
| 6. Connection Pool -> Upstream |
| - Get/create connection from pool |
| - Forward JSON-RPC call to upstream server |
| |--- POST server.endpoint_url ---->|
| |<-- JSON-RPC response ------------|
| | |
| 7. Response Forwarding + Audit |
| - Return JSON-RPC response to client |
| - Log tool invocation to audit_logs |
| | |
|<-- JSON-RPC response -----| |
Key Design Decisions#
- Streamable HTTP transport is the primary MCP transport. The gateway connects to upstream servers via
streamable_http. - Tool aggregation: When a client calls
tools/list, the gateway aggregates tools from all registered MCP servers that the user has access to, prefixing each tool name with its server name and__to avoid collisions. - Connection pooling reuses upstream MCP connections to avoid repeated handshakes.
- Health checking runs periodically (configurable via
health_check_interval) to mark servers as healthy or unhealthy.
6. Crate Architecture#
ThinkWatch is structured as a Cargo workspace with five crates:
crates/
server/ # Binary crate -- HTTP servers, routes, handlers, middleware
gateway/ # Library -- AI API proxy engine
mcp-gateway/ # Library -- MCP proxy engine
auth/ # Library -- authentication and authorization
common/ # Library -- shared infrastructure
server#
The application entry point. Contains:
main.rs— Initializes config, database, Redis, runs startup validation (dependency checks, JWT secret entropy), and starts both the gateway and console Axum servers.app.rs— Builds the Axum router trees for both ports.background_tasks/— Periodic background jobs:api_key_lifecycle.rs— Runs hourly to enforce key rotation periods, inactivity timeouts, and expiry policies.data_retention.rs— Runs daily to purge expired usage records, audit logs, and soft-deleted records past the 30-day retention window.
handlers/— Request handlers organized by domain:auth.rs,sso.rs— Login, registration, OIDC callbacksapi_keys.rs— CRUD for virtual API keysproviders.rs— AI provider and model managementmcp_servers.rs,mcp_tools.rs— MCP server registry managementanalytics.rs,audit.rs— Usage dashboards, audit log queriesadmin.rs— User management, role assignment, system settingssettings.rs— Dynamic configuration CRUD (GET/PATCH /api/admin/settings, category filtering)setup.rs— First-run setup wizard (GET /api/setup/status,POST /api/setup/initialize)health.rs— Health check endpoints (/health/live,/health/ready,/api/health)metrics.rs— Prometheus metrics endpoint (GET /metrics)
middleware/— Axum middleware layers:api_key_auth.rs— Extracts and validatestw-API keys for gateway routesauth_guard.rs— Validates JWT tokens for console routesrequire_role.rs— RBAC enforcement middleware
gateway#
The AI API proxy engine. Contains:
providers/— Provider implementations behind a common trait:traits.rs—AiProvidertrait defining the proxy interfaceopenai.rs— OpenAI proxy (passthrough)anthropic.rs— Anthropic Claude proxy with format translationgoogle.rs— Google Gemini proxy with format translationazure.rs— Azure OpenAI proxy withapi-keyheader auth andapi_versionquery parameterbedrock.rs— AWS Bedrock proxy with SigV4 signing (officialaws-sigv4crate), Converse API, and native binary event-stream streamingcustom.rs— Generic OpenAI-compatible provider proxy
proxy.rs— Core proxy logic: receives request, selects provider, forwards, returns response.router.rs— Model-to-provider routing and permission checks.streaming.rs— SSE stream forwarding and chunk processing.transform/— Request/response format conversion between OpenAI, Anthropic, Google, Azure, and Bedrock formats.rate_limiter.rs— Redis-backed RPM/TPM rate limiting with sliding window counters.token_counter.rs— Token counting usingtiktoken-rsfor usage tracking and TPM enforcement.cost_tracker.rs— Real-time cost calculation based on model pricing from the database.circuit_breaker.rs— Three-state circuit breaker (Closed/Open/HalfOpen) per provider, with configurablefailure_thresholdandrecovery_secs.retry.rs— Retry with exponential backoff for transient failures (NetworkError, UpstreamRateLimited), with configurablemax_retries,initial_delay_ms,max_delay_ms, andjitter.
mcp-gateway#
The MCP proxy engine. Contains:
proxy.rs— Core MCP proxy: receives JSON-RPC requests, dispatches to the correct upstream server.registry.rs— In-memory registry of MCP servers and their tools, synced from PostgreSQL.pool.rs— Connection pool for upstream MCP server connections.session.rs— Session management for stateful MCP interactions (Redis-backed).access_control.rs— Permission checks for tool invocations againstmcp_tool_permissions.health.rs— Periodic health checking of registered MCP servers.transport/— Transport layer implementations:streamable_http.rs— Streamable HTTP transport client for upstream MCP servers.
auth#
Authentication and authorization library. Contains:
jwt.rs— JWT token creation and validation usingjsonwebtoken.api_key.rs— API key generation (tw-prefixed), hashing, and validation.password.rs— Password hashing and verification using Argon2.oidc.rs— OpenID Connect client for SSO with Zitadel or any OIDC provider. Handles authorization URL generation, callback processing, and user provisioning.rbac.rs— Role-based access control: permission loading, role hierarchy, and authorization checks.
common#
Shared infrastructure used by all other crates. Contains:
config.rs—AppConfigstruct loaded from environment variables.dynamic_config.rs—DynamicConfigsystem that loads settings from thesystem_settingsdatabase table. Supports multi-instance sync via Redis Pub/Sub and in-memory caching. Covers JWT TTLs, cache TTL, content filter patterns, PII patterns, budget thresholds, API key policies, and data retention settings.db.rs— PostgreSQL connection pool setup usingsqlx.models/— Database model structs (one per domain entity):user.rs,team.rs,api_key.rs,provider.rs,mcp_server.rs,usage.rs,audit_log.rs.dto/— Data transfer objects for API request/response serialization.errors.rs— Unified error type with HTTP status code mapping.crypto.rs— AES-256-GCM encryption/decryption for provider API keys.audit.rs— Audit log writer (PostgreSQL + optional ClickHouse + optional syslog forwarding).validation.rs— Password complexity validation (8+ chars, uppercase, lowercase, digit).
7. Log Exploration Architecture#
Log Storage Architecture#
ThinkWatch stores six types of logs in ClickHouse, each in a dedicated table:
| Table | Purpose | TTL |
|---|---|---|
audit_logs | Security audit trail (login, API key ops, settings changes) | 90 days |
gateway_logs | AI API request logs (model, tokens, cost, latency) | 90 days |
mcp_logs | MCP tool invocation logs (server, tool, duration, status) | 90 days |
platform_logs | Platform management operations | 90 days |
access_logs | HTTP access logs for both ports (method, path, status, latency) | 30 days |
app_logs | Application runtime tracing (level, target, message, spans) | 30 days |
All tables use MergeTree engine with monthly partitioning and TTL-based automatic cleanup. HTTP access logs and application traces are written asynchronously to avoid blocking request processing.
All log types are queryable from a single Log Explorer page in the admin console with structured query syntax.
8. Database Schema Overview#
The database schema is defined across twelve migration files applied in order on startup:
001_init_users — User Accounts#
| Table | Purpose |
|---|---|
users | Core user accounts with email, display name, password hash (for local auth), and OIDC subject/issuer (for SSO). |
002_init_teams — Team Organization#
| Table | Purpose |
|---|---|
teams | Organizational teams with optional monthly budget limits. |
team_members | Many-to-many join between users and teams, with a role field (owner/member). |
003_init_rbac — Role-Based Access Control#
| Table | Purpose |
|---|---|
roles | Named roles. Five system roles are seeded: super_admin, admin, team_manager, developer, viewer. |
permissions | Fine-grained permissions as (resource, action) pairs. |
role_permissions | Maps roles to their granted permissions. |
user_roles | Assigns roles to users with an optional scope (global or team-scoped). |
004_init_api_keys — Virtual API Keys#
| Table | Purpose |
|---|---|
api_keys | Virtual API keys (tw- prefixed) issued to users or teams. Each key stores: hashed key, allowed models, rate limits (RPM/TPM), monthly budget, expiration, and scopes. |
005_init_providers — AI Provider Configuration#
| Table | Purpose |
|---|---|
providers | Upstream AI provider configuration: name, type (openai/anthropic/google/azure/bedrock/custom), base URL, AES-encrypted API key, and optional config_json (e.g. api_version for Azure). |
models | AI models registered under a provider, with input/output token pricing. |
model_permissions | Access control rules for models, grantable by role, team, or individual user. |
006_init_mcp_servers — MCP Server Registry#
| Table | Purpose |
|---|---|
mcp_servers | Registered upstream MCP servers: endpoint URL, transport type, auth configuration, health status, and check interval. |
mcp_tools | Tools discovered from each MCP server, with their JSON Schema input definitions. |
mcp_tool_permissions | Access control rules for individual MCP tools, grantable by role, team, or user. |
007_init_usage_audit — Usage Tracking and Audit#
| Table | Purpose |
|---|---|
usage_records | Per-request usage data: API key, user, team, provider, model, token counts, cost in USD, latency, and HTTP status. Indexed by time and by user/team/key for dashboard queries. |
audit_logs | Security audit trail: user, API key, action, resource, detail JSON, IP address, and user agent. |
budget_alerts | Records of budget threshold notifications for teams and API keys. |
008—012 — Additional Migrations#
The remaining migrations add:
| Table / Change | Purpose |
|---|---|
system_settings | Dynamic configuration key-value store with category, description, and type metadata. Queried by the DynamicConfig system and editable via the Admin Settings UI. |
api_keys.rotation_period_days | Per-key automatic rotation interval. |
api_keys.inactivity_timeout_days | Auto-disable key after N days of inactivity. |
api_keys.last_used_at | Timestamp of last key usage, used by the inactivity timeout policy. |
api_keys.grace_period_expires_at | Allows the old key to remain valid during rotation until the grace period expires. |
users.deleted_at | Soft-delete column for user accounts. |
providers.deleted_at | Soft-delete column for provider records. |
api_keys.deleted_at | Soft-delete column for API keys. |
| New indexes | Performance indexes on hot query paths (usage records by time range, audit logs by user, API keys by team). |
9. Frontend Architecture#
The web console is a single-page application located in the web/ directory.
Technology Stack#
- React 19 with TypeScript
- shadcn/ui component library (Tailwind CSS-based)
- Simple pushState routing (no external router library)
- Fetch API for all HTTP requests, with JWT from
localStorage
Page Structure#
The UI consists of approximately 20 pages organized into six groups:
| Group | Pages | Description |
|---|---|---|
| Setup | setup.tsx | First-run setup wizard (super_admin creation, site config, optional provider setup) |
| Auth | login.tsx | Login form (local + OIDC SSO) |
| Dashboard | dashboard.tsx | Overview: usage charts, cost summary, recent activity |
| Gateway | providers.tsx, models.tsx, api-keys.tsx, logs.tsx | AI provider management, model registry, API key CRUD, request logs |
| MCP | servers.tsx, tools.tsx, logs.tsx | MCP server registry, tool discovery/permissions, MCP request logs |
| Analytics | usage.tsx, costs.tsx, audit.tsx | Token usage dashboards, cost breakdowns, audit log search |
| Admin | users.tsx, roles.tsx, settings.tsx | User management, RBAC role configuration, system settings |
API Client Pattern#
All pages communicate with the console server (/api/*) using a shared fetch wrapper that:
- Reads the JWT from
localStorage. - Attaches it as
Authorization: Bearer <token>. - Handles 401 responses by redirecting to the login page.
- Parses JSON responses and surfaces errors to the UI.
The web UI is built as a static SPA and served either by the development server (Vite with pnpm dev) or by an nginx container in production (see deploy/docker/Dockerfile.web and deploy/docker/nginx.conf).