AI Integration

ORIS AI Guide

Route AI requests through 12 LLM providers with automatic model selection, fallback, and cost optimization.

Quick Start

import { OrisAI } from '@oris/ai-client';

const oris = new OrisAI({
  baseUrl: 'https://api.meetoris.com',
  accessToken: jwt, // From ORIS Identity
  vertical: 'DE',   // Your product code
});

// Sync request
const result = await oris.reason({
  primitive: 'text.generate',
  messages: [{ role: 'user', content: 'Draft a thank you letter' }],
});
console.log(result.output); // AI-generated content

Streaming with SSE

const controller = oris.stream(
  {
    primitive: 'chat.assistant',
    messages: [{ role: 'user', content: 'Explain FCRA compliance' }],
  },
  {
    onChunk(content) { process.stdout.write(content); },
    onDone(meta) { console.log('Model:', meta.model_used); },
    onError(err) { console.error(err); },
  }
);

// Cancel if needed:
controller.abort();

API Endpoints

POST/ai/v1/reasonSubmit reasoning task (sync)

POST/ai/v1/streamSubmit streaming task (SSE)

GET/ai/v1/modelsList available models for your tier

GET/ai/v1/healthProvider health status

GET/ai/v1/usageAI usage analytics

POST/api/v1/completeExternal API (API key auth)

Prompt Registry

ORIS AI ships with versioned prompt templates per vertical. When you call a primitive, the router loads the active prompt for your vertical automatically. Platform-level defaults apply when no vertical-specific prompt exists.

Prompts are managed in the database (oris_ai.prompt_registry) and can be updated without redeployment. Each prompt has a system_prompt, user_prompt_template with variable placeholders, and optional output_schema for structured responses.

Credit System

Credits are weighted by task complexity, not 1:1 per call. Simple tasks (chat, classify) cost 1 credit. Standard tasks (generate, summarise) cost 2. Analysis costs 3. Document drafting costs 5. Cross-vertical queries cost 10.

Starter100/monthFree models only

Pro5,000/monthClaude Sonnet, GPT-4.1, Gemini Pro

EnterpriseUnlimitedClaude Opus, GPT-5.4, o3, full fleet

Model Routing

The routing algorithm: (1) Extract primitive, ai_tier, data_residency from JWT. (2) Gate check (CROSS_VERTICAL_QUERY blocked on starter). (3) Query candidate models from routing matrix. (4) Filter by data residency if PII. (5) Filter by context window. (6) Cost optimizer picks cheapest above quality threshold. (7) Circuit breaker skips unhealthy providers. (8) Fallback chain tries next candidate on failure. (9) Calculate billing (internal = cost x 1.15, external = published rate). (10) Dispatch and log.

Products never know which model answered. The fallback chain is transparent. If Anthropic is down, the request silently routes to OpenAI or Google. Provider health is tracked in Redis with a 120-second recovery window.