
Abyss
A voice-first AI assistant that lives on your iPhone, learns you over time through a context graph, and reaches deeper tools through a permissioned macOS bridge — built for the way people actually work.
Abyss is a voice-first AI assistant that lives on your iPhone, learns you over time through a context graph, and reaches deeper tools through a permissioned macOS bridge — built for the way people actually work.
The Problem
Today's AI assistants are stateless and siloed. Each conversation starts from scratch, with no memory of who you are, what you're working on, or what tools you actually need. They live inside browser tabs or chat windows — disconnected from the real workflows happening on your devices.
Abyss was built to close that gap: an assistant that accumulates context over time, speaks naturally through voice, and can reach into your desktop environment when you give it permission.
System Architecture

Abyss is a distributed system spanning three layers connected over WebSocket:
iOS App
The mobile client is built in SwiftUI with WhisperKit for on-device speech-to-text, ElevenLabs for text-to-speech, and OAuth for authenticating with external services. It connects to the backend over a persistent WebSocket and exchanges events bidirectionally.
Certain destructive actions — sending emails, deleting calendar events — require explicit iOS user confirmation before executing.
Node.js Conductor Server (ECS Fargate)
The Conductor is the central orchestration layer. It routes between three LLM backends depending on the task:
- Bedrock Nova Lite / Pro — default text reasoning (Lite) and heavy tasks (Pro)
- Anthropic Claude — complex reasoning and tool-use chains
- Nova Sonic — real-time voice streaming
The Conductor dispatches tool calls to three categories:
Server Tools
APIs that run server-side with OAuth credentials:
| Service | Capabilities |
|---|---|
| Gmail API | inbox, search, read, send*, reply* |
| Google Calendar API | list, create, update, delete* |
| Canvas LMS | todo, grades, assignments |
| Cursor Cloud Agents | spawn, status, cancel, followup |
| Brave Search | web search |
| GitHub OAuth | auth, repos |
* = iOS user confirmation required
macOS Bridge
A permissioned desktop agent connected through a Bridge Router. It exposes three tool modules, all sandboxed within a Workspace Sandbox with pairing security:
| Module | Capabilities |
|---|---|
| Command Executor | exec.run, fs.read, fs.search, fs.patch, git.* |
| Claude Code (Subprocess) | Bash, Read, Edit, Write, Glob, Grep |
| Nova Act (Python) | start, act, stop — drives Chrome for web automation |
Data Flow
The end-to-end voice interaction follows this path:
- User speaks → WhisperKit transcribes on-device
- Transcript sent over WebSocket to the Conductor
- Conductor routes to LLM (Nova/Claude) for reasoning
- LLM dispatches tool calls to one of three categories:
- Server Tools — Gmail, Calendar, Canvas, Search
- iOS Tools — user confirmations, audio controls, preferences
- Bridge Tools — file system, git, Claude Code, Nova Act
- Tool results feed back to LLM for continued reasoning
- Final response streamed via ElevenLabs / Nova Sonic → iOS → speaker
AWS Infrastructure
The entire backend runs in AWS us-east-1:
Compute
| Component | Details |
|---|---|
| Amazon ECR | abyss-server:latest (linux/amd64) |
| ECS Fargate | Cluster: abyss, Task: abyss-server, Node.js Conductor on port 8080 |
| Application Load Balancer | :8080 with WebSocket sticky sessions |
| Execution Role | abyss-ecs-execution-role (ECR + CloudWatch Logs) |
| Task Role | abyss-ecs-task-role (Bedrock + Neptune + S3 + KB) |
AI / ML
| Service | Usage |
|---|---|
| Bedrock Converse API | Nova Lite (default), Nova Pro (heavy), Nova Sonic (voice), Titan Embed V2 |
| Bedrock Knowledge Bases | Memory retrieval, vector search |
| Bedrock Agent Runtime | KB ingestion, S3 data source |
Storage & Graph
| Service | Usage |
|---|---|
| Amazon S3 | Memory documents (JSON) |
| Neptune Analytics | Context graph, OpenCypher queries, vector + keyword hybrid search |
| CloudWatch Logs | /ecs/abyss-server, structured JSON |
External Services
Context Graph
Every interaction builds a persistent knowledge graph in Neptune Analytics using OpenCypher queries. Entities, preferences, projects, and relationships are extracted and stored as graph nodes and edges. Over time, the assistant develops a rich model of who you are and what you care about — enabling responses that are genuinely personalized rather than generically helpful.
Memory documents are stored as JSON in S3 and indexed through Bedrock Knowledge Bases for vector search. Neptune Analytics provides hybrid vector + keyword search for retrieval, combining semantic similarity with structured graph traversal.
Voice Pipeline
Voice is the primary input modality — not an afterthought bolted onto a chat interface. WhisperKit runs on-device for low-latency speech-to-text without a network round trip. Output is synthesized through ElevenLabs or Nova Sonic depending on the interaction mode, streamed back to the iOS client over WebSocket, and played through the speaker.
The system is designed for the conversational cadence of real speech: interruptions, corrections, and follow-ups all work naturally.
Challenges & Solutions
- Context graph scalability: As the knowledge graph grows, query latency can degrade. We use selective subgraph retrieval based on embedding similarity to keep context injection fast and relevant.
- Voice latency budget: End-to-end voice interaction needs to feel conversational (<2s). On-device WhisperKit handles recognition without a network round trip, and Bedrock streaming keeps generation responsive.
- Desktop trust model: Giving an AI access to your filesystem is a security minefield. The workspace sandbox with pairing security ensures every bridge action requires explicit authorization, with granular capability scoping per module.
- WebSocket reliability: Persistent WebSocket connections over mobile networks are fragile. The ALB uses sticky sessions to maintain connection affinity, and the iOS client handles reconnection transparently.
- Multi-LLM routing: Different tasks demand different models. The Conductor dynamically routes between Nova Lite (fast/cheap), Nova Pro (complex), Claude (deep reasoning), and Nova Sonic (voice) based on task characteristics.