Abyss

A voice-first AI assistant that lives on your iPhone, learns you over time through a context graph, and reaches deeper tools through a permissioned macOS bridge. Built for the way people actually work.

Amazon BedrockNova SonicAnthropic ClaudeNeptune AnalyticsAWS ECS FargateWhisperKitElevenLabsSwiftUINode.jsWebSockets

Abyss is a voice-first AI assistant that lives on your iPhone, learns you over time through a context graph, and reaches deeper tools through a permissioned macOS bridge — built for the way people actually work.

The Problem

Today's AI assistants are stateless and siloed. Each conversation starts from scratch, with no memory of who you are, what you're working on, or what tools you actually need. They live inside browser tabs or chat windows — disconnected from the real workflows happening on your devices.

Abyss was built to close that gap: an assistant that accumulates context over time, speaks naturally through voice, and can reach into your desktop environment when you give it permission.

System Architecture

Abyss is a distributed system spanning three layers connected over WebSocket:

iOS App

The mobile client is built in SwiftUI with WhisperKit for on-device speech-to-text, ElevenLabs for text-to-speech, and OAuth for authenticating with external services. It connects to the backend over a persistent WebSocket and exchanges events bidirectionally.

Certain destructive actions — sending emails, deleting calendar events — require explicit iOS user confirmation before executing.

Node.js Conductor Server (ECS Fargate)

The Conductor is the central orchestration layer. It routes between three LLM backends depending on the task:

Bedrock Nova Lite / Pro — default text reasoning (Lite) and heavy tasks (Pro)
Anthropic Claude — complex reasoning and tool-use chains
Nova Sonic — real-time voice streaming

The Conductor dispatches tool calls to three categories:

Server Tools

APIs that run server-side with OAuth credentials:

Service	Capabilities
Gmail API	inbox, search, read, send, reply
Google Calendar API	list, create, update, delete*
Canvas LMS	todo, grades, assignments
Cursor Cloud Agents	spawn, status, cancel, followup
Brave Search	web search
GitHub OAuth	auth, repos

* = iOS user confirmation required

macOS Bridge

A permissioned desktop agent connected through a Bridge Router. It exposes three tool modules, all sandboxed within a Workspace Sandbox with pairing security:

Module	Capabilities
Command Executor	exec.run, fs.read, fs.search, fs.patch, git.*
Claude Code (Subprocess)	Bash, Read, Edit, Write, Glob, Grep
Nova Act (Python)	start, act, stop — drives Chrome for web automation

Data Flow

The end-to-end voice interaction follows this path:

User speaks → WhisperKit transcribes on-device
Transcript sent over WebSocket to the Conductor
Conductor routes to LLM (Nova/Claude) for reasoning
LLM dispatches tool calls to one of three categories:
- Server Tools — Gmail, Calendar, Canvas, Search
- iOS Tools — user confirmations, audio controls, preferences
- Bridge Tools — file system, git, Claude Code, Nova Act
Tool results feed back to LLM for continued reasoning
Final response streamed via ElevenLabs / Nova Sonic → iOS → speaker

AWS Infrastructure

The entire backend runs in AWS us-east-1:

Compute

Component	Details
Amazon ECR	abyss-server:latest (linux/amd64)
ECS Fargate	Cluster: abyss, Task: abyss-server, Node.js Conductor on port 8080
Application Load Balancer	:8080 with WebSocket sticky sessions
Execution Role	abyss-ecs-execution-role (ECR + CloudWatch Logs)
Task Role	abyss-ecs-task-role (Bedrock + Neptune + S3 + KB)

AI / ML

Service	Usage
Bedrock Converse API	Nova Lite (default), Nova Pro (heavy), Nova Sonic (voice), Titan Embed V2
Bedrock Knowledge Bases	Memory retrieval, vector search
Bedrock Agent Runtime	KB ingestion, S3 data source

Storage & Graph

Service	Usage
Amazon S3	Memory documents (JSON)
Neptune Analytics	Context graph, OpenCypher queries, vector + keyword hybrid search
CloudWatch Logs	/ecs/abyss-server, structured JSON

External Services

Gmail APIGoogle Calendar APICanvas LMS APICursor Cloud Agents APIBrave Search APIGitHub OAuthElevenLabs (TTS)WhisperKit (on-device)Anthropic Claude API

Context Graph

Every interaction builds a persistent knowledge graph in Neptune Analytics using OpenCypher queries. Entities, preferences, projects, and relationships are extracted and stored as graph nodes and edges. Over time, the assistant develops a rich model of who you are and what you care about — enabling responses that are genuinely personalized rather than generically helpful.

Memory documents are stored as JSON in S3 and indexed through Bedrock Knowledge Bases for vector search. Neptune Analytics provides hybrid vector + keyword search for retrieval, combining semantic similarity with structured graph traversal.

Voice Pipeline

Voice is the primary input modality — not an afterthought bolted onto a chat interface. WhisperKit runs on-device for low-latency speech-to-text without a network round trip. Output is synthesized through ElevenLabs or Nova Sonic depending on the interaction mode, streamed back to the iOS client over WebSocket, and played through the speaker.

The system is designed for the conversational cadence of real speech: interruptions, corrections, and follow-ups all work naturally.

Challenges & Solutions

Context graph scalability: As the knowledge graph grows, query latency can degrade. We use selective subgraph retrieval based on embedding similarity to keep context injection fast and relevant.
Voice latency budget: End-to-end voice interaction needs to feel conversational (<2s). On-device WhisperKit handles recognition without a network round trip, and Bedrock streaming keeps generation responsive.
Desktop trust model: Giving an AI access to your filesystem is a security minefield. The workspace sandbox with pairing security ensures every bridge action requires explicit authorization, with granular capability scoping per module.
WebSocket reliability: Persistent WebSocket connections over mobile networks are fragile. The ALB uses sticky sessions to maintain connection affinity, and the iOS client handles reconnection transparently.
Multi-LLM routing: Different tasks demand different models. The Conductor dynamically routes between Nova Lite (fast/cheap), Nova Pro (complex), Claude (deep reasoning), and Nova Sonic (voice) based on task characteristics.