DementiaGuide AI

A modern iOS mobile application that acts as a digital library for dementia care information. Users can ask questions through text or voice and receive responses through a real-time 3D avatar with lip-sync driven by ElevenLabs character-level alignment.

Application Workflow

RAG Pipeline Workflow

Video Walkthrough

Screen.Recording.2026-04-29.at.1.24.15.PM.mov

Overview

DementiaGuide AI is designed for caregivers, family members, and healthcare professionals. The app provides evidence-based dementia care guidance through a calm, accessible, and emotionally supportive interface. The AI avatar — Aria — is a VRM model rendered in real time with natural speech, multi-shape lip-sync driven by ElevenLabs character-level alignment, and expressive idle animations.

Tech Stack

Layer	Technology
Framework	React Native (Expo SDK 54)
Navigation	React Navigation 7 (Bottom Tabs + Native Stack)
AI / RAG	OpenAI `gpt-4o-mini` + `text-embedding-3-small`
STT	OpenAI Whisper (`whisper-1`) via `expo-av` audio recording
TTS	ElevenLabs `eleven_turbo_v2_5` (primary) · OpenAI `tts-1` (fallback)
Lip Sync	ElevenLabs character-level alignment → viseme timeline → 5 VRM blend shapes
Avatar	VRM 3D model via Three.js r180 + `@pixiv/three-vrm` in a WebView
Animations	React Native Animated API
Gradients	expo-linear-gradient
Audio	expo-av · Web Audio API (WebView)
Haptics	expo-haptics
Safe Area	react-native-safe-area-context
Storage	`@react-native-async-storage/async-storage` · `expo-secure-store`

Screens

Screen	Description
Home	Avatar hero card, quick question chips, text/voice entry, navigation grid
Chat	iMessage-style conversation, typing indicator, clickable source links
Library	Searchable knowledge base across 6 dementia-care categories with article detail view
Voice	Full-screen voice UI — records via Whisper STT, streams LLM response, plays avatar speech sentence-by-sentence with lip sync
Settings	Accessibility controls — text size, contrast, audio, subtitles, haptics, privacy

Project Structure

DementiaGuideAi/
├── App.js
├── babel.config.js
├── app.json                          # Expo config
├── scripts/
│   └── test-responses.mjs            # CLI tool to test RAG output against sample questions
└── src/
    ├── navigation/
    │   └── AppNavigator.js           # Bottom tab + stack navigator
    ├── screens/
    │   ├── HomeScreen.js
    │   ├── ChatScreen.js             # GiftedChat UI, calls openaiService, shows sources
    │   ├── LibraryScreen.js
    │   ├── ArticleDetailScreen.js    # Full article view from Library
    │   ├── VoiceScreen.js            # Voice conversation UI (Whisper → LLM → TTS → avatar)
    │   └── ProfileScreen.js          # AI configuration (API keys, privacy controls)
    ├── components/
    │   ├── AvatarVRM.js              # VRM avatar in WebView (Three.js + viseme lip sync)
    │   ├── Avatar.js                 # Legacy animated avatar (idle/listening/speaking)
    │   ├── MessageCard.js            # Chat bubble with sources and actions
    │   ├── CategoryCard.js           # Library category row
    │   └── VoiceWaveform.js          # 9-bar animated waveform
    ├── hooks/
    │   └── useAvatarConversation.js  # Voice pipeline orchestration (STT → LLM stream → TTS queue → playback)
    ├── lib/
    │   ├── tts/
    │   │   ├── ttsService.js         # TTS provider selection (ElevenLabs primary, OpenAI fallback)
    │   │   └── elevenLabsService.js  # ElevenLabs API wrapper (audio + character alignment)
    │   └── lipsync/
    │       ├── createVisemeTimeline.js  # Converts ElevenLabs alignment → viseme frame sequence
    │       └── phonemeMap.js            # Character → VRM viseme mapping
    ├── constants/
    │   ├── colors.js
    │   ├── typography.js
    │   └── data.js                   # Categories, resources, sample messages
    ├── data/
    │   └── knowledgeBase.js          # 42 dementia care knowledge chunks (7 per category)
    └── services/
        ├── openaiService.js          # Full RAG pipeline (embeddings, semantic search, streaming chat, Whisper STT)
        ├── aceService.js             # NVIDIA ACE stub (used by VoiceScreen mock)
        └── knowledgeService.js       # Knowledge base search (used by LibraryScreen)

Getting Started

Prerequisites

Node.js 20+
Expo CLI
Xcode (for iOS Simulator) or Expo Go on a physical device
An OpenAI API key
An ElevenLabs API key (optional — enables vowel-accurate lip sync; falls back to amplitude-based sync without it)

Install

git clone <repo-url>
cd DementiaGuideAi
npm install

Run

# iOS Simulator
npx expo start --ios

# Android
npx expo start --android

# Clear Metro cache if needed
npx expo start --ios --clear

API Key Setup

Enter your API keys in the app under Settings → AI Configuration:

OpenAI key — required for chat, STT (Whisper), and fallback TTS
ElevenLabs key — optional; enables the full viseme lip sync pipeline

Both keys are stored securely via expo-secure-store and never leave the device.

Voice Conversation Pipeline

The Voice screen runs a fully pipelined conversation flow managed by useAvatarConversation.js:

[Microphone] → expo-av recording
     ↓
[Whisper STT] → transcribed text
     ↓
[OpenAI gpt-4o-mini stream] → tokens arrive sentence by sentence
     ↓
[ElevenLabs TTS] ← fires immediately per sentence, in parallel
     ↓
[Viseme timeline] ← character alignment → mouth shape keyframes
     ↓
[AvatarVRM WebView] → plays audio + drives 5 blend shapes in real time

Each sentence is sent to TTS as soon as it completes in the LLM stream — so the avatar begins speaking the first sentence while later sentences are still being generated.

Avatar (AvatarVRM)

The avatar is a .vrm model rendered inside a React Native WebView using Three.js and @pixiv/three-vrm. All animation runs in the embedded browser context and communicates back to React Native via postMessage.

State machine: idle → listening → thinking → speaking

Each state drives:

Body bob and sway amplitude
Head look-around frequency and range
Thinking gaze bias (up-right)
Breathing depth on spine/chest bones

Lip sync — ElevenLabs viseme path (primary)

ElevenLabs returns character-level timestamps alongside the audio. These are converted into a viseme frame sequence by createVisemeTimeline.js, mapping characters to one of five VRM blend shapes: aa (open), ih (smile-open), ou (round), ee (wide), oh (rounded-open). During playback, the WebView tracks AudioContext.currentTime each frame, binary-searches the viseme timeline, and cross-fades between the active and next frame over the final 20% of each frame's duration.

Lip sync — RMS fallback path (OpenAI TTS or no ElevenLabs key)

When no alignment data is available, a Web Audio AnalyserNode measures RMS amplitude per frame and maps it to the aa blend shape, producing open/close jaw movement that tracks the audio loudness.

Recovery: If the WebGL context is lost (iOS background eviction, Android process kill), the WebView automatically remounts.

Custom VRM model: Pass a modelUrl prop to AvatarVRM to use any publicly hosted .vrm file.

<AvatarVRM
  ref={avatarRef}
  modelUrl="https://example.com/your-model.vrm"
  isListening={listening}
  isSpeaking={speaking}
  isThinking={thinking}
  width={300}
  height={420}
/>

// Play TTS audio with viseme lip sync (ElevenLabs path)
await avatarRef.current.playAudio({ audio: base64DataUri, visemeTimeline });

// Play TTS audio with RMS fallback
await avatarRef.current.playAudio(base64DataUri);

// Stop early
avatarRef.current.stopAudio();

RAG Pipeline

The chat is powered by a fully client-side RAG pipeline in src/services/openaiService.js.

Setting	Value
Embedding model	`text-embedding-3-small` (1536 dims)
Chat model	`gpt-4o-mini`
Context window	Last 6 messages
Retrieval	Top-5 chunks, min similarity 0.25
Embedding cache	AsyncStorage key `kb_embeddings_v2`
Message history	AsyncStorage key `chat_messages_v1` (max 100)

The knowledge base (src/data/knowledgeBase.js) contains 42 curated dementia care chunks across 6 categories: caregiving, clinical, behavioral best practices, home safety, wellbeing, and communication.

Testing RAG output

OPENAI_API_KEY=sk-... node scripts/test-responses.mjs

Runs a set of sample questions through the full pipeline and prints each response alongside the retrieved knowledge base chunks and their similarity scores. Edit the QUESTIONS array in the script to test specific queries.

Design System

Token	Value	Use
Primary	`#4A7C8E`	Buttons, links, user bubbles
Secondary	`#7FB5A0`	Accents, success states
Accent	`#E8956D`	Warnings, speaking state
Background	`#F7F5F2`	App background
Surface	`#FFFFFF`	Cards, nav bar
Text Primary	`#1E2D3D`	Body and headings

Accessibility:

Minimum 44×44pt tap targets
accessibilityLabel and accessibilityRole on all interactive elements
Configurable text size (small / medium / large)
High contrast mode toggle
Subtitle and audio toggles for avatar responses
Haptic feedback toggle

Disclaimer

DementiaGuide AI provides information for general guidance only. It is not a substitute for professional medical advice, diagnosis, or treatment. Always consult a qualified healthcare provider for dementia-related concerns.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
assets		assets
scripts		scripts
src		src
.env.example		.env.example
.gitignore		.gitignore
App.js		App.js
README.md		README.md
app.json		app.json
babel.config.js		babel.config.js
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DementiaGuide AI

Application Workflow

RAG Pipeline Workflow

Video Walkthrough

Overview

Tech Stack

Screens

Project Structure

Getting Started

Prerequisites

Install

Run

API Key Setup

Voice Conversation Pipeline

Avatar (AvatarVRM)

RAG Pipeline

Testing RAG output

Design System

Disclaimer

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DementiaGuide AI

Application Workflow

RAG Pipeline Workflow

Video Walkthrough

Overview

Tech Stack

Screens

Project Structure

Getting Started

Prerequisites

Install

Run

API Key Setup

Voice Conversation Pipeline

Avatar (AvatarVRM)

RAG Pipeline

Testing RAG output

Design System

Disclaimer

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages