A modern iOS mobile application that acts as a digital library for dementia care information. Users can ask questions through text or voice and receive responses through a real-time 3D avatar with lip-sync driven by ElevenLabs character-level alignment.
Screen.Recording.2026-04-29.at.1.24.15.PM.mov
DementiaGuide AI is designed for caregivers, family members, and healthcare professionals. The app provides evidence-based dementia care guidance through a calm, accessible, and emotionally supportive interface. The AI avatar — Aria — is a VRM model rendered in real time with natural speech, multi-shape lip-sync driven by ElevenLabs character-level alignment, and expressive idle animations.
| Layer | Technology |
|---|---|
| Framework | React Native (Expo SDK 54) |
| Navigation | React Navigation 7 (Bottom Tabs + Native Stack) |
| AI / RAG | OpenAI gpt-4o-mini + text-embedding-3-small |
| STT | OpenAI Whisper (whisper-1) via expo-av audio recording |
| TTS | ElevenLabs eleven_turbo_v2_5 (primary) · OpenAI tts-1 (fallback) |
| Lip Sync | ElevenLabs character-level alignment → viseme timeline → 5 VRM blend shapes |
| Avatar | VRM 3D model via Three.js r180 + @pixiv/three-vrm in a WebView |
| Animations | React Native Animated API |
| Gradients | expo-linear-gradient |
| Audio | expo-av · Web Audio API (WebView) |
| Haptics | expo-haptics |
| Safe Area | react-native-safe-area-context |
| Storage | @react-native-async-storage/async-storage · expo-secure-store |
| Screen | Description |
|---|---|
| Home | Avatar hero card, quick question chips, text/voice entry, navigation grid |
| Chat | iMessage-style conversation, typing indicator, clickable source links |
| Library | Searchable knowledge base across 6 dementia-care categories with article detail view |
| Voice | Full-screen voice UI — records via Whisper STT, streams LLM response, plays avatar speech sentence-by-sentence with lip sync |
| Settings | Accessibility controls — text size, contrast, audio, subtitles, haptics, privacy |
DementiaGuideAi/
├── App.js
├── babel.config.js
├── app.json # Expo config
├── scripts/
│ └── test-responses.mjs # CLI tool to test RAG output against sample questions
└── src/
├── navigation/
│ └── AppNavigator.js # Bottom tab + stack navigator
├── screens/
│ ├── HomeScreen.js
│ ├── ChatScreen.js # GiftedChat UI, calls openaiService, shows sources
│ ├── LibraryScreen.js
│ ├── ArticleDetailScreen.js # Full article view from Library
│ ├── VoiceScreen.js # Voice conversation UI (Whisper → LLM → TTS → avatar)
│ └── ProfileScreen.js # AI configuration (API keys, privacy controls)
├── components/
│ ├── AvatarVRM.js # VRM avatar in WebView (Three.js + viseme lip sync)
│ ├── Avatar.js # Legacy animated avatar (idle/listening/speaking)
│ ├── MessageCard.js # Chat bubble with sources and actions
│ ├── CategoryCard.js # Library category row
│ └── VoiceWaveform.js # 9-bar animated waveform
├── hooks/
│ └── useAvatarConversation.js # Voice pipeline orchestration (STT → LLM stream → TTS queue → playback)
├── lib/
│ ├── tts/
│ │ ├── ttsService.js # TTS provider selection (ElevenLabs primary, OpenAI fallback)
│ │ └── elevenLabsService.js # ElevenLabs API wrapper (audio + character alignment)
│ └── lipsync/
│ ├── createVisemeTimeline.js # Converts ElevenLabs alignment → viseme frame sequence
│ └── phonemeMap.js # Character → VRM viseme mapping
├── constants/
│ ├── colors.js
│ ├── typography.js
│ └── data.js # Categories, resources, sample messages
├── data/
│ └── knowledgeBase.js # 42 dementia care knowledge chunks (7 per category)
└── services/
├── openaiService.js # Full RAG pipeline (embeddings, semantic search, streaming chat, Whisper STT)
├── aceService.js # NVIDIA ACE stub (used by VoiceScreen mock)
└── knowledgeService.js # Knowledge base search (used by LibraryScreen)
- Node.js 20+
- Expo CLI
- Xcode (for iOS Simulator) or Expo Go on a physical device
- An OpenAI API key
- An ElevenLabs API key (optional — enables vowel-accurate lip sync; falls back to amplitude-based sync without it)
git clone <repo-url>
cd DementiaGuideAi
npm install# iOS Simulator
npx expo start --ios
# Android
npx expo start --android
# Clear Metro cache if needed
npx expo start --ios --clearEnter your API keys in the app under Settings → AI Configuration:
- OpenAI key — required for chat, STT (Whisper), and fallback TTS
- ElevenLabs key — optional; enables the full viseme lip sync pipeline
Both keys are stored securely via expo-secure-store and never leave the device.
The Voice screen runs a fully pipelined conversation flow managed by useAvatarConversation.js:
[Microphone] → expo-av recording
↓
[Whisper STT] → transcribed text
↓
[OpenAI gpt-4o-mini stream] → tokens arrive sentence by sentence
↓
[ElevenLabs TTS] ← fires immediately per sentence, in parallel
↓
[Viseme timeline] ← character alignment → mouth shape keyframes
↓
[AvatarVRM WebView] → plays audio + drives 5 blend shapes in real time
Each sentence is sent to TTS as soon as it completes in the LLM stream — so the avatar begins speaking the first sentence while later sentences are still being generated.
The avatar is a .vrm model rendered inside a React Native WebView using Three.js and @pixiv/three-vrm. All animation runs in the embedded browser context and communicates back to React Native via postMessage.
State machine: idle → listening → thinking → speaking
Each state drives:
- Body bob and sway amplitude
- Head look-around frequency and range
- Thinking gaze bias (up-right)
- Breathing depth on spine/chest bones
Lip sync — ElevenLabs viseme path (primary)
ElevenLabs returns character-level timestamps alongside the audio. These are converted into a viseme frame sequence by createVisemeTimeline.js, mapping characters to one of five VRM blend shapes: aa (open), ih (smile-open), ou (round), ee (wide), oh (rounded-open). During playback, the WebView tracks AudioContext.currentTime each frame, binary-searches the viseme timeline, and cross-fades between the active and next frame over the final 20% of each frame's duration.
Lip sync — RMS fallback path (OpenAI TTS or no ElevenLabs key)
When no alignment data is available, a Web Audio AnalyserNode measures RMS amplitude per frame and maps it to the aa blend shape, producing open/close jaw movement that tracks the audio loudness.
Recovery: If the WebGL context is lost (iOS background eviction, Android process kill), the WebView automatically remounts.
Custom VRM model: Pass a modelUrl prop to AvatarVRM to use any publicly hosted .vrm file.
<AvatarVRM
ref={avatarRef}
modelUrl="https://example.com/your-model.vrm"
isListening={listening}
isSpeaking={speaking}
isThinking={thinking}
width={300}
height={420}
/>
// Play TTS audio with viseme lip sync (ElevenLabs path)
await avatarRef.current.playAudio({ audio: base64DataUri, visemeTimeline });
// Play TTS audio with RMS fallback
await avatarRef.current.playAudio(base64DataUri);
// Stop early
avatarRef.current.stopAudio();The chat is powered by a fully client-side RAG pipeline in src/services/openaiService.js.
| Setting | Value |
|---|---|
| Embedding model | text-embedding-3-small (1536 dims) |
| Chat model | gpt-4o-mini |
| Context window | Last 6 messages |
| Retrieval | Top-5 chunks, min similarity 0.25 |
| Embedding cache | AsyncStorage key kb_embeddings_v2 |
| Message history | AsyncStorage key chat_messages_v1 (max 100) |
The knowledge base (src/data/knowledgeBase.js) contains 42 curated dementia care chunks across 6 categories: caregiving, clinical, behavioral best practices, home safety, wellbeing, and communication.
OPENAI_API_KEY=sk-... node scripts/test-responses.mjsRuns a set of sample questions through the full pipeline and prints each response alongside the retrieved knowledge base chunks and their similarity scores. Edit the QUESTIONS array in the script to test specific queries.
| Token | Value | Use |
|---|---|---|
| Primary | #4A7C8E |
Buttons, links, user bubbles |
| Secondary | #7FB5A0 |
Accents, success states |
| Accent | #E8956D |
Warnings, speaking state |
| Background | #F7F5F2 |
App background |
| Surface | #FFFFFF |
Cards, nav bar |
| Text Primary | #1E2D3D |
Body and headings |
Accessibility:
- Minimum 44×44pt tap targets
accessibilityLabelandaccessibilityRoleon all interactive elements- Configurable text size (small / medium / large)
- High contrast mode toggle
- Subtitle and audio toggles for avatar responses
- Haptic feedback toggle
DementiaGuide AI provides information for general guidance only. It is not a substitute for professional medical advice, diagnosis, or treatment. Always consult a qualified healthcare provider for dementia-related concerns.
Private — all rights reserved.
