AmicoScript is a local-first audio transcription tool powered by Whisper.
It provides:
- audio transcription
- optional speaker diarization
- transcript management and search
- export in multiple formats
Typical usage:
- Upload an audio file (or batch of files)
- Start a transcription job
- Monitor progress
- Retrieve the result
- Export or edit the transcript
GET /api/models
Returns available Whisper models.
GET /api/settings
Retrieve saved settings (e.g., Hugging Face token)
POST /api/settings
Save settings
POST /api/transcribe
Upload an audio file and start a transcription job.
Response:
{
"job_id": "string"
}GET /api/jobs/{id}/stream
Server-Sent Events (SSE) stream for real-time progress updates.
POST /api/jobs/{id}/cancel
Cancels a running transcription job.
GET /api/jobs/{id}/result
Returns the full transcription result in JSON format.
GET /api/jobs/{id}/export/{fmt}
Download transcript in one of the following formats:
- json
- srt
- txt
- md
Speaker diarization uses pyannote and requires:
- A Hugging Face account
- Acceptance of model licenses
- A valid
hf_token
Add your token via the settings endpoint or UI.
- Backend: Python + FastAPI
- Frontend: Static HTML served by FastAPI
- Processing: Background threads for transcription
- In-memory job state
- Temporary audio files (auto-deleted after ~1 hour)
To enable GPU acceleration:
- Use a CUDA-enabled PyTorch base image
- Update the Dockerfile accordingly
- Enable GPU support in docker-compose
- All processing is local
- No audio data is uploaded externally
- Performance depends on hardware and selected model
AmicoScript can now call a locally hosted LLM (e.g. Ollama or any service implementing a compatible /v1/chat/completions API) to produce higher-level analyses from transcripts. This includes:
- Summaries: concise meeting summaries highlighting topics and decisions.
- Action items: extracted tasks, owners, and deadlines where present.
- Translations: translate the full transcript into a target language using the LLM.
- Custom prompts: run arbitrary instructions against the transcript.
To use the AI analysis features, you need a compatible LLM service running locally. Ollama is the easiest option and is free.
macOS:
- Download from ollama.com
- Move Ollama.app to
/Applicationsand run it - The Ollama service will start automatically in the background
Windows:
- Download the installer from ollama.com
- Run the installer and follow the prompts
- Ollama runs as a background service automatically
Linux:
curl -fsSL https://ollama.ai/install.sh | shThen start the service:
ollama serveOnce Ollama is running, pull a model to download it locally:
ollama pull mistral # Fast, good for summaries
ollama pull neural-chat # Smaller, lighter weight
ollama pull llama2 # More capable, larger (~4GB)First pull takes time (model download), but subsequent loads are instant.
Check that Ollama is running at http://localhost:11434:
curl http://localhost:11434/api/tagsYou should see a JSON list of your downloaded models.
- Open AmicoScript and go to LLM Settings (sidebar)
- Set:
- Base URL:
http://localhost:11434(default) - Model Name: your chosen model (e.g.,
mistral) - API Key: leave blank (Ollama doesn't require one)
- Base URL:
- Click Test Connection to verify
Done! You can now use AI analysis features.
If running AmicoScript in Docker and Ollama on your host machine, use http://host.docker.internal:11434 as the base URL instead.
Key implementation notes
- Settings: LLM configuration is persisted to the same settings store used for HF tokens. The UI exposes a
LLM Settingspanel (base URL, model name, API key). - Backend endpoints:
GET /api/llm/settings— returns the current LLM configuration.POST /api/llm/settings— save LLM settings (llm_base_url,llm_model_name,llm_api_key).POST /api/llm/test-connection— quick connectivity test to the configured LLM.GET /api/llm/models— list models exposed by the LLM server (if supported).POST /api/llm/models/pull— fire-and-forget model pull (useful for Ollama's/api/pull).POST /api/recordings/{recording_id}/analyses— create a new analysis job for a recording.GET /api/recordings/{recording_id}/analyses— list past analyses for a recording.GET /api/recordings/{recording_id}/analyses/{analysis_id}— fetch a specific analysis result.
Streaming and SSE
Analyses execute as background jobs and stream incremental results to the client via the existing SSE job stream: GET /api/jobs/{job_id}/stream. The frontend subscribes to that stream and appends partial deltas as they arrive.
Example: start an analysis (curl)
curl -X POST "http://localhost:8002/api/recordings/<RECORDING_ID>/analyses" \
-F analysis_type=summary \
-F output_language=EnglishExample: test LLM connection (curl)
curl -X POST "http://localhost:8002/api/llm/test-connection"Notes & references
- Ollama HTTP API (example server): https://docs.ollama.com/
- SSE (EventSource) streaming pattern: https://developer.mozilla.org/en-US/docs/Web/API/EventSource
- Settings location on disk:
~/.amicoscript/settings.json(containsllm_base_url,llm_model_name,llm_api_key,hf_token, etc.)
Docker tip: when running the app in Docker and your LLM server runs on the host, use http://host.docker.internal:11434 as the base URL.
Running tests
Install pytest (if not already installed):
python -m pip install pytestRun the test suite:
pytest -q