AI-powered document translator that preserves Word formatting, images, and layout. Built with Flask and Google Gemini API.
- Translates .docx files while preserving:
- Bold and italic formatting
- Images and graphics
- Tables and headers/footers
- Tab spacing and alignment
- Web-based interface with drag-and-drop upload
- Automatic file cleanup
- Error recovery with checkpoint system
- Progress tracking support
pip install -r requirements.txtCopy the example env file and add your API key:
cp .env.example .envThen edit .env and set your Gemini API key:
GEMINI_API_KEY=your_google_gemini_api_key_here
TARGET_LANGUAGE=German
BATCH_SIZE=25
BATCH_DELAY=0.5
MAX_FILE_SIZE_MB=10
FILE_CLEANUP_HOURS=1Important: Get your Gemini API key from https://aistudio.google.com/app/apikey
Option A: Python directly
python app.pyOption B: Windows batch file
StartTranslator.batThe server will start at http://127.0.0.1:5000
- Open
http://127.0.0.1:5000in your browser - Drag and drop a .docx file (or click to select)
- Click "Translate & Download"
- Wait 10-30 seconds (depending on document size)
- Translated document will download automatically
Edit .env file to customize:
| Variable | Description | Default |
|---|---|---|
GEMINI_API_KEY |
Your Google Gemini API key | (required) |
TARGET_LANGUAGE |
Translation target language | German |
BATCH_SIZE |
Segments per API call (1-100) | 25 |
BATCH_DELAY |
Delay between batches (seconds) | 0.5 |
MAX_FILE_SIZE_MB |
Maximum upload size | 10 |
FILE_CLEANUP_HOURS |
Auto-delete files after hours | 1 |
FLASK_DEBUG |
Enable Flask debug mode | True |
FLASK_PORT |
Server port | 5000 |
Currently configured for German, but you can change TARGET_LANGUAGE to:
- French
- Spanish
- Italian
- Portuguese
- Dutch
- Any language supported by Google Gemini
If translation fails mid-process:
- Checkpoint files (
.checkpoint,.tmp) are created automatically - Re-running translation will resume from the last successful batch
- Checkpoint files are cleaned up after successful completion
AutomatedDocxTranslator/
├── app.py # Flask web application
├── translator_core.py # Core translation logic
├── config.py # Centralized configuration
├── .env # Environment variables (not in git)
├── requirements.txt # Python dependencies
├── templates/
│ └── index.html # Web UI
├── validators.py # Document validation
├── tests/ # Unit tests
├── uploads/ # Temporary uploaded files
└── downloads/ # Temporary translated files
- Never commit
.envfile to version control (already in.gitignore) - API keys are loaded from environment variables only
- Uploaded files are automatically deleted after configured hours
- Maximum file size is enforced (default 10MB)
"GEMINI_API_KEY not found" error:
- Make sure
.envfile exists in project root - Check that
GEMINI_API_KEYis set correctly
Translation fails mid-process:
- Check your internet connection
- Verify API key is valid
- Check checkpoint files (
.checkpoint) for resume capability
Files not cleaning up:
- Verify
FILE_CLEANUP_HOURSin.env - Check console for
[CLEANUP]messages - Scheduler runs every N hours (not immediately)
pytest tests/ -vMIT License