11. Deployment Checklist¶
11.1. Phase 1: Development (Week 1-4)¶
Set up Epitran + PanPhon in Pitt environment.
Implement IPA normalisation library with tests.
Design and create
toponym_indexschema in Elasticsearch.Create
ipa_indextemplate.Build proof-of-concept with 10k toponyms (single language).
Benchmark query latency and accuracy.
Test ONNX model export and Django integration.
11.2. Phase 2: Initial Migration (Week 5-8)¶
Extract all WHG toponyms from
place_indexto Pitt.Create
toponym_indexand populate from existing place data.Generate IPA for all toponyms (batch processing).
Create initial rule-based embeddings (PanPhon).
Bulk push to Elasticsearch staging environment.
Validate data integrity (checksums, counts).
Test query pipeline with production-scale data.
11.3. Phase 3: Model Training (Week 9-12)¶
Construct training dataset (positive/negative pairs).
Train Siamese BiLSTM (baseline 64-dim).
Evaluate on held-out test set.
Select optimal embedding dimension.
Export model to ONNX format.
Test inference model performance in Django.
Re-embed all IPA entries with trained model.
Deploy to staging with version v1.
11.4. Phase 4: Production Rollout (Week 13-16)¶
Deploy inference model to production Django servers.
Deploy updated indices to production Elasticsearch.
Enable query pipeline with feature flag (10% traffic).
Monitor for errors and performance issues.
Gradually increase traffic (50%, 100%).
Document operational runbooks.
Train support team on new search capabilities.
11.5. Phase 5: Continuous Improvement (Ongoing)¶
Monthly embedding refresh cycle.
Quarterly model retraining with new data.
User feedback analysis and training data enrichment.
Performance optimisation (cache tuning, index settings).
Model distillation for faster inference if needed.