14. Summary¶
This architecture provides a scalable, multilingual, phonetic-aware search system fully integrated with the existing WHG infrastructure. It leverages:
Epitran + PanPhon for linguistically grounded IPA generation.
Pitt CRC for heavy compute (model training, bulk processing).
DigitalOcean for responsive, real-time search.
Versioned embeddings for safe, iterative improvement.
Multiple fallback paths for robustness.
The design prioritises operational stability, data integrity, and incremental deployment while enabling cutting-edge phonetic search capabilities for historical place research.
15. References¶
WHG Place Discussion #81 - Detailed phonetic search proposal
Epitran Documentation - G2P library
PanPhon Documentation - Phonetic feature vectors