8. Advantages of This Architecture¶
Minimal infrastructure changes: Only adds two new Elasticsearch indices (
toponym_index,ipa_index) alongside existingplace_index.Resource isolation: Heavy compute (IPA generation, model training) offloaded to Pitt CRC.
Global deduplication:
ipa_indexeliminates redundant storage and computation.Versioned embeddings: Safe model updates with rollback capability.
Graceful degradation: Multiple fallback paths ensure search always returns results.
Scalable: Handles millions of toponyms without DigitalOcean resource pressure.
Maintainable: Clear separation of concerns (online vs. offline processing).
Real-time inference: On-the-fly embedding generation enables immediate query response without pre-indexing all possible queries.