AI models for natural language can enable fundamentally new user experiences in OpenSearch. However, converting demos into production-ready features presents challenges in ensuring AI output reliability and accuracy. This talk will address these challenges in the context of building new AI-powered features for query generation and data exploration in OpenSearch. We’ll examine some risks of language models, suggest mitigations, and share experiences from building a reliable query generator.