An open-source, all-in-one vector database for building flexible, scalable, and future-proof AI applications
Traditional lexical search, based on term frequency models like BM25, is widely used and effective for many search applications. However, lexical search techniques require significant investment in time and expertise to tune them to account for the meaning or relevance of the terms searched. Today, more and more developers want to embed semantic understanding into their search applications. Enter machine learning embedding models that can encode the meaning and context of documents, images, and audio into vectors for similarity search. These embedded meanings can, in turn, be searched using the k-nearest neighbors (k-NN) functionality provided by OpenSearch.
Using OpenSearch as a vector database brings together the power of traditional search, analytics, and vector search in one complete package. OpenSearch’s vector database capabilities can accelerate artificial intelligence (AI) application development by reducing the effort for builders to operationalize, manage, and integrate AI-generated assets. Bring your models, vectors, and metadata into OpenSearch to power vector, lexical, and hybrid search and analytics, with performance and scalability built in.
What is a vector database?
Information comes in many forms: unstructured data, like text documents, rich media, and audio, and structured data, like geospatial coordinates, tables, and graphs. Innovations in AI have enabled the use of models, or embeddings, to encode all types of data into vectors. These vectors are data points in a high-dimensional space that capture the meaning and context of an asset, allowing search tools to find similar assets by searching for neighboring data points.
Vector databases allow you to store and index vectors and metadata, unlocking the ability to use low-latency queries to discover assets by degree of similarity. Typically powered by k-NN indexes built using algorithms like Hierarchical Navigable Small Worlds (HNSW) and Inverted File (IVF) System, vector databases augment k-NN functionality by providing a foundation for applications like data management, fault tolerance, resource access controls, and a query engine.
OpenSearch provides an integrated vector database that can support AI systems by serving as a knowledge base. This benefits AI applications like generative AI and natural language search by providing a long-term memory of AI-generated outputs. These outputs can be used to enhance information retrieval and analytics, improve efficiency and stability, and give generative AI models a broader and deeper pool of data from which to draw more accurate and truthful responses to queries.