Amazon Nova 2 Multimodal Embeddings with Amazon S3 Vectors and AWS Java SDK – Part 1 Introduction

Introduction

Throughout this series, we’ll use Amazon Nova 2 Multimodal Embeddings to create embeddings and store them in Amazon S3 Vectors. In this article, we’ll introduce Amazon Nova 2 Multimodal Embeddings and Amazon S3 Vectors, providing an overview of their key features. We will then cover how to create and store text and image embeddings, and proceed with audio and video in subsequent parts. After that, we’ll also cover how to do the (similarity) search for the query in the stored embeddings in S3 Vectors. We’ll use AWS Java SDK in the examples. The reason for this is that I see lots of examples in Python and not so many in Java. At the same time, I also observe Java’s steadily increasing popularity in the ML/AI space. You can find the code examples in my GitHub repository amazon-nova-2-multimodal-embeddings. Please give it a star if you like it, and follow me on GitHub for more examples.

Introduction to Amazon Nova 2 Multimodal Embeddings

Amazon Nova Multimodal Embeddings is a multimodal embeddings model for agentic RAG and semantic search applications. It supports text, documents, images, video, and audio through a single model, enabling cross-modal retrieval. Nova Multimodal Embeddings maps each of these content types into a unified semantic space. This enables you to conduct unimodal, cross-modal, and multimodal vector operations.

When we pass a piece of content through Nova embeddings, the model converts that content into a universal numerical format. We refer to this format as a vector. A vector is a set of numerical values that we can use for various search functionalities. Similar content is given a closer vector than less similar content.

Key Features of Nova Multimodal Embeddings

Key features of the Nova Multimodal Embeddings are:

Support for text, image, document image, video, and audio in a unified semantic space. Maximum context length is 8K tokens or 30s of video and 30s of audio.
Synchronous and asynchronous APIs: The API supports both synchronous and asynchronous use.
Large file segmentation: The async API makes it easy to work with large inputs by providing API built segmentation for long text, video, and audio, controlled by user-defined parameters. The model will generate a single embedding for each segment.
Video with audio: Process video with audio simultaneously. Specify if you would like a single embedding representing both modalities or two separate embeddings.
Embedding purpose: Optimize your embeddings depending on the intended downstream application (retrieval/RAG/Search, classification, clustering).
Dimension sizes: 4 dimension sizes to trade off embedding accuracy and vector storage cost: 3072, 1024, 384, and 256.
Input methods: Pass content to be embedded by specifying an S3 URI or inline as a base64 encoding.

Introduction to Amazon S3 Vectors

Amazon S3 Vectors delivers purpose-built, cost-optimized vector storage for AI agents, inference, RAG, and semantic search. S3 Vectors is designed to provide the same elasticity, durability, and availability as Amazon S3 and delivers sub-second latency for infrequent queries and as low as 100 milliseconds for more frequent queries. You get a dedicated set of API operations to store, access, and query vector data without provisioning any infrastructure. S3 Vectors consists of several key components that work together:

Vector buckets – A new bucket type that’s purpose-built to store and query vectors.
Vector indexes – Within a vector bucket, you can organize your vector data within vector indexes. You perform similarity queries on your vector data within vector indexes.
Vectors – You store vectors in your vector index. For similarity search and AI applications, vectors are created as vector embeddings, which are numerical representations that preserve semantic relationships between content (such as text, images, or audio). With that, similar items are positioned closer together. S3 Vectors can perform similarity searches based on semantic meaning rather than exact matching by comparing how close vectors are to each other mathematically. When adding vector data to a vector index, you can also attach metadata for future filtering queries based on a set of conditions (for example, timestamps, categories, and user preferences).

Key Features of S3 Vectors

Purpose-built storage for vectors

S3 Vectors is the first purpose-built object storage in the cloud to store and query vectors. Vector buckets are designed to provide cost-effective, elastic, and durable storage for vector data.

Vector embeddings are transforming how customers use and retrieve their unstructured data, from detecting similarities across medical images, finding anomalies in thousands of hours of video footage, navigating through large code bases, and identifying the most relevant case law for a given legal matter. These emerging applications combine with embedding models to encode the semantic meaning of data (for example, text, images, video, code) as numerical vector embeddings.

Within a vector bucket, you organize your vector data within vector indexes, without provisioning infrastructure. As you write, update, and delete vectors over time, S3 Vectors automatically optimizes the vector data to achieve the best possible price performance for vector storage, even as the data sets scale and evolve.

Purpose-built storage for vectors

With S3 Vectors, you can perform queries to find the most similar vectors to a query vector. The response time is sub-second for infrequent queries and as low as 100 milliseconds for more frequent queries. S3 Vectors is ideal for workloads where queries are less frequent.
You can attach metadata (for example, year, author, genre, and location) as key-value pairs to your vectors. By default, you can filter all metadata unless you explicitly specify it as non-filterable. You can use filterable metadata to filter your query results based on specific attributes, enhancing the relevance of your queries. Vector indexes support string, number, boolean, and list types of metadata.
You can manage access for resources in vector buckets with IAM and Service Control Policies in AWS Organizations. S3 Vectors uses a different service namespace than Amazon S3: the s3vectors namespace. Therefore, you can design policies specifically for the S3 Vectors service and its resources. You can design policies to grant access to individual vector indexes, all vector indexes within a vector bucket, or all vector buckets in an account.

Integration with AWS services

S3 Vectors integrates with other AWS services to enhance your vector processing capabilities:

Amazon OpenSearch Service: Optimize vector storage costs while continuing to use OpenSearch API operations. This is ideal for workloads that need advanced search functionality, such as hybrid search, aggregations, advanced filtering, and faceted search. You can also export a snapshot of an S3 vector index to Amazon OpenSearch Serverless for high QPS and low latency vector search.
Amazon Bedrock Knowledge Bases: Select a vector index in S3 Vectors as your vector store to save on storage costs for retrieval augmented generation (RAG) applications.
Amazon Bedrock in SageMaker Unified Studio: Develop and test knowledge bases using S3 Vectors as your vector store.

Conclusion

In this part of the series, we introduced the goal of this series and introduced Amazon Nova 2 Multimodal Embeddings and Amazon S3 Vectors. We’ll cover creating and storing text and image embeddings in the next part.