Imagine you have a massive collection of objects stored in S3—everything from logs, media files, documents, and backups. Traditionally, you’d store these files with some metadata like tags or object properties. But here’s the challenge: when you need to find objects based on specific metadata criteria (e.g., all files tagged “project=Q1”), you have to painstakingly scan or manually index your S3 bucket.
That’s where S3 Metadata (Preview) steps in.
What Is S3 Metadata (Preview)?
Think of S3 Metadata as a searchable database layer directly on top of your S3 buckets. It’s like giving your objects a new superpower: queryable metadata! Now, instead of iterating through endless lists of objects to find what you’re looking for, you can query metadata attributes—like tags, object keys, or custom metadata—using familiar tools such as Amazon Athena or AWS SDKs.
In simple terms, it turns your S3 bucket into a structured dataset, enabling fast, serverless queries to locate files or manage data more effectively.
S3 Table Bucket
But it gets better! Amazon is also introducing the concept of S3 Table Buckets, a special type of S3 bucket designed to natively support metadata queries. With an S3 Table Bucket, the metadata of every object is automatically indexed and updated, enabling real-time queries and seamless integration with tools like Athena and Lake Formation. This isn’t just an add-on; it fundamentally changes how you think about object storage.
For instance, let’s say you’re managing a massive data lake for analytics. Using an S3 Table Bucket, you could run SQL-like queries on your bucket to filter objects based on metadata—no additional indexing or pipelines required. It’s storage and search, reimagined.
Here is the Step By Step Process of How to query the data.
Step-1: Create a s3 bucket and go to the Metadata-preview tab and click ‘Create metadata configuration’
Step-2: Click on ‘Create table bucket’
Step-3: Create a table bucket.
Step-4: Now come back to the s3 create metadata configuration page and click on ‘Browse s3’, now you will find your recently table bucket. Select it and click o Choose
Step-5: Metadata configuration created successfully.
Step-6: Now upload some objects to the S3 bucket.
Step-7: Before querying it, grant AWS Lake Information permission to the s3 bucket.
Go to AWS Lake Information console, click on ‘Catalogs’ from the left panel and click on ‘s3tablescatalog’. There you will find your recently created table bucket. Click on it.
Step-8: Now click on ‘Tables’ from the left panel and from the ‘Choose Catalog’ dropdown choose your table bucket
Step-9: Click on the Name and then click on Action and click Grant
Step-10: Choose the User and give Table Permissions and click on ‘Grant’
After granting the permissions , you can go to Athena to query the table.
Step-11: Go to Athena, Choose your DataSource, Catalog, Database from the left panel and write the query.
SELECT *
FROM “s3tablescatalog/rudra-first-tablebucket“.”aws_s3_metadata“.”s3metadata_metadata_testing_rudra“
LIMIT 5;
The Table Location format – “<catalog name>/<table bucket name>” . ”<Database name>” . ”<Table Name>”
Find catalog name from AWS Lake Information Catalog.
Find Database name and Table Name from AWS Lake Information Tables
Now you will successfully get your results.
Step-12: Storing and Querying CSV Files in Amazon S3 Using Athena.
Inside the bucket, create a folder (e.g., data/) and upload your CSV files there. Keep all related CSVs in this folder for easier table creation.
Create a Table: Use the following SQL to define a table for your CSV data:
CREATE EXTERNAL TABLE my_csv_table ( column1 STRING, column2 INT, column3 FLOAT, column4 STRING )
ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ STORED AS TEXTFILE
LOCATION ‘s3://csv-data-bucket/data/’;
Replace column1, column2, etc., with the names and data types of your CSV columns. Make sure the S3 location (LOCATION) matches the folder where you uploaded your files.
After creating the table, you can query the CSV data directly:
SELECT * FROM my_csv_table LIMIT 10;
Conclusion:
Amazon’s introduction of S3 Metadata (Preview) and S3 Table Buckets is a game-changer for anyone leveraging S3 for data storage and analytics. By transforming metadata into a queryable layer, Amazon is turning S3 into more than just a storage solution—it’s becoming an integral part of your data strategy. For those already using services like AWS Lake Formation and Amazon Athena, these features offer a seamless way to enhance your workflows, enabling faster insights and more efficient data management without complex pipelines.
As these features evolve, they promise to unlock even greater possibilities, bridging the gap between raw object storage and sophisticated data querying. The future of S3 isn’t just about storing data—it’s about making that data accessible, actionable, and insightful. Whether you’re managing a data lake, running analytics, or simply searching for a smarter way to organize your files, S3 Metadata and S3 Table Buckets are tools you don’t want to miss.





