Apache Ozone – A Multi-Protocol Aware Storage System

neub9
By neub9
4 Min Read

Title: Apache Ozone: Simplifying the Management of Modern Data Architectures

Date: November 07, 2023

Reading Time: 5 minutes

Handling the expanding variety of data in today’s dynamic landscape of modern data architectures can be challenging. With data types ranging from structured to unstructured, data professionals need to be proficient with various formats such as ORC, Parquet, Avro, CSV, and Apache Iceberg tables. To address this issue, Apache Ozone has emerged as a popular, cloud-native storage solution that meets the performance demands of present-day data architectures.

Apache Ozone is a highly scalable, high-performance distributed object store that offers flexible bucket layouts and multi-protocol support. It is compatible with Amazon S3 and Hadoop FileSystem protocols, allowing for optimized storage and access to different types of data using multiple protocols. A previous blog post discussed the various bucket layouts available in Ozone, and this article aims to provide guidance to Ozone administrators and application developers on the optimal usage of the bucket layouts for different applications.

The two new bucket layouts in Ozone, File System Optimized (FSO) and Object Store (OBS), provide unified and optimized storage as well as access to files, directories, and objects. These layouts enable the Ozone cluster to function as both a Hadoop Compatible File System (HCFS) and Object Store, catering to various storage needs.

Interoperability between FS and S3 API is a key feature of Ozone. Users can store their data in Apache Ozone and access it using multiple protocols including ofs, S3, and o3. This enables seamless interaction with traditional analytics applications as well as cloud-native workloads, providing flexibility and ease of use.

The support for interoperability between File System and Object Store API in Ozone facilitates the implementation of hybrid cloud use cases, such as ingesting data using S3 interface into FSO buckets for low-latency analytics using the ofs protocol, or storing data on-premises for security and compliance while also accessing it using cloud-compatible API.

The choice between FSO and OBS bucket layouts depends on the specific workload requirements. FSO buckets are well-suited for analytics services built for HDFS, offering faster and consistent operations. On the other hand, OBS buckets are more suitable for cloud-native applications built for S3, providing strict S3 compatibility and rich storage for unstructured data.

In conclusion, bucket layouts are a powerful feature of Apache Ozone that allows it to be used as both an Object Store and Hadoop Compatible File System. Selecting the right bucket layout for each workload is crucial, and our Professional Services, Support, and Engineering teams are available to assist you in optimizing your data architecture. If you want to learn more about how Apache Ozone powers data science, please check out this article. For further information on Cloudera on private cloud, click here. Please reach out to your Cloudera account team or get in touch with us to learn more.

References:
1. https://blog.cloudera.com/apache-ozone-a-high-performance-object-store-for-cdp-private-cloud/
2. https://blog.cloudera.com/a-flexible-and-efficient-storage-system-for-diverse-workloads/

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *