SAP Data Lake – Evolution and Architecture

 April 2020 saw the launch of the SAP HANA Data Lake designed to further increase the capabilities of SAP of its data storage capabilities. The goal was to provide storage options to customers at very affordable rates.

The package has SAP HANA native storage extension as well as the SAP data lake incorporated in it. This cloud-based relational data lake of the SAP IQ ecosystem has features at par with the leaders in this field namely Microsoft Azure or Amazon Simple Storage Service (S3).

Architecture of the SAP Data Lake
The unique SAP data lake architecture resembles a pyramid with the top, middle, and bottom segments having specific storage capabilities.

At the top section of the structure is all data that is critical for organizations and hence the cost of data storage here is the highest in the SAP data lake. This data is frequently accessed and processed for operational requirements.

The middle of the pyramid stores data that is not accessed frequently but not insignificant enough to be deleted. Access to this data is not required frequently and it is not as high-performing as the top tier.
At the bottom of the pyramid lies data that is rarely used and would have been deleted in older systems to create more storage space. In the SAP data lake,this data is stored at very low costs.

Hence, in a nutshell, SAP data lake is an optimized data storage service providing support to data through its full life cycle, from hot to warm to cold data. This data tiering facility results in significant lowering of data storage charges as the full volume of data stored is not charged at a single flat fee.

Comments

Popular posts from this blog

The Functioning of Data Lake Built on Amazon S3

Extracting Data From SAP Source Systems