The Data Lake Success Story
Cloud-Based Data Archival of legacy data saves $ 0.5 million for a manufacturing company
Big Data, Data Lake, Cloud
- Enhanced process accuracy of 98%
- 90% saving in effort of document classification
- $1 million saving in process cost over 2 years
Our client is a leader in the paper manufacturing industry with operations in India and the US. Multiple plants in India, a large dealer network across the globe, and operations in the US make the company a global leader.
IT infrastructure of the company has been a changing landscape. From the early 2000s, the company was using Oracle ERP as the backbone of business operations. This was later replaced by SAP. However, the business users needed access to legacy data in Oracle ERP for reporting and legal compliance purposes. Hence the access to Oracle ERP was continued even though business operations were shifted to SAP.
Ellicium implemented a data lake solution to migrate the reporting out of Oracle ERP and helped retire the Oracle ERP solution.
Challenges with continued access to Oracle ERP posed the following challenges:
- High cost of maintenance of unsupported legacy hardware and very old version of legacy ERP,
- High license cost for maintaining the ERP for reporting purpose,
- Complex data structures in legacy ERP made self service reporting impossible.
Ellicium big data architects designed an architecture catering to business and technology needs of the company. Google Big Query was chosen as a platform for the data lake. Ease of management, compatibility with ANSI SQL and cost effectiveness were factors leading to selection of this technology. 23000 tables in ERP were exported to google storage and were loaded in Big Query. The data lake contained raw data as well as cleaned, summarized data ready for consumption. For ease of access, Pentaho was used as a reporting platform.
The data lake consisted of entire business data for the organization consisting of:
- 23000 tables in Oracle ERP migrated as is in the data lake.
- About 300 summary tables for ease of reporting.
- Look up tables created from spreadsheets provided by business users.
- Total cost of ownership of legacy data reduced by 90%.
- Manpower requirement for supporting legacy data reduced by 75%.
- The company could retire Oracle ERP resulting in total cost saving of $ 0.5 million over 2 years.
- Time to access critical reports came down from average of 6 hours to 15 minutes due to redesigned reporting system in the data lake.
Overall, data lake on cloud helped the organization manage legacy data in a cost-effective way meeting the compliance requirements efficiently!!!