Big Data Articles From Ellicium

Big Data Articles From Ellicium

In our endeavor to share our experience and passion for Big Data, we have been regularly publishing articles on relevant topics around Big Data since the beginning of Ellicium. For all our readers here is a compilation of all our Big Data articles.

This list covers a wide range of topics ranging from how to ensure successful Big Data proof of concept to Hadoop Security. This could be your one-stop URL for all things Big Data. We will be updating this list regularly with new articles. We really hope this list helps you!

Ensuring a successful Big Data Proof of Concept

A common factor in all successful Big Data Proof of Concept (POC) to Production journeys has been ‘planning well even before the POC’.

Read here: http://bit.ly/2qPyCcT 

What 5 things to consider before streaming data into Hadoop?

Streaming data into Hadoop? Read this article from Ellicium Solutions to know 5 things to consider before streaming data into Hadoop.

Read here: http://bit.ly/2qiNgGy 

Enemy #1 for Hadoop adoption – Bad Data Quality

While doing implementations of streaming data, IoT on HBase platform, I observed a consistent pattern of data quality issues. Read how to fight it.

Read here: http://bit.ly/2qYyO7Z 

Why you need Big Data to get actionable customer insights?  

How to employ Big Data Technologies to get meaningful customer insights? Read this Big Data blog post by Ellicium Solutions.

Read here: http://bit.ly/2rTiPbq

How I saved 200 hours of Ellicium’s Recruitment Team Using Hadoop!   

This is the story of how I saved 200 hours for our recruitment team using Hadoop and other Big Data technologies.

Read here: http://bit.ly/2SdM1Wn

How does Cloudera and Hortonworks merger signify the obituary of Hadoop? 

Cloudera and Hortonworks have decided to unite. What does it mean for Hadoop? Read this article to know what will happen to Hadoop!

Read here: http://bit.ly/2Nglsz5

Testing dashboard to track the progress of your Big Data projects   

Why testing dashboard for Big Data projects? Big Data projects are complex & lengthy. Rigorous tracking & analysis is a prime requirement for these projects.

Read here: http://bit.ly/2qbgueC

5 Data Warehouse implementation mistakes to avoid in Big Data Projects   

Data warehouse implementations are tricky. Know 5 data warehouse implementation mistakes you should avoid along with a list of common aspects.

Read here: http://bit.ly/2qOZPfJ

10 factors to consider when selecting a Visualization tool for Hadoop data  

Selecting Big Data visualization tool has its own dynamics. You need to consider many factors while selecting the Hadoop data visualization tool.

Read here: http://bit.ly/2qVrH2h

How We Resolved Data Processing Challenges On The Hadoop Data Lake 

Read how we resolved data processing challenge on the Hadoop data lake for leading insurance company dealing in specialty insurance.

Read here: http://bit.ly/2GQcohI

Why we stored insurance policy snapshots on HBase in JSON format?  

Recently we worked to build a data lake on Big Data for the insurance industry. This article narrates our reasoning behind why we stored insurance policy snapshots on HBase in JSON format.

Read here: http://bit.ly/2IPGMw2

Testing Methodology For Big Data Project   

To overcome the issue of the lack of documentation and processes for testing Big data applications, I tried to define a testing process based on my experience.

Read here: http://bit.ly/2PDMNfy

Things to do to get the best performance from HBase   

How to make HBase efficient and work as per your requirements? This article is the result of several implementations that we have done over the period.

Read here: http://bit.ly/2vs8Dtl

How we resolved dncp_block_verification log file issue in HDFS for CDH 5.3.0, saving critical time!

This is how we resolved dncp_block_verification log file issue in HDFS for CDH 5.3.0, saving critical time!

Read here: http://bit.ly/2J4m4rC

What if your HBase seems to be working but is still not working!  

What if your HBase seems to be working but is actually not working! This how we solved the problem of log splitting.

Read here: http://bit.ly/2WdzXHt

Data Security: Empowered With Data Masking And Data Encryption  

Data security is the practice of keeping data protected from corruption and unauthorized access. This article is about Data Masking And Data Encryption.

Read here: http://bit.ly/2npzcf2

Hadoop Security: Prime Areas To Focus On   

Hadoop Security can be beefed up by focusing on some prime areas. We applied the same for our Hadoop project and got great results.

Read here: http://bit.ly/2AKzVgn

Hadoop Security: Prime Areas To Focus On – Part 2   

Part 2 of Hadoop security: Prime Areas To Focus On, this article talks about user authorization and data governance areas in Hadoop Security.

Read here: http://bit.ly/2BF2QkU

ORC Vs Parquet Vs Avro: How to select a right file format for Hive?  

ORC Vs Parquet Vs Avro? Which one is the better of the lot? People working in Hive would be asking this question more often. Read this article for the answer.

Read here: http://bit.ly/2AqVF40

HBase performance tuning (Part 1) – Bloom filters on HBase

How effective application of Bloom filter in HBase increases performance.

Read here: http://bit.ly/2VRQNyX

HBase performance tuning (Part 2) – Importance of compression

An article where we share our experience of optimization of HBase design for managing streaming, time series data.

Read here: http://bit.ly/2LEGOZQ