In our endeavor to share our experience and passion for Big Data, we have been regularly publishing articles on relevant topics around Big Data since the beginning of Ellicium. For all our readers here is a compilation of all our Big Data articles.
This list covers a wide range of topics ranging from how to ensure successful Big Data proof of concept to Hadoop Security. This could be your one-stop URL for all things Big Data. We will be updating this list regularly with new articles. We really hope this list helps you!
Ensuring a successful Big Data Proof of Concept
A common factor in all successful Big Data Proof of Concept (POC) to Production journeys has been ‘planning well even before the POC’.
Read here: http://bit.ly/2qPyCcT
What 5 things to consider before streaming data into Hadoop?
Streaming data into Hadoop? Read this article from Ellicium Solutions to know 5 things to consider before streaming data into Hadoop.
Read here: http://bit.ly/2qiNgGy
Enemy #1 for Hadoop adoption – Bad Data Quality
While doing implementations of streaming data, IoT on HBase platform, I observed a consistent pattern of data quality issues. Read how to fight it.
Read here: http://bit.ly/2qYyO7Z
Why you need Big Data to get actionable customer insights?
How to employ Big Data Technologies to get meaningful customer insights? Read this Big Data blog post by Ellicium Solutions.
Read here: http://bit.ly/2rTiPbq
How I saved 200 hours of Ellicium’s Recruitment Team Using Hadoop!
This is the story of how I saved 200 hours for our recruitment team using Hadoop and other Big Data technologies.
Read here: http://bit.ly/2SdM1Wn
How does Cloudera and Hortonworks merger signify the obituary of Hadoop?
Cloudera and Hortonworks have decided to unite. What does it mean for Hadoop? Read this article to know what will happen to Hadoop!
Read here: http://bit.ly/2Nglsz5
Testing dashboard to track the progress of your Big Data projects
Why testing dashboard for Big Data projects? Big Data projects are complex & lengthy. Rigorous tracking & analysis is a prime requirement for these projects.
Read here: http://bit.ly/2qbgueC
5 Data Warehouse implementation mistakes to avoid in Big Data Projects
Data warehouse implementations are tricky. Know 5 data warehouse implementation mistakes you should avoid along with a list of common aspects.
Read here: http://bit.ly/2qOZPfJ
10 factors to consider when selecting a Visualization tool for Hadoop data
Selecting Big Data visualization tool has its own dynamics. You need to consider many factors while selecting the Hadoop data visualization tool.
Read here: http://bit.ly/2qVrH2h
How We Resolved Data Processing Challenges On The Hadoop Data Lake
Read how we resolved data processing challenge on the Hadoop data lake for leading insurance company dealing in specialty insurance.
Read here: http://bit.ly/2GQcohI
Why we stored insurance policy snapshots on HBase in JSON format?
Recently we worked to build a data lake on Big Data for the insurance industry. This article narrates our reasoning behind why we stored insurance policy snapshots on HBase in JSON format.
Read here: http://bit.ly/2IPGMw2
Testing Methodology For Big Data Project
To overcome the issue of the lack of documentation and processes for testing Big data applications, I tried to define a testing process based on my experience.
Read here: http://bit.ly/2PDMNfy
Things to do to get the best performance from HBase
How to make HBase efficient and work as per your requirements? This article is the result of several implementations that we have done over the period.
Read here: http://bit.ly/2vs8Dtl
How we resolved dncp_block_verification log file issue in HDFS for CDH 5.3.0, saving critical time!
This is how we resolved dncp_block_verification log file issue in HDFS for CDH 5.3.0, saving critical time!
Read here: http://bit.ly/2J4m4rC
What if your HBase seems to be working but is still not working!
What if your HBase seems to be working but is actually not working! This how we solved the problem of log splitting.
Read here: http://bit.ly/2WdzXHt
Data Security: Empowered With Data Masking And Data Encryption
Data security is the practice of keeping data protected from corruption and unauthorized access. This article is about Data Masking And Data Encryption.
Read here: http://bit.ly/2npzcf2
Hadoop Security: Prime Areas To Focus On
Hadoop Security can be beefed up by focusing on some prime areas. We applied the same for our Hadoop project and got great results.
Read here: http://bit.ly/2AKzVgn
Hadoop Security: Prime Areas To Focus On – Part 2
Part 2 of Hadoop security: Prime Areas To Focus On, this article talks about user authorization and data governance areas in Hadoop Security.
Read here: http://bit.ly/2BF2QkU
ORC Vs Parquet Vs Avro: How to select a right file format for Hive?
ORC Vs Parquet Vs Avro? Which one is the better of the lot? People working in Hive would be asking this question more often. Read this article for the answer.
Read here: http://bit.ly/2AqVF40
HBase performance tuning (Part 1) – Bloom filters on HBase
How effective application of Bloom filter in HBase increases performance.
Read here: http://bit.ly/2VRQNyX
HBase performance tuning (Part 2) – Importance of compression
An article where we share our experience of optimization of HBase design for managing streaming, time series data.
Read here: http://bit.ly/2LEGOZQ