Text Analytics: Superpower to harness unstructured Big Data

Text Analytics: Superpower to harness unstructured Big Data

Text Analytics: Superpower to harness unstructured Big Data

The first message was sent over internet on 29th October 1969. It was between UCLA professor Leonard Kleinrock, his student and programmer Charley Kline and Bill Duvall, at Stanford Research Institute. They intended to send “Login” as a message, but system crashed after typing “Lo”. So “lo” was the first message sent over internet. By 2019, number of emails sent over internet will be 246 billion. It seems we have come a long way!

Text Analytics

2.4 million emails are sent every second

Computers are shrinking and internet is growing, generating data everywhere. In just one second 43,281GB of Internet traffic is flowing. By 2020, digital universe will be of 44 ZB (trillion gigabytes) and up to 90% of its data will be unstructured. And lots of this data is of textual nature. Sounds exciting? But without analytics this data is just data. Hence, onus is upon text analytics to deliver the value out of this data.

Defining text analytics

Text analytics is the practice of semi-automatically aggregating and exploring textual data to obtain new insights by combining technology, industry knowledge, and practices that drive business outcomes. Text analytics is a set of linguistic, analytical and predictive techniques.

History, Current State and Massive Future Opportunities

The first definition of business intelligence (BI) itself, in an October 1958 IBM Journal article by H.P. Luhn, ‘A Business Intelligence System’, focused on text analytics. But as BI emerged as practice in 80s and 90s, focus was on numerical data stored in relational databases. It was hard back then to process text in documents. Since its first emergence as “text mining“in 1990’s, text analytics has evolved a lot and continues to evolve.

Though text analytics is old problem, it has just started gaining more significance due to exponential growth in unstructured data generation. Combined with structured data analysis, text analytics can be a game changer for businesses. Text analytics can go beyond numbers and unveil unseen insights. For example, analysis of email exchange between customer and support department can help business to improve on service front, at the same time helping to predict customer mood.

As per Allied market research, by 2020 text analytics market is expected to reach $ 6.5 Billion. In recent Forrester’s recent Big Data Text Analytics platform report, text analytics market is dominated by leaders like IBM, Clarabridge, SAS, HPE, Attivio, Digital Reasoning and Linguamatics. IBM and Clarabridge are leading with huge market share due to their broader feature offerings. Going by the same report, major companies are making clear moves showing importance of text analytics. For example: IBM is pushing their Watson platform hard, at same time doing some significant acquisitions. At the moment, it seems market leaders are beyond market thanks to full feature platform and acquisition of rivals. At the same time, there are some serious open source tool who are making their presence felt.

Text Analytics Market 2016

Big Data Text Analytics Platforms, Q2 ’16 

Image source: Forrester Wave™ Big Data Text Analytics
Platforms, Q2 2016

During my research, I found that following are the ideal text analytics platform features:

Scalability (Huge amount of data from multiple sources)
Flexibility (Integration of multiple text mining functions)
Efficiency (Target relevant Data)
Quality (Accuracy of results)
Costless (Time / Cost / Development)

From user perspective following are the desired features:

Ease of use tool
Intuitive tool
Good interface
Accuracy of results (customizable tool)
Guide by prebuilt categories

Following are the broad features that should be there in the text analytics product:

Document Classification 
Concept Extraction 
Information Extraction 
Information Retrieval   
Web Mining  
Sentiment Analysis   

Text Analytics: Superpower

By 2020 Retail, FMCG, BFSI and Healthcare will be major adopters of text analytics platform. They will use it mainly for Customer Relationship Management, Brand Reputation and to gain competitive intelligence.

As mentioned before, most of unstructured data is in textual format. As per accepted maxim, 20% data is analyse by organizations and it is in structure. Imagine the possibility of organizations analysing 80% of their remaining data which is unstructured and in textual format. Text analytics can empower organizations to get deeper and more significant insights. Be it social media listening connecting with customer experience or extracting cancer related from pathology or radiology reports; text analytics can leverage to unleash potential of available data.

Text analytics can be used to learn more about customer behaviour, in turn this intelligence would help significantly in improving customer loyalty and to increase sales. Just recently, Facebook announced the availability of Topic Data which uses text analytics to reveal what audiences are saying on Facebook about events, brands, subjects and activities.


At Ellicium, our unstructured data analytics tool ‘Gadfly’ helped leading LPO for automating their document classification work based upon text analytics principles. This solution would save 85% human effort & result in multi-million-dollar cost saving over a 3 years  period.

Text Analytics


Social media is another huge opportunity for businesses to employ text analytics to listen to customer’s conversation in real time and act upon it right away.

With variety of tools available in the market, any business, small or big can start using text analytics and capitalize on it.

At Ellicium, we are open to assist and help businesses to understand how they can leverage text analytics. We are soon going to do a workshop on text analytics introduction. You can also know more about our platform ‘Gadfly’ for analysing unstructured data here: https://www.ellicium.com/unstructured-data-analytics You can schedule a demo for the same with us here: https://www.ellicium.com/get-a-demo

Unstructured data analytics

Ellicium’s Gadfly: Analytics platform for unstructured data