Table of Contents:
Text Analytics is a Catalyst Zia AI-driven service that processes textual content to draw structured and quantitative information from it. Text Analytics enables you to obtain useful insights, patterns, and trends from any textual content, which helps you automate the analysis of crucial data and make informed business decisions. It can be widely used in applications that involve R&D, analytics, knowledge management, customer support services, and more.
Catalyst Text Analytics is a service that fundamentally encompasses three major features:
- Sentiment Analysis: Recognizes the tone and sentiment of a text as Positive, Negative, or Neutral
- Named Entity Recognition (NER): Extracts key words from a text and groups them into various pre-determined categories
- Keyword Extraction: Extracts the highlights of a text, such as the key words and key phrases in it
Catalyst Text Analytics delivers the results of any or all three of these features, based on your request. The responses of Sentiment Analysis and NER also include confidence scores that inform you of their accuracy. You can code your Catalyst application to process this information or handle it in any way you require.
The Catalyst console provides an integrated platform where you can test Text Analytics with a selection of sample text. The console delivers a JSON response, as well as a multifaceted visual response that enables you to understand the results easily.
SDK and API for Text Analytics
Catalyst provides Text Analytics in the Java and Node.js platforms. You can obtain sample code templates for these platforms from the Catalyst console or from the SDK help, and easily integrate Text Analytics in your web or mobile application. Catalyst also provides individual APIs for each Text Analytics feature. For more information and code samples on working in these programming environments, refer to these help pages:
To learn about using Text Analytics in the development and production environments, visit the Environments help page.
Before you learn about testing Text Analytics from the Catalyst console, let's discuss the Text Analytics features in detail.
Sentiment Analysis is a common deep learning tool that performs contextual opinion mining of unstructured text, to extract subjective information about the underlying sentiments in it. It involves the applications of general text analysis, natural language processing, computational linguistics, and other machine learning techniques.
The text is generally broken down into its component structures, and weighted sentiment scores are assigned to each entity. This helps the AI determine the overall sentiment of the text, based on a cumulative analysis of the sentiments recognized in each entity.
The Zia Sentiment Analysis model analyzes the polarity of the sentiments in text, and categorizes it as one of these three: positive, negative, neutral.
Zia determines these sentiments for each sentence in a text after analysing them one after the other, and then predicts if the overall text is positive, negative, or neutral, based on the sentiments of each sentence.
Zia Sentiment Analysis returns the analysis of each sentence, as well as an analysis of the overall text as the response. The response also returns the confidence scores for each sentence and the overall text, to showcase the accuracy of the analysis.
The response and confidence score formats differ based on the response types:
- Visual response: The visual response generated when you test Text Analytics in the console displays the sentiment of each sentence in the text, and the overall sentiment of the text in a comprehensive and clear manner. The confidence score of each analysis is presented as a percentage value in the visual response.
- JSON response: You can obtain a JSON response for Text Analytics in both the API and while testing it in the console. The console generates both types of responses. The JSON response also delivers individual sentiments of each sentence and the overall sentiment. However, the confidence score in the JSON response is presented in the range of 0 to 1.
You can view a complete sample JSON response from the API documentation.
Sentiment Analysis is crucial for businesses and brands that place a huge value on understanding customer experiences. Sentiment Analysis can be implemented for the following purposes:
- Gauging and monitoring sentiments expressed by customers in social media platforms and opinion surveys
- Automating the analysis of customer feedbacks and reviews, and reducing the manual workload involved
- Identifying and addressing critical situations in real-time, by the automated monitoring of negative sentiments
- Upholding brand reputation and integrity by delivering quick and thorough customer support, and taking strategic actions
- Implementing a centralized, organized, and unbiased system to arrive at decisions on subjective matters
- Useful for data analysts to conduct nuanced market research and deliver actionable data
- Making informed business decisions and providing tailor-made services based on the analysed sentiments and perceptions
Named Entity Recognition is a subset of information extraction that identifies named entities in an unstructured text, and classifies them into pre-determined categories. These categories represent real-world objects such as people, places, or organizations. An entity that contains common traits with a category is grouped into it.
NER is concerned with natural language processing and artificial intelligence. It involves undergoing rigorous machine learning using a host of training data to detect and categorize entities with precision. The Zia NER model is efficient and reliable, and can extract and group entities with high accuracy.
Zia NER can recognize and label entities that fit into the following pre-defined categories:
That is, it can determine a word in a piece of text to be the name of an organization or a person, and add it to the appropriate category.
NER returns the recognized entities and the categories they belong to each, along with a confidence score for each categorization that determines the accuracy of the classification.
The response and confidence score formats differ based on the response types:
- Visual response: The visual response generated when you test Text Analytics in the console displays a list of the recognized entities and their categories. The confidence score of each classification is presented as a percentage value in the visual response.
- JSON response: You can obtain a JSON response for Text Analytics in both the API and while testing it in the console. The console generates both types of responses. The JSON response also delivers the entities and their categories, with additional information about the position of the entity in the text. The confidence score is presented as a percentage value here as well.
You can view a complete sample JSON response from the API documentation.
NER is highly useful in identifying key elements in large datasets, and conveying their subject matter by grouping relevant or similar information together. NER can be implemented in the following scenarios:
- Content classification and clustering applications that require efficient, quick, and real-time management of multiple content categories
- Summarizing information in contents like resumes, manuals, news, scientific papers, and group crucial details together
- Customer support applications that categorize complaints or requests based on departments, filter priority words, or pinpoint recurring problems
- Applications in data science and analytics sectors that require extraction and organization of large scale data, or identifying common themes and trends
- Optimizing search and recommendation engines using tags pertaining to entity classifications, and enhancing user experience
- Transforming unstructured data into quantitative and meaningful information
Keyword Extraction is a text analysis technique that involves extracting important and relevant terms from a piece of text, which provides an abstraction of the whole text. It also works on the principles of text mining, information retrieval, and natural language processing.
Keyword Extraction is similar to other Text Analytics feature in the areas of analyzing human language, and developing precision with more training using rich data sets. It uses simple statistical approach like word frequencies and collocations to advanced machine learning approaches.
Zia Keyword Extraction groups the extracted terms into two categories: Keywords and Keyphrases. These highlights deliver a concise summary of the text and provide valuable insights into its topic.
Zia Keyword Extraction returns the response in the same format while testing it in the console, or through the API. Both the visual and JSON response contain an array of Keywords and another array of Keyphrases, which include the extracted terms in them. A confidence score is not returned for this feature.
Keyword Extraction is a highly useful feature if you want to skim through long textual content and just obtain the essential information and action items from it. It enables you to identify the subject matter and the gist of the text at a glance, and save valuable time.
Keyword Extraction can be used for the following purposes:
- Implementation in online libraries that return key highlights of reading materials as search results
- Automating data retrieval, indexing, and organization in large datasets like online forums, customer feedbacks, news reports, and more
- Querying for the presence of particular keywords and keyphrases from extracted material, and performing necessary actions accordingly
- Implementation in applications that perform real-time analysis and generate automated responses based on the extracted key terms
- Obtaining quick and in-depth insights from a huge number of source materials, and saving efforts and time in manual research and processing
- Reducing inconsistencies in manual information retrieval, since Keyword Extraction functions on pre-defined parameters
- Integration of Multiple Features
Text Analytics is a combination of three different features that work distinctly to deliver individual, drilled-down results. The integration of these features enables you to process the same content for three different purposes, and obtain multitudes of insights on it, with a single request. This helps you gain layers of insights on your source material, and strategize your actions accordingly.
- Highly Scalable and Efficient
Zia Text Analytics can process large volumes of data in parallel, and generate results in a fast and effective manner. Catalyst ensures a high throughput of data transmission and a minimal latency in serving requests. The quick response time enhances your application's performance, and enables you to deliver instant solutions.
- Advanced and Trained AI
Zia is an AI-driven assistant that undergoes repeated systematic training to generate results with high accuracy and a low error margin. The AI is trained using various machine learning techniques to perform complex analysis and computations. The training model is vigorous, which means it studies and analyzes large volumes of data, and this ensures that the results generated are precise, accurate, and reliable.
- Seamless Integration
You can easily implement Text Analytics in your application without having to worry about the underlying logic or the backend set-up, by incorporating the ready-made code templates provided for the Java and Node.js platforms. Catalyst saves your efforts in customizing the code for your needs, by auto-populating the code with relevant project and environment details.
- Testing in the Console
The testing feature in the console enables you to verify the efficiency of Text Analytics. You can process your own textual content and view both the JSON response and visual results. This allows you to get an idea about the format and accuracy of the response that will be generated when you implement it in your application.
The Implementation section explains the process of testing Text Analytics in the Catalyst console. Refer to the Java SDK, Node.js SDK, and API help pages for help on implementing Text Analytics in your code.
The Text Analytics component in the console processes your text and delivers results for all three features: Sentiment Analysis, NER, and Keyword Extraction by default. You can view individual responses for each feature, and a common JSON response for all three features.
You can access Text Analytics for your project in the Catalyst console in the following way:
- Navigate to Zia Services under Discover. Scroll down to Text Analytics and click Access Now.
- Click Try a Demo in the Text Analytics feature page.
This will open the Text Analytics page.
You can test Text Analytics by either selecting an existing sample text, or by providing your own text.
You can pass a block of text of upto 1500 characters to process for Text Analytics.
To process an existing sample text and obtain the results:
- Click Get Sample Text in the window.
- Select a sample text of your choice from the ones shown in the window.
- You can additionally provide optional keywords in the text for Sentiment Analysis. This will enable Sentiment Analysis to process only those sentences that contain these keywords, and determine their sentiments. Other sentences will be ignored.
Enter the keys, then click Proceed.
Catalyst will process the text for all three features and display the visual and JSON responses for them. Let's discuss the response of each feature.
The results of the Sentiment Analysis contain the sentiments recognized in each sentence of the text. As mentioned earlier, the visual response presents the range of accuracy of the sentiments in percentages.
You can navigate to a sentence either by clicking on it from the text window, or by clicking the arrow marks in the result window displaying each sentence.
The overall sentiment and accuracy of the entire text are also displayed in the results window.
If you provided any optional keywords for Sentiment Analysis, you can also view all the keywords from the dropdown above, and select one to process the text for. The sentiments will be analyzed only in the sentences that contain that keyword.
You can click Edit Keyword to modify a keyword.
Named Entity Recognition:
The results of Named Entity Recognition contain the list of all entities recognized in the text, and the categories they are classified into, along with the confidence score of each classification in percentages.
You can navigate between the entities by clicking on the recognized entity, indicated in grey, in the text window, or by clicking the entity displayed in the results section.
You can also filter the results by a specific entity category using the filter drop-down above. Select a category to view the entities that are grouped into it.
The result section of Keyword Extraction displays a list of keywords and keyphrases extracted from the text that provide highlights of the content.
You can navigate between the keywords and keyphrases by clicking on them in the text window, or from the results section.
The results section also contains options for you to edit the text that you want to process, or select a different existing sample text at the bottom. Clicking Edit Text will open a text window, where you modify or enter your own text. We will discuss at the end of this section.
You can view the JSON response of this Text Analytics by clicking View Response.
This will open a window with the JSON response that includes the results of all three features.
You can check a full sample JSON response of each feature from the API documentation.
You can provide your own text for Text Analytics processing in the following way:
- Click Enter Text from the Text Analytics page.
- Enter your own text in the window, then click Confirm.
- Enter the optional keywords for Sentiment Analysis if you need, then click Proceed. Refer to the previous section on processing a sample text for details about this.
Catalyst will process your text for all three features, and display the visual and JSON responses for them, in the same manner as discussed in the sample text section.
You can check the results for all the features, and the JSON response, in the same way.
As mentioned in the introduction, you can implement Text Analytics in your Catalyst application in the Java and Node.js platforms. The console provides the code templates for both these platforms, below the test window.
Click the required tab and copy the code using the copy icon. You can include this code in your web or mobile application's source.
In the Java code, the text to be processed for all three Text Analytics feature is passed as a JSON Array to getTextAnalytics(), along with the optional keywords to perform Sentiment Analysis on. The response returns the results of each Text Analytics feature in their own format. Refer to the Java SDK documentation for more details on the Java code.
The Node.js code references the zia component instance and also passes the text to be processed to getTextAnalytics(). You can pass the optional keywords to perform Sentiment Analysis on, in a similar way. You can write your own processing logic and error logic. The response returns the results of each Text Analytics feature in their own format. Refer to the Node.js SDK documentation for more details on the Node.js code.