Table of Contents:
Object recognition is a Catalyst Zia AI-driven technology that recognizes and identifies objects in an image. It is an AI program in the field of computer vision that uses pre-trained 3D models, component identification, edge detection algorithms, and other technologies to recognize specific entities semantically and episodically. In other words, the AI detects and labels specific objects, such as a tree or a person, from an image.
Object Recognition can identify 80 different kinds of common objects in an image. Zia provides information about each identified object's type, its position in the image, and a confidence score indicating the accuracy of each recognition in the response. You can implement Zia Object recognition in applications that perform image classification, object detection, or object localization.
Catalyst provides Zia Object Recognition in the Java and Node.js SDK packages, and you can integrate it in your Catalyst web or Android application. The Catalyst console provides easy access to code templates for these environments that you can implement in your application's code.
You can also test Object Recognition by uploading sample images or documents that contain text in the console and obtain the recognized text, to get a better idea of Zia's accuracy and the response format.
You can refer to the Java SDK documentation and Node.js SDK documentation for code samples of Object Recognition. Refer to the API documentation to learn about the API available for Object Recognition.
Before you learn about the use cases and implementation of Object Recognition, it's important to understand its fundamental concepts in detail.
Zia's object recognition process can be broken down into various stages:
- Object detection: Object detection involves detecting instances of semantic objects in images. Along with object localization, object detection can not only detect the most obvious object in the image, but can detect and localize multiple objects in the same image.
- Object localization: Object localization draws bounding boxes around the object after it has been detected. A bounding box is the coordinates of a rectangular border that encloses the detected object. This helps Zia map the location of the object in the image.
- Image classification:
Image classification is part of deep learning, a subset of machine learning, which classifies detected objects in an image into categories. Zia can identify object classes like animals, everyday items, food, sports equipment, vehicles in an image. When an object is assigned to a class based on its visual content, Zia runs further algorithms to accurately recognize the object and deliver the results.
- Object recognition:
Zia breaks down the object class further to perform semantic recognition of specific entities. Combining object detection, image localization, and image classification, the specific object is recognized and produced in the output. Zia can recognize 80 kinds of specific object types, some of which include: person, car, dog, chair, traffic light, knife, umbrella, cellphone, book, cake, baseball bat, laptop, aeroplane, stop_sign, parking meter.
This entire process involves dividing and conquering, comparing the detected objects with the representations stored in the AI's memory, and repeatedly hypothesizing and testing. Since an object can appear in different poses, contexts, arbitrary positions, and orientations, Zia undergoes constant training to generate accurate results.
Zia Object Recognition performs facial detection and attributes recognition by analyzing image files. Object Recognition supports the following input file formats:
You could provide a space for the user to upload the image file from the device's memory to the Catalyst application. You can also code the Catalyst application to use the end user device's camera to capture a photo and process the image as the input file.
You can check the request format of the API from the API documentation.
The user must follow these guidelines while providing the input, for better results:
- Avoid providing blurred or corrupted images.
- Object Recognition can recognize the objects in an image better if they are clear, visible, and distinct.
- The file size must not exceed 10 MB.
Zia Object Recognition returns the response in the following ways:
- In the Console
When you upload a sample image in the console, it will return the results in two response formats:
- Document response: The textual response shows the object types of all the recognized objects in the image, with the confidence level of each recognition as percentage values.
- JSON response: The JSON response contains the object type, the coordinates of the object in the image, and the confidence score in a value between 0 to 1 of each object that was recognized in the image. The confidence score of 0 to 1 can be equated to percentage values as follows:
Confidence Level in percentage Confidence Score of values between 0 and 1 0-9 0.0 3-9 0.0 10-19 0.0 20-29 0.0 30-39 0.0 40-49 0.01 50-59 0.12 60-69 0.23 >70 0.63
- Using the SDKs
When you send an image file using an API request, you will receive a JSON response containing the results in the format specified above.
You can check the JSON response format from the API documentation.
- Highly Accurate Results
Zia undergoes repeated systematic training to generate results with higher accuracy and a lower error margin. The AI is trained using various machine learning algorithms to perform complex computations and analysis. The training model is highly vigorous, which means it studies and analyzes large volumes of data, and this ensures that the results generated are precise, accurate, and reliable.
- Confidence Score for Each Recognized Object
The confidence score provided for each recognized object helps the user verify the level of accuracy of the result. The end user can analyze the confidence score and make informed decisions. The confidence score also helps them decide on providing better quality input for more accurate results.
- Rapid Performance
Object Recognition generates results almost instantaneously when the image is uploaded. Catalyst ensures a high throughput of data transmission and a minimal latency in serving requests. The fast response time enhances your application's performance, and provides a satisfying experience for the end user.
- Seamless Integration
You can easily implement Object Recognition in your application without having to learn the complex processing of the machine learning algorithms or the backend set-up. You can implement the ready-made code templates provided for the Java and Node.js platforms in any of your Catalyst applications that requires Object Recognition.
- Testing in the Console
The testing feature in the console enables you to verify the efficiency of Object Recognition. You can upload sample images and view the results. This allows you to get an idea about the format and accuracy of the response that will be generated when you implement it in your application.
Object Recognition is increasingly being used in a wide range of applications. The following are some use cases for Zia Object Recognition:
- A real-time people tracking application uses Zia Object Recognition to count the crowd from live images captured through surveillance cameras or drone cameras. The cameras capture and produce images of the crowd inside a mall, a store, an event, or a festival and process them using Catalyst. Combined with Zia Face Analytics, this application detects the gender and age of the people in the venue, to analyze the demographics of the visitors, the pattern of their visits, and determines the success of the event.
- A traffic monitoring application implements Zia Object Recognition to analyze the images captured in traffic cameras to identify offenders of traffic rules and parking rules, and instances of road accidents. Since Zia can recognize stop signs, traffic signals, parking meters, common vehicles, people, the application is coded to quickly identify these objects, analyze their positions in the image, and issue alerts for specific violations.
Some other examples where Zia Object Recognition can be implemented include:
- An application that processes images captured in motion tracking cameras to determine movements of sports balls, skateboards, Frisbee, surfboards using their positions in the images
- An application that processes the images captured on surveillance cameras installed on forest roads that detects the presence and movements of animals using motion detection, and recognizes individual species from the images
- An application that processes images captured in crime scenes to detect and label objects in them, assisting in automating and streamlining forensic analysis
- An application that processes images captured in retail outlets to determine the popularity and customer interest of specific products based on the count of people in specific sections of the store
This section only covers working with Object Recognition in the Catalyst console. Refer to the SDK and API documentation sections for implementing Object Recognition in your application's code.
As mentioned earlier, you can access the code templates that will enable you to integrate Object Recognition in your Catalyst application from the console, and also test the feature by uploading images and obtaining the results.
To access Object Recognition in your Catalyst console:
- Navigate to Zia Services under Discover, then click Access Now on the Object Recognition window.
- Click Try a Demo in the Object Recognition feature page.
This will open the Object Recognition feature.
You can test Object Recognition by either selecting a sample image from Catalyst or by uploading your own image.
To scan a sample image and recognize the objects:
- Click Select a Sample Image in the box.
- Select an image from the samples provided.
Object Recognition will scan the image, and list the recognized objects in the image along with the confidence level of each object in percentage values under the Result section.
The colors in the response bars indicate the range of the confidence percentage the recognition such as, red: 0-30%, orange: 30-80%, green: 80-100%.
You can use the arrows to view all the recognized objects.
Click View Response to view the JSON response. The JSON response provides the coordinates of each recognized object, their type, and the confidence score of the recognition in a value between 0 to 1.
You can refer to the API documentation to view a complete sample JSON response structure.
To upload your own image and test Object Recognition:
- Click Upload under the Result section.
If you're opening Object Recognition after you have closed it, click Browse Files in this box.
- Upload a file from your local system.
Note: The file must be in .jpg/.jpeg, or .png format. The file size must not exceed 10 MB.
The console will scan the image and display the recognized objects.
You can check the JSON response in a similar way.
You can access them from the section below the test window. Click either the Java SDK or NodeJS SDK tab, and copy the code using the copy icon. You can paste this code in your web or Android application's code wherever you require.
You can process the input file as a new File in Java. The ZCObjectDetectionData contains the detectObjects() method that detects objects in the input image file. The getObjectType(), getConfidence(), getObjectPoints() obtain the object type, confidence value, and its coordinates respectively.
In Node.js, you can pass the input image file to the detectObject() method.