Introduction
Microsoft Cognitive Computer Vision APIs are Azure, Cloud-based, powerful REST services. These APIs provide developers access to advanced machine learning algorithms for processing images and returning back the image analysis in a JSON-structured format. Cognitive Vision APIs help developers to build powerful intelligence into the applications to enable natural and contextual interactions. Computer Vision APIs could be used for information such as image description, recognizing celebrities, reading text from images, and generating thumbnails. The Computer Vision API can tag images according to their content.
Using cognitive computer vision API’s developer can perform the following tasks, among others:
- Analyze information about the visual content found in an image
- Categorize images
- Generate thumbnails
- Identify the type and quality of images
- Recognize celebrities
- Detect human faces in a image
- Recognize text present in a image and read it
- Flag adult contents
- Utilize optical character recognition to identify printed text found in images
How to Register and Get the API Key
To consume the Vision API, the first step is to obtain the API keys from Microsoft Cognitive Services, which is deployed in the Azure Cloud. Currently, a free plan that limits calls to 5000 transactions per month is available.
Open the preceding link and create a new account by clicking the Create button.
Figure 1: Creating a new account
During the new Cognitive Account creation process, you have to accept the Cognitive Service terms and conditions.
Figure 2: Accepting the account terms and conditions
Sign in, using your existing Microsoft account. You also can use your Facebook, Linkedin, or Git account to create an account.
Figure 3: Logging in to your account
After completing the sign-up process, locate the Computer Vision section; two keys will be provided to you. You have to pass these keys with the service call to execute the transaction.
Figure 4: Receiving the two keys
Computer Vision API Details
The Computer Vision API is currently available in the following Azure zones:
- West US: westus.api.cognitive.microsoft.com
- East US 2: eastus2.api.cognitive.microsoft.com
- West Central US: westcentralus.api.cognitive.microsoft.com
- West Europe: westeurope.api.cognitive.microsoft.com
- Southeast Asia: southeastasia.api.cognitive.microsoft.com
A developer can either (1) upload an image or (2) specify an image URL in the HTTP post request. There is an optional parameter that allows the developer to choose which features to return. Supported image formats are JPEG, PNG, GIF, and BMP. The maximum image size is 4MB and the image dimension should be greater than 50 x 50 pixels.
A successful response will be returned in JSON format with response code 200. If the request failed, error code 400, 415, or 500 will be returned.
The Computer Vision API HTTP Post request format is as follows.
https://[location].api.cognitive.microsoft.com/vision/v1.0/ analyze[?visualFeatures][&details][&language] [ &subscription-key=<Your subscription key]
Following is an example of an HTTP post request with a valid subscription key.
POST https://westus.api.cognitive.microsoft.com/vision/v1.0/ analyze?visualFeatures=Categories&language=en HTTP/1.1 Content-Type: application/json Host: westus.api.cognitive.microsoft.com Ocp-Apim-Subscription-Key: ################################ {"url":"https://media.licdn.com/mpr/mpr/ shrinknp_200_200/building.Jpeg "}
Editor’s Note: The “#” signs were used to replace “hard spaces” in the original text.
Here is a sample successful JSON response with success code 200.
{ "categories": [ { "name": "building_", "score": 0.31640625, "detail": { "landmarks": [ { "name": "Colosseum", "confidence": 0.944500566 } ] } }, { "name": "others_", "score": 0.00390625 }, { "name": "outdoor_", "score": 0.04296875 } ], "tags": [ { "name": "building", "confidence": 0.99887830018997192 }, { "name": "outdoor", "confidence": 0.97255456447601318 } ], "description": { "tags": [ "building", "outdoor", "front", "sitting", "large", "old", "standing", "table", "top", "train", "bridge", "city", "group", "white", "man", "clock", "walking", "people", "parked", "track", "castle", "sheep", "riding", "tower", "street", "tall" ], "captions": [ { "text": "a group of people in front of a building", "confidence": 0.84632025454882787 } ] }, "requestId": "7e38d717-52b0-4947-ae9a-2210ee036dbd", "metadata": { "width": 600, "height": 399, "format": "Jpeg" }, "faces": [], "color": { "dominantColorForeground": "Grey", "dominantColorBackground": "White", "dominantColors": [ "Grey", "White" ], "accentColor": "486A83", "isBWImg": false }, "imageType": { "clipArtType": 0, "lineDrawingType": 0 } }
Following is an example of a failed JSON response with error code 401 because an invalid subscription key was passed in the request URL.
apim-request-id: 1bed0251-5c8d-4bc0-8cc9-797ecabd14d2 Strict-Transport-Security: max-age=31536000; includeSubDomains; preload x-content-type-options: nosniff Date: Sun, 21 May 2017 06:03:13 GMT WWW-Authenticate: AzureApiManagementKey realm="https://westus.api.cognitive.microsoft.com/ vision/v1.0",name="Ocp-Apim-Subscription-Key", type="header" Content-Length: 143 Content-Type: application/json { "statusCode": 401, "message": "Access denied due to invalid subscription key. Make sure to provide a valid key for an active subscription."}
Programmatically Analyze an Image with the Vision API Using C#
The following C# console application will demonstrate how to retrieve image features in JSON format, such as Image properties, tags, and description from a selected image using the Cognitive Service Computer Vision API.
The following tools/software are required to develop the console application.
- Windows 8 or higher version
- Free Visual Studio 2015 Community Edition
- Cognitive Service Computer Vision API key
Step 1
Open Visual Studio 2015 -> Start -> New Project-> Select Templates (under Visual C# -> Console Application) -> Blank Application -> Give suitable name for your App (ComputerVisionAPI) -> OK. See Figure 5.
Figure 5: Starting a new console application
Step 2
Add the following namespaces in the Program.cs file.
using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.IO; using System.Net.Http; using System.Net.Http.Headers; using System.Configuration;
Step 3
Rename Program.cs to ComputerVisionAPI.cs. Accordingly, change the name of the class to ComputerVisionAPI.
Step 4
Add the app.config file to the project and update the app settings with the following keys. Provide a local image path or a URL. Update the subscript key value you have generated from the registration process. Request parameters and API URI values will be changed based on image operation.
<?xml version="1.0" encoding="utf-8"?> <configuration> <startup><supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.6.1"/></startup> <appSettings> <add key="ImagePath" value="C:\\Users\Tapas\\ Desktop\\Docs\\BB Backup\\SampleImage.jpg"/> <add key="RequestParameters" value="visualFeatures= Categories&language=en"/> <add key="APIuri" value="https://westus.api.cognitive. microsoft.com/vision/v1.0/analyze?"/> <add key="Subscription-Key" value="13hc77781f8f6cc9b5fcdd72a8df7156"/> <add key ="Contenttypes" value="application/json"/> <!-- example uses content type "application/octet-stream". The other content types you can use are "application/json" and "multipart/form-data".--> </appSettings> </configuration>
Step 5
The following static methods will return key values from the app.config file.
static string Subscriptionkey() { return System.Configuration.ConfigurationManager. AppSettings["Subscription-Key"]; } static string RequestParameters() { return System.Configuration.ConfigurationManager. AppSettings["RequestParameters"]; } static string ReadImagePath() { return System.Configuration.ConfigurationManager. AppSettings["ImagePath"]; } static string ReadURI() { return System.Configuration.ConfigurationManager. AppSettings["APIuri"]; } static string Contenttypes() { return System.Configuration.ConfigurationManager. AppSettings["Contenttypes"]; }
Step 6
For image processing and calling the API, write the following static functions in the ComputerVisionAPI class.
static byte[] GetImageAsByteArray(string ImagePath) { FileStream ImagefileStream = new FileStream(ImagePath, FileMode.Open, FileAccess.Read); BinaryReader ImagebinaryReader = new BinaryReader(ImagefileStream); return ImagebinaryReader.ReadBytes ((int)ImagefileStream.Length); } <summary> /// Use the following function to fetch all image-related details /// </summary> /// <param name="ImagePath"></param> static void GetImgeDetails(string ImagePath) { var ComputerVisionAPIclient = new HttpClient(); // Request headers - replace this example key with your valid subscription key. I have added that in App.config ComputerVisionAPIclient.DefaultRequestHeaders.Add ("Ocp-Apim-Subscription-Key", Subscriptionkey()); // Request parameters. string requestParameters = RequestParameters(); string APIuri = ReadURI() + requestParameters; // Request body. byte[] ImagebyteData = GetImageAsByteArray(ImagePath); ImgeAnalysis(ImagebyteData, APIuri, ComputerVisionAPIclient); } /// <summary> /// The following function calls the computer vision API and /// displays the response in the Console /// </summary> /// <param name="ImagebyteData"></param> /// <param name="uri"></param> /// <param name="ComputerVisionAPIclient"></param> static async void ImgeAnalysis(byte[] ImagebyteData, string APIuri, HttpClient ComputerVisionA PIclient) { HttpResponseMessage APIresponse; var Imagecontent = new ByteArrayContent(ImagebyteData); Imagecontent.Headers.ContentType = new MediaTypeHeaderValue(Contenttypes()); APIresponse = await ComputerVisionAPIclient.PostAsync (APIuri, Imagecontent); Console.WriteLine(APIresponse); Console.Read(); }
Step 7
Finally, call the GetImgeDetails function to get image details from Main().
static void Main(string[] args) { GetImgeDetails(ReadImagePath()); }
Figure 6 shows the C# console application.
Figure 6: The application running as a C# console application
Successful execution of the program will generate JSON output depicted in the Computer Vision API Details section. The developer could write code to parse the JSON response output and build an intelligent app.
Conclusion
The Computer Vision API has solved the problem known as object recognition inside an image. Currently, the API recognizes about 2000 distinct objects and groups them into 87 categories. In this article of the Cognitive API series, you have learned what the Computer Vision API is and its offering to you as a developer. You will get a closer look at the other APIs and code walkthrough in my upcoming posts.