Enabling telemetry for custom models in Intel DevCloud for the Edge




While edge computing has seen an exponential growth in the last few years, developers are experiencing issues in implementing AI and edge software solutions. Here’s how to enable telemetry metrics and make data driven decisions in order to determine the best hardware for their solution.

While edge computing has seen an exponential growth in the last few years, developers are experiencing issues in implementing AI and edge software solutions. Intel DevCloud for the Edge addresses these challenges by providing a remote development environment for users including the necessary tools to determine the optimal hardware configuration for a given solution.

Intel DevCloud for the Edge allows users to develop, prototype, and experiment with AI workloads for computer vision. Developers are able to run AI applications remotely on a wide set of the latest Intel hardware, as well as get immediate access to the up to date Intel Distribution of OpenVINO Toolkit. Furthermore, it offers access to application-specific performance benchmarks on various CPU, GPU, VPU combinations, and FPGAs. Lastly, DevCloud for the Edge provides telemetry metrics including data about the intensity and use of conditions of a computing device.

This article is a how to for users who want to enable the telemetry metrics for their application and make data driven decisions in order to determine the best hardware for their solution.

To get started with Intel DevCloud for the Edge, take a look at Get Started on the Intel DevCloud for the Edge.

1.1 Introduction

This sample application discusses what is needed in order to enable the telemetry dashboard built in the Intel DevCloud for the Edge environment. Specifically, the steps to run metrics on a custom model, supported by Intel Distribution of OpenVINO toolkit for inference, with input data (images and videos). Once inferencing on your model application is complete, you are able to compare the telemetry metrics which are available on the compute nodes.

This is the resulting telemetry dashboard output using this sample application for person detection on an Intel GPU.

click for full size image

(Source: Intel)

The dashboard consists of application details ran during a given job – ie: average inference time (MS), inference count, target hardware. It also includes the following metrics: frames per second, inference times, CPU/GPU usage during inferencing, average CPU/GPU temperature, and memory usage during inferencing.

By the end of this tutorial you will be able to produce this system-level data and dashboard for your custom model and learn about your model’s performance on different Intel hardware and determine the best hardware for your solution.

1.2 Overview

The overall workflow for Intel DevCloud for the Edge sample is as follows:

click for full size image

(Source: Intel)

  1. Register for Intel DevCloud for the Edge
  2. Launch and open a Jupyter Notebook
  3. Develop models and send jobs to the job queue with the target hardware specified
  4. Metrics/results are accessed by Jupyter Notebook
  5. Produce telemetry via Grafana dashboard

We will now go over the key concepts needed for this article. It is recommended that you read the following from Get Started on the Intel DevCloud for the Edge.

1.2.1 Intel Distribution of OpenVINO Toolkit

The OpenVINO toolkit is a robust toolkit for rapidly designing applications which expands computer vision and non-vision workloads through Intel hardware, optimizing efficiency, using the new generations of artificial neural networks.

1.2.2 Model Optimizer

Model Optimizer works with a network model that has been developed using a supported deep learning system. The Model Optimizer creates an Intermediate Representation (IR) of the network, which the Inference Engine can read, load, and infer. The model is defined by a pair of files called the Intermediate Representation which include an.xml (network topology information) and a.bin (weights and biases binary data).

For the complete API Reference, see Inference Engine Python* API Reference

1.2.3 Inference Engine

The Inference Engine includes a plugin library for all Intel hardware and allows to load a trained model. In order to do so the application will tell the inference engine which hardware to target as well as the respective plugin library for that device. The Inference Engine uses blobs for all data representations which captures the input and output data of the model. The Inference Engine API will be used to load the plugin, read the model intermediate representation, load the model into the plugin, and process the output.

1.2.4 Intel OpenVINO Metrics Writer (installed on DevCloud environment)

Intel DevCloud for the Edge has a preinstalled Python package to collect the OpenVINO application metrics. The applicationMetricWriter will need to be imported and call two functions that captures the information of the model as well as the inference times. With both functions called correctly, it will send the collected system-level data the telemetry Grafana Dashboard.

Note* this repository is not public and can only be used in the Intel DevCloud for the Edge environment.

1.3 Sample Application

Lets get started to enable the telemetry dashboard via a Jupyter Notebook in the DevCloud environment.

click for full size image

(Source: Intel)

These are the tasks that will be performed:

  • Import all custom model files (tensorflow, kaldi, onnx, etc)
  • Use the Model Optimizer to create the model Intermediate Representation (IR) files in the necessary precisions (weights and schema- take a look into the Model Optimizer for the models that are compatible with the Inference Engine -.xml and.bin files)
  • Create the job file (.sh) used to submit running inference on compute nodes
  • Enable telemetry using Application Metrics Writer
  • Submit jobs for different compute nodes and monitor the job status until complete (submitting a job will call the bash and custom python file)
  • Display model metrics on the Telemetry Dashboard

These are the sample files used to enable the telemetry metrics.

  • custom_model_telemetry – Jupyter Notebook
  • py- python code for custom model application
  • sh – Bash file
  • txt- Specifies video file or Image folder to be used
  • path to test videos ex: /dir/dir/ex.mp4
  • path to test images directory ex:/dir/dir/testImg

1.3.1 Imports

You must upload all custom model files into the DevCloud environment.

Python imports necessary in order to run the sample application via Jupyter* notebook custom_model_telemetry.ipynb:

import matplotlib.pyplot as plt
import os
import time
import sys
from qarpo.demoutils import *

1.3.2 Convert Custom Model to Intermediete Representation

The Intel Distribution of OpenVINO toolkit includes the Model Optimizer used to convert and optimize trained models into the IR model files, and the Inference Engine then uses the IR model files to run inference on hardware devices. The IR model files can be created from custom trained models from popular frameworks (Caffe, TensorFlow, MXNet, ONNX, and Kaldi).

For any specifications for conversion, refer to How to Convert a Model to Intermediate Representation (IR). To learn about the supported frameworks and topologies, refer to Supported Framework Layers.

To convert your custom model you may use the following, depending on the framework:

Go to the /deployment_tools/model_optimizer directory.

Caffe Model

Use the mo.py script to convert a model with the path to the input model.caffemodel file:

python3 mo.py –input_model .caffemodel

TensorFlow

Use the mo_tf.py script to convert a model with the path to the input model.pb file:

python3 mo_tf.py –input_model .pb

MXNet

To convert an MXNet* model contained in a model-file-symbol.json and model-file-0000.params, run the Model Optimizer launch script mo.py, specifying a path to the input model file: MXNET:

python3 mo_mxnet.py –input_model model-file-0000.params

Kaldi

Use the mo.py script to convert a model with the path to the input model.nnet or.mdl file:

python3 mo.py –input_model .nnet

ONNX

Use the mo.py script to convert a model with the path to the input model.onnx file:

python3 mo.py –input_model .onnx

Jupyter* Notebook– Python example with other parameters

The input arguments are as follows:

!mo.py 
    --input_model raw_models/public/mobilenet-ssd/mobilenet-ssd.caffemodel 
    --data_type  
    -o models/mobilenet-ssd/ 
    --scale 256 
    --mean_values [127,127,127]

NOTE: Some models will require manipulation in the above scripts to specify conversion parameters. To learn about when you need to use these parameters, refer to Converting a Model Using General Conversion Parameters.

1.3.3 Code Specifics

The following is the code needed to generate data and enable the telemetry dashboard. We will be initializing values and arrays, creating classes and functions to help loop through all input data, store the input data in arrays, and parse input arguments from the bash file. Code will be specified if it belongs to the Jupyter* notebook or custom Python file.

1.3.3.1 Code Setup

Import Python modules for custom_telem_enable.py:

from __future__ import print_function
import sys
import os
from argparse import ArgumentParser
import cv2
import numpy
import time
import datetime
import collections
import threading
import datetime
import math
from openvino.inference_engine import IECore
from pathlib import Path
from qarpo.demoutils import progressUpdate
from PIL import Image
import glob
import applicationMetricWriter
 

1.3.3.2 Constants

Constants for custom_telem_enable.py include placeholder values for cpu extension, window names, and thresholds.

CPU_EXTENSION = ''
STATS_WINDOW_NAME = 'Statistics'
CAM_WINDOW_NAME_TEMPLATE = 'inference_output_Video_{}_{}'
FRAME_THRESHOLD = 5
WINDOW_COLUMNS = 3
LOOP_VIDEO = False

1.3.3.3 Globals

Globals for custom_telem_enable.py include placeholder value for the model files as well as image and video frames arrays.

model_xml = ''
model_bin = ''
videoCaps = []
frames = 0
frameNames = []
numVids = 20000

1.3.3.4 Conf.txt

We set the path to where the input data are located which is created in the Jupyter* notebook custom_model_telemetry.ipynb.

For Images enter the path of the image’s directory:

%%writefile conf.txt
/data/reference-sample-data/python-classification/TEST

For Videos enter the path to the videos:

%%writefile conf.txt
/data/reference-sample-data/python-sample/people-detection.mp4
/data/reference-sample-data/python-sample/one-by-one-person-detection.mp4

1.3.3.5 Set NumRequests

We set the variable NumRequests_* – in the Jupyter* notebook custom_model_telemetry.ipynb -to the maximum number of inference requests. This will improve performance for each hardware.

# Set maximum number of inference requests for CPU
NumRequests_CPU = 2
print(f"Number of inference requests for CPU set to:{NumRequests_CPU}")
# Set maximum number of inference requests for CPU
NumRequests_GPU = 4
print(f"Number of inference requests for GPU set to:{NumRequests_GPU}")
# Set maximum number of inference requests for NCS2
NumRequests_NCS2 = 4
print(f"Number of inference requests for NCS2 set to:{NumRequests_NCS2}")
# Set maximum number of inference requests for FPGA
NumRequests_FPGA = 4
print(f"Number of inference requests for FPGA set to:{NumRequests_FPGA}")
# Set maximum number of inference requests for HDDL-R
NumRequests_HDDLR = 128
print(f"Number of inference requests for HDDL-R set to:{NumRequests_HDDLR}")

1.3.3.6 Classes

Classes for custom_telem_enable.py:

VideoCap and FrameInfo classes are created to aid in initializing the captured frame by frame information of videos. The objective is to read and initialize all frames in videos so we can run inferencing correctly.
[Note: Scroll to see the complete code snippet.]

class FrameInfo:
    def __init__(self, frameNo=None, count=None, timestamp=None):
        self.frameNo = frameNo
        self.count = count
        self.timestamp = timestamp

class VideoCap:
    def __init__(self, cap, cap_name, is_cam):
        ''' Initialize the captured frames in videos '''
        self.cap = cap
        self.cap_name = cap_name
        self.is_cam = False
        self.cur_frame = {}
        self.initial_w = 0
        self.initial_h = 0
        self.frames = 0
        self.cur_frame_count = 0
        self.total_count = 0
        self.last_correct_count = 0
        self.candidate_count = 0
        self.candidate_confidence = 0
        self.closed = False
        self.countAtFrame = []
        self.video = None
        self.rate = 0
        self.start_time = {}
        
        if not is_cam:
            self.fps = self.cap.get(cv2.CAP_PROP_FPS)
            self.length = self.cap.get(cv2.CAP_PROP_FRAME_COUNT)
        else:
            self.fps = 0
        
        self.videoName = cap_name + ".mp4"

    def init_vw(self, h, w, fps):
        self.video = cv2.VideoWriter(os.path.join(output_dir, self.videoName), cv2.VideoWriter_fourcc(*"avc1"), fps, (w, h), True) 
        if not self.video.isOpened():
            print ("Could not open for write" + self.videoName)
            sys.exit(1)

1.3.3.7 Functions

Functions for custom_telem_enable.py:

The functions below parse the arguments passed from the bash file (1.2.6.1) to the custom_telem_enable.py file and store into global variables, arrange the windows of video frames, parse the conf.txt, and save the images or video into an array.
[Note: Scroll to see the complete code snippet.]

def env_parser():
    ''' Parse env values and store to global '''
    global TARGET_DEVICE, numVids, LOOP_VIDEO
    if 'DEVICE' in os.environ:
        TARGET_DEVICE = os.environ['DEVICE']

    if 'LOOP' in os.environ:
        lp = os.environ['LOOP']
        if lp == "true":
            LOOP_VIDEO = True
        if lp == "false":
            LOOP_VIDEO = False

    if 'NUM_VIDEOS' in os.environ:
        numVids = int(os.environ['NUM_VIDEOS'])

def args_parser():
    ''' Parse arguments from the bash file and store to globals '''
    parser = ArgumentParser()
    parser.add_argument("-d", "--device",
                        help="Specify the target device to infer on; CPU, GPU or MYRIAD is acceptable. Application "
                             "will look for a suitable plugin for device specified (CPU by default)", type=str)
    parser.add_argument("-m", "--model", help="Path to an .xml file with a trained model's weights.", required=True, type=str)
    parser.add_argument("-e", "--cpu_extension",
                        help="MKLDNN (CPU)-targeted custom layers.Absolute path to a shared library with the kernels "
                             "impl.", type=str, default=None)
    parser.add_argument("-lp", "--loop", help = "Loops video to mimic continous input", type = str, default = None)
    parser.add_argument("-c", "--config_file", help = "Path to config file", type = str, default = None)
    parser.add_argument("-n", "--num_videos", help = "Number of videos to process", type = int, default = None)
    parser.add_argument("-nr", "--num_requests", help = "Number of inference requests running in parallel", type = int, default = None)
    parser.add_argument("-o", "--output_dir", help = "Path to output directory", type = str, default = None)
    parser.add_argument("-inp", "--input_type", help = "Input Type either Video or Images", type = str, default = None)

    global model_xml, model_bin, device, CPU_EXTENSION, LOOP_VIDEO, config_file, num_videos, output_dir, num_infer_requests,input_type

    args = parser.parse_args()
    if args.model:
        model_xml = args.model
        model_bin = os.path.splitext(model_xml)[0] + ".bin"
    if args.device:
        device = args.device
    if args.cpu_extension:
        CPU_EXTENSION = args.cpu_extension
    if args.loop:
        lp = args.loop
        if lp == "true":
            LOOP_VIDEO = True
        if lp == "false":
            LOOP_VIDEO = False
    if args.config_file:
        config_file = args.config_file
    if args.num_videos:
        num_videos = args.num_videos
    if args.num_requests:
        num_infer_requests = args.num_requests
    if args.output_dir:
        output_dir = args.output_dir
    if args.input_type:
        input_type = args.input_type


def parse_conf_file(job_id):
    """ Parses the configuration file. Reads videoCaps and images and stored to an array """
    with open(config_file, 'r') as f:
        cnt = 0
        for idx, item in enumerate(f.read().splitlines()):
            # for input type video, save videos to array
            if input_type == 'V':
                if cnt < num_videos:
                    split = item.split()
                    if split[0].isdigit():
                        videoCap = VideoCap(cv2.VideoCapture(int(split[0])), CAM_WINDOW_NAME_TEMPLATE.format(job_id, idx), True)
                    else:
                        if os.path.isfile(split[0]) :
                            videoCap = VideoCap(cv2.VideoCapture(split[0]), CAM_WINDOW_NAME_TEMPLATE.format(job_id, idx), False)

                        else:
                            print ("Couldn't find " + split[0])
                            sys.exit(3)
                    videoCaps.append(videoCap)
                    cnt += 1
                else:
                    break
            # for input type image, retrieve all images in folder and save to list
            elif(input_type == 'I'):
                split = item.split()
                global image_files
                image_files = glob.glob(os.path.join(split[0], "*","*.jpg"))
                image_files.extend(glob.glob(os.path.join(split[0], "*", "*.JPG")))
            else:
                print("Input type not compatabile")

    for vc in videoCaps:
        if not vc.cap.isOpened():
            print ("Could not open for reading " + vc.cap_name)
            sys.exit(2)

def arrange_windows(width, height):
    """ Arranges the windows for videos so they are not overlapping. Also starts the display threads """
    spacer = 25
    cols = 0
    rows = 0

    # Arrange video windows
    for idx in range(len(videoCaps)):
        if(cols == WINDOW_COLUMNS):
            cols = 0
            rows += 1
        cv2.namedWindow(CAM_WINDOW_NAME_TEMPLATE.format("", idx), cv2.WINDOW_AUTOSIZE)
        cv2.moveWindow(CAM_WINDOW_NAME_TEMPLATE.format("", idx), (spacer + width) * cols, (spacer + height) * rows)
        cols += 1

    # Arrange statistics window
    if(cols == WINDOW_COLUMNS):
        cols = 0
        rows += 1
    cv2.namedWindow(STATS_WINDOW_NAME, cv2.WINDOW_AUTOSIZE)
    cv2.moveWindow(STATS_WINDOW_NAME, (spacer + width) * cols, (spacer + height) * rows)

1.3.4 Inference Engine

We create an inference engine instance in order to produce inferencing on the model. Then create an IEINetwork object to read the model network from the Intermediate Representation (IR) to the plugin. Once read, load the network into the plugin which creates an executable network which is used during inferencing. The input and output blobs of the model are stored and used later. Lastly, the model information include the input batch size, number of input channels, input height and width stored in the following variables n,c,h,w, respectively. For the complete API Reference, see Inference Engine Python* API Reference.

Main function for custom_telem_enable.py – which initializes/creates the inference plugin, determine model blobs, and load model into the inference engines.
[Note: Scroll to see the complete code snippet.]

def main():
    # Plugin initialization for specified device and load extensions library
    global rolling_log, job_id
    job_id = os.environ['PBS_JOBID']
    # Args Parser functions
    env_parser()
    args_parser()
    parse_conf_file(job_id)
    
    # create Inference Engine instance
    ie = IECore()
    if CPU_EXTENSION and 'CPU' in device:
        ie.add_extension(CPU_EXTENSION, "CPU")

    # Read IR files
    print("Reading IR...")
    net = ie.read_network(model=model_xml, weights=model_bin)
    # store name of input and output blobs
    assert (len(net.input_info.keys()) == 1 or len(net.input_info.keys()) == 2),    "Sample supports topologies only with 1 or 2 inputs"
    for blob_name in net.input_info:
        if len(net.input_info[blob_name].input_data.shape) == 4:
            input_blob = blob_name
        elif len(net.input_info[blob_name].input_data.shape) == 2:
            img_info_input_blob = blob_name
        else:
            print("topology length not accepted")

    input_blob = next(iter(net.inputs))
    out_blob = next(iter(net.outputs))

    # load the model into the Inference Engine for our device
    print("Loading IR to the plugin...")
    exec_net = ie.load_network(network=net, num_requests=num_infer_requests, device_name=device)
   
    # Input type of video or Image
    if input_type == 'V':
        isVideoInput(net,exec_net, input_blob, out_blob)
    else:
        isImageInput(net,exec_net, input_blob, out_blob)  
if __name__ == '__main__':
    sys.exit(main() or 0)

1.3.5 Run Telemetry Metrics

Now its time to run telemetry metrics on your model.

1.3.5.1 OpenVINO Metrics Writer

We have created a Python package in order to collect the OpenVINO application metrics and is preinstalled in Intel DevCloud for the Edge environment. You will need to import the applicationMetricWriter and invoke two functions that capture the information of the model as well as the inference times. The functions are to be called as followed:

import applicationMetricWriter
applicationMetricWriter.send_inference_time(milliseconds)
applicationMetricWriter.send_application_metrics(model_xml, device)

Send_inference_time requires the time it takes to run inference on one video frame/image in milliseconds. This function should be called for all frames/images of the test input. Send_application_metrics requires the inference engine compatible xml model schema and the device in which the job is run (ie: CPU/GPU/etc). With both functions called correctly, it will produce and the needed data to the telemetry Grafana Dashboard.

1.3.5.2 Video Input

If the data to test are videos, then we will loop through each frame of the videos in order to send the inferencing time to the dashboard. We will use the model loaded and submit to the inference engine, frame by frame and save inferencing. Once inferencing is complete we call the application metrics writer to enable data to the telemetry dashboard.

Video input function for custom_telem_enable.py:
[Note: Scroll to see the complete code snippet.]

def isVideoInput(net,exec_net, input_blob, out_blob):
    ''' Function that runs for video input in conf file. '''
    
    # read the input's dimensions: n=batch size, c=number of channels, h=height, w=width
    n, c, h, w = net.input_info[input_blob].input_data.shape
    
    # retrieve minimun frames per second and length of videos
    minFPS = min([i.cap.get(cv2.CAP_PROP_FPS) for i in videoCaps])
    minlength = min([i.cap.get(cv2.CAP_PROP_FRAME_COUNT) for i in videoCaps])
    
    # capture the rate for all videos
    for vc in videoCaps:
        vc.rate = int(math.ceil(vc.length/minlength))
    waitTime = int(round(1000 / minFPS / len(videoCaps))) # wait time in ms between showing frames
    
    # open videos to write stats
    frames_sum = 0
    for vc in videoCaps:
        vc.init_vw(h, w, minFPS)
        frames_sum += vc.length
    statsWidth = w if w > 345 else 345
    statsHeight = h if h > (len(videoCaps) * 20 + 15) else (len(videoCaps) * 20 + 15)
    statsVideo = cv2.VideoWriter(os.path.join(output_dir,f'Statistics_{job_id}.mp4'), cv2.VideoWriter_fourcc(*"avc1"), minFPS, (statsWidth, statsHeight), True)    
    if not statsVideo.isOpened():
        print ("Couldn't open stats video for writing")
        sys.exit(4)

    # Init a rolling log to store events
    rolling_log_size = int((h - 15) / 20)
    rolling_log = collections.deque(maxlen=rolling_log_size)

    # Start with async mode enabled
    is_async_mode = True
    no_more_data = False
    # frames submitted to inference engine
    frame_count = 0
    progress_file_path = os.path.join(output_dir, f'i_progress_{job_id}.txt')
    infer_start_time = time.time()
    current_inference = 0
    previous_inference = 1 - num_infer_requests
    videoCapResult = {}
    infer_requests = exec_net.requests
    frame_count = 0
    
    f_proc = 0
    
#Start while loop
    while True:
        # If all video captures are closed stop the loop
        if False not in [videoCap.closed for videoCap in videoCaps]:
            print("All videos completed")
            no_more_data = True
            break
   
        no_more_data = False
        # loop over all video captures
        for idx, videoCapInfer in enumerate(videoCaps):
            # read the next frame
            if not videoCapInfer.closed:
                 vfps = int(round(videoCapInfer.cap.get(cv2.CAP_PROP_FPS)))
                 for i in range(videoCapInfer.rate):
                     ret, frame = videoCapInfer.cap.read()
                     videoCapInfer.cur_frame_count += 1
                     # If the read failed close the program
                     if not ret:
                         videoCapInfer.closed = True
                         break
                     frame_count += 1
                     f_proc += 1

                     if videoCapInfer.closed:
                         print("Video {0} is done".format(idx))
                         print("Video has  {0} frames ".format(videoCapInfer.length))
                         break

                     # Copy the current frame for later use
                     videoCapInfer.cur_frame[current_inference] = frame.copy()
                     videoCapInfer.initial_w = int(videoCapInfer.cap.get(3))
                     videoCapInfer.initial_h = int(videoCapInfer.cap.get(4))
                     # Resize and change the data layout so it is compatible
                     in_frame = cv2.resize(frame, (w, h))
                     in_frame = in_frame.transpose((2, 0, 1))  # Change data layout from HWC to CHW
                     in_frame = in_frame.reshape((n, c, h, w))

                     inf_start = time.time()
                     if is_async_mode:
                         exec_net.start_async(request_id=current_inference, inputs={input_blob: in_frame})
                         # Async enabled and only one video capture
                         if(len(videoCaps) == 1):
                             videoCapResult = videoCapInfer
                             videoCapInfer.start_time[0]= time.time()
                         # Async enabled and more than one video capture
                         else:
                             # Get previous index
                             videoCapResult[current_inference] = videoCapInfer
                             videoCapInfer.start_time[current_inference] = time.time()
                     else:
                         # Async disabled
                         exec_net.start_async(request_id=current_inference, inputs={input_blob: in_frame})
                         videoCapResult = videoCapInfer
                         
                     if previous_inference >= 0:
                        # Number of videos is 1
                        if(num_videos==1):
                            
                            status = exec_net.requests[previous_inference].wait(-1)
                            if status is not 0:
                                raise Exception("Infer request not completed successfully")
                            # Parse inference results
                            vidcap = videoCapResult
                            current_count=0
                            det_time = time.time() - vidcap.start_time[0]
                            # Call OpenVINO Metrics Writer to send inference times 
                            applicationMetricWriter.send_inference_time(det_time*1000)                      
                            res = exec_net.requests[previous_inference].outputs[out_blob]         
                        else:
                            # More than 1 video
                            status = exec_net.requests[previous_inference].wait(-1)
                            if status == 0:
                                res = exec_net.requests[previous_inference].outputs[out_blob]
                                vidcap = videoCapResult[previous_inference]
                                res_frame = vidcap.cur_frame[previous_inference]
                                end_time = time.time()
                                current_count = 0
                                infer_duration = end_time - vidcap.start_time[previous_inference]
                                # Call OpenVINO Metrics Writer to send inference times 
                                applicationMetricWriter.send_inference_time(infer_duration*1000)                        
                                
                                res_frame = cv2.resize(res_frame, (w, h))
          
                        vidcap.frames+=1
                     # Progress Tracker with time and frame information used for slide bar
                     if frame_count%10 == 0: 
                         progressUpdate(progress_file_path, time.time()-infer_start_time, frame_count, frames_sum) 
                     current_inference += 1
                     if current_inference >= num_infer_requests:
                         current_inference = 0

                     previous_inference += 1
                     if previous_inference >= num_infer_requests:
                         previous_inference = 0

            # Loop video if LOOP_VIDEO = True and input isn't live from USB camera
            if LOOP_VIDEO and not videoCapInfer.is_cam:
                vfps = int(round(videoCapInfer.cap.get(cv2.CAP_PROP_FPS)))
                # If a video capture has ended restart it
                if (videoCapInfer.cur_frame_count > videoCapInfer.cap.get(cv2.CAP_PROP_FRAME_COUNT) - int(round(vfps / minFPS))):
                    videoCapInfer.cur_frame_count = 0
                    videoCapInfer.cap.set(cv2.CAP_PROP_POS_FRAMES, 0)
        
            if no_more_data:
                break
#End of while loop--------------------
    progressUpdate(progress_file_path, time.time()-infer_start_time, frames_sum, frames_sum) 
    t2 = time.time()-infer_start_time
    print(f"total processed frames = {f_proc}")
    for videos in videoCaps:
        print(videos.closed)
        print("Frames processed {}".format(videos.cur_frame_count))
        print("Frames count {}".format(videos.length))
        videos.video.release()
        videos.cap.release()

    print("End loop")
    print("Total time {0}".format(t2))
    print("Total frame count {0}".format(frame_count))
    print("fps {0}".format(frame_count/t2))
    with open(os.path.join(output_dir, f'stats_{job_id}.txt'), 'w') as f:
        f.write('{} n'.format(round(t2)))
        f.write('{} n'.format(f_proc))
    # Call OpenVINO Metrics Writer for model info
    applicationMetricWriter.send_application_metrics(model_xml, device)

1.3.5.3 Image Input

If the data to test are images, we will loop through each image and send the time it takes to perform inferencing on each image. Once inferencing is complete we call the application metrics writer to enable data to the telemetry dashboard.

Image input function for custom_telem_enable.py:

def isImageInput(net,exec_net, input_blob, out_blob):
    ''' Function that runs for image input in conf file '''
    infer_file = os.path.join(output_dir,'i_progress_'+str(job_id)+'.txt')
    infer_time = []
    correct = 0; error = 0
    t0 = time.time()
    # read the input's dimensions: n=batch size, c=number of channels, h=height, w=width
   n, c, h, w = net.input_info[input_blob].input_data.shape
       
    for i in range(0, len(image_files), n):
        images = numpy.ndarray(shape=(n, c, h, w))
        for j in range(n):
            input = image_files[i*n + j]
            image = cv2.imread(input, 0) # Read image as greyscale
            if image.shape[-2:] != (h, w):
                image = cv2.resize(image, (w, h))

            # Normalize to keep data between 0 - 1
            image = (numpy.array(image) - 0) / 255.0
            # Change data layout from HWC to CHW
            image = image.reshape((1, 1, h, w))    
            images[j] = image
        try:
            t0P = time.time()
            result = exec_net.infer(inputs={input_blob: images})
            infer_time.append((time.time()-t0P)*1000)
            
            # Call OpenVINO Metrics Writer to send inference times 
            applicationMetricWriter.send_inference_time((time.time()-t0P)*1000)
            
            # Progress Tracker with time and frame information used for slide bar
            if i%10 == 0: 
                progressUpdate(infer_file, time.time()-t0, i+1, len(image_files))
                
            result = result[out_blob]
        except Exception as e:
            print("Exception Occurred when trying to run inference")
    # Call OpenVINO Metrics Writer for model info
    applicationMetricWriter.send_application_metrics(model_xml, device)

1.3.6 Submit Job

1.3.6.1 Create Bash file

We run inference on several different edge compute nodes present in the Intel DevCloud for the Edge. Work is sent to these nodes by submitting the corresponding non-interactive jobs into a queue. For each job, we will specify the type of the edge compute server that must be allocated for the job.

The job file is a Bash script that serves as a wrapper around the Python executable of our application that will be executed directly on the edge compute node. One purpose of the job file is to simplify running an application on different compute nodes.

The job file will be submitted as if it were run from the command line using the following format:

custom_telem_enable.sh

Where the job file input arguments are:

  • <output_directory> – Output directory to use to store output files
  • <device> – Hardware device to use (e.g. CPU, GPU, etc.)
  • <fp_precision> – Which floating point precision inference model to use (FP32 or FP16)
  • <num_videos> – Number of input videos to process from configuration file
  • – Indicate the input type as Image or Video (I or V respectively)

Based on the input arguments, the job file will do the following:

  • Change to the working directory PBS_O_WORKDIR where this Jupyter* Notebook and other files appear on the compute node
  • Create the <output_directory>
  • Choose the appropriate inference model IR file for the specified <fp_precision>
  • Run the application Python executable with the appropriate command line arguments

The following is the custom_telem_enable.sh which is created in the Jupyter* notebook custom_model_telemetry.ipynb.

%%writefile custom_telem_enable.sh
# Store input arguments:      
OUTPUT_FILE=$1
DEVICE=$2
FP_MODEL=$3
NUM_VIDEOS=$4
NUM_REQ=$5
INPUT_TYPE=$6

# The default path for the job is the user's home directory,
cd $PBS_O_WORKDIR

# Make sure that the output directory exists.
mkdir -p $OUTPUT_FILE

# Check for special setup steps depending upon device to be used
if [ "$DEVICE" = "HETERO:FPGA,CPU" ]; then
    # Environment variables and compilation for edge compute nodes with FPGAs - Updated for OpenVINO 2020.3
    export AOCL_BOARD_PACKAGE_ROOT=/opt/intel/openvino/bitstreams/a10_vision_design_sg2_bitstreams/BSP/a10_1150_sg2
    source /opt/altera/aocl-pro-rte/aclrte-linux64/init_opencl.sh
    aocl program acl0 /opt/intel/openvino/bitstreams/a10_vision_design_sg2_bitstreams/2020-3_PL2_FP16_MobileNet_Clamp.aocx
    export CL_CONTEXT_COMPILER_MODE_INTELFPGA=3
fi
# Set inference model IR files using specified precision
#insert  path to your XML file below
MODELPATH=models/mobilenet-ssd/FP16/mobilenet-ssd.xml
#Run the custom telem code
python3 custom_telem_enable.py -d $DEVICE 
                                 -m $MODELPATH 
                                 -o $OUTPUT_FILE 
                                 -nr $NUM_REQ 
                                 -c conf.txt 
                                 -n $NUM_VIDEOS 
                                 -inp $INPUT_TYPE

echo "job submitted"

1.3.6.2 Job Request

Now that we have the job script, we can submit jobs to edge compute nodes in the Intel DevCloud for the Edge. To submit a job, the qsub command is used with the following format:

qsub -N -l -F “

There are three options of qsub command that we use to send a job to different compute nodes:

  • <job_file> – This is the job file we created in the previous step
  • -N<JobName>: Sets name specific to the job
  • -l<nodes> – Specifies the number and the type of nodes using the format nodes=<node_count>:<property>[:<property>…]
  • -F”<job_file_arguments>” – String containing the input arguments described in the previous step to use when running the job file

To see the available types of nodes on the Intel DevCloud for the Edge, run the following in the Jupyter* notebook custom_model_telemetry.ipynb:

!pbsnodes | grep compnode | awk ‘{print $3}’ | sort | uniq -c

1.3.6.3 Submit Job to an edge compute node (with target hardware)

The following will send a job to a compute node where the output is the JobID for the submitted job. The JobID can be used to track the status of the job. Feel free to run on different hardware’s found in the pbsnodes command gives in the section above (change the nodes= portion of code).

Run the following in the Jupyter* notebook custom_model_telemetry.ipynb:

#Submit job to the queue for Images
job_id_core =!qsub custom_telem_enable.sh -l nodes=1:idc001skl -F "results/core CPU FP16 1 {NumRequests_CPU} I" -N custom_telem

# Submit job to the queue for Videos
job_id_core =!qsub custom_telem_enable.sh -l nodes=1:idc001skl -F "results/core CPU FP16 1 {NumRequests_CPU} V" -N custom_telem

print(job_id_core[0])
#Progress indicators
if job_id_core:
    progressIndicator('results/core', f'i_progress_{job_id_core[0]}.txt', "Inference", 0, 100)

1.4 Display Telemetry Dashboard

Once your submitted jobs are completed, run the code below to view telemetry dashboards containing performance metrics for your model and target hardware.

The following is located in the Jupyter* notebook custom_model_telemetry.ipynb.

link_t = " Click here to view telemetry dashboard of the last job ran on Intel Core i5-6500TE"
result_file = "https://devcloud.intel.com/edge/metrics/d/" + job_id_core[0].split('.')[0]
html = HTML(link_t.format(href=result_file))
display(html) 

A link will generate and take you to the Grafana Dashboard. Using this sample application, you are now able to display the telemetry metrics for a custom model.

1.5 More Information

For more information take a look at the following:


Daya Kulkarni is a software development engineer with the Intel DevCloud team enabling telemetry for edge software solutions. She holds a Masters of Science in Computer Science from Arizona State University.

Related Contents:

For more Embedded, subscribe to Embedded’s weekly email newsletter.

The post Enabling telemetry for custom models in Intel DevCloud for the Edge appeared first on Embedded.com.





Original article: Enabling telemetry for custom models in Intel DevCloud for the Edge
Author: Daya Kulkarni