Chapter 5. XGen Results

Location of Outputs

Every time XGen runs, its outputs are saved into a folder located within the workplace. If the user does not manually designate the workplace path, either through the program's interface or configuration file, XGen automatically saves the outputs to its default workplace path. This path is typically a folder situated beneath the Output folder of XGen. By default, on the XGen container, this folder is designated as ~/output, and on the host, the folder is usually labeled as Output/. Once the execution completes, XGen notifies the user of the output path on the screen.

Note:

By default, XGen creates a new folder in the workplace and names it using the current timestamp and the model name. For example, the default workplace folder for running a model named DigitNet_MNIST on August 15, 2023 at 3:32:25 AM would be located at ~/output/DigitNet_MNIST/20230815033225
In compatibility testing mode, the default workplace folder is located in /tmp and is automatically deleted after the test is completed.

Outputs

The essential outputs of XGen are as follows:

summarized_data.csv: The summary file contains the meta info of every run that shares the same configuration, including the performance of the models found in those runs. The essential columns of the file are as follows; some columns (e.g., power, memory size) in the file are reserved fields for future use:
- Onnx_path: the produced AI model in ONNX format
- pth_path: the produced AI model parameters (i.e., weights) in PyTorch format
- quality score: the quality score of the produced AI model
- model_size: the total number of parameters in the produced AI model
- flops: the total number of floating-point operations (FLOPs) in one inference of the produced AI model
- prune_rates: the value of the parameter prune_rate in the configuration
- global_sparsity: average sparsity (percentage of zero weights) of the entire network
- global_sparsity_prune_layer: average sparsity of the pruned layers
- device: The IDs of the selected devices
- compiled_file_path: the path that holds the final AI model and code
- latency: the latency of the model on the slowest selected device
xgen-training.log: the log file of the run. When filing a bug report, the user should include this log file in the report.
a *.onnx file: the produced AI model in ONNX format
a *.pth file: the produced AI model parameters (i.e., weights) in PyTorch format
*compiled_files folder: A folder containing the final model and code files that are ready to be integrated into mobile apps. This folder is only produced if at least one device was selected in the run. If XGen does not know the target device, it will not compile the models. Hint: Users may call XGen later to compile a previously generated model. See Using the Compiler and Testing Components Only.

After using XGen, the contents of the *compiled_files folder are all that users need to integrate the AI model into their apps. The folder's contents depend on the backend mode used in XGen.

For Android devices, XGen's backend includes two modes for code generation: DeepOpt and Fallback. DeepOpt conducts deeper optimizations and often produces better results than Fallback, but it does not yet support some operators that Fallback does. (Appendix I lists the supported operators of the different modes.) XGen tries both modes and automatically gives the better one to the user as the final output, without user involvement. The output of the two modes consists of different sets of files, but they are used in the same way in an app. This chapter first explains the outputs of each mode, then explains how to use the outputs in an app. For iOS devices, there is only one mode of code generation, which is the same as the Fallback mode for Android.

DeepOpt Output for Android

The output of DeepOpt consists of 6 files in the output directory.
The directory structure is as follows:

output folder/android
├── include/xgen.h           # XGen model deployment interface
├── include/xgen_data.h      # DNN model weights file
├── include/xgen_pb.h        # DNN model structure file
├── lib/libxgen.so           # XGen library
├── model/*.pb               # DNN model structure file
└── model/*.data             # DNN model weights file

Fallback Output for Android

The output of Fallback consists of 3 files in the output directory. The directory structure is as follows:

output folder/android
├── include/xgen.h           # XGen model deployment interface
├── lib/libxgen.so           # XGen library
└── model/*.fallback         # DNN model file (including both structure and weights)

Output for iOS

The output for iOS consists of five files in the output directory. The directory structure is as follows:

output folder/ios
.
├── benchmark_data
│   ├── benchmark_params.json  # Configuration file for benchmark settings
│   └── *.fallback             # DNN model file (including both structure and weights)
├── benchmark_data.zip         # Compressed file for folder benchmark_data
├── benchmark_output.json      # Output from the benchmark execution
└── xgen.framework             # XGen iOS library

Usage of the Results

The code generated by XGen can be easily integrated into Android or iOS apps. Integration can be static, where the AI model becomes a part of the app package, or dynamic, where the AI model can be loaded into the app during the execution. The latter approach can help minimize the static size of the app package.

Both integration methods are demonstrated in a demo app for both Android and iOS:

Android: Click on this link to see the code for the demo app. The localDynamicDeepOpt_XGen-Auto branch of the git repository demonstrates the static approach using the DeepOpt outputs, while the remoteDynamicDeepOpt branch demonstrates the dynamic approach.
iOS (static and dynamic integration): Click on this link to see the code for the demo app. The main branch of the git repository demonstrates the static approach, while the download-model branch demonstrates the dynamic approach.

The following section introduces the complete API for using the output of XGen. The XGen APIs can be found in the xgen.h file, which contains detailed descriptions. Below is a complete list of all the APIs.

XGEN_EXPORT XGenHandle *XGenInit(const void *model_data, size_t model_size);
XGEN_EXPORT XGenHandle *XGenInitWithData(const void *model_data,
                                         size_t model_size_in_bytes,
                                         const void *extra_data,
                                         size_t data_size_in_bytes);
XGEN_EXPORT XGenHandle *XGenInitWithPower(const void *model_data,
                                          size_t model_size_in_bytes,
                                          const void *extra_data,
                                          size_t data_size_in_bytes,
                                          XGenPowerPolicy policy);
XGEN_EXPORT XGenHandle *XGenInitWithFiles(const char *model_file,
                                          const char *data_file,
                                          XGenPowerPolicy policy = XGenPowerDefault);
XGEN_EXPORT XGenStatus XGenRun(XGenHandle *handle);
XGEN_EXPORT void XGenShutdown(XGenHandle *handle);
XGEN_EXPORT size_t XGenGetNumInputTensors(const XGenHandle *handle);
XGEN_EXPORT size_t XGenGetNumOutputTensors(const XGenHandle *handle);
XGEN_EXPORT XGenTensor *XGenGetInputTensor(XGenHandle *handle,
                                           size_t tensor_index);
XGEN_EXPORT XGenTensor *XGenGetOutputTensor(XGenHandle *handle,
                                            size_t tensor_index);
XGEN_EXPORT void XGenCopyBufferToTensor(XGenTensor *input_tensor,
                                        const void *input_data,
                                        size_t input_size_in_bytes);
XGEN_EXPORT void XGenCopyTensorToBuffer(const XGenTensor *output_tensor,
                                        void *output_data,
                                        size_t output_size_in_bytes);

XGEN_EXPORT XGenType XGenGetTensorType(const XGenTensor *tensor);
XGEN_EXPORT int32_t XGenGetTensorNumDims(const XGenTensor *tensor);
XGEN_EXPORT int32_t XGenGetTensorDim(const XGenTensor *tensor,
                                     int32_t dim_index);
XGEN_EXPORT size_t XGenGetTensorSizeInBytes(const XGenTensor *tensor);
XGEN_EXPORT void *XGenGetTensorData(const XGenTensor *tensor);
XGEN_EXPORT const char *XGenGetTensorName(const XGenTensor *tensor);

Use in an Android application

There are five steps to follow when using an XGen-generated AI model in an Android application:

Initialization: Use one of the XGenInit functions to initialize the model.
Copy Input Data: Use XGenCopyBufferToTensor to copy the input data into the tensor.
Perform Prediction: Run the model by calling XGenRun.
Copy Output Data: Once the prediction is complete, use XGenCopyTensorToBuffer to copy the output data from the tensor.
Release Resources: Finally, release all resources by calling XGenShutdown.

Additionally, a set of APIs are available to query various tensor attributes, such as tensor name, type, and size.

Using the output of DeepOpt

If the XGen output directory (i.e., the *compiled_files folder mentioned in the XGen outputs section) includes xgen_pb.h and xgen_data.h, the model is generated in the DeepOpt mode. Follow the example below or the demo app link to deploy it.

Suppose that example_pb and example_pb_len are the first and second variable names in xgen_pb.h, and example_data and example_data_len are the first and second variable names in xgen_data.h.

#include "xgen_data.h"
#include "xgen_pb.h"
#include "xgen.h"

int main() {
  //Initialization
  XGenHandle *h = XGenInitWithData(example_pb,
                                   example_pb_len,
                                   example_data,
                                   example_data_len);

  size_t num_input_tensors = XGenGetNumInputTensors(h);
  for (size_t i = 0; i < num_input_tensors; ++i) {
    XGenTensor *t = XGenGetInputTensor(h, i); 
    size_t input_size = XGenGetTensorSizeInBytes(t);

    float *buffer = ... // initialize buffer with your input data
    XGenCopyBufferToTensor(t, buffer, input_size);//copy data
  }

  //Run inference
  XGenRun(h);

  size_t num_output_tensors = XGenGetNumOutputTensors(h);
  float **buffer = new float*[num_output_tensors];
  for (size_t i = 0; i < num_output_tensors; ++i) {
    XGenTensor *t = XGenGetOutputTensor(h, i);
    size_t output_size = XGenGetTensorSizeInBytes(t);
    buffer[i] = new float[(output_size + sizeof(float) - 1) / sizeof(float)];
    XGenCopyTensorToBuffer(t, buffer[i], output_size);//copy output data
  }

  // Use model output here.

  //Release resources
  XGenShutdown(h);

  return 0;
}

XGenInitWithPower can be used to fine-tune power usage for a better power-speed trade-off. The power policy can be set as one of the four predefined settings, as follows:

typedef enum XGenPowerPolicy {
  /* No power policy. Use whatever system default configuration is. */
  XGenPowerNone = 0,
  /* XGen default power policy is used with XGenInitWithData */
  XGenPowerDefault = 1,
  /* Performance policy uses less power than the default but still performs
     well. */
  XGenPowerPerformance = 2,
  /* Power saving policy uses even less power than performance policy. */
  XGenPowerSave = 3,
} XGenPowerPolicy;

In addition to using XGenInitWithData as in the example above, you can also initialize XGen by using XGenInitWithFiles, which takes in *.pb and *.data files, as follows:

XGenHandle *h = XGenInitWithFiles("wdsr.pb", "wdsr.data");

Using the output of Fallback

The usage of AI models in the fallback mode is the same as in the DeepOpt mode, except that you should use XGenInitWithFile for initialization instead of XGenInitWithData. Use the following example, where example_model_file is the *.fallback file located in its model folder:

 XGenHandle *h = XGenInitWithFiles(example_model_file, NULL);

Use in an iOS application

Using the output of XGen in an iOS application is similar to using Fallback output in an Android application, except that the model file *.fallback is located in the benchmark_data folder. Users can refer to this link for the code of a demo app.