Chapter 5. XGen Results
Location of Outputs
Every time XGen runs, its outputs are saved into a folder located within the workplace.
If the user does not manually designate the workplace path, either through the program's interface or configuration file,
XGen automatically saves the outputs to its default workplace path.
This path is typically a folder situated beneath the Output folder of XGen.
By default, on the XGen container, this folder is designated as ~/output
, and on the host,
the folder is usually labeled as Output/
. Once the execution completes, XGen notifies the user of the output path on the screen.
Note:
- By default, XGen creates a new folder in the workplace and names it using the current timestamp and the model name. For example, the default workplace folder for running a model named DigitNet_MNIST on August 15, 2023 at 3:32:25 AM would be located at ~/output/DigitNet_MNIST/20230815033225
- In compatibility testing mode, the default workplace folder is located in /tmp and is automatically deleted after the test is completed.
Outputs
The essential outputs of XGen are as follows:
-
summarized_data.csv
: The summary file contains the meta info of every run that shares the same configuration, including the performance of the models found in those runs. The essential columns of the file are as follows; some columns (e.g., power, memory size) in the file are reserved fields for future use:- Onnx_path: the produced AI model in ONNX format
- pth_path: the produced AI model parameters (i.e., weights) in PyTorch format
- quality score: the quality score of the produced AI model
- model_size: the total number of parameters in the produced AI model
- flops: the total number of floating-point operations (FLOPs) in one inference of the produced AI model
- prune_rates: the value of the parameter
prune_rate
in the configuration - global_sparsity: average sparsity (percentage of zero weights) of the entire network
- global_sparsity_prune_layer: average sparsity of the pruned layers
- device: The IDs of the selected devices
- compiled_file_path: the path that holds the final AI model and code
- latency: the latency of the model on the slowest selected device
-
xgen-training.log
: the log file of the run. When filing a bug report, the user should include this log file in the report. -
a
*.onnx
file: the produced AI model in ONNX format - a
*.pth
file: the produced AI model parameters (i.e., weights) in PyTorch format *compiled_files
folder: A folder containing the final model and code files that are ready to be integrated into mobile apps. This folder is only produced if at least one device was selected in the run. If XGen does not know the target device, it will not compile the models. Hint: Users may call XGen later to compile a previously generated model. See Using the Compiler and Testing Components Only.
After using XGen, the contents of the *compiled_files
folder are all that users need to integrate the AI model into their apps. The folder's contents depend on the backend mode used in XGen.
For Android devices, XGen's backend includes two modes for code generation: DeepOpt
and Fallback
. DeepOpt
conducts deeper optimizations and often produces better results than Fallback
, but it does not yet support some operators that Fallback
does. (Appendix I lists the supported operators of the different modes.) XGen tries both modes and automatically gives the better one to the user as the final output, without user involvement. The output of the two modes consists of different sets of files, but they are used in the same way in an app. This chapter first explains the outputs of each mode, then explains how to use the outputs in an app. For iOS devices, there is only one mode of code generation, which is the same as the Fallback
mode for Android.
DeepOpt Output for Android
The output of DeepOpt consists of 6 files in the output directory.
The directory structure is as follows:
output folder/android
├── include/xgen.h # XGen model deployment interface
├── include/xgen_data.h # DNN model weights file
├── include/xgen_pb.h # DNN model structure file
├── lib/libxgen.so # XGen library
├── model/*.pb # DNN model structure file
└── model/*.data # DNN model weights file
Fallback Output for Android
The output of Fallback consists of 3 files in the output directory. The directory structure is as follows:
output folder/android
├── include/xgen.h # XGen model deployment interface
├── lib/libxgen.so # XGen library
└── model/*.fallback # DNN model file (including both structure and weights)
Output for iOS
The output for iOS consists of five files in the output directory. The directory structure is as follows:
output folder/ios
.
├── benchmark_data
│ ├── benchmark_params.json # Configuration file for benchmark settings
│ └── *.fallback # DNN model file (including both structure and weights)
├── benchmark_data.zip # Compressed file for folder benchmark_data
├── benchmark_output.json # Output from the benchmark execution
└── xgen.framework # XGen iOS library
Usage of the Results
The code generated by XGen can be easily integrated into Android or iOS apps. Integration can be static, where the AI model becomes a part of the app package, or dynamic, where the AI model can be loaded into the app during the execution. The latter approach can help minimize the static size of the app package.
Both integration methods are demonstrated in a demo app for both Android and iOS:
-
Android: Click on this link to see the code for the demo app. The
localDynamicDeepOpt_XGen-Auto
branch of the git repository demonstrates the static approach using theDeepOpt
outputs, while theremoteDynamicDeepOpt
branch demonstrates the dynamic approach. -
iOS (static and dynamic integration): Click on this link to see the code for the demo app. The
main
branch of the git repository demonstrates the static approach, while thedownload-model
branch demonstrates the dynamic approach.
The following section introduces the complete API for using the output of XGen. The XGen APIs can be found in the xgen.h
file, which contains detailed descriptions. Below is a complete list of all the APIs.
XGEN_EXPORT XGenHandle *XGenInit(const void *model_data, size_t model_size);
XGEN_EXPORT XGenHandle *XGenInitWithData(const void *model_data,
size_t model_size_in_bytes,
const void *extra_data,
size_t data_size_in_bytes);
XGEN_EXPORT XGenHandle *XGenInitWithPower(const void *model_data,
size_t model_size_in_bytes,
const void *extra_data,
size_t data_size_in_bytes,
XGenPowerPolicy policy);
XGEN_EXPORT XGenHandle *XGenInitWithFiles(const char *model_file,
const char *data_file,
XGenPowerPolicy policy = XGenPowerDefault);
XGEN_EXPORT XGenStatus XGenRun(XGenHandle *handle);
XGEN_EXPORT void XGenShutdown(XGenHandle *handle);
XGEN_EXPORT size_t XGenGetNumInputTensors(const XGenHandle *handle);
XGEN_EXPORT size_t XGenGetNumOutputTensors(const XGenHandle *handle);
XGEN_EXPORT XGenTensor *XGenGetInputTensor(XGenHandle *handle,
size_t tensor_index);
XGEN_EXPORT XGenTensor *XGenGetOutputTensor(XGenHandle *handle,
size_t tensor_index);
XGEN_EXPORT void XGenCopyBufferToTensor(XGenTensor *input_tensor,
const void *input_data,
size_t input_size_in_bytes);
XGEN_EXPORT void XGenCopyTensorToBuffer(const XGenTensor *output_tensor,
void *output_data,
size_t output_size_in_bytes);
XGEN_EXPORT XGenType XGenGetTensorType(const XGenTensor *tensor);
XGEN_EXPORT int32_t XGenGetTensorNumDims(const XGenTensor *tensor);
XGEN_EXPORT int32_t XGenGetTensorDim(const XGenTensor *tensor,
int32_t dim_index);
XGEN_EXPORT size_t XGenGetTensorSizeInBytes(const XGenTensor *tensor);
XGEN_EXPORT void *XGenGetTensorData(const XGenTensor *tensor);
XGEN_EXPORT const char *XGenGetTensorName(const XGenTensor *tensor);
Use in an Android application
There are five steps to follow when using an XGen-generated AI model in an Android application:
- Initialization: Use one of the
XGenInit
functions to initialize the model. - Copy Input Data: Use
XGenCopyBufferToTensor
to copy the input data into the tensor. - Perform Prediction: Run the model by calling
XGenRun
. - Copy Output Data: Once the prediction is complete, use
XGenCopyTensorToBuffer
to copy the output data from the tensor. - Release Resources: Finally, release all resources by calling
XGenShutdown
.
Additionally, a set of APIs are available to query various tensor attributes, such as tensor name, type, and size.
Using the output of DeepOpt
If the XGen output directory (i.e., the *compiled_files
folder mentioned in the XGen outputs section) includes xgen_pb.h
and xgen_data.h
, the model is generated in the DeepOpt mode. Follow the example below or the demo app link to deploy it.
Suppose that example_pb
and example_pb_len
are the first and second variable names in xgen_pb.h
, and example_data
and example_data_len
are the first and second variable names in xgen_data.h
.
#include "xgen_data.h"
#include "xgen_pb.h"
#include "xgen.h"
int main() {
//Initialization
XGenHandle *h = XGenInitWithData(example_pb,
example_pb_len,
example_data,
example_data_len);
size_t num_input_tensors = XGenGetNumInputTensors(h);
for (size_t i = 0; i < num_input_tensors; ++i) {
XGenTensor *t = XGenGetInputTensor(h, i);
size_t input_size = XGenGetTensorSizeInBytes(t);
float *buffer = ... // initialize buffer with your input data
XGenCopyBufferToTensor(t, buffer, input_size);//copy data
}
//Run inference
XGenRun(h);
size_t num_output_tensors = XGenGetNumOutputTensors(h);
float **buffer = new float*[num_output_tensors];
for (size_t i = 0; i < num_output_tensors; ++i) {
XGenTensor *t = XGenGetOutputTensor(h, i);
size_t output_size = XGenGetTensorSizeInBytes(t);
buffer[i] = new float[(output_size + sizeof(float) - 1) / sizeof(float)];
XGenCopyTensorToBuffer(t, buffer[i], output_size);//copy output data
}
// Use model output here.
//Release resources
XGenShutdown(h);
return 0;
}
XGenInitWithPower
can be used to fine-tune power usage for a better power-speed trade-off. The power policy can be set as one of the four predefined settings, as follows:
typedef enum XGenPowerPolicy {
/* No power policy. Use whatever system default configuration is. */
XGenPowerNone = 0,
/* XGen default power policy is used with XGenInitWithData */
XGenPowerDefault = 1,
/* Performance policy uses less power than the default but still performs
well. */
XGenPowerPerformance = 2,
/* Power saving policy uses even less power than performance policy. */
XGenPowerSave = 3,
} XGenPowerPolicy;
In addition to using XGenInitWithData
as in the example above, you can also initialize XGen by using XGenInitWithFiles
, which takes in *.pb and *.data files, as follows:
XGenHandle *h = XGenInitWithFiles("wdsr.pb", "wdsr.data");
Using the output of Fallback
The usage of AI models in the fallback mode is the same as in the DeepOpt mode, except that you should use XGenInitWithFile
for initialization instead of XGenInitWithData
. Use the following example, where example_model_file
is the *.fallback
file located in its model
folder:
XGenHandle *h = XGenInitWithFiles(example_model_file, NULL);
Use in an iOS application
Using the output of XGen in an iOS application is similar to using Fallback output in an Android application, except that the model file *.fallback
is located in the benchmark_data
folder. Users can refer to this link for the code of a demo app.