Chapter 1. Introduction

About

CoCoPIE XGen is a tool dedicated to optimization of AI models for end devices. The creation of XGen was prompted by a problem often encountered by AI application developers. When they try to deploy a trained AI model on devices, the developers often find that the model is too slow, too large, or too power-demanding for practical needs. The problem is most serious on mobile and embedded devices, such as smartphones, Raspberry Pi, and IoT devices. This version focuses the support on smartphones that run Android systems and iOS systems. For other kinds of devices (e.g., Raspberry Pi, RISC-V, etc.) and non-16-bit precisions, please contact CoCoPIE Inc.

CoCoPIE XGen solves the AI deployment problem by offering an easy-to-use tool that automatically optimizes AI models. With it, an input AI model can turn into directly deployable code as much as tens of times smaller and faster with a still satisfying accuracy. With XGen removing the key barrier and making more AI tasks possible to achieve on end devices, developers can capitalize on the huge market of mobile AI that had been beyond the reach.

CoCoPIE now provides XGen as an on-premise software tool, installable on users' machines, avoiding concerns on data/model security. This particular version is a single-node version, running on one single machine. For the distributed version that runs on computer clusters, please contact CoCoPIE Inc.

Benefits from XGen

Shortened time-to-market of AI solutions

With the automated AI optimization by XGen, the process of making an AI model deployable in mobile apps can be shortened from several months to several days.

Lower cost

The dramatically enhanced productivity can substantially reduce the cost in deploying an AI solution as well as its maintenance and upgrades. In addition, as its superior optimizations make some cloud-based AI tasks possible to run on end devices, XGen can help lower the cloud cost for businesses.

Superior performance and compact models

CoCoPIE's world-leading co-optimization technology makes XGen generate AI models and code several times smaller, faster, and more power efficient than those by existing tools while maintaining the same level of accuracy.

Expended business opportunities

By making many AI tasks achieve realtime performance on end devices, XGen can be used to exploit many new business opportunities blocked by the lack of device AI capabilities.

Where XGen Fits in the AI Workflow

Users can use XGen to easily obtain a DNN model and code that meet their AI needs. XGen supports the generation of efficient AI models on either public datasets or users' own datasets.

The high-level view of XGen is shown as below. This version of XGen runs on one single computer connected with one or more real devices. It receives users' requirements and other inputs, optimizes the AI model of interest, and outputs the optimized model and code. During the optimization, it includes on-device speed testing into the loop to guarantee the actual performance of the generated model. It has some predefined AI models but also works on users' own AI models. It reuses the models produced in previous runs to save time, but does not depend on them to function.

XGen high-level view

Specifically, there are two main usage scenarios as follows.

Scenario I: The user needs a common AI capability that meets certain requirements (e.g., a certain speed on some common devices with a certain accuracy and size), and one of the base AI models already included in XGen suits the general task, but is too large or too slow, and needs to be made work well on users' data.

Usage I: From the model set included in XGen, the users select the base model already included in XGen that fits their needed AI capability, and inputs their requirements; XGen optimizes the model and generates the model and code that meet the requirements.

Scenario II: The user needs an AI capability but the base AI models already included in XGen do not fit that types of task. The users have their own model and the training script, but the model is too large or too slow.

Usage II: The users follow the guidelines in the Usage to make their training script compatible with XGen's requirement, give the training script and other inputs to XGen, and XGen automatically generates a model that satisfies users requirements.

Besides these main usage scenarios, XGen can also be used for some smaller tasks, such as training an AI model, assessing the quality and speed of an AI model, testing the compatibility of an AI training model with XGen, managing AI models, and so on.

General Features

Support a broad range of AI models (e.g., CNN, RNN, Transformers) for various AI tasks (CV, NLP, etc.)
Free of security and privacy concerns as an on-premise tool
Superior speed & accuracy
Single-click solution as well as fully-customizable usage
Flexible support of users' DNN models and data
Co-optimizing DNN models and code through the world-leading technology
Proprietary cutting-edge DNN pruning methods
Novel effective AI compilation and code generation
Reliable quality and speed assessment with actual devices in the loop
Flexible power control APIs to ensure power-efficient AI
Generated code works on all mainstream Android and iOS devices; support to other devices can be added separately

Features Specific to this Version

This XGen is a single-node version; it runs on a single GNU/Linux machine. It can use one or more GPUs equipped in the machine.
The generated AI model is 16-bit in precision, working on Android 9 and iOS 11 and later versions.
When used for optimizing customer's AI models, a training script in PyTorch must be provided.
It supports multiple users, allowing the launch of multiple XGen instance containers on a machine

For the support of special processors or other kinds of devices (e.g., iOS, Raspberry Pi, RISC-V, etc.), or non-16-bit precisions (e.g., via quantization), please contact CoCoPIE Inc.