July updates

Updated compiler tools, API libraries, and trained models

The Coral Team
July 26, 2021

We're excited to share the following updates for the Coral platform tools and APIs.

If you have a moment, please fill out this quick developer survey.

Edge TPU Compiler v16

The latest Edge TPU Compiler adds support for new graph operations:

LSTM (Unidirectional only; TF Lite currently does not support bidirectional LSTM or any customized LSTM ops. Also read about the new append_recurrent_links tool below for other RNN options.) Try it out with this LSTM time series Colab.
Reduce-max
Reduce-min
Rsqrt
Squared difference
Transpose

We've also added some features to improve the compiler experience and success rate:

A new "delegate search" feature will repeatedly search your graph for a successful Edge TPU delegate. For example, if the compiler first fails to compile the model due to an unexpected error in an op, it attempts to compile again by incrementally stepping backward along the graph and stopping the compilation at an earlier point. You can enable this with the new --search_delegate option. You can also specify the step size with the new --delegate_search_step option.
A new --timeout_sec option allows you to specify a timeout for the compiler. (Default is 180 seconds.)
More graceful compilation failures (fewer crashes with no error messages).
Support for partitioning SSD models (although you really should use the updated profiling-based partitioner described next).

For more information about the new compiler flags, see the Edge TPU Compiler guide.

To get the latest compiler, see the installation guide (available only for Debian Linux).

Updated profiling-based partitioner

We first released this tool in November to help improve throughput in a pipelined model by segmenting the model based on segment latencies rather than parameter data sizes. With this update, we've made the following changes:

Added support for SSD models, and other models that have large CPU segments and/or graph branches. (This also requires v16 of the Edge TPU Compiler.)
By default, it now enables the compiler's new search_delegate option (mentioned above).
Renamed the executable to partition_with_profiling.
Added flags:
- delegate_search_step: Same as the delegate_search_step option added in v16 of the Edge TPU Compiler.
- partition_search_step: Similar to the delegate_search_step option, but applied to the search for each segment (rather than the entire graph's delegate).
- initial_lower_bound_ns and initial_upper_bound_ns: The known smallest/largest latency among your model's segments. If not specified, these are calculated in the tool by benchmarking the heuristic-based segments from the Edge TPU Compiler.

For more detail, read about the profiling based partitioner.

New tool to build RNN models

We've created a new tool called append_recurrent_links tool, which helps you create recurrent networks for the Edge TPU with one or more hidden saved states. Without this tool (and when not using the TF LSTM op), creating your own recurrent network that can compile for the Edge TPU requires that your model output the saved state, and then your application must pass the saved state back into your model with each iteration. So by instead passing such a model (already compiled for the Edge TPU) to append_recurrent_links, you can make that saved state hidden again so your application code can focus on the final output.

New tool to compile models with a huge FC layer

If you have a model with a huge fully-connected layer, the Edge TPU Compiler previously might have cut that layer from the Edge TPU delegate and instead execute it on the CPU, due to the size of the weights applied to that layer. So the new split_fc tool divides that layer's weights matrix into smaller blocks using block-wise matrix multiplication (you can control the ratio of the split operation). The split_fc tool outputs a new .tflite file that you can pass to the Edge TPU Compiler and the compiled output will then include the fully-connected layer in the Edge TPU delegate.

PyCoral API 2.0

Back in November, we introduced the PyCoral API to simplify development and add features on top of the TensorFlow Lite API. With this update, we're not changing a lot, but some of the APIs have changed their behavior in ways that may break your existing code, so we bumped the major version.

The API changes you'll find are the following:

Improved error reporting for PipelinedModelRunner. It now prints error messages originating from the TensorFlow Lite runtime.
Code-breaking APIs changes:
- PipelinedModelRunner.push() now requires a dictionary for the input_tensors (instead of a list), so that each input tensor provides a corresponding tensor name as the dictionary key. This method is also now void instead of returning a bool; it will raise RuntimeError if the push fails.
- Similarly, PipelinedModelRunner.pop() now returns a dictionary instead of a list, and also may raise RuntimeError.
Updated APIs:
- make_interpreter() now accepts an optional delegate argument to specify the Edge TPU delegate object you want to use.
- get_objects() now supports SSD models with different orders in the output tensor.
New APIs:
- utils.edgetpu.set_verbosity() prints logs related to each Edge TPU.

We've also added official support for Python 3.9.

To install or update the PyCoral library, use the following commands:

On Debian Linux systems including the Coral Dev Board and Raspberry Pi (first, be sure you installed our package repos):

sudo apt-get update 

sudo apt-get install python3-pycoral

On other systems (including Mac and Windows):

python3 -m pip install --extra-index-url=https://google-coral.github.io/py-repo/ pycoral==2.0.0

Updated libcoral API

We've updated our C++ library with some similar changes:

Updated PipelinedModelRunner:
- Push() and Pop() now return either absl::OkStatus or absl::InternalError, instead of True/False.
- Added tensor name to PipelineTensor.
- GetInputTensorNames() and GetInputTensor() moved to tflite_utils.h.
GetDetectionResults() now supports SSD models with different orders in the output tensor.

To build your project with the latest libcoral API, see the libcoral GitHub readme.

New pre-trained models

You can now find the following trained models at coral.ai/models:

Image classification:
- Popular US products (100,000 supermarket products)
Object detection models:
- EfficientDet-Lite in 5 sizes (COCO dataset)
- TF2-based SSD/FPN MobileNet V1 (COCO dataset)
- TF2-based SSD MobileNet V2 (COCO dataset)
Semantic segmentation:
- EdgeTPU-DeepLab-slim (Cityscapes dataset)
Pose estimation:
- MoveNet single-pose lightning
- MoveNet single-pose thunder

That's all we have for you now. But if you can spare two minutes, please fill out this developer survey. Thank you!