How to generate RISC-V vector extension assembly from Tensorflow

In this tutorial, we will see how to generate RISC-V vector extension assembly from Tensorflow via XLA.

What is XLA?

In TensorFlow, graph execution means that tensor computations are executed as a TensorFlow graph, sometimes referred to as a tf.Graph or simply a “graph”. Computations are described using a data-flow-like model, and the computations can be mapped onto different hardware like CPUs and GPUs.

XLA (Accelerated Linear Algebra) is a domain-specific compiler for linear algebra that can accelerate TensorFlow models with potentially no source code changes.

XLA compiler provides the following advantages:

Fused pipeline operations to reduce memory overhead
Memory usage analysis to eliminate intermediate buffer usage
Fusing of operations/kernels to form a low-level op to match the performance of custom-tuned low-level Operations

Tensorflow, with tf.function support graph mode, generates an intermediate graph object that is later parsed and optimized by XLA, producing HLO, and from there, we can emit the LLVM/IR code.

Getting started with XLA

Test Workload

By default, in Tensorflow 2.x, the code runs in eager mode. To run in graph mode, we use a @tf.fuction decorator such that the whole function will be compiled, optimized, and run as a single computational graph.

@tf.function(experimental_compile=True)
def myworkload(a,b):
    return tf.add(a,b)

Inspecting compiled programs

You can inspect compiled program in text format and html format where you can see your computational graph visually. by setting a bunch of environment variables before launching the workload:

XLA_FLAGS=" - xla_dump_to=/tmp/generated" TF_XLA_FLAGS=" - tf_xla_auto_jit=2" my/tensorflow/program

Generate RISC-V vector extension assembly

Select your compiled workload. Inside the folder specified during compilation time, you will find files with .ll extensions, which are LLVM IR files. We are interested in modules that end with .ir-with-opt.ll.

This command will help you navigate all the attributes needed to generate RISC-V assembly code from LLVM-IR:

$ llc -march=riscv64 -mattr=help

To produce the RISC-V vector code:

find ./ -type f -name "*ir-with-opt.ll" -exec llc "{}" -march=riscv64 -mattr=+m,+v,+zve32x,+zvl1024b - riscv \;

Where:

+m : support multiplication
+v: vector extension
+zve32x: it’s one of the five standard extensions defined to provide varying degrees of vector support for embedded processors
+zvl1024b: Specify that the minimum VLEN is 1024.