Systolic Arrays are a very special purpose paradigm designed for Matrix multiplication, Convolution, etc…
used in TPUs

Difference to pipelining

  • systolic arrays can do multidimension array computations
  • each PE (processing element) can execute a “kernel” not just one instruction

The challenge is to orchestrate the memory in order to have the data ready

  • need to carefully place it into the inputs

We can also arrange the PEs in other structures (triangular, hex cells, etc…) for different tasks.

Pros:

  • efficient use of limited memory
  • specialised (computation needs to fit the PE organization/functions)

Cons:

  • specialised not generally applicable

19.2 General Systolic Computational Model

We can string a series of PEs after each other that can be “more general PEs”.
This allows less specialised computation runs.

Basically instead of writing back to memory after each instruction like in a normal CPU, each PE just passes the output forward to the next one.
Less energy is wasted on data movement.

Needs a specialised compiler to optimise.