SPI+DMA Tutorial (for Maple, Leaflabs)

This is a little tutorial about using DMA to boost up SPI communications in Maple (Leaflabs).

About SPI

SPI is a hierarquical synchronous communication protocol amongst electronic devices. By means of four pins – Clock, output, input and chip selector – it manages to transfer data between two or more devices. It is a centralized communication, that is, there is a master (usually a micro controller) and one or more slaves (different peripherals, SD cards, sensors, displays or even other micro controllers).
The function of each of the pins is pretty straightforward. The clock pin serves as a coordination mechanism, it tells the devices when to read/write data. The output pin is called MOSI (master output slave input) and transmits information from the master device to the slave device. The input pin is called MISO (master input slave output) and transmits information from the slave to the master. Finally, the chip selector is in charge of selecting the slave device that we want to communicate to. Usually, when we bring the chip selector pin down we are telling the device that we want to exchange information with it. This enables all connected devices to share the same clock, MISO and MOSI lines and have their own independent chip selector.

For more detailed info on SPI you can check the Wikipedia article

About Maple and SPI (Hardware SPI)

The ARM micro controller in the Maple has two built-in SPI ports. In the following table you can see what is the pin map for the different Maple boards:

Maple v5/ Maple RET6

    Pin 10 - SPI1 Chip selector
    Pin 11 - SPI1 MOSI
    Pin 12 - SPI1 MISO
    Pin 13 - SPI1 Clock

    Pin 31 - SPI2 Chip selector
    Pin 32 - SPI2 Clock
    Pin 33 - SPI2 MISO
    Pin 34 - SPI2 MOSI

Maple Mini

    Pin 4 - SPI1 MOSI
    Pin 5 - SPI1 MISO
    Pin 6 - SPI1 Clock
    Pin 7 - SPI1 Chip selector

    Pin 28 - SPI2 MOSI
    Pin 29 - SPI2 MISO
    Pin 30 - SPI2 Clock
    Pin 31 - SPI2 Chip selector

To use these ports in your Maple sketch you need declare a HardwareSPI object in your code. This object takes care of setting up this ports for communication and let's you benefit from SPI with four little commands:

void begin(SPIFrequency frequency, uint32 bitOrder, uint32 mode) - allows you to configure and start the SPI port. The parameters allow us to configure the frequency or how fast we want the communication to be (from 18MHz to 140.625 KHz); the bit order – little-endian (LSBFIRST) or big-endian (MSBFIRST); and the synchronization mode. In order to configure this parameters properly, check the data sheet of your device.

void write(byte data) - sends a byte over from the master to the slave.

byte read() - reads a byte sent from the slave to the master.

byte transfer(byte data) - sends a byte from the master to the slave and reads the response of the slave.

Reached this point we can create a little sketch that would implement SPI communication from the Maple to another device. The code it is pretty straight forward:

// Code developed by Pol Pla i Conesa

// declaration of the HardwareSPI object the parenthesis
// indicates which SPI port we want to use (SPI1 or SPI2)
HardwareSPI spi(1);

void setup(){
  //initialize SPI1 to 18MHz, big-endian and sync mode 0
  spi.begin(SPI_18MHZ, MSBFIRST, 0);

  //create an array of bytes we want to send
  byte bytesToSend[512];
  //fill the array with data (put your own)
  for(int i=0; i<512; i++) {
    bytesToSend[i] = 0xFF;

  //we create a buffer for revceiving the responses
  byte bytesReceived[512];

  //we send the array over spi and receive the response (using transfer)
  for(int i=0; i<512; i++){
    bytesReceived[i] = spi.transfer(bytesToSend[i]);
  //at this point we have successfully transmitted the data an get responses

void loop(){

Speed: the problem

If speed is not crucial to your application, the code above will do the trick. However, speed sensitive projects might not get satisfactory results with the plain HardwareSPI object. That is because although the configuration enables us to set it up to 18MHz, the code above will only get a portion of that speed. That is due to a series of overheads that slow down the performance of the SPI port. I did a little experiment where I tested how much time it took to transfer 100KB over SPI with the different frequency configurations obtaining the following results:

Frequency Time
18 MHz 348425 microseconds (0.28 MBytes/s)
9 MHz 348424 microseconds (0.28 MBytes/s)
4.5 MHz 412350 microseconds (0.24 MBytes/s)
2.25 MHz 571545 microseconds (0.17 MBytes/s)
1.125 MHz 923728 microseconds (0.1 MBytes/s)
562.5 KHz 1593889 microseconds (0.06 MBytes/s)
281.25 KHz 2904446 microseconds (0.03 MBytes/s)

As you can see, although we are decreasing the speed by half every row not until we go under 1 MHz we get half of the speed. The extreme case is that there is no difference between using 18 MHz and 9MHz configurations. This is rather disappointing, however there is a workaround that helps to get a much better performance out of SPI: using DMA (thanks Robodude666 for suggesting it).

What is DMA?

DMA stands for 'Direct Memory Access' and it is feature of micro controllers that allow the hardware to access directly memory and functionalities in parallel to your code. What this means, effectively, is that you can tell your micro to send something over SPI and forget about it. This not only speeds up the speed of SPI (since it is completely hardware operated) but at the same time allows you to run code in parallel of the data transmission. Please note that, although this tutorial focuses on SPI, DMA in the Maple can be used for other functionalities besides SPI.

For more info on DMA check out the Wikipedia entry.

DMA in the Maple

The Maple boards have either 1 (Mini and v5) or 2 (RET6) DMAs. Each DMA has a series of channels – DMA1 has 7 and DMA2 has 5. Each channel gives DMA access to different functionalities. You can check all the functionalities in the documentation (linked below) but for what matters to SPI we are going to use only DMA1. Channel 2 has the SPI1 RX, channel 3 has SPI1 TX, channel 4 has SPI2 RX and channel 5 has SPI2 TX.

How does DMA work?

There are a series of things that we need to do to use SPI with DMA. Here is an ordered list of steps that you must have to follow to successfully use the SPI+DMA combo in your sketch:

This article on the Maple Wiki explains how to use DMA with your board with greater detail. Also, the Leaflabs documentation has an entry that describes the dma.h class

Now, we are all set to start our coding. We will replicate the HardwareSPI example but this time using DMA and achieving enormous speed :)

// Code developed by Pol Pla i Conesa

// 1. Import "dma.h"
#include <dma.h>

// 2. Create a HardwareSPI object.
HardwareSPI spi(1);

// this function is the one that we are going to call
// when the transmission is complete (we have to check if
// the transfer was successful or if it generated an error)

void DMAEvent(){

  //we get the DMA event
  dma_irq_cause event = dma_get_irq_cause(DMA1, DMA_CH3);

  switch(event) {
    //the event indicates that the transfer was successfully completed
      //11. Disable DMA when we are done
      SerialUSB.println("Done transfering");
    //the event indicates that there was an error transmitting
      //11. Disable DMA when we are done


void setup(){

  // 3. Init the HardwareSPI object.
  spi.begin(SPI_18MHZ, MSBFIRST, 0);

  // 4. Initialize DMA.

  //5. Enable DMA to use SPI communication; both TX (output) and RX (input).

  // create an array of bytes we want to send
  byte bytesToSend[512];
  //fill the array with data (put your own)
  for(int i=0; i<512; i++) {
    bytesToSend[i] = 0xFF;

  // we create a buffer for revceiving the responses
  byte bytesReceived[512];

  // 6. Setup a DMA transfer (for both TX and RX). If we only want
  // to read (RX) or write (TX) it is fine to just setup one. In this
  // case we want to transfer (write and get a response) so we do both.
  // Parameters:
  //  - DMA
  //  - DMA channel
  //  - Memory register for SPI
  //  - The size of the SPI memory register
  //  - The buffer we want to copy things to or transmit things from
  //  - The unit size of that buffer
  //  - Flags (see the Maple DMA Wiki page for more info in flags)
  dma_setup_transfer(DMA1, DMA_CH2, &SPI1->regs->DR, DMA_SIZE_8BITS,
                     bytesToSend, DMA_SIZE_8BITS, (DMA_MINC_MODE | DMA_TRNS_CMPLT | DMA_TRNS_ERR));
  dma_setup_transfer(DMA1, DMA_CH3, &SPI1->regs->DR, DMA_SIZE_8BITS,
                     bytesReceived, DMA_SIZE_8BITS,(DMA_MINC_MODE | DMA_CIRC_MODE | DMA_FROM_MEM));

  // 7. Attach an interrupt to the transfer. Note that we need to add
  // the interrupt flag in step 6 (DMA_TRNS_CMPLT and DMA_TRNS_ERR).
  // Also, we only attach it for one of the transfers since they are
  // going to finish at the same time because they are in sync.
  dma_attach_interrupt(DMA1, DMA_CH2, DMAEvent);

  //8. Setup the priority for the DMA transfer.
  dma_set_priority(DMA1, DMA_CH2, DMA_PRIORITY_VERY_HIGH);
  dma_set_priority(DMA1, DMA_CH3, DMA_PRIORITY_VERY_HIGH);

  // 9. Setup the number of bytes that we are going to transfer.
  dma_set_num_transfers(DMA1, DMA_CH2, 512);
  dma_set_num_transfers(DMA1, DMA_CH3, 512);

  // 10. Enable DMA to start transmitting. When the transmission
  // finishes the event will be triggered and we will jump to
  // function DMAEvent.
  dma_enable(DMA1, DMA_CH2);
  dma_enable(DMA1, DMA_CH3);


void loop(){

Creative Commons License