Skip to content

Conversation

@sfe-SparkFro
Copy link
Contributor

@sfe-SparkFro sfe-SparkFro commented Dec 16, 2025

Summary

Fixes #18471

Testing

Before this change, the reproduction code in #18471 freezes within ~100 loop iterations. After this change, the reproduction code does not freeze (at least, not within 10k loop iterations) an OSError is raised instead of freezing.

Trade-offs and Alternatives

I don't know the original motivation for using DMA channels for large SPI transfers, but one downside of this change is that long interrupts could hold up SPI transfers. I don't know how problematic that really is, but users should be writing their interrupts to be efficient anyways.

If a user really want to use DMA channels for large SPI transfers, they can easily do this with the existing MicroPython API. Something like this should work (I've not tested it myself):

src_data = bytearray(1024)
DATA_REQUEST_INDEX = 24

spi = machine.SPI(0)

dma = rp2.DMA()
ctrl = dma.pack_ctrl(size=0, inc_write=False, treq_sel=DATA_REQUEST_INDEX)

dma.configure(
    read=src_data,
    write=spi,
    count=len(src_data),
    ctrl=ctrl,
    trigger=True
)

An alternative solution could be to enable to the user to select whether they want to use the DMA for large transfers, either when the SPI constructor is called, or when the read/write functions are called.

@github-actions
Copy link

github-actions bot commented Dec 16, 2025

Code size report:

Reference:  mpy-cross/mpconfigport: Explicitly disable native code loading. [8fb848f]
Comparison: rp2/machine_spi: Improve RX overrun handling. [merge of 95a4749]
  mpy-cross:    +0 +0.000% 
   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:    +0 +0.000% standard
      stm32:    +0 +0.000% PYBV10
     mimxrt:    +0 +0.000% TEENSY40
        rp2:   +80 +0.009% RPI_PICO_W
       samd:    +0 +0.000% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:    +0 +0.000% VIRT_RV32

@peterhinch
Copy link
Contributor

Is this a bug or a known limitation? If the bus arbitration for PSRAM can fail irretrievably in the face of concurrent access by two DMA channels, the fault will eventually occur in user code. In this case disabling DMA in the SPI driver is a workround rather than a fix. If, rather than a bug, there is an inherent limit on DMA data rates it should be documented.

@sfe-SparkFro
Copy link
Contributor Author

Force pushed 403b109, which only disables DMA transfers if PSRAM is enabled. @dpgeorge Please have a look. Haven't tested yet, don't have hardware with me right now.

Is this a bug or a known limitation?

The fact that DMA transfers to/from PSRAM can stall other DMA channels is a known limitation. Section 4.4.3 of the RP2350 datasheet states this pretty clearly in the second paragraph:

...QMI serial transfers force lengthy bus stalls on the DMA... stalling the DMA prevents any other active DMA channels from making progress during this time, slowing overall DMA throughput.

However, the current MicroPython SPI implementation results in a bug where the processor can get stuck in an infinite loop, because it does nothing to consider the above limitation. IMO, it should at least do something to guarantee that it can't get stuck in an infinite loop.

In this case disabling DMA in the SPI driver is a workround rather than a fix.

Yes, outright disabling DMA transfers is certainly a workaround. Perhaps an alternative fix could be something like the following?

         dma_start_channel_mask((1u << chan_rx) | (1u << chan_tx));
-        dma_channel_wait_for_finish_blocking(chan_rx);
         dma_channel_wait_for_finish_blocking(chan_tx);

+        // Wait for SPI to finish, and all data to be received by RX DMA
+        while (spi_is_busy(self->spi_inst) || spi_is_readable(self->spi_inst)) {
+        }
+
+        // If RX DMA still has transfers pending, raise an error
+        if (dma_channel_hw_addr(chan_rx)->transfer_count) {
+            mp_raise_OSError(MP_EIO);
+        }
    }

@nickovs
Copy link
Contributor

nickovs commented Dec 30, 2025

Since the cause is a limitation of the hardware, and the bug in the code is the result of the code not properly handling the implications of that hardware limitation, I don't think that killing all DMA support for SPI as a workaround is appropriate.

I would favour your later proposal of having the code raise an error if the RX DMA doesn't complete. That way as long as the user doesn't set up their code in a way that overwhelms the SPRAM bus they can still have the benefits of using DMA for their SPI transfers.

@nickovs
Copy link
Contributor

nickovs commented Dec 30, 2025

@sfe-SparkFro On a related note, I just submitted PR #18622 to add support for the DMA controller's pacing timers. At the moment if you uses the rp2.DMA() class for a memory-to-memory move it will always run at full speed but with this new code you could update your failing example in #18471 to something like this:

...
dma_psram = rp2.DMA()
dma_psram_timer = rp2.DMATimer()
# Limit the transfer rate to 20 million transfers per second
dma_psram_timer.freq = 20_000_000
dma_psram_ctrl = dma_psram.pack_ctrl(size = 0, treq_sel=dma_psram_timer.treq_sel)
...

This way the SRAM to PSRAM transfer will be rate limited and the user can arrange not to swamp the bus.

It would be instructive if you could test this on a board with PSRAM to confirm that it can be used to fix the problem (but I still think the SPI code should be updated to raise an error rather than stall if the RX DMA doesn't complete).

Fixes bug where SPI can freeze if another DMA channel is
tranferring to/from PSRAM.

Signed-off-by: Dryw Wade <[email protected]>
@sfe-SparkFro sfe-SparkFro changed the title rp2/machine_spi: Don't use DMA channels for SPI transfers. rp2/machine_spi: Detect RX overrun when using DMA transfers. Dec 30, 2025
@sfe-SparkFro
Copy link
Contributor Author

See 30fb190. Now this PR continues to use DMA transfers when possible, but can gracefully detect the SPI RX FIFO overrun condition and raise an OSError if that occurs.

I would like to see a solution that allows the user to force software transfers to guarantee this OSError never gets raised. Perhaps it can be added as an argument in the constructor? Eg. machine.SPI(dma=false). Can save for a different PR, let's just get this fix through 🙂

Need to clear the overrun flag.
RX overruns are not a problem if only writing.

Signed-off-by: Dryw Wade <[email protected]>
@sfe-SparkFro
Copy link
Contributor Author

sfe-SparkFro commented Dec 30, 2025

Couple improvements in 95a4749. Forgot to clear the overrun flag, and realized that an OSError doesn't need to be raised if write_only == true since RX data is being ignored anyways. This PR passes all my tests now!

Also, with my test code, I realized RX overruns occur much more frequently when transferring from PSRAM to SRAM, almost every loop iteration:

dma_psram.config(
    read = psram_buffer,
    write = sram_buffer,
    ...
)

Can also easily test SPI read versus write by changing spi.write(sram_buffer) to spi.readinto(sram_buffer), which highlights that the OSError does not get raised with writes (but I confirmed overruns were still happening by adding a mp_printf()).

@sfe-SparkFro
Copy link
Contributor Author

It would be instructive if you could test this on a board with PSRAM to confirm that it can be used to fix the problem

Thanks! Just tested it out. Pacing the PSRAM DMA at 20 million transfers per second does not help (my test code typically freezes on the first loop iteration now that I'm transferring from PSRAM to SRAM). If I drop it to 2 million, then it does help, but does not fix the problem (first run froze after ~600 loop iterations).

Maybe it can be fixed by further reducing the transfer rate, but that would really start throttling the DMA, and I'm not even sure that pacing the PSRAM DMA transfer can be guaranteed to fix this. When the DMA encounters a bus stall during a PSRAM transfer, the amount of time its stalled is variable. I'm guessing it's possible for a single very long bus stall to cause an RX FIFO overrun, in which case, no amount of DMA pacing fixes the problem. Makes it a lot less likely, but not guaranteed. Or at least, I'm guessing that's the case, but even if not, you're sacrificing DMA throughput. Which is why I'd like an option to force software SPI transfers, because it's guaranteed to not have overruns while allowing the PSRAM DMA channel to transfer as fast as the XIP subsystem allows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rp2: Max baudrate SPI can freeze if another DMA channel transfers to/from PSRAM

4 participants