From 3387d751d93673e09ad1933c0bb00e6c36d6e68e Mon Sep 17 00:00:00 2001 From: "Michael (XIAO Xufeng)" Date: Tue, 23 Oct 2018 16:57:32 +0800 Subject: [PATCH] spi: fix the crash when callbacks are not in the IRAM Introduced in 9c23b8e5 and 4f87a62f. To get higher speed, menuconfig options are added to put ISR and other functions into the IRAM. The interrupt flag ESP_INTR_FLAG_IRAM is also mistakenly set when the ISR is put into the IRAM. However callbacks, which are wrote by the user, are called in the master and slave ISR. The user may not be aware of that these callbacks are not disabled during flash operations. Any cache miss during flash operation will cause panic. Essentially IRAM functions and intrrupt flag ESP_INTR_FLAG_IRAM are different, the latter means not disabling the ISR during flash operations. New bus_config flag intr_flags is offered to help set the interrupt attribute, including priority level, SHARED, IRAM (not disabled during flash operations). It introduced a small BREAK to IDFv3.1 (but the same as IDFv3.0) that the user has to manually set IRAM flag now (therefore he's aware of the IRAM thing) to void the ISR being disabled during flash operations. --- components/driver/Kconfig | 6 ++-- components/driver/include/driver/spi_common.h | 5 +++ components/driver/include/driver/spi_master.h | 22 ++++++++++-- components/driver/include/driver/spi_slave.h | 22 ++++++++++-- components/driver/spi_master.c | 9 ++--- components/driver/spi_slave.c | 12 ++++--- .../api-reference/peripherals/spi_master.rst | 35 ++++++++++++++----- 7 files changed, 87 insertions(+), 24 deletions(-) diff --git a/components/driver/Kconfig b/components/driver/Kconfig index bff45a46b..7a091351d 100644 --- a/components/driver/Kconfig +++ b/components/driver/Kconfig @@ -41,8 +41,10 @@ config SPI_MASTER_ISR_IN_IRAM bool "Place SPI master ISR function into IRAM" default y help - Place the SPI master ISR in to IRAM to avoid possibly cache miss, or - being disabled during flash writing access. + Place the SPI master ISR in to IRAM to avoid possible cache miss. + + Also you can forbid the ISR being disabled during flash writing + access, by add ESP_INTR_FLAG_IRAM when initializing the driver. endmenu # SPI Master Configuration diff --git a/components/driver/include/driver/spi_common.h b/components/driver/include/driver/spi_common.h index cc029db2b..5f23bec9d 100644 --- a/components/driver/include/driver/spi_common.h +++ b/components/driver/include/driver/spi_common.h @@ -58,6 +58,11 @@ typedef struct { int quadhd_io_num; ///< GPIO pin for HD (HolD) signal which is used as D3 in 4-bit communication modes, or -1 if not used. int max_transfer_sz; ///< Maximum transfer size, in bytes. Defaults to 4094 if 0. uint32_t flags; ///< Abilities of bus to be checked by the driver. Or-ed value of ``SPICOMMON_BUSFLAG_*`` flags. + int intr_flags; /**< Interrupt flag for the bus to set the priority, and IRAM attribute, see + * ``esp_intr_alloc.h``. Note that the EDGE, INTRDISABLED attribute are ignored + * by the driver. Note that if ESP_INTR_FLAG_IRAM is set, ALL the callbacks of + * the driver, and their callee functions, should be put in the IRAM. + */ } spi_bus_config_t; diff --git a/components/driver/include/driver/spi_master.h b/components/driver/include/driver/spi_master.h index 4e64c404e..a83a8fd90 100644 --- a/components/driver/include/driver/spi_master.h +++ b/components/driver/include/driver/spi_master.h @@ -79,8 +79,26 @@ typedef struct { int spics_io_num; ///< CS GPIO pin for this device, or -1 if not used uint32_t flags; ///< Bitwise OR of SPI_DEVICE_* flags int queue_size; ///< Transaction queue size. This sets how many transactions can be 'in the air' (queued using spi_device_queue_trans but not yet finished using spi_device_get_trans_result) at the same time - transaction_cb_t pre_cb; ///< Callback to be called before a transmission is started. This callback is called within interrupt context. - transaction_cb_t post_cb; ///< Callback to be called after a transmission has completed. This callback is called within interrupt context. + transaction_cb_t pre_cb; /**< Callback to be called before a transmission is started. + * + * This callback is called within interrupt + * context should be in IRAM for best + * performance, see "Transferring Speed" + * section in the SPI Master documentation for + * full details. If not, the callback may crash + * during flash operation when the driver is + * initialized with ESP_INTR_FLAG_IRAM. + */ + transaction_cb_t post_cb; /**< Callback to be called after a transmission has completed. + * + * This callback is called within interrupt + * context should be in IRAM for best + * performance, see "Transferring Speed" + * section in the SPI Master documentation for + * full details. If not, the callback may crash + * during flash operation when the driver is + * initialized with ESP_INTR_FLAG_IRAM. + */ } spi_device_interface_config_t; diff --git a/components/driver/include/driver/spi_slave.h b/components/driver/include/driver/spi_slave.h index 9ad4b9126..546f3b673 100644 --- a/components/driver/include/driver/spi_slave.h +++ b/components/driver/include/driver/spi_slave.h @@ -44,8 +44,26 @@ typedef struct { uint32_t flags; ///< Bitwise OR of SPI_SLAVE_* flags int queue_size; ///< Transaction queue size. This sets how many transactions can be 'in the air' (queued using spi_slave_queue_trans but not yet finished using spi_slave_get_trans_result) at the same time uint8_t mode; ///< SPI mode (0-3) - slave_transaction_cb_t post_setup_cb; ///< Callback called after the SPI registers are loaded with new data - slave_transaction_cb_t post_trans_cb; ///< Callback called after a transaction is done + slave_transaction_cb_t post_setup_cb; /**< Callback called after the SPI registers are loaded with new data. + * + * This callback is called within interrupt + * context should be in IRAM for best + * performance, see "Transferring Speed" + * section in the SPI Master documentation for + * full details. If not, the callback may crash + * during flash operation when the driver is + * initialized with ESP_INTR_FLAG_IRAM. + */ + slave_transaction_cb_t post_trans_cb; /**< Callback called after a transaction is done. + * + * This callback is called within interrupt + * context should be in IRAM for best + * performance, see "Transferring Speed" + * section in the SPI Master documentation for + * full details. If not, the callback may crash + * during flash operation when the driver is + * initialized with ESP_INTR_FLAG_IRAM. + */ } spi_slave_interface_config_t; /** diff --git a/components/driver/spi_master.c b/components/driver/spi_master.c index 87ef444c2..800d65ef3 100644 --- a/components/driver/spi_master.c +++ b/components/driver/spi_master.c @@ -141,6 +141,10 @@ esp_err_t spi_bus_initialize(spi_host_device_t host, const spi_bus_config_t *bus SPI_CHECK(host>=SPI_HOST && host<=VSPI_HOST, "invalid host", ESP_ERR_INVALID_ARG); SPI_CHECK( dma_chan >= 0 && dma_chan <= 2, "invalid dma channel", ESP_ERR_INVALID_ARG ); + SPI_CHECK((bus_config->intr_flags & (ESP_INTR_FLAG_HIGH|ESP_INTR_FLAG_EDGE|ESP_INTR_FLAG_INTRDISABLED))==0, "intr flag not allowed", ESP_ERR_INVALID_ARG); +#ifndef CONFIG_SPI_MASTER_ISR_IN_IRAM + SPI_CHECK((bus_config->intr_flags & ESP_INTR_FLAG_IRAM)==0, "ESP_INTR_FLAG_IRAM should be disabled when CONFIG_SPI_MASTER_ISR_IN_IRAM is not set.", ESP_ERR_INVALID_ARG); +#endif spi_chan_claimed=spicommon_periph_claim(host); SPI_CHECK(spi_chan_claimed, "host already in use", ESP_ERR_INVALID_STATE); @@ -191,10 +195,7 @@ esp_err_t spi_bus_initialize(spi_host_device_t host, const spi_bus_config_t *bus } } - int flags = ESP_INTR_FLAG_INTRDISABLED; -#ifdef CONFIG_SPI_MASTER_ISR_IN_IRAM - flags |= ESP_INTR_FLAG_IRAM; -#endif + int flags = bus_config->intr_flags | ESP_INTR_FLAG_INTRDISABLED; err = esp_intr_alloc(spicommon_irqsource_for_host(host), flags, spi_intr, (void*)spihost[host], &spihost[host]->intr); if (err != ESP_OK) { ret = err; diff --git a/components/driver/spi_slave.c b/components/driver/spi_slave.c index 3d62d8248..332977377 100644 --- a/components/driver/spi_slave.c +++ b/components/driver/spi_slave.c @@ -98,10 +98,11 @@ esp_err_t spi_slave_initialize(spi_host_device_t host, const spi_bus_config_t *b //We only support HSPI/VSPI, period. SPI_CHECK(VALID_HOST(host), "invalid host", ESP_ERR_INVALID_ARG); SPI_CHECK( dma_chan >= 0 && dma_chan <= 2, "invalid dma channel", ESP_ERR_INVALID_ARG ); + SPI_CHECK((bus_config->intr_flags & (ESP_INTR_FLAG_HIGH|ESP_INTR_FLAG_EDGE|ESP_INTR_FLAG_INTRDISABLED))==0, "intr flag not allowed", ESP_ERR_INVALID_ARG); spi_chan_claimed=spicommon_periph_claim(host); SPI_CHECK(spi_chan_claimed, "host already in use", ESP_ERR_INVALID_STATE); - + if ( dma_chan != 0 ) { dma_chan_claimed=spicommon_dma_chan_claim(dma_chan); if ( !dma_chan_claimed ) { @@ -163,7 +164,8 @@ esp_err_t spi_slave_initialize(spi_host_device_t host, const spi_bus_config_t *b goto cleanup; } - err = esp_intr_alloc(spicommon_irqsource_for_host(host), ESP_INTR_FLAG_INTRDISABLED, spi_intr, (void *)spihost[host], &spihost[host]->intr); + int flags = bus_config->intr_flags | ESP_INTR_FLAG_INTRDISABLED; + err = esp_intr_alloc(spicommon_irqsource_for_host(host), flags, spi_intr, (void *)spihost[host], &spihost[host]->intr); if (err != ESP_OK) { ret = err; goto cleanup; @@ -280,9 +282,9 @@ esp_err_t spi_slave_queue_trans(spi_host_device_t host, const spi_slave_transact BaseType_t r; SPI_CHECK(VALID_HOST(host), "invalid host", ESP_ERR_INVALID_ARG); SPI_CHECK(spihost[host], "host not slave", ESP_ERR_INVALID_ARG); - SPI_CHECK(spihost[host]->dma_chan == 0 || trans_desc->tx_buffer==NULL || esp_ptr_dma_capable(trans_desc->tx_buffer), + SPI_CHECK(spihost[host]->dma_chan == 0 || trans_desc->tx_buffer==NULL || esp_ptr_dma_capable(trans_desc->tx_buffer), "txdata not in DMA-capable memory", ESP_ERR_INVALID_ARG); - SPI_CHECK(spihost[host]->dma_chan == 0 || trans_desc->rx_buffer==NULL || esp_ptr_dma_capable(trans_desc->rx_buffer), + SPI_CHECK(spihost[host]->dma_chan == 0 || trans_desc->rx_buffer==NULL || esp_ptr_dma_capable(trans_desc->rx_buffer), "rxdata not in DMA-capable memory", ESP_ERR_INVALID_ARG); SPI_CHECK(trans_desc->length <= spihost[host]->max_transfer_sz * 8, "data transfer > host maximum", ESP_ERR_INVALID_ARG); @@ -370,7 +372,7 @@ static void IRAM_ATTR spi_intr(void *arg) if (host->dma_chan != 0) freeze_cs(host); //when data of cur_trans->length are all sent, the slv_rdata_bit - //will be the length sent-1 (i.e. cur_trans->length-1 ), otherwise + //will be the length sent-1 (i.e. cur_trans->length-1 ), otherwise //the length sent. host->cur_trans->trans_len = host->hw->slv_rd_bit.slv_rdata_bit; if (host->cur_trans->trans_len == host->cur_trans->length - 1) { diff --git a/docs/en/api-reference/peripherals/spi_master.rst b/docs/en/api-reference/peripherals/spi_master.rst index 3a526fbd5..aad19ff22 100644 --- a/docs/en/api-reference/peripherals/spi_master.rst +++ b/docs/en/api-reference/peripherals/spi_master.rst @@ -180,7 +180,8 @@ Speed and Timing Considerations Transferring speed ^^^^^^^^^^^^^^^^^^ -There're two factors limiting the transferring speed: (1) The transaction interval, (2) The SPI clock frequency used. +There're three factors limiting the transferring speed: (1) The transaction interval, (2) The SPI clock frequency used. +(3) The cache miss of SPI functions including callbacks. When large transactions are used, the clock frequency determines the transferring speed; while the interval effects the speed a lot if small transactions are used. @@ -211,6 +212,13 @@ speed a lot if small transactions are used. 2. SPI clock frequency: Each byte transferred takes 8 times of the clock period *8/fspi*. If the clock frequency is too high, some functions may be limited to use. See :ref:`timing_considerations`. + 3. The cache miss: the default config puts only the ISR into the IRAM. + Other SPI related functions including the driver itself and the callback + may suffer from the cache miss and wait for some time while reading code + from the flash. Select :envvar:`CONFIG_SPI_MASTER_IN_IRAM` to put the whole + SPI driver into IRAM, and put the entire callback(s) and its callee + functions into IRAM to prevent this. + For a normal transaction, the overall cost is *20+8n/Fspi[MHz]* [us] for n bytes tranferred in one transaction. Hence the transferring speed is : *n/(20+8n/Fspi)*. Example of transferring speed under 8MHz clock speed: @@ -234,6 +242,15 @@ clock speed: When the length of transaction is short, the cost of transaction interval is really high. Please try to squash data into one transaction if possible to get higher transfer speed. +BTW, the ISR is disabled during flash operation by default. To keep sending +transactions during flash operations, enable +:envvar:`CONFIG_SPI_MASTER_ISR_IN_IRAM` and set :cpp:class:`ESP_INTR_FLAG_IRAM` +in the ``intr_flags`` member of :cpp:class:`spi_bus_config_t`. Then all the +transactions queued before the flash operations will be handled by the ISR +continuously during flash operation. Note that the callback of each devices, +and their callee functions, should be in the IRAM in this case, or your +callback will crash due to cache miss. + .. _timing_considerations: Timing considerations @@ -286,10 +303,10 @@ And if the host only writes, the *dummy bit workaround* is not used and the freq | 40 | 80 | +-------------------+------------------+ -The spi master driver can work even if the *input delay* in the ``spi_device_interface_config_t`` is set to 0. -However, setting a accurate value helps to: (1) calculate the frequency limit in full duplex mode, and (2) compensate -the timing correctly by dummy bits in half duplex mode. You may find the maximum data valid time after the launch edge -of SPI clocks in the AC characteristics chapter of the device specifications, or measure the time on a oscilloscope or +The spi master driver can work even if the *input delay* in the ``spi_device_interface_config_t`` is set to 0. +However, setting a accurate value helps to: (1) calculate the frequency limit in full duplex mode, and (2) compensate +the timing correctly by dummy bits in half duplex mode. You may find the maximum data valid time after the launch edge +of SPI clocks in the AC characteristics chapter of the device specifications, or measure the time on a oscilloscope or logic analyzer. .. wavedrom don't support rendering pdflatex till now(1.3.1), so we use the png here @@ -327,18 +344,18 @@ Some typical delays are shown in the following table: | chip, 12.5ns sample delay included. | +---------------------------------------+ -The MISO path delay(tv), consists of slave *input delay* and master *GPIO matrix delay*, finally determines the -frequency limit, above which the full duplex mode will not work, or dummy bits are used in the half duplex mode. The +The MISO path delay(tv), consists of slave *input delay* and master *GPIO matrix delay*, finally determines the +frequency limit, above which the full duplex mode will not work, or dummy bits are used in the half duplex mode. The frequency limit is: *Freq limit[MHz] = 80 / (floor(MISO delay[ns]/12.5) + 1)* -The figure below shows the relations of frequency limit against the input delay. 2 extra apb clocks should be counted +The figure below shows the relations of frequency limit against the input delay. 2 extra apb clocks should be counted into the MISO delay if the GPIO matrix in the master is used. .. image:: /../_static/spi_master_freq_tv.png -Corresponding frequency limit for different devices with different *input delay* are shown in the following +Corresponding frequency limit for different devices with different *input delay* are shown in the following table: +--------+------------------+----------------------+-------------------+