皮质-M处理器上的浮点运算乐趣
Floating Point Fun on Cortex-M Processors

原始链接: https://danielmangum.com/posts/floating-point-cortex-m/

## 在 ARM MCU 上导航浮点 ABI(nRF52840 和 ESP32-S3) 这篇文章深入探讨了在 ARM 微控制器(特别是 nRF52840 和 ESP32-S3)上使用 PSA Crypto API 的复杂性,重点是浮点处理。一个常见的链接器错误源于混合了不同的浮点应用程序二进制接口 (ABI):`soft`、`softfp` 和 `hard`。 ARM 根据处理器是否具有浮点单元 (FPU) 以及参数如何传递给函数(通过整数寄存器或浮点寄存器)来定义这些 ABI。`hard` 利用 FPU 寄存器以提高速度,而 `soft` 和 `softfp` 依赖于软件模拟,通过整数寄存器传递参数。混合使用这些会导致链接器错误,因为 ABI 由对象文件中的属性决定。 nRF52840 具有 FPU,默认使用 `soft` ABI,除非在 Zephyr 中使用 `CONFIG_FPU=y` 明确配置,从而启用 `hard` 或 `softfp`。文章详细介绍了如何通过编译后的代码观察每个 ABI 的影响,演示参数传递和指令使用情况。 最后,它探讨了动态启用 FPU “即时” 作为始终启用它的替代方案,强调了潜在的权衡以及用于电源优化的用例,但警告不要轻易实现。

黑客新闻 新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 浮点运算在 Cortex-M 处理器上的乐趣 (danielmangum.com) 4 点赞 by hasheddan 1 小时前 | 隐藏 | 过去 | 收藏 | 讨论 帮助 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文

In my recent post on the PSA Crypto API, I demonstrated the use of the API on two different MCUs: the nRF52840 and the ESP32-S3. In the case of the former, the ECDSA signature operation was eventually executed in a closed source library that manages communication between the Arm Cortex-M4 processor and the Arm TrustZone CryptoCell 310 security subsytem. Readers that ventured down the rabbit hole of links in the post may have noticed that there are variants of the nrf_cc310_mbedcrypto libraries for hard-float and soft-float. If you have ever hit an error from your linker of the following style, you know exactly why.

ld.bfd: error: X uses VFP register arguments, Y does not
ld.bfd: failed to merge target specific data of file

Arm defines three floating point Application Binary Interface (ABI) options, which are controlled by the -mfloat-abi compiler flag.

  • soft: Soft ABI without FPU hardare: All floating-point operations are handled by the runtime library functions. Values are passed through integer register bank.
  • softfp: Soft ABI with FPU hardware: This allows the compiled code to generate codes that directly access the FPU. But, if a calculation needs to use a runtime library function, a soft-float calling convention is used. Values are passed through integer register bank.
  • hard: Hard ABI: This allows the compiled code to generate codes that directly accesss the FPU and use FPU-specific calling conventions when calling runtime library functions.

Arm, like most Instruction Set Architectures (ISAs), passes arguments to subroutines in general purpose registers (GPRs), specifically r0-r3. When the number or size of arguments exceeds the available GPRs, the remaining arguments are “spilled” to the stack, where they can be accessed by the callee. However, when a processor includes a Floating Point Unit (FPU) (more specifically for Armv7-M processors, the C10 and C11 coprocessors), and thus the floating point extension, there is an additional register bank with 32 floating point registers (s0-s31).

Side note: you may see the term Vector Floating Point (VFP) when referring to floating point on Armv7-M processors, such as the Cortex-M4. The reference manual explains why this is the case: “In the ARMv7-A and ARMv7-R architecture profiles, floating point instructions are called VFP instructions and have mnemonics starting with V. Because ARM assembler is highly consistent across architecture versions and profiles, ARMv7-M retains these mnemonics, but normally describes the instructions as floating point instructions, or FP instructions.”

When using the hard ABI, the s0-s15 registers can be used for passing arguments to subroutines. The use of hard also indicates that floating point instructions (load and store, register transfer, data processing) may be used within routines.

When using softfp, floating point instructions are allowed within routines, but arguments cannot be passed in floating point registers. soft uses the same calling convention as softfp, and is thus compatible, but does not allow for the use of floating point instructions. When floating point operations are performed without support for floating point instructions, they must be emulated in software. When you see the error described at the beginning of this post, you are mixing soft/softfp with hard, which the linker will refuse. It is able to determine the ABI of an object file being linked by looking at the Arm attributes section, which differs for each variant. For example, on the nRF52840, the attributes appear as follows (extracted via readelf).

hard

Attribute Section: aeabi
File Attributes
  Tag_CPU_name: "7E-M"
  Tag_CPU_arch: v7E-M
  Tag_CPU_arch_profile: Microcontroller
  Tag_THUMB_ISA_use: Thumb-2
  Tag_FP_arch: VFPv4-D16
  Tag_ABI_PCS_wchar_t: 4
  Tag_ABI_FP_denormal: Needed
  Tag_ABI_FP_exceptions: Needed
  Tag_ABI_FP_number_model: IEEE 754
  Tag_ABI_align_needed: 8-byte
  Tag_ABI_align_preserved: 8-byte, except leaf SP
  Tag_ABI_enum_size: small
  Tag_ABI_HardFP_use: SP only
  Tag_ABI_VFP_args: VFP registers
  Tag_ABI_optimization_goals: Aggressive Speed
  Tag_CPU_unaligned_access: v6

softfp

Attribute Section: aeabi
File Attributes
  Tag_CPU_name: "7E-M"
  Tag_CPU_arch: v7E-M
  Tag_CPU_arch_profile: Microcontroller
  Tag_THUMB_ISA_use: Thumb-2
  Tag_FP_arch: VFPv4-D16
  Tag_ABI_PCS_wchar_t: 4
  Tag_ABI_FP_rounding: Needed
  Tag_ABI_FP_denormal: Needed
  Tag_ABI_FP_exceptions: Needed
  Tag_ABI_FP_user_exceptions: Needed
  Tag_ABI_FP_number_model: IEEE 754
  Tag_ABI_align_needed: 8-byte
  Tag_ABI_enum_size: small
  Tag_ABI_HardFP_use: SP only
  Tag_ABI_optimization_goals: Aggressive Size
  Tag_CPU_unaligned_access: v6
  Tag_ABI_FP_16bit_format: IEEE 754

soft

Attribute Section: aeabi
File Attributes
  Tag_CPU_name: "7E-M"
  Tag_CPU_arch: v7E-M
  Tag_CPU_arch_profile: Microcontroller
  Tag_THUMB_ISA_use: Thumb-2
  Tag_ABI_PCS_wchar_t: 4
  Tag_ABI_FP_denormal: Needed
  Tag_ABI_FP_exceptions: Needed
  Tag_ABI_FP_number_model: IEEE 754
  Tag_ABI_align_needed: 8-byte
  Tag_ABI_align_preserved: 8-byte, except leaf SP
  Tag_ABI_enum_size: small
  Tag_ABI_optimization_goals: Aggressive Speed
  Tag_CPU_unaligned_access: v6

Floating Point ABIs in Practice