Feature
|
Architecture versions
|
Description
|
More information
|
AArch64
|
Armv8.0-A
Armv9.0-A
|
AArch64 is the 64-bit execution environment for the Arm architecture.
AArch64 provides:
- Large physical and virtual address spaces
- Large register file or 64-bit registers
- Automatic signalling of events power-efficient, high-performance spinlocks
- Efficient cache management
- Load-Acquire, Store-Release instructions designed for C++11, C11, Java memory models.
- 64-bit execution environment for the Arm architecture.
|
Learn the architecture: Guides for A-profile
|
AArch32
|
Armv8.0-A
Armv9.0-A (EL0 only)
|
The 32-bit execution environment for the Arm architecture. Provides compatilibility with Armv7-A and earlier.
|
Learn the architecture: Guides for A-profile
|
Virtualization
|
Armv8.0-A
Armv9.0-A
|
Support for hypervisors and virtualization
|
Learn the architecture: AArch64 Virtualization
|
TrustZone
|
Armv8.0-A
Armv9.0-A
|
TrustZone offers an efficient, system-wide approach to security with hardware-enforced isolation built into the CPU.
|
Learn the architecture: TrustZone for AArch64
|
Realm Management Extension (RME) |
Armv9-A |
The Realm Management Extension (RME) builds on TrustZone, with the following features:
- Two additional security states
- Two additional physical address spaces
- The ability to dynamically move resources between security states
These features enable the Arm Confidential Compute Architecture (Arm CCA) and Dynamic TrustZone.
|
Arm Confidential Compute Architecture |
Hardware-accelerated cryptography
|
Armv8.0-8.2-A
Armv9.0-A
|
Provides 3× to 10× better software encryption performance. This is useful for small granule decryption and encryption that is too small to offload to a hardware accelerator efficiently, for example https.
|
Learn the architecture: AArch64 Instruction Set Architecture
|
Neon
|
Armv8.0-A
Armv9.0-A
|
Neon technology is a packed SIMD architecture. Neon registers are considered as vectors of elements of the same data type, with Neon instructions operating on multiple elements simultaneously. Multiple data types are supported by the technology, including floating-point and integer operations.
|
Neon programmer's guides for Armv8-A
|
Virtualization Host Extension (VHE)
|
Armv8.1-A
Armv9.0-A
|
These enhancements improve the performance of Type 2 hypervisors by reducing the software overhead associated when transitioning between the Host and Guest operating systems. The extensions allow the Host OS to execute at EL2, as opposed to EL1, without substantial modification.
|
Learn the architecture: AArch64 Virtualization
|
Privilege Access Never (PAN)
|
Armv8.1-A
Armv9.0-A
|
PAN allows kernels to prevent access to unprivileged locations, providing increased robustness.
|
Learn the architecture: AArch64 memory model
|
Statistical Profiling Extension (SPE)
|
Armv8.2-A
Armv9.0-A
|
A sample criterion is set on an instruction or micro-operation basis, and then sampled at regular intervals. Each sample then gathers context associated with that sample into a profiling record, with only one record ever being compiled at any given time. Analyzing large working sets of samples can provide considerable insight into software execution and its associated performance when sampling continuously on systems running large workloads over extended periods of time.
|
Statistical Profiling Extension for Armv8-A
|
Scalable Vector Extensions (SVE)
|
Armv8.2-A |
SVE provides support for SIMD with variable vector lengths. SVE enables vector length agnostic coding style, where the code does not need to be re-written or re-compiled, since it dynamically adapts to the implemented vector length. The SVE architecture allows implementations with a vector length up to 2048-bits, where vector length must be a multiple of 128-bits. SVE also supports code written for a fixed vector length.
|
SVE programming examples
|
Pointer authentication
|
Armv8.3-A
Armv9.0-A
|
Computer attacks are becoming more sophisticated. Examples of this are exploit mechanisms, such as the use of gadgets in Return-Orientated Programming (ROP) and Jump-Orientated Programming (JOP). To mitigate against such exploits, Armv8.3-A introduces a feature that authenticates the contents of a register before it is used as the address for an indirect branch or data reference. For address authentication, the functionality uses the upper bits in a 64-bit address value normally associated with signed extension of the address space. This allows the introduction of a Pointer Authentication Code (PAC) as a new field within the upper bits of the value.
|
Code reuse attacks: the compiler story
|
Nested Virtualization
|
Armv8.3-A
Armv8.4-A
Armv9.0-A
|
There is growing interest in cloud computing and particular interest in an increasingly common use case, where a user rents a virtual machine from an infrastructure as a service (IaaS) provider. Nested virtualization is an attractive proposition, where the workload intended to run on this virtual machine includes the use of a hypervisor.
|
Learn the architecture: AArch64 Virtualization
|
Memory Tagging Extension (MTE)
|
Armv8.5-A
Armv9.0-A
|
Memory tagging enables developers to identify memory safety violations in their programs.
|
Memory Tagging Extension: Enhancing memory safety through architecture
Memory Tagging Extension Whitepaper
|
Branch Target Identification (BTI)
|
Armv8.5-A
Armv9.0-A
|
BTI allows software to identify valid targets for in-direct branches. BTI complements the support for Pointer authentication, providing a defence against JOP techniques.
|
Code reuse attacks: the compiler story |
GEneral Matrix Multiply (GEMM)
|
Armv8.6-A
Armv9.1-A
|
Adds new Advanced SIMD (Neon) and SVE instructions to accelerate matrix operations, greatly reducing the number of memory accesses required. |
Developments in the Arm A-Profile Architecture: Armv8.6-A
|
BFloat16
|
Armv8.6-A
Armv9.1-A
|
Support in Advanced SIMD (Neon) and SVE for BFloat16 data type. BF16 has recently emerged as a format tailored specifically to high-performance processing of Neural Networks.
|
BFloat16 processing for Neural Networks on Armv8-A
|
High precision timers
|
Armv8.6-A
Armv9.1-A
|
The Generic Timer frequency is increased to a new standard of 1GHz.
|
Arm A-Profile architecture developments 2018: Armv8.5-A
|
64-byte load and stores
|
Armv8.7-A
Armv9.2-A
|
A growing trend in enterprise systems is the introduction of accelerators that can be accessed using 64-byte atomic loads or stores. These are used to add items to queues and can, in some cases, signal success or failure of the enqueue operation.
|
Arm A-Profile architecture developments 2020
|
Scalable Vector Extension v2 (SVE2)
|
Armv9.0-A
|
The SVE2 is a superset of the Armv8-A SVE, with expanded functionality. The SVE2 instruction set adds thorough fixed-point arithmetic support.
|
Arm A-Profile architecture developments 2020
|
Transactional Memory Extension (TME)
|
Armv9.0-A
|
The Transactional Memory Extension brings Hardware Transactional Memory (HTM) support to the Arm architecture. Transactional Memory is used to address the difficulty of writing highly concurrent, multi-threaded programs in which the amount of coarse-grain, thread-level parallelism can scale better with the number of CPUs, by reducing serialization due to lock contention.
|
New technologies for the Arm A-profile architecture
|
Branch Record Buffer Extensions (BRBE)
|
Armv9.2-A
|
Branch Record Buffer Extensions (BRBE) captures a recent sequence of branches in an easily consumable format. This information can be used for debugging or fed into profiling tools for hot-spot analysis and AutoFDO.
|
Available Q3 - Q4 2021
|