In this chapter, we will perform true parallel computing. The Neon coprocessor shares a lot of functionality with the FPU from Chapter 12, “Floating-Point Operations,” but can perform several operations at once. For example, you can achieve four 32-bit floating-point operations at once with one instruction. The type of parallel processing performed by the Neon Coprocessor is single instruction multiple data (SIMD) . In SIMD processing, each single instruction issued executes on multiple data items in parallel.
We’ll examine how to arrange data, so we can operate on it in ...