DSP libraries boost Intel's Core i7 performance
StorySeptember 10, 2010
No longer overshadowed by the high volumes of fixed-point applications in digital and mobile telephony, floating-point DSP has moved back to center stage to boost Core i7 performance.
An insatiable demand for more performance continues to drive technology and market growth in all forms of image and video processing for simulation, visualization, gaming, modeling, manufacturing, and medical applications. Military and security applications, typified by urban and guerrilla warfare, require higher resolutions, better image enhancement, and faster threat analysis and results dissemination, specifically from ground-based and airborne electro-optical sensor systems. In addition, there is a demand for continuous performance growth for military sensor systems such as radar, sonar, all forms of signals intelligence, software radio, and multisensor fusion. These applications all require optimal floating-point processing capability, often packaged to meet the most demanding Size, Weight, and Power (SWaP) criteria.
Integrated vector processor
Freescale Semiconductor’s AltiVec broke new ground in the late 1990s for embedded developers by offering floating-point performance comparable to, if not better than, dedicated DSP engines but with the ease-of-use, performance, and development support of general-purpose Power Architecture processor devices. Recently, Freescale Semiconductor’s 8641D/8640D dual-core devices have become almost de facto standards for complex, multicomputing military sensor-processing systems.
Those advantages, pioneered by Freescale Semiconductor, have focused attention on Intel’s latest multicore embeddable processors with Streaming SIMD Extensions (SSE) for many sensor-development projects. SSE, also adopted by AMD, provides an equivalent level of functionality to AltiVec, with 128-bit vector processing capability integrated into a number of processor families, such as Intel’s Core 2 Duo and Core i7 (Arrandale). Core i7 has dual cores running at up to 2.53 GHz, each with its own SSE, L1 data and instruction caches, plus a combined L2 cache. The L3 cache with its DDR2/3 memory interface is common to both cores. SSE provides sixteen 128-bit XMM registers and a rich instruction set extension to manipulate floating-point vectors and packed floating-point operands, plus many logical and functional instructions such as directing cache operation. Support for multithreading optimizes the use of XMM register and cache resources.
Renewed market commitment
Military projects are characterized by long gestation periods. Once a capability or technology has been proven, it can have a long “shelf” life prior to a rapid deployment cycle to meet an urgent operational need to counter a particular new threat. COTS embedded computing vendors recognize this by offering compatibility and software migration road maps for their products over many generations. A renewed commitment by Intel to this COTS market model for longevity of supply and ease of migration through successive generations is already ensuring that SSE gains a significant market share of new sensor-systems development.
DSP function libraries
In practical terms, military sensor systems such as radar and signals intelligence, whichever vendor is used, will continue to employ multiple computing nodes interconnected by high-speed fabrics to achieve the DSP performance needed. These will have complex system architectures for data flow and processing throughput that must be supported during development with visualization and modeling tools and DSP function libraries, plus system-level profiling and debugging tools. Typical of these is GE Intelligent Platforms’ AXIS multicomputing development environment, recently enhanced with the open standard Vector Signal Image Processing Library (VSIPL) Core 1.0 compliant libraries for SSE version 3 and upwards. The library offers more than 600 DSP functions called from a common Application Programming Interface (API) with both an instrumented, development version and an optimized, deployable version for Intel-based SBCs such as the rugged VMEbus VR12 depicted in Figure 1.
Figure 1: The VR12 Core i7-based SBC from GE Intelligent Platforms
(Click graphic to zoom by 1.7x)
No longer overshadowed by the high volumes of fixed-point applications in digital and mobile telephony, floating-point DSP has moved back to center stage. Imaging and video processing are major growth opportunities, serving the needs of much broader markets. The successor to SSE has already been announced. Advanced Vector Extensions (AVX) will be available on Intel’s Sandy Bridge multicore processor family and beyond, providing 256-bit SIMD processing. By packing two floating-point vectors into each register, big performance gains over SSE are anticipated, and optimized VSIPL support for AVX will play a large part in achieving these goals. SSE combined with AVX is but one architectural direction designers can adopt for image processing. Companies such as GE Intelligent Platforms have also integrated math libraries within their multicomputing development environments for the 100+ stream processors of CUDA-capable General Purpose Graphics Processor Units (GPGPUs) to round out a wealth of support for new DSP development projects.
To learn more, e-mail Duncan at [email protected].