Rightsizing processor performance for today’s DSP applications
StoryJuly 22, 2022
For rugged DSP [digital signal processing] processor cards, it is important to have a balance between processor performance, memory bandwidth, I/O bandwidth, and ruggedization. Deficiencies in any of these attributes will limit the achievable performance. Due to limited real estate available on 3U OpenVPX boards, designers and users must make tradeoffs on which dimensions to maximize and/or minimize.
Processing requirements for DSP algorithms vary from case to case, but in general, the SWaP [size, weight, and power] should be maximized for a given card within its typical operating constraints. Excess capability just sitting there doesn’t gain the user anything. You don’t put a BOSS 302 engine in a Ford Pinto! Likewise, the performance of the processor must be balanced with the environment it’s in and the I/O bandwidth it can support.
The DSP system designer must ascertain how much processor performance is needed for their application: Is multithreaded or single-threaded performance more important? How many GFLOPS or MIPS are needed? How much memory bandwidth will the processor need? Is memory bandwidth or memory capacity more important? How much I/O bandwidth is needed?
When evaluating DSP applications, both the processing portion and the amount of data being ingested or generated must be considered. Memory bandwidth requirements, which can vary greatly from application to application, becomes critical when blocks of data cannot be fully processed in internal cache. Intel server-class products have had more than two memory banks to support the higher core counts; Intel brought this same capability to the embedded world with the Ice Lake D processor to ensure that memory bandwidth is available to support the multiple cores that run at higher data rates. The extra memory banks can provide 50% to 100% more memory bandwidth, which is key to fully utilizing the processor cores.
Ideally, a systems engineer will perform model-based system engineering (MBSE) modeling before selecting a DSP module. This provides them with exact knowledge of what parameters and characteristics their DSP engine will need to run a particular ISR/EW [intelligence, surveillance, reconnaissance/electronic warfare] application. As MBSE approaches become increasingly common, the sophistication of system engineers prior to their module selection will only increase.
Today’s more densely integrated and capable processors bring a commensurate rise in heat dissipation. The price paid for maximum processing power is often more heat than the system can manage, resulting in a wasted investment since the device can’t run at full capacity in rugged applications. Instead of automatically going for the most GFLOPS and MIPS horsepower, it pays to rightsize the DSP module decision. Designers should evaluate whether a device’s performance is overkill for their application, whether there is sufficient and optimized memory and I/O bandwidth available to support the performance they are paying for, and whether there is a way to cool the module at the desired performance level. Otherwise, the processor may throttle (slow its clock speed) and provide much less performance than is expected on paper.
An example of a DSP module designed to balance the range of variables that a system engineer must consider is Curtiss-Wright’s CHAMP-XD3 which uses the 10-core version of the Intel Ice Lake D processor. (Figure 1.) The 3U OpenVPX module, which is aligned to the SOSA Technical Standard 1.0, is designed to support near-maximum utilization of the processor at its target temperature range. It’s optimized to take full advantage of the rich set of processing and I/O capabilities built into the Ice Lake D processor. For example, the processor supports the maximum amount of memory banks, providing over 50% more bandwidth than prior generations. The board supports the SOSA payload profile, which has a 40 GbE data plane plus up to 16 lanes of PCIe for ultrawide data path to FPGA and GPGPU cards. While a higher-end version of this processor exists, which tops out at 115 W, using that device in the typical rugged defense environment would likely result in minimal, if any, additional performance gain due to the intense throttling that would occur.
Selecting the right DSP engine for an application is not as simple as designing in the highest-performance processor available on a 3U OpenVPX form factor. Thermal challenges and power requirements for the highest-end processors can make it difficult for applications to take full advantage of the extra level of performance that a higher-core – but much hotter – device can deliver. In addition, I/O and memory bandwidth must be there to keep the processor engine well-fed.
Denis Smetana is a senior product manager for FPGA and DSP products for Curtiss-Wright Defense Solutions.
Curtiss-Wright Defense Solutions https://www.curtisswrightds.com/