Page 33

EETE SEP 2014

Harwin Gecko EETImes Europe third page Sept 14.qx extreme of the parallel spectrum, any failure to extract maximum parallelism is more crippling than on other devices. The OpenCL standard solves many of these problems by allowing the programmer to explicitly specify and control parallelism. The OpenCL standard more naturally matches the highly-parallel nature of FPGAs than do sequential High Reliability in a compact package Harwin’s Gecko connectors provide high reliability under extreme conditions. - Pin spacing 1.25mm - 2A per contact and up to 1000 operations - Four-finger BeCu contact system (Patent Published) - Pick and Place to optimise production - Comprehensive selection of cabling options available - Locking latch with board retention features For evaluation samples, CAD models and technical specifications go to: www.harwin-gecko.com programs described in pure C. OpenCL applications consist of two parts. The OpenCL host program is a pure software routine written in standard C/C++ that runs on any sort of microprocessor. That processor may be, for example, an embedded soft processor in an FPGA, a hard ARM processor, or an external x86 processor. At a certain point during the execution of this host software routine, there is likely to be a function that is computationally expensive and can benefit from the highly parallel acceleration on a more parallel device: a CPU, GPU, FPGA, etc. This function to be accelerated is referred to as an OpenCL kernel. These kernels are written in standard C; however, they are annotated with constructs to specify parallelism and memory hierarchy. The example shown in figure 2 performs the vector addition of two arrays, a and b, while writing the results back to an output array answer. Parallel threads operate on the each element of the vector, allowing the result to be computed much more quickly when it is accelerated by a device that offers massive amounts of fine-grained parallelism such as an FPGA. The host program has access to standard OpenCL APIs that allow data to be transferred to the FPGA, invoking the kernel on the FPGA and returning the resulting data. In FPGAs, kernel functions can be transformed into dedicated and deeply pipelined hardware circuits that are inherently multithreaded using the concept of pipeline parallelism. Each of these pipelines can be replicated many times to provide even more parallelism than is possible with a single pipeline. Implementing the Open CL Standard on an FPGA The creation of designs for FPGAs using an OpenCL description offers several advantages in comparison to traditional methodologies based on HDL design. Development for software programmable devices typically follows the flow of conceiving an idea, coding the algorithm in a high-level language such as C, and then using an automatic compiler to create the instruction stream. The Altera SDK for OpenCL provides a design environment to easily implement OpenCL applications on FPGAs – see figure 3. This approach can be contrasted with traditional FPGA-based design methodologies which requires the designer to create cycle-by-cycle descriptions of hardware that are used to implement their algorithm. The traditional flow involves the creation of datapaths, state machines to control those datapaths, Fig. 3: Altera SDK for OpenCL overview. connecting to low-level IP cores using system level tools, and handling the timing closure problems since external interfaces impose fixed constraints that must be met. The Altera SDK for OpenCL performs all of these steps automatically for the designers, allowing them to focus on defining their algorithm rather than focusing on the tedious details of hardware design. Designing in this way allows the designer to easily migrate to new FPGAs that offer better performance and higher capacities because the OpenCL compiler will transform the same high-level description into pipelines that take advantage of the new FPGAs. Utilizing the OpenCL standard on an FPGA may offer significantly higher performance and at much lower power than is available today from hardware architectures (CPU, GPUs, etc). In addition, an FPGA-based heterogeneous system (CPU + FPGA) using the OpenCL standard has a significant time-to-market advantage compared to traditional FPGA development using lower level hardware description languages (HDLs) such as Verilog or VHDL. www.electronics-eetimes.com Electronic Engineering Times Europe September 2014 29


EETE SEP 2014
To see the actual publication please follow the link above