Page 18

EETE OCTOBER 2012

DESIGN & PRODUCTS DIGITal SIGNal PROCESSING Complex 32x16 multiplication (i.e., mixed precision multiplication) is some- what more complicated. As with 16x16 multiplication, one oper- and is a 16-bit fractional complex number in a packed complex format. the other operand is a 32-bit fractional complex number which is placed in two registers: one register holds the 32-bit real portion and the other register holds the 32-bit imaginary portion. the result is placed in two registers in 40-bit precision. the mACCxm.r.2x instruction performs this complex multipli- cation rounding and accumulation. Fig. 2: SC3900 complex multiplication with rounding to 20-bit. It enables flexibility that can boost the performance when the mnemonic of this instruction is as follows: lower precision is required. • MAC is the base, which is a multiply-accumulate operation the following code demonstrates the use of the new sC3900 •CXM indicates a mixed precision complex multiply is in- complex dot-product instruction which did not exist in previous volved, with 40-bit inputs generation cores. the operation of two mpYCx.2x instructions • R indicates rounding can be combined using an mpYCxD.pp.s.2x instruction. the • 2X describes the output-lane size, which is two 40-bit results mnemonic of this instruction is as follows: • MPY is the base, which is a multiply operation the throughput is four complex mACs/cycle. •CXD indicates a complex dot product is involved, with 40-bit inputs • PP specifies a positive operation with 40 bit values • S indicates the output is saturated to 32-bit • 2X describes the output-lane size, which is two 40-bit results the sC3900 also supports complex multiply-accumulate (mAC) operations. Note that in order to perform the same op- Example 3. SC3900 32x16 complex multiply-accumulate eration on the sC3850, six separate instructions were required. code. in this example, b is the 32-bit complex input. Complex matrix inversion definition the inverse of a square matrix A, sometimes called a recipro- cal matrix, is a matrix A-1 such that where i is the identity matrix Example 1. SC3900 complex dot product code. The SC3900 performs two complex 16x16 and a complex subtraction in a several methods exist to invert a matrix, such as Gauss- single instruction. Jordan elimination, lower upper (LU) decomposition, cofactor method, and others. The Gauss-Jordan elimination is a method to find the inverse matrix by solving a system of linear equations. A good explana- tion about how this algorithm works can be found in Numerical recipes in C. in this method, the choice of a good pivot is a critical part. This requires that all values of a specific column be tested against each other. therefore, it is not well-suited for a symmetric parallel code. the description of LU decomposition can be found in Nu- merical recipes in C as well. this method uses decomposition of a block matrix into a lower block triangular matrix L and an upper block triangular matrix U. this method is useful for matri- ces larger then 4x4, but it is less efficient for our 4x4 matrix. Cofactor method We chose the cofactor method because it is suitable for a example 2. sC3850 complex multiply code. the sC3850 can symmetric parallel code and it is efficient for matrices up to 4x4. perform two complex 16x16 multiplies per cycle. this inversion method uses the following formula: 18 Electronic Engineering Times Europe October 2012 www.electronics-eetimes.com


EETE OCTOBER 2012
To see the actual publication please follow the link above