ARM7 LPC2214 SPECTRUM ANALYSIS
interesting project wud u plz a little more about it...
heard of the book "ARM Systems Developer's Guide" ???
if yes refer "ARM Systems Developer's Guide", section
8.5.1.2.
thank for your attension to my problem.My project is voice recognition use ARM7 LPC2214 .So i want to know the algorithm of FFT the discrete Voice signal from ADC of LPC and algorithm IIR digital filter by C code or ASM code.Then compare it with data strored in EEPROM.
Thank for all advices for me to complete my project.
HAPPY NEW YEAR
from chat box:
i have read about FFT and do it with matlab. So I can't make FFT algorithm with my LPC2214 because I don't know how to use imagine number with my uc.As I know that it doesn't support imagine number. Can you help me to make FFT algorithm with ARM lpc2xxx.thank for all advices
HAPPY NEW YEAR
[ Edited Sat Dec 29 2007, 11:46 pm ]
This example implements a 16-bit radix-4 FFT for any ARMv4 architecture processor. We
assume that the number of points is n = 4b. If N is an odd power of two, then you will need
to alter the routine to start with a radix-2 stage, or a radix-8 stage, rather than the radix-4
stage we show.
The code uses a trick to perform a complex multiply using only three real multiplies. If
a + ib is a complex data item, and c + is a complex coefficient, then
(a + ib)(c − is) = [(b − a)s + a(c + s)] + i[(b − a)s + b(c − s)] (8.49)
(a + ib)(c + is) = [(a − b)s + a(c − s)] + i[(a − b)s + b(c + s)] (8.50)
When c + is = e2Ï€i/N , these are the complex multiplies required for the forward and
inverse transform radix-4 butterflies, respectively. Given inputs c −s, s, c +s, a, b, you can
calculate either of the above using a subtract, multiply, and two multiply accumulates. In
the coefficient lookup table we store (c −s, s) and calculate c +s on the fly. We can use the
same table for both forward and inverse transforms.
Use the following code to perform the radix-4 transform on ARMv4. The number of
points N must be a power of four. The algorithm actually calculates DFTN (x)/N, the extra
scaling by N preventing overflow. The algorithm uses the C_FFT4 and load-store macros
defined previously.; Complex conjugate multiply a=(xr+i*xi)*(cr-i*ci) ; x = xr + i*xi ; w = (cr-ci) + i*ci MACRO C_MUL9m $a, $x, $w SUB t1, $x._i, $x._r ; (xi-xr) MUL t0, t1, $w._i ; (xi-xr)*ci ADD t1, $w._r, $w._i, LSL#1 ; (cr+ci) MLA $a._i, $x._i, $w._r, t0 ; xi*cr-xr*ci MLA $a._r, $x._r, t1, t0 ; xr*cr+xi*ci MEND y RN 0 ; output complex array y[] c RN 0 ; coefficient array x RN 1 ; input complex array x[] N RN 2 ; number of samples (a power of 2) S RN 2 ; the number of blocks R RN 3 ; the number of samples in each block x0_r RN 4 ; data register (real part) x0_i RN 5 ; data register (complex part) x1_r RN 6 x1_i RN 7 x2_r RN 8 x2_i RN 9 x3_r RN 10 x3_i RN 11 y3_r RN x3_i y3_i RN x3_r t0 RN 12 ; scratch register t1 RN 14 ; void fft_16_arm9m(short *y, short *x, unsigned int N) fft_16_arm9m STMFD sp!, {r4-r11, lr} MOV t0, #0 ; bit reversed counter first_stage_arm9m ; first stage load and bit reverse ADD t1, x, t0, LSL#2 C_LDR x0, t1, N C_LDR x2, t1, N C_LDR x1, t1, N C_LDR x3, t1, N C_FFT4 0 C_STR x0, y, #4 C_STR x1, y, #4 C_STR x2, y, #4 C_STR y3, y, #4 EOR t0, t0, N, LSR#3 ; increment third bit TST t0, N, LSR#3 ; from the top BNE first_stage_arm9m EOR t0, t0, N, LSR#4 ; increment fourth bit TST t0, N, LSR#4 ; from the top BNE first_stage_arm9m MOV t1, N, LSR#5 ; increment fifth bit_reversed_count_arm9m ; bits downward EOR t0, t0, t1 TST t0, t1 BNE first_stage_arm9m MOVS t1, t1, LSR#1 BNE bit_reversed_count_arm9m ; finished the first stage SUB x, y, N, LSL#2 ; x = working buffer MOV R, #16 MOVS S, N, LSR#4 LDMEQFD sp!, {r4-r11, pc} ADR c, fft_table_arm9m next_stage_arm9m ; S = the number of blocks ; R = the number of samples in each block STMFD sp!, {x, S} ADD t0, R, R, LSL#1 ADD x, x, t0 SUB S, S, #1 << 16 next_block_arm9m ADD S, S, R, LSL#(16-2) next_butterfly_arm9m ; S=((number butterflies left-1) << 16) ; + (number of blocks left) C_LDR x0, x, -R C_LDR x3, c, #4 C_MUL9m x3, x0, x3 C_LDR x0, x, -R C_LDR x2, c, #4 C_MUL9m x2, x0, x2 C_LDR x0, x, -R C_LDR x1, c, #4 C_MUL9m x1, x0, x1 C_LDR x0, x, #0 C_FFT4 14 ; coefficients are Q14 C_STR x0, x, R C_STR x1, x, R C_STR x2, x, R C_STR y3, x, #4 SUBS S, S, #1 << 16 BGE next_butterfly_arm9m ADD t0, R, R, LSL#1 ADD x, x, t0 SUB S, S, #1 MOVS t1, S, LSL#16 SUBNE c, c, t0 BNE next_block_arm9m LDMFD sp!, {x, S} MOV R, R, LSL#2 ; quadruple block size MOVS S, S, LSR#2 ; quarter number of blocks BNE next_stage_arm9m LDMFD sp!, {r4-r11, pc} fft_table_arm9m ; FFT twiddle table of triplets E(3t), E(t), E(2t) ; Where E(t)=(cos(t)-sin(t))+i*sin(t) at Q14 ; N=16 t=2*PI*k/N for k=0,1,2,..,N/4-1 DCW 0x4000,0x0000, 0x4000,0x0000, 0x4000,0x0000 DCW 0xdd5d,0x3b21, 0x22a3,0x187e, 0x0000,0x2d41 DCW 0xa57e,0x2d41, 0x0000,0x2d41, 0xc000,0x4000 DCW 0xdd5d,0xe782, 0xdd5d,0x3b21, 0xa57e,0x2d41 ; N=64 t=2*PI*k/N for k=0,1,2,..,N/4-1 DCW 0x4000,0x0000, 0x4000,0x0000, 0x4000,0x0000 DCW 0x2aaa,0x1294, 0x396b,0x0646, 0x3249,0x0c7c DCW 0x11a8,0x238e, 0x3249,0x0c7c, 0x22a3,0x187e DCW 0xf721,0x3179, 0x2aaa,0x1294, 0x11a8,0x238e DCW 0xdd5d,0x3b21, 0x22a3,0x187e, 0x0000,0x2d41 DCW 0xc695,0x3fb1, 0x1a46,0x1e2b, 0xee58,0x3537 DCW 0xb4be,0x3ec5, 0x11a8,0x238e, 0xdd5d,0x3b21 DCW 0xa963,0x3871, 0x08df,0x289a, 0xcdb7,0x3ec5 DCW 0xa57e,0x2d41, 0x0000,0x2d41, 0xc000,0x4000 DCW 0xa963,0x1e2b, 0xf721,0x3179, 0xb4be,0x3ec5 DCW 0xb4be,0x0c7c, 0xee58,0x3537, 0xac61,0x3b21 DCW 0xc695,0xf9ba, 0xe5ba,0x3871, 0xa73b,0x3537 DCW 0xdd5d,0xe782, 0xdd5d,0x3b21, 0xa57e,0x2d41 DCW 0xf721,0xd766, 0xd556,0x3d3f, 0xa73b,0x238e DCW 0x11a8,0xcac9, 0xcdb7,0x3ec5, 0xac61,0x187e DCW 0x2aaa,0xc2c1, 0xc695,0x3fb1, 0xb4be,0x0c7c ; N=256 t=2*PI*k/N for k=0,1,2,..,N/4-1 ;... continue as necessary ...
The code is in two parts. The first stage does not require any complex multiplies. We
read the data in a bit-reversed order from the source array x, and then we apply the radix-4
butterfly and write to the destination array y. We perform the remaining stages in place in
the destination buffer.
It is possible to implement the FFT without bit reversal, by alternating between the
source and destination buffers at each stage. However, this requires more registers in the
general stage loop, and there are none available. The bit-reversed increment is very cheap,
costing less than 1.5 cycles per input sample in total.