Communiation Group S Chen Communiation Group S Chen Stohasti Least-Symbol-Error-Rate Adaptive Motivations Equalization for Pulse-Amplitude Modulation Sheng Cheny, Bernard Mulgrewz and Ljaos Hanzoy Equalization topi is well researhed, and a variety of solutions exists. BUT y Department of Eletronis and Computer Siene University of Southampton, Southampton SO17 1BJ, U.K. sqes.soton.a.uk lhes.soton.a.uk For high-level modulation, MAP/MLSE sequene detetor too omplex Even MAP or Bayesian symbol-detetor too omplex z Department of Eletronis and Eletrial Engineering University of Edinburgh, Edinburgh EH9 3JL, U.K. Bernie.Mulgrewee.ed.a.uk Aordable: linear equalizer and deision feedbak equalizer Classially, MMSE solution is regarded as optimum MMSE would be optimum only if equalizer soft output were Gaussian Presented at ICASSP2002, May 13-17, 2002, Orlando, USA Generally, equalizer soft output has a non-Gaussian distribution Support of U.K. Royal Aademy of Engineering under an international travel grant (IJB/AH/ITG 02-011) is gratefully aknowledged ? Adopting to non-Gaussian nature leads to optimal MSER solution for linear equalizer and DFE 1 2 Communiation Group S Chen Communiation Group Some Previous Works A Toy Example MBER equalization almost as old as adaptive equalizer Yao, IEEE Shamash & Yao, ICC'74 Chen et Yeh & Barry, ICC'97, ICC'98?, IEEE Trans. Information Theory , al. ICC'96 , IEE Chen & Mulgrew, IEE S Chen Two-tap hannel 1:0 + 0:5z 1 with 4-PAM and SNR= 35 dB 1972 Two-tap m = 2 linear equaliser with deision delay d = 0 Pro. Communiations Trans. Communiations Pro. Communiations Mulgrew & Chen, IEEE Symp. 1998 ASSPCC Normalized MMSE: w 2000 T MMSE = [0:9285 0:3713℄ with log10(SER) = 2:7593 log10(SER) 0 -1 -2 -3 -4 -5 -6 -7 -8 0.2 0 1999? MSER ( > 0): 2000, Signal Proessing 2001 w = [0:8957 0:4447℄ with log10(SER) = 7:1566 T MSER -0.2 -0.2 0 -0.4 0.2 0.4 w[0] -0.6 0.6 w[1] -0.8 0.8 1 -1 MSER solutions form a half line, origin is singular point ?: for multi-level PAM shemes 3 4 Communiation Group S Chen PAM Channel Model Communiation Group S Chen Express equaliser output y(k) = w (r(k) + n(k)) = y(k) + e(k) Channel of length n T h X1 n r(k) = h ? e(k): Gaussian with zero mean and variane w w2 4 ? y(k) 2 Y = fy = w r ; 1 q N g, whih an be divided into L subsets T h s(k n i) + n(k) i q i=0 4 s(k) 2 S = fs = 2l l L Y 1; 1 l Lg q s s q l T d) + d X 0 1 m+n s(k 2 i) + e(k) i 6 h i =d m 1 Optimal deision making ℄T , and deision delay d r(k) = r(k) + n(k) = Hs(k) + n(k) As s(k) 2 fs ; 1 q N g where N = L , 4 r(k) 2 R = fr = Hs ; 1 q N g m+n h 4 = fyq y(k) = s(k T T s T y(k) = w r(k) m + 1)℄ , w = [w0 w q 2 Yjs(k d) = s g; 1 l L Let ombined impulse response = w H = [ ℄. Then l Linear equaliser of order m r(k) = [r(k) r(k T s^(k 1 q 8 s; > > < s1; d) = > > : if y (k) (s1 + 1)d ; if (sl 1)d < y (k) (sl + 1)d for l = 2; L 1; if y (k) > (sL 1)d : l s ; L s 5 Communiation Group S Chen 6 Communiation Group S Chen Two Useful Properties Shifting: Y +1 = Y l l PDF of y (k) + 2d Symmetry: distribution of Y l cd s1 SER Expression p (x) = p is symmetri around dsl. ... ... cd sl cd (sl −1) y cd sL y 1 2n p 0 2 1 ( ) 1 XX C B x y exp A N 2 2 w w ww T s l Nsb L i n l=1 i=1 where Nsb = Ns=L is number of points in Yl and yi(l) 2 Yl. Utilizing shifting and symmetri properties, SER of equaliser w is: X Q(g P (w) = N =1 Nsb cd (sl+1) l;i E sb ? For linear equaliser to work, Y , 1 l L, must be linearly separable This is not guaranteed ? In DFE, linear separability is guaranteed g 7 l;i w ( )= w ( )) i where Q is usual Q-funtion, = 2(L l T 1)=L, and y( ) (s p w w l d i n l 1) T 8 Communiation Group S Chen Communiation Group S Chen MSER Solution Blok Adaptation MSER solution is dened as: w MSER Identify hannel ! P (w) ! optimisation Alternatively, kernel density or Parzen window estimate approah E w = arg min PE ( ) w w An estimated PDF of py (x) Gradient of PE ( ) 1 p 2 w w N rP (w) = p E T n ( yi (s (l ) d sb X N ( yi exp i=1 1)) l sb (l) (s 1)) 2 2 w w d n w r (l ) ! 2 1)hd + (sl l T p^ (x) = y T n X exp k=1 n l (x y (k))2 22nwT w K ! K : sample length, and : radius parameter. From p^ (x), estimated SER ! ww Computation is on single subset Y , and further simpliation by using Y with s = 1 Use simplied onjugated gradient algorithm with reseting searh diretion to negative i T p2 pw w K1 1 l l P^ (w) = E T X Q(^ g (w)) K =1 K k k where g^ (w) = gradient every I iterations As SER is invariant to a positive saling of w, it is omputationally advantageous to normalize weight vetor to w w = 1. y y(k) ^ (s(k p w d w ^ , and h ^ an estimate for the d-th olumn h =w h k ^ d T d d) 1) T n d d of H 9 Communiation Group S Chen Sample-by-Sample Adaptation: LSER Single-sample estimate of py (x) p^ (x; k) = y 2 T n n (y (k) n r(k) (y (k) ^ (k)(s(k d ^ (s(k 22 d d) 1))2 ! 1))w(k) (s(k d) (x 1 exp p~y (x; k) = p 2n T n d) S Chen Single-sample estimate of py (x) ! T w(k + 1) = w(k) + p2 exp Communiation Group Sample-by-Sample Adaptation: ALSER p2 1pw w exp (x2 wy(kw)) With a re-saling after eah update to ensure w w = 1, and using instantaneous stohasti gradient, ! LSER: 2 10 Using instantaneous stohasti gradient, w(k + 1) = w(k) + p2 r(k) ^ d (k) 1)h y(k))2 22 ! n ! ALSER: (y (k) exp n ^ (s(k 22 d n (s(k d) ^ d(k) 1)h d) 1))2 ! ? No need for normalization to simplify omplexity w(k + 1) = pw (kw(+k 1)+w1)(k + 1) ? Although using rather than n T n pw w appears to involve more approximation, ALSER T seems to work well | not restrit to unit length makes it easier to onverge to a MSER Adaptive gain and width n need to be set appropriately 11 12 Communiation Group S Chen Communiation Group ? Feedbak lter oeÆients do not disappear. They have been set to their optimal values. As s (k) 2 fs ; 1 q N g with N = L +1 Extension to DFE f \Linear-ombiner" DFE: T b h h b n s b 1 2 f q f l q l Lower bound SER for DFE T w under assumption of orret symbol feedbak P (w) = b E Dene translated observation spae r0 (k) =4 r(k) H ^s (k) = ~r(k) + n(k) 2 l ? Y~ are always linearly separable. All results of linear equaliser are appliable. n )℄ , b X Q(~ g (w)) N =1 N g~ (w) = b DFE beomes a \linear equaliser": T 0 r (k) = y~(k) + e(k) y~ (l) i 2 Y~ , and N l f sb f sb l;i f sb l;i y(k) = w f g whih an be partitioned into L subsets ~ s(k d) = s g; 1 l L Y~ =4 fy~ 2 Yj T q h s (k) = [s(k d 1) s(k d r(k) = H s (k) + H ^s (k) + n(k) N g 4 y~(k) 2 Y~ = fy~ = w ~r ; 1 q N T b T Under assumption ^b (k) = d f 4 d 1) s^(k d n )℄ and b = [b1 b ℄ Choose d = n 1, m = n and n = n 1 Dene s (k) = [s(k) s(k d)℄ and partition H = [H1 j H2℄ where ^b (k) = [^ s(k f f T b s f;q ~ = f~rq = H1sf;q ; 1 q ~r(k) 2 R y(k) = w r(k) + b ^s (k) T S Chen y~( ) l i i (s p w w d l 1) T n = Nf =L = Ld is number of points in Y~l 13 Communiation Group S Chen 14 Communiation Group S Chen Distribution of Subset Y~5 (s5 = 1), 64 points, SNR=34 dB An 8-PAM DFE Example Weight vetor has been normalized to a unit length, a point plotted as a unit impulse. 0 0 :3 + 1 :0 z 1 with 8-PAM 0 :3 z DFE: m = 3, d = 2, n b 2 =2 decision threshold 1.5 (a) MMSE -4 Distribution Channel: log10(Symbol Error Rate) -2 Comparison 2 MMSE MSER 1 0.5 -6 0 0 -8 0.5 1 1.5 2 1.5 2 y(k) -10 2 -12 1.5 decision threshold (b) MSER -14 Distribution Lower-Bound SER 1 0.5 -16 20 25 30 35 SNR (dB) 40 45 15 0 0 0.5 1 y(k) 16 Communiation Group S Chen Conditional PDF given s(k d) = 1, SNR=34 dB normalized w = [ 0:0578 0:2085 0:9763℄, w =[ T 0:2365 0:7946 0:5592℄ T MMSE MSER Communiation Group Learning Curves of LSER Averaged Over 100 Runs, SNR=34 dB Initial weight: (a) decision threshold 1 0.5 0 0.5 1 1 y 1.5 2 2.5 conditional pdf , (b) [ 0:01 0:01 0:01℄T Weight normalization applied 1e+0 1e-1 MMSE 1e-5 1e-6 1e-7 1e-8 MSER decision threshold decision direct 1e-2 1e-3 training 1e-4 1e-6 1e-7 MSER 1e-9 0 0.6 0.4 2000 4000 6000 Sample 8000 10000 0 (a) 0.2 0 -0.5 0 0.5 1 1.5 2 200 400 600 Sample Initial value is ritial for onvergene, MMSE not neessarily good initial hoie Communiation Group S Chen Learning Curves of ALSER Averaged Over 100 Runs, SNR=34 dB MMSE , (b) [ 0:01 0:01 0:01℄T 1e-4 Weight normalization not applied 1e-1 Symbol Error Rate Symbol Error Rate MMSE 1e-6 1e-7 1e-8 decision direct 1e-2 1e-9 training 1e-4 2000 4000 6000 Sample 8000 10000 MMSE 1e-5 1e-6 1e-7 MSER 1e-9 0 Communiation Group 0 200 (a) 400 600 Sample S Chen Only ZF, equaliser output is Gaussian with noise enhanement MMSE generally non-optimal and tries to t parameters to non-Gaussian PDF in a way so that it looks as losely as possible to a Gaussian one Non-Gaussian approah leads naturally to MSER For high-level PAM modulation shemes, MSER equalisation solution has being derived Eetive sample-by-sample adaptation has been developed Unlike MSE surfae whih is quadrati, SER surfae is highly omplex Initial equaliser weight values an ritially inuene onvergene speed ALSER is partiular promising: simpler omputation 1e-3 1e-8 MSER 18 Conlusions 1e+0 1e-5 1000 (b) 17 w 800 In (a) training and deision direted indistinguishable, in (b) dashed urve: after 200-sample training, swithed to deision-direted with s^(k d) substituting s(k d) 2.5 y Initial weight: (a) MMSE 1e-5 1e-8 1e-9 0.8 (b) MSER MMSE 1e-4 1.5 0 -0.5 w Symbol Error Rate (a) MMSE decision threshold 2 Symbol Error Rate conditional pdf 3 2.5 S Chen 800 1000 (b) In (a) training and deision direted indistinguishable, in (b) dashed urve: after 200-sample training, swithed to deision-direted with s^(k d) substituting s(k d) Compared with LSER, no performane degradation, muh simpler 19 20