my PhD Thesis

Transcription

my PhD Thesis
ORSAY
› d’ordre : 7774
UNIVERSITÉ PARIS–SUD
THÈSE
présentée pour obtenir
LE GRADE DE DOCTEUR EN SCIENCES
DE L’UNIVERSITÉ PARIS XI ORSAY
Spécialité : Automatique et Traitement du Signal
par
Mohamed SAHMOUDI
PROCESSUS ALPHA-STABLES POUR LA SÉPARATION
ET L’ESTIMATION ROBUSTES DES SIGNAUX
NON-GAUSSIENS ET/OU NON-STATIONNAIRES
Soutenue le 13 décembre 2004 devant le jury composé de :
Rapporteurs
Eric Moreau
Jean-Yves Tourneret
Examinateurs Jean-Pierre Delmas
Ali M.-Djafari
Jean Christophe Pesquet
Encadrant
Karim Abed-Meraim
Directeur
Messaoud Benidir
Professeur, Université de Toulon
Professeur, INP, Toulouse
Professeur, INT, Evry
Directeur de Recherche, LSS, CNRS
Professeur, Université Marne la Vallée
Enseignant-Chercheur, Telecom Paris
Professeur, Université Paris XI Orsay
2
EN
T ON N OM ET POUR T OI MON DIEU ,
L0 OMN ISCIEN T ET L0 OMN IPOT EN T
SBD sur toi Ahmad, mon modèle idéal ...
À ma famille,
À toi ”Knikina”... ♥
REMERCIEMENTS
J’aime autant le chemin parcouru que l’arrivée au but ; je considère ce manuscrit comme mémoire de thèse de doctorat, mais aussi comme une belle histoire
à raconter, une histoire d’idées et de personnes que j’espère vous faire un peu
connaı̂tre.
Je suis profondément reconnaissant aux chercheurs qui ont partagé avec moi,
non seulement les résultats de leurs travaux, mais aussi leur contexte humain.
Faire justice à tout ceux qui ont contribué à l’élaboration de ma thèse est particulièrement délicat. Je n’ai pas pu dans ce cadre limité, mentionner tous les
chercheurs qui font partie de cette histoire et qui méritent d’y trouver leur nom.
J’espère qu’ils me le pardonneront.
Je n’aurais jamais pu accomplir ce travail sans aide. Et par ordre chronologique :
Je tiens d’abord à remercier toute ma famille qui a supporté toutes les difficultés morales et matérielles pour me soutenir tout au long de mes études
supérieures. Mon père Mohammadine, ma mère Habiba, ma grande soeur Fatima et son petit Fouad, ma petite soeur Saida et mon petit frère Hafid, je les
garde bien au chaud, dans mon coeur. Ils savent combien ils comptent pour moi...
Je suis aussi particulièrement redevable à mon oncle Elhoucine Sahmoudi. Sa
confiance et son soutien m’ont beaucoup aidé à démarrer mes études de troisième
cycle.
Je remercie M. Paul Deheuvels, directeur du laboratoire de statistiques LSTA
de l’université Pierre et Marie Curie - Paris 6, et tous les enseignants du DEA de
statistique de Paris VI pour leur qualité d’enseignement et leur encadrement.
Je suis aussi redevable à Hervé Monod, chercheur au laboratoire de Biométrie
de l’INRA de Jouy-en-Josas, pour m’avoir encadrer pendant mon stage de DEA
et permis de réaliser de belles applications de la théorie dans son domaine d’Agronomie. Également, je remercie le professeur Y. Kutoyants de l’université de Le
Mans pour son encadrement lors de mon mémoire théorique du DEA sous sa
direction.
i
Je tiens à saluer toute ”la bande” d’Antony, mes amis de la résidence universitaire Jean-Zay, qui ont constitué ma deuxième famille pendant 4 ans. En
particulier, Abdelillah Sahmoudi, Zirari et sa petite famille, Ajinou, Sabri, Elhattab, Rekik, Eljazouli, Halmi, Sbai, Brich, Bajdouri, ...
Je remercie également mon directeur de thèse Prof. Messaoud Benidir, Professeur à l’université Paris-Sud d’Orsay, pour m’avoir donné l’occasion d’entrer
dans le monde fascinant de traitement du signal. Sa confiance et son soutien
m’ont beaucoup aidé à accomplir ce travail. Il a su guider ce travail avec sagesse,
tout en me laissant une grande liberté.
Je tiens à témoigner publiquement à Dr. Karim Abed-Meraim, mon directeur
scientifique de recherche, toute la reconnaissance que je lui dois. Ce dernier a
suscité, developpé, puis accompagné mes premiers pas dans le domaine du traitement du signal avec une grande patience et avec une pédagogie extraordinaire.
Le soutien moral, matériel et intellectuel de mon encadrant Dr. K. Abed-Meraim,
fut essentiel.
Non seulement il m’a fourni une aide indispensable à l’avancement de mes travaux de recherches, mais aussi, quand j’étais dans certaine situation familiale ou
financière délicate, il a su, avec un instinct infaillible, localiser mon laxisme et m’a
aidé avec ses sages conseils prodigués à reprendre le chemin. Grand professeur et
source inépuisable de nouvelles idées, il restera mon mentor,....
Mes remerciements s’adressent aussi à M. Henri Maitre, chef du département
TSI de l’ENST, de m’avoir accepté au sein de son département.
Aux membres du TREX Electronique de l’Ecole Polytechnique qui m’ont
accueilli comme moniteur, plus particulièrement Yvan Bonnassieux et Stéphane
Mallat avec qui j’ai partagé le grand plaisir d’enseigner à l’X.
Aux membre du CEREMADE de l’université Dauphine qui m’ont accueillie
comme ATER, plus particulièrement M. Bellec, C. Pardoux, C. Robert avec qui
j’ai partagé mon goût pour l’enseignement des statistiques et des mathématiques.
Je remercie sincèrement les professeurs Eric Moreau et Jean-Yves Tourneret
d’avoir accepté la lourde tâche de rapporteurs malgré le temps très restreint que
je leur ai laissé. Par leurs questions et remarques constructives, ils m’ont été
d’une aide précieuse et m’ont permis d’améliorer de manière significative certaines parties de mon manuscrit.
J’aimerais également remercier les professeurs Jean-Pierre Delmas, Ali MohammadDjafari et Jean Christophe Pesquet pour avoir accepté de juger ces quelques
années de travail en participant au jury de ma thèse et pour l’intérêt qu’ils ont
bien voulu porter à mon travail.
ii
D’autres chercheurs ont répondu à ma demande de coopération avec une
générosité extraordinaire. B. Boashash (Australie), B. Barkat (Singapore), A.
Belouchrani (Algérie) et M. Taqqu (Boston, USA) pendant leurs séjours sabbatiques au département TSI de l’ENST. L.J. Stanković (Montenegro), A. Hero
(USA), J.-F. Cardoso (France) et J. Chambers (UK) dans différentes occasions.
Ils ont partagé avec moi leur érudition. J’admire et j’apprécie non seulement
leur compétence professionnelle, mais aussi l’ingéniosité qu’ils ont mise en œuvre
pour m’expliquer certains concepts techniques et la courtoisie avec laquelle ils
ont épargné mon amour-propre de la recherche scientifique.
Je remmercie également Philippe Ciblat de Comelec-ENST, Marc Lavielle
et Estelle Kuhn de l’équipe modélisation statistique de Paris-Sud pour les nombreuses discussions scientifiques que nous avons eu lors de nos réunions dans le
cadre du projet MathSTIC.
Aux membres du LSS de Supélec et TSI de Telecom Paris et particulièrement
à Naji, Gazzah, Snoussi, Sayadi, Belkacemi, Djeddi, Khanfouci, Mohammadpour,
Hallouli, Thomas, Mouhouche, Djalil, Trung, Berriche, Souidene et Robert, j’exprime ma plus grande gratitude. L’ambiance cosmopolite qui caractérise les deux
laboratoires m’a donné envie de suivre la recherche sans frontières...
Le plaisir que j’ai eu à écrire ce rapport est largement dû à la bonté de plusieurs personnes et je ne saurais assez les remercier...
Je garde le meilleur pour la fin, ma douce et tendre Nassera. Tu m’a redonné
confiance au moment où j’en avais le plus besoin, tu m’a permis de continuer ce
travail sans jamais abondonner, tu as supporté avec une grande sagesse et une
grande patience que je travail le week-end et que je rentre tard le soir. Pour tous
cela et bien plus encore, je ne te remercierai jamais assez,... un grand merci à toi
”Knikina” !.
L’auteur
Mohamed Sahmoudi
iii
iv
Table des matières
Dédicaces
i
Remerciements
i
Table des Matières
iv
Liste des Figures
x
Liste des Tableaux
xiii
Résumé
xiv
Publications de l’Auteur
xvi
Notations et Abréviations
xviii
1 Introduction
1.1 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.1 Non-gaussianité . . . . . . . . . . . . . . . . . . . . .
1.1.2 Non-stationnarité . . . . . . . . . . . . . . . . . . . .
1.1.3 Robustesse . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Position du Problème . . . . . . . . . . . . . . . . . . . . . .
1.2.1 Séparation de sources impulsives à variance infinie . .
1.2.2 Estimation de signaux FM multicomposantes dans un
environnement impulsif . . . . . . . . . . . . . . . . .
1.3 Objectifs et Contributions . . . . . . . . . . . . . . . . . . .
1.4 Organisation du Document . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
3
3
4
5
7
8
8
. 9
. 11
. 13
I Outils pour le Traitement des Signaux non-Gaussiens
et/ou non-Stationnaires
17
2 Distributions Non-Gaussiennes à Queues Lourdes
19
2.1 Bref Historique . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Lois Stables Univariées . . . . . . . . . . . . . . . . . . . . . . 22
v
2.3
2.4
2.5
2.6
2.7
2.8
2.2.1 Lois indéfiniment divisibles . . . . . . . . . . . . . . . .
2.2.2 Deux définitions équivalentes des distributions α-stables
2.2.3 Stabilité de quelques lois usuelles . . . . . . . . . . . .
2.2.4 Propriétés des lois stables . . . . . . . . . . . . . . . .
2.2.5 Moments fractionnaires d’ordre inférieur . . . . . . . .
2.2.6 Simulation des lois stables . . . . . . . . . . . . . . . .
Inférence Statistique des Lois Stables . . . . . . . . . . . . . .
2.3.1 Tests de la variance . . . . . . . . . . . . . . . . . . . .
2.3.2 Estimation des paramètres des lois α-stables . . . . . .
Lois Stables Multivariées . . . . . . . . . . . . . . . . . . . . .
2.4.1 Définition et propriétés . . . . . . . . . . . . . . . . . .
2.4.2 Moments des lois stables multivariées . . . . . . . . . .
2.4.3 Vecteur aléatoire α-sous-gaussien . . . . . . . . . . . .
Mesure de Dépendance des v.a.r. α-Stables . . . . . . . . . . .
2.5.1 Covariation . . . . . . . . . . . . . . . . . . . . . . . .
2.5.2 Métrique de covariation . . . . . . . . . . . . . . . . .
2.5.3 Coefficient de covariation . . . . . . . . . . . . . . . . .
2.5.4 Codifférence . . . . . . . . . . . . . . . . . . . . . . . .
2.5.5 Coefficient de covariation symétrique . . . . . . . . . .
2.5.6 Estimation des coefficients de covariation . . . . . . . .
Représentation Analytique des PDF α-Stables . . . . . . . . .
2.6.1 Développement en séries entières . . . . . . . . . . . .
2.6.2 Développement asymptotique . . . . . . . . . . . . . .
2.6.3 Approximation par un mélange fini . . . . . . . . . . .
Autres Distributions à Queues Lourdes . . . . . . . . . . . . .
2.7.1 Loi gaussienne généralisée . . . . . . . . . . . . . . . .
2.7.2 Loi normale inverse gaussienne . . . . . . . . . . . . .
2.7.3 Loi t-Student . . . . . . . . . . . . . . . . . . . . . . .
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Robust Estimation
3.1 Robustness . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 M- Estimation . . . . . . . . . . . . . . . . . . . . . . . .
3.2.1 Minimax M-estimate of location estimator . . . .
3.2.2 Influence Function . . . . . . . . . . . . . . . . .
3.2.3 M-Estimation of a deterministic signal parameter
3.2.4 Theoretical performance . . . . . . . . . . . . . .
3.2.5 Minimax optimal cost function . . . . . . . . . .
3.3 Concluding Remarks . . . . . . . . . . . . . . . . . . . .
22
23
25
27
30
31
32
32
33
35
35
37
38
39
39
40
41
43
44
44
44
44
45
45
46
46
49
51
52
.
.
.
.
.
.
.
.
53
53
54
56
57
58
59
59
60
4 Time–Frequency Concepts
4.1 Need of Time-Frequency Representation . . . . . . . . . . . .
4.2 Nonstationarity and FM Signals . . . . . . . . . . . . . . . . .
4.3 The STFT, SPEC, WVD, and Quadratic TFD . . . . . . . . .
61
61
63
66
vi
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4.4
4.5
4.6
4.7
4.8
4.9
4.10
Reduced Interference Distributions . . . . .
The WVD and Ambiguity Function . . . . .
Relationships Among Dual Domains . . . .
Time–Frequency Signal Synthesis . . . . . .
IF Estimation . . . . . . . . . . . . . . . . .
Engineering Applications of Time–Frequency
Concluding Remarks . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
Methods
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
67
68
69
70
70
71
72
II Blind Separation of Impulsive Sources
with Infinite Variances
73
5 State of the Art of BSS
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . .
5.1.1 What is blind source separation (BSS) ? . . .
5.1.2 Brief history of BSS . . . . . . . . . . . . . .
5.1.3 Statistical information for BSS . . . . . . . . .
5.2 Linear Instantaneous Mixtures . . . . . . . . . . . . .
5.2.1 Separability and indeterminacies . . . . . . . .
5.2.2 How to find the independent components . . .
5.3 Basic BSS Methods . . . . . . . . . . . . . . . . . . .
5.3.1 BSS by minimization of mutual information .
5.3.2 BSS by maximization of non-gaussianity . . .
5.3.3 BSS by maximum likelihood estimation . . . .
5.3.4 BSS by algebraic tensorial methods . . . . . .
5.3.5 BSS by non-linear decorrelation . . . . . . . .
5.3.6 BSS using geometrical concepts . . . . . . . .
5.3.7 Source separation using Bayesian framework .
5.3.8 BSS using time structure . . . . . . . . . . . .
5.4 BSS of Impulsive Heavy-Tailed Sources . . . . . . . .
5.4.1 Why heavy-tailed α-stable distributions ? . . .
5.4.2 Existing BSS methods for heavy-tailed signals
5.5 Conclusion & Future Research . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
75
76
76
76
79
81
81
83
85
85
87
87
88
88
89
89
90
92
92
92
93
.
.
.
.
.
.
.
.
.
95
95
95
96
97
97
98
101
102
102
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6 Minimum Dispersion Approach
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . .
6.1.1 The failure of second and higher -order methods
6.1.2 Fractional lower-order statistics (FLOS) theory
6.2 Source Separation Procedure . . . . . . . . . . . . . . .
6.2.1 Whitening by normalized covariance matrix . .
6.2.2 Minimum dispersion criterion . . . . . . . . . .
6.2.3 Separation algorithm : Jacobi implementation .
6.3 Performance Evaluation & Comparison . . . . . . . . .
6.3.1 Generalized rejection level index . . . . . . . . .
vii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6.4
6.3.2 Experimental results . . . . . . . . . . . . . . . . . . . 102
Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . 107
7 Sub- and Super- Additivity based Contrast Functions
7.1 BSS Using Contrast Functions . . . . . . . . . . . . . . .
7.2 On contrast functions . . . . . . . . . . . . . . . . . . . .
7.3 Orthogonality constraint . . . . . . . . . . . . . . . . . .
7.4 Sub-Additivity based Contrast Functions . . . . . . . . .
7.4.1 Lp -norm contrast functions ; p ≥ 1 . . . . . . . .
7.4.2 Alpha-stable scale contrast function . . . . . . .
7.5 Super-Additivity based Contrast Functions . . . . . . . .
7.5.1 Dispersion contrast function . . . . . . . . . . . .
7.6 Jacobi-Gradient Algorithm for Prewhitened BSS . . . . .
7.7 Concluding Remarks . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
109
. 109
. 109
. 111
. 111
. 113
. 113
. 114
. 115
. 116
. 117
8 Normalized HOS-based Approaches
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . .
8.2 Normalized Statistics of Heavy-Tailed Mixtures . . . .
8.2.1 Normalized moments . . . . . . . . . . . . . . .
8.2.2 Normalized second and fourth order cumulants .
8.3 Normalized Tensorial BSS Methods . . . . . . . . . . .
8.3.1 Separation algorithms . . . . . . . . . . . . . .
8.3.2 Performance evaluation & comparison . . . . .
8.4 Normalized Non-linear Decorrelation BSS Methods . .
8.4.1 Robust composite criterion for source separation
8.4.2 Iterative quasi-Newton implementation . . . . .
8.4.3 Performance evaluation & comparison . . . . .
8.5 Concluding Remarks . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
119
119
120
120
121
121
121
123
125
126
127
128
131
9 A Semi-Parametric Maximum Likelihood Approach
9.1 The Likelihood of the BSS Model . . . . . . . . . . . .
9.1.1 Derivation of the likelihood . . . . . . . . . . .
9.1.2 Sources density estimation . . . . . . . . . . . .
9.1.3 Optimization via the EM algorithm . . . . . . .
9.2 Semi-Parametric Source Separation . . . . . . . . . . .
9.2.1 Noisy linear instantaneous mixtures. . . . . . .
9.2.2 The proposed approach. . . . . . . . . . . . . .
9.2.3 Density estimation by B-spline approximations .
9.2.4 The SAEM algorithm . . . . . . . . . . . . . . .
9.3 Performance evaluation & comparison . . . . . . . . . .
9.3.1 Some existing BSS methods . . . . . . . . . . .
9.3.2 Parametric versus semi-parametric approaches .
9.3.3 Computer simulation experiments . . . . . . . .
9.4 Concluding Remarks . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
133
133
133
134
134
135
135
136
137
138
139
139
140
140
143
viii
III Separation and Estimation of Multicomponent
FM Signals affected by Heavy-Tailed Noise
145
10 State of the Art
10.1 Modern Spectral Analysis Approaches . . . . . . .
10.2 Time-Frequency Analysis Approaches . . . . . . . .
10.2.1 IF estimation using time-frequency methods
10.2.2 Analysis of noisy multicomponent signals . .
10.3 Robust time-frequency analysis . . . . . . . . . . .
10.4 Concluding Remarks . . . . . . . . . . . . . . . . .
11 Robust Parametric Approaches
11.1 Introduction-Problem Statement . . . . . . .
11.2 Polynomial-Phase Transform of FM Signals
11.3 IF Estimation Procedure of FM Signals . . .
11.4 Robust Subspace Estimation . . . . . . . . .
11.4.1 TRUNC-MUSIC algorithm . . . . . .
11.4.2 FLOS-MUSIC algorithm . . . . . . .
11.4.3 ROCOV-MUSIC algorithm . . . . . .
11.5 Performance Evaluation & Comparison . . .
11.5.1 Mixture of sinusoidal component . .
11.5.2 Mixture of two chirps . . . . . . . . .
11.6 Concluding Remarks . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
12 Robust Time-Frequency Approaches
12.1 Introduction-Problem Statement . . . . . . . . . . . . . . .
12.2 Failure of Standard TFD in Impulsive Noise . . . . . . . .
12.2.1 Effect of impulsive spike noise on TFD . . . . . . .
12.2.2 Effect of impulsive α-stable noise on TFD . . . . .
12.2.3 The need of robust TFD in Gaussian environment .
12.3 Pre-processing Techniques based Approach . . . . . . . . .
12.3.1 Exponential compressor filter . . . . . . . . . . . .
12.3.2 Huber filter . . . . . . . . . . . . . . . . . . . . . .
12.4 Robust Time-Frequency Approach . . . . . . . . . . . . . .
12.4.1 Optimal TFD kernel in α-stable noise . . . . . . . .
12.4.2 A new robust quadratic time-frequency distribution
12.5 IF Estimation & Component Separation . . . . . . . . . .
12.6 Performance Evaluation & Comparison . . . . . . . . . . .
12.7 Concluding Remarks . . . . . . . . . . . . . . . . . . . . .
147
147
148
150
150
151
152
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
153
. 153
. 155
. 155
. 156
. 156
. 157
. 158
. 159
. 159
. 161
. 162
.
.
.
.
.
.
.
.
.
.
.
.
.
.
163
. 163
. 165
. 165
. 166
. 167
. 167
. 167
. 169
. 169
. 169
. 170
. 172
. 173
. 178
13 Conclusions et Perspectives
179
13.1 Conclusion Générale . . . . . . . . . . . . . . . . . . . . . . . 179
13.2 Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Références Bibliographiques
185
ix
x
Table des figures
1.1
1.2
2.1
2.2
2.3
2.4
4.1
4.2
4.3
4.4
Réalisations d’un signal Gaussien et celles d’un signal α-stable.
• Figures (c) et (d) : Lorsque la taille de l’échantillon est relativement petite, les deux réalisations de la loi gaussienne et
de la loi α-stable sont semblables.
• Figures (a) et (b) : Lorsque la taille de l’échantillon est relativement large, les deux réalisations se diffèrent clairement . 5
Exemples de signaux non-stationnaires.
(a–c) Représentent des signaux d’applications de la vie réelle
par la B-distribution : (a) pour un signal de Baleines, (b)
pour un signal electroencephalogram et (c) pour un signal de
Chauve-souris. . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Réalisations de signaux α-stables pour différentes valeurs de α.
Densité de probabilité α-stables pour différentes valeurs de α.
Les queues de la densité de probabilité α-stable pour différentes
valeurs de α. . . . . . . . . . . . . . . . . . . . . . . . . . . .
La densité de probabilité de la loi N IG(α, 0, 1, 0) pour différent
valeurs de α. . . . . . . . . . . . . . . . . . . . . . . . . . . .
(a) : Time-domain and (b) : frequency-domain representations
of an LFM signal. It shows clearly the inherent limitation of
classical representations of a non-stationary signal. . . . . .
A TF representation of the LFM signal in 4.1. . . . . . . . .
Examples of nonstationary signals.
An engineering application is shown in (a) for a linear FM
signal (plotted using the Wigner–Ville distribution). Real–life
applications are shown in (b–d) for a whale signal, an electroencephalogram signal, and a bat signal, respectively (all
plotted using the B distribution). . . . . . . . . . . . . . . .
Quadratic representations corresponding to the WVD.
Wz (t, f ), Az (τ, ν), Kz (t, τ ) and Dz (ν, f ) are respectively the
WVD, AF, time–lag signal kernel and the Doppler–frequency
signal kernel of the analytic signal z(t). . . . . . . . . . . . .
xi
25
28
30
50
. 62
. 63
. 65
. 69
4.5
Dual domains of general signal quadratic representations.
γ(t, f ), Γ (τ, ν), G(t, τ ) and G(ν, f ) are the TFD time–frequency,
Doppler–lag, time–lag and Doppler–frequency kernel, respectively. ρz (t, f ) and Az (τ, ν) are the general quadratic TFD and
the GAF of the analytic signal z(t). . . . . . . . . . . . . . . . 69
5.1
5.2
Signal model for the blind source separation problem . . . . . 76
Order of Statistics in Blind Source Separation . . . . . . . . . 79
6.1
Extraction of 3 α-stable sources from 3 observations where
α = 0.5, and N = 10000. . . . . . . . . . . . . . . . . . . . . . 103
Generalized mean rejection level versus α where N = 1000. . . 103
Generalized mean rejection level versus the estimation error ∆α.104
Generalized mean rejection level versus sample size N. . . . . . 105
Generalized mean rejection level versus sample size for α = 1.5.106
Generalized mean rejection level versus the additive noise power for α = 1.5. . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.2
6.3
6.4
6.5
6.6
8.1
8.2
8.3
8.4
Generalized mean rejection level versus the noise power. .
Generalized mean rejection level versus the sample size. .
Generalized mean rejection level versus the sample size. . .
Mean rejection level versus the noise power with T = 1000.
9.1
9.2
9.3
Consistency of different BSS algorithms. The sample sizes were
1000 for case (1) and 5000 for case (2). . . . . . . . . . . . . . 141
The performance index versus noise level. . . . . . . . . . . . . 142
The performance index versus sample size. . . . . . . . . . . . 142
11.1
11.2
11.3
11.4
The
The
The
The
MSE
MSE
MSE
MSE
versus
versus
versus
versus
the
the
the
the
noise dispersion in dB, N=1000.
sample size, γ = 0.1. . . . . . .
sample size, γ = 0.1. . . . . . .
noise dispersion in dB, N=1000.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
12.1 The nonlinear law of the compressor used in the pre-processing
stage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12.2 Compression of a linear FM signal in impulsive noise using different values of β. . . . . . . . . . . . . . . . . . . . . . . . . .
12.3 The standard MBD of the multi-component signal test. . . . .
12.4 The Robust-MBD of the multi-component signal test. . . . . .
12.5 The NMSE versus sample size : a comparative study. . . . . .
12.6 NMSE of IF estimates, corresponding to the HAF, r-PWVD
and the R-MBD for a noisy two-component chirp signal. . . .
12.7 Normalized MSE of the various phase parameters versus sample
size, γ = 0.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12.8 Normalized MSE of the various phase parameters versus noise
dispersion in dB, N=1000. . . . . . . . . . . . . . . . . . . . .
xii
124
124
129
131
160
160
161
162
168
168
174
174
175
176
177
178
Liste des tableaux
2.1
2.2
2.3
2.4
2.5
La moyenne et la variance des lois α-stables pour différentes
valeurs de α . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Test graphique de la variance en utilisant la variance empirique.
Test graphique de la queue d’une distribution par la méthode
dite ”log-log”. . . . . . . . . . . . . . . . . . . . . . . . . . . .
Valeurs optimales de K en fonction de n et de α . . . . . . . .
Approximation de la PDF SαS par le modèle de mélange de
gaussiennes et affinage de l’approximation par l’algorithme EM.
32
33
33
35
47
4.1
Some common TFD and their kernels. . . . . . . . . . . . . . 68
6.1
The principal steps of the proposed minimum dispersion (MD)
algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8.1
8.2
The principal steps of the proposed Robust-JADE algorithm. . 122
The principal steps of the proposed Robust-EASI algorithm. . 128
11.1 The proposed frequency estimation TRUNC-MUSIC algorithm.156
11.2 The proposed robust frequency estimation FLOS-MUSIC algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
11.3 The proposed robust covariance estimation ROCOV algorithm. 159
11.4 The proposed frequency estimation ROCOV-MUSIC algorithm.159
12.1 Computation procedure of the Robust-MBD. . . . . . . . . . . 171
12.2 Component separation procedure for the proposed algorithm . 173
xiii
xiv
RÉSUMÉ
L’objectif principal de ce travail de thèse est de développer de nouvelles
techniques robustes pour le traitement des signaux non-gaussiens et/ou nonstationnaires dans des environnements impulsifs. Plus précisément, le travail
de cette thèse de doctorat se situe au carrefour des deux problématiques suivantes :
I- Séparation aveugle de mélanges linéaires de sources impulsives : Ce
problème a été peu étudié pour certains cas statistiquement ardus. En effet,
lorsque les sources sont modélisées par des lois α-stables, les méthodes classiques ne s’appliquent plus, car la densité de probabilité n’a pas d’expression
analytique explicite et les moments d’ordre 2 ou d’ordre supérieur à 2 sont
infinis. Dans ce cas, nous avons introduit quatres approches originales :
– Une approche basée sur le critère de dispersion minimum qui consiste à
minimiser la somme des dispersions des observations blanchies. L’étape
de pré-blanchiment des observations est basée sur une nouvelle matrice
de covariance normalisée que nous avons introduite.
– Une deuxième approche basée sur l’idée des statistiques normalisées
que nous avons introduite pour adapter les méthodes existantes basées
sur les statistiques d’ordre deux ou d’ordre supérieur.
– Une troisième approche en utilisant des fonctions de contrastes, sous
contrainte d’orthogonalité, basées sur des fonctionnelles sous- ou suradditives. En particulier, nous avons proposé un critère qui consiste
à minimiser la somme des normes Lp (p ≥ 1) des observations pour
séparer des sources qui peuvent être éventuellement à variance infinie.
– Une quatrième approche de structure semi-paramétrique. Dans cette
méthode nous formulons le problème de séparation de source sous forme
d’un problème d’estimation par le principe du maximum de vraisemblance. Par la suite, nous combinons une version stochastique de l’algorithme EM et l’approximation des PDF α-stables par les fonctions logspline afin d’estimer la PDF et la matrice du mélange simultanément.
xv
II- Estimation de signaux FM non-stationnaires multicomposantes dans
un environnement impulsif : La littérature reste relativement pauvre dans
le cas multicomposantes en présence de bruit impulsif α-stable. Pour contribuer à la résolution de ce problème, nous avons proposé des méthodes paramétriques et d’autres non-paramétriques basées sur l’analyse temps-fréquence :
– Méthodes paramétriques : Nous commençons par ramener le problème
à celui de l’estimation de signaux harmoniques noyés dans un bruit
impulsif grâce à une transformée polynomiale du signal. Une méthode
haute résolution (de type MUSIC) est alors appliquée au signal ainsi
transformé pour l’estimation des paramètres de la phase. Trois cas de
figures sont considérés et comparés : (i) Celui de l’application directe
de l’algorithme MUSIC au signal harmonique tronqué ; (ii) celui de
l’application de l’algorithme MUSIC à l’estimée robuste de la fonction
de covariance du signal harmonique et (iii) celui de l’application de
MUSIC à la covariation généralisée du signal.
– Méthodes non-paramétriques : Dans une première approche, nous avons
appliqué la procédure de robustesse au sens minimax d’Huber contre
l’effet du bruit impulsif sous forme d’une étape de pré-traitement par
deux techniques différentes à savoir : (i) technique de compression
des amplitudes par un filtre non-linéaire de type |x|β ; 0 < β < 1 et
(ii) la technique de troncature du signal en amplitude. Par la suite
nous représentons le signal dans le plan temps-fréquence en utilisant
des transformées quadratiques adéquates au cas multicomposantes et
un algorithme de séparation pour extraire les composantes et estimer
leurs fréquences instantanées. Par contre, dans la deuxième approche,
nous avons appliqué la méthode M-estimation robuste directement à la
transformée temps-fréquence quadratique pour définir une transformée
robuste à l’effet du bruit impulsif et des termes croisés d’un signal multicomposantes.
Finalement, une étude numérique vient compléter les résultats théoriques
et permet de comparer nos approches à d’autres méthodes existantes dans la
littérature.
xvi
Publications
1- Articles de Journals
1. M. Sahmoudi, H. Monod, D. Makowski et D. Wallach,” Optimal experimental designs for estimating model parameters, applied to yield response to
nitrogen models,” Agronomie, vol. 22, pp. 229–238, 2002.
2. M. Sahmoudi, K. Abed-Meraim and M. Benidir, “Blind Separation of Impulsive alpha-stable Sources Using a Minimum Dispersion Criterion”, in IEEE
Signal Processing Letters Journal, vol. 12, No.4, April, 2005.
3. M. Sahmoudi and K. Abed-Meraim ”Blind Separation of Instantaneous Mixtures of Impulsive α-Stable Sources based on Fractional Lower-Order Statistics”, submitted for publication in IEEE Transaction on Signal Processing.
4. M. Sahmoudi and K. Abed-Meraim ”Blind Separation of Heavy-Tailed Sources
Using Normalized Statistics”, submitted for publication in IEEE Transaction
on Signal Processing.
5. M. Sahmoudi, K. Abed-Meraim and B. Barkat ”Robust Estimation of Multicomponent Non-Stationary FM Signals in Heavy-Tailed Noise”, submitted
for publication in IEEE Transaction on Signal Processing.
2- Articles de Conferences
1. M. Benidir and A. Ouldali and M. Sahmoudi, ”Performances Analysis For
The HAF-Estimator For A Time-Varying Amplitude Phase-Modulated Signals,” in Proceeding of CA 2002, IASTED International Conference in
Control and Applications, Cancun, Mexico, May 20-22, 2002.
2. M. Sahmoudi, K. Abed-Meraim and M. Benidir, ” Blind Separation of
Alpha-Stable Sources : A new Fractional Lower-Order Moments (FLOM)
Approach”, in Proceeding of the ISSPIT02 ; the IEEE International Symposium on Signal Processing and Information Technology, December Marrakech, Morocco, December 18-21, 2002.
3. M. Sahmoudi, K. Abed-Meraim and M. Benidir, ”Estimation de Signaux
Chirp Multicomposantes Affecte par un Bruit Impulsif Alpha-stable”, in
Proceeding of GRETSI Paris, France, Septembre 2003.
4. M. Sahmoudi, K. Abed-Meraim and M. Benidir, ”Blind Separation of Instantaneous Mixtures of Impulsive alpha-stable Sources”, in Proceeding of the
xvii
5.
6.
7.
8.
9.
10.
11.
12.
13.
IEEE International Symposium on Signal and Image Processing and Analysis, Rome, Italy, September 2003.
M. Sahmoudi, K. Abed-Meraim, N. Linh-Trung, V. Sucic, F. Tupin and B.
Boashash, ”An Image and Time-frequency Processing Methods for Blind
Separation of Non-stationary Sources”, in Proc. of Journées d’Etude sur
les Méthodes pour les Signaux Complexes en Traitement d’Image, INRIA
Recquencourt, Paris France, 9-10 décembre 2003.
M. Sahmoudi and K. Abed-Meraim, ”Multicomponent Chirp Interference
Estimation For Communication Systems In Impulsive alpha-stable noise Environment”, in Proceeding of the IEEE International Symposium on Control,
Communications and Signal Processing, Hammamet, Tunisia, Mars 2004.
M. Sahmoudi and K. Abed-Meraim, ”Robust IF Estimation of Multicomponent FM Signals Affected by Heavy-Tailed Noise Using TFD”, Int. Colloquium of Modelization, Stochastic and Statistics MSS-2004, Alger, Algeria,
Avril 2004.
M. Sahmoudi, K. Abed-Meraim and B. Barkat, ”IF Estimation of Multicomponent Chirp Signal in Impulsive alpha-stable noise Environment Using
Parametric and Non-Parametric Approaches”, in Proceeding of EUSIPCO
2004, 12th European Signal Processing Conference, Vienna, Austria, September 2004.
M. Sahmoudi, K. Abed-Meraim and M. Benidir, ”Blind Separation of HeavyTailed Signals Using Normalized Statistics”, in Proceeding of ICA 2004. 5th
International Conference on Independent Component Analysis and Blind
Source Separation, Granada, Spain, September 22-24, 2004.
M. Sahmoudi and K. Abed-Meraim, ”Robust Blind Separation Algorithms
for Heavy-Tailed Sources”, to appear in Proceeding of ISSPIT’2004 ; the
fourth IEEE Symposium on Signal Processing and Information Technology,
Rome, Italy, December 18-21 2004.
M. Sahmoudi, K. Abed-Meraim, M. Lavielle, E. Kuhn and Ph. Ciblat, ”Blind
Source Separation Using a Semi-Parametric Approach with Application to
Heavy-Tailed Signals” submitted to EUSIPCO’2005, Turkey, Sep. 2005.
M. Sahmoudi and K. Abed-Meraim, ”A Robust Time-frequency Distribution
for the Analysis of Multicomponent Non-stationary FM Signals Affected by
Impulsive α-stable Noise”, submitted to SSP’2005, Bordeaux, France, July
2005.
M. Sahmoudi and K. Abed-Meraim ”Blind Sources Separation Using Contrast
Functions based on Some sub- and super- Additive Functionals”, submitted
to ISSPA’2005, Sydney, Australia, Sept. 2005.
xviii
1
Notations et abréviations
Tous au long de ce document, les notations et les abréviations classiques suivantes seront utilisées :
diag(a1 , · · · , an )
i.i.d.
v.a.
v.a.r.
BD
EEG
EVD
FM
FT
FLOS
HOS
IF
IFT
LFM
MBD
PDF
SNR
SOS
TFSP
TF
TFD
WVD
Matrice diagonale d’élements diagonaux a1 , · · · , an
indépendants et identiquement distribués
variable aléatoire
variable aléatoire réelle
B distribution
electroencephalogram
eigenvalue decomposition
frequency–modulated
Fourier transform
fractional lower-order statistics
higher–order statistics
instantaneous frequency
inverse Fourier transform
linear frequency–modulated
modified B distribution
probability density function
signal–to–noise ratio
second–order statistics
time–frequency signal processing
time–frequency
time–frequency distribution
Wigner–Ville distribution
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
2
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
Chapitre 1
Introduction
Ce chapitre introductif a une double finalité. La première est de préciser le
cadre de la thèse et les deux problèmes au quelle a tenté de résoudre. Tandis
que la seconde consiste à présenter les principales contributions de ce travail en
indiquant le fil directeur reliant ses deux parties.
1.1
Motivations
Ce travail trouve son origine et sa motivation dans le besoin croissant de caractériser, d’analyser et de traiter des signaux non-stationnaires [Suppappola(2003)]
et/ou non-gaussiens [Wegman et al.(1989)], [Kassam(1995)].
Le développement des méthodes de traitement du signal a donné naissance à un
ensemble de techniques dont l’objectif principal est d’éclairer une situation d’application donnée. Avec la complexification des situations réelles, par exemples rupture
de transmission [Tourneret(1998)], phénomènes impulsifs [Nikias et Shao(1995)],
panne de capteurs [Kassam(1995)], canal non stationnaire [Ikram et al.(1998)], signal non stationnaire, effet Doppler, besoin d’instruments de mesure de plus en
plus fins,... etc, les outils de traitement du signal se spécialisent et deviennent
moins flexibles : pour s’adapter à une situation particulière, les procédures d’étude
et d’analyse doivent être modifiées fréquemment.
Lors de sa pratique professionnelle, le traiteur du signal a l’occasion de constater
qu’il est parfois très éloigné du cadre théorique strict dans lequel certains outils
ou méthodes de traitement du signal fonctionnent. Il est confronté à des données
manquantes, erronées, incomplètes, tronquées, l’hypothèse de normalité n’est pas
vérifiée, l’hypothèse de stationnarité n’est pas vérifiée,..etc. Face à ces difficultés,
il ne dispose en général que de sa propre expérience et, guidé également par son intuition, il essaye de ”façonner” de façon empirique des outils adaptés au problème
4
Introduction
qui lui est soumis.
Le traiteur du signal se trouve donc souvent dans l’obligation de choisir entre plusieurs clés pour ouvrir une serrure, dont aucune ne correspond exactement à la
serrure en question. Afin de le guider, la statistique mathématique a établi les propriétés de telle ou telle méthode dans un contexte bien spécifié, en général décrit
par un modèle probabiliste donné. Cette modélisation n’est qu’une représentation
quelque peu simplifiée de la réalité du phénomène étudié. En effet, le recours à la
loi normale n’est parfois que la conséquence d’un acte de foi ou la reconnaissance
de l’impossibilité de trouver le ”vrai” mécanisme probabiliste engendrant les observations. En plus, on rencontre des systèmes non-stationnaires de façon quasi
permanente dans la nature du fait de la dynamique et de l’évolution rapide des
systèmes étudiés.
Plusieurs époques ont été distinguées dans l’évolution chronologique des méthodes
de traitement du signal. La présente phase ne peut être résumée si simplement,
aussi préférons-nous parler de l’époque ”des méthodes statistiques fondées sur peu
d’hypothèses”. Les applications du traitement du signal conduisent donc tout naturellement à l’étude des signaux non-gaussiens, des signaux non-stationnaires et
de la robustesse des méthodes de traitement dans ces deux classes de signaux.
1.1.1
Non-gaussianité
En effet, pendant très longtemps, le développement des méthodes statistiques
et l’étude de leurs propriétés ont été fondés essentiellement sur la gaussianité1 de
la famille de lois. Cela transparaı̂t clairement, par exemple, dans toute l’approche
de R. Fisher, et dans la méthode des moindres carrés. Néanmoins, le choix d’un
modèle statistique régi par la loi normale relève plus de l’acte de foi que d’une
réflexion rigoureuse.
Sur le plan théorique, le développement récent de la statistique mathématique est
dominé par la recherche de solutions dans un contexte où la validité d’un modèle
n’est pas assurée, et où seront faites des hypothèses limitées sur la loi de probabilité.
Sur le plan pratique, dans de nombreux problèmes de communication, tels que la
transmission sur le réseau électrique, les communications HF ou bien les communications sous-marines [Grigoriu(1995)], [Kassam(1995)], [Nikias et Shao(1995)],
l’hypothèse classique sur la nature gaussienne du bruit, justifiée par le biais du
théorème central-limite, n’est plus valide. En effet, dans de tels systèmes, des
bruits à faible probabilité d’apparition mais à très fortes amplitudes dits de nature
impulsive, ou des problèmes de discontinuité du comportement du bruit (problème
de rupture) interviennent et ne peuvent plus être représentés par des lois gaussiennes. De tels phénomèmes peuvent en fait être modélisés à l’aide de distributions
non-gaussiennes à décroissance algébrique, c’est-à-dire, en x−α avec 0 < α < 2
1
Pour faire référence à Carl Friedrich Gauss : né en 1777 à Runswick en Allemagne. Il
devient très rapidement un astronome et mathématicien renommé, si bien qu’actuellement
il est toujours considéré comme l’un des plus grands mathématiciens de tous les temps,
au même titre qu’Archimède et Newton. Ses contributions à la science et en particulier
à la statistique sont de très grande importance. On lui doit notamment la méthode des
moindres carrés et le développement de la loi normale pour les problèmes d’erreurs de
mesure.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
1.1 Motivations
5
[Nikias et Shao(1995)], [Ilow(1995)], [Kuruoglu(1998)] ayant ainsi le même comportement que les distributions α-stables. C’est la raison pour laquelle nous nous
intéressons à des modèles de distributions généraux incluant certes les modèles
gaussiens mais aussi des lois de type queue lourdes.
Pour illustration, le modèle gaussien est bien approprié dans le cas des données
à bande limité. Tandis que dans le cas des données à large bande, un modèle stable
à variance infinie doit être utilisé comme il est présenté dans la figure 1.1.
Signal α−stable: α=1.2
Signal Gaussien
4
40
20
Signal SαS(t)
Signal G(t)
2
0
0
−20
−40
−60
−2
−80
−4
0
50
100
150
200
−100
0
50
(a) Temps t
100
150
200
30
40
(b) Temps t
3
4
2
1
Signal SαS(t)
Signal G(t)
2
0
−1
−2
0
−2
−3
−4
0
10
20
30
40
−4
0
10
20
(d) Temps t
(c) Temps t
Fig. 1.1: Réalisations d’un signal Gaussien et celles d’un signal α-stable.
• Figures (c) et (d) : Lorsque la taille de l’échantillon est relativement petite,
les deux réalisations de la loi gaussienne et de la loi α-stable sont semblables.
• Figures (a) et (b) : Lorsque la taille de l’échantillon est relativement large,
les deux réalisations se diffèrent clairement
1.1.2
Non-stationnarité
De tous les outils dont on peut disposer en traitement du signal, l’analyse
spectrale est certainement l’un des plus importants. Les raisons de son excellence
sont évidemment à chercher dans la relative universalité du concept majeur sur
lequel elle repose : celui de fréquence. Que ce soit dans des domaines s’intéressant à
des ondes physiques (acoustique, vibrations, géophysique, optique,...) ou reposant
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
6
Introduction
sur certaines périodicités d’événements (économie, biologie, astronomie,...) une
description fréquentielle est souvent à la base d’une plus grande intelligence des
phénomènes mis en jeu, en fournissant un complément indispensable à la seule description temporelle (sortie de capteur ou suite d’événements), qui est généralement
première pour l’analyse. Si l’on ajoute à cela que l’approche fréquentielle s’accommode aussi d’une transposition au traitement spatial (imagerie acoustique, radioastronomie,...), on comprend qu’un très grand nombre d’études aient été et soient
encore consacrées à l’analyse spectrale. On dispose ainsi aujourd’hui d’un arsenal
de méthodes dont, au moins pour les plus simples et les plus robustes (et donc les
plus éprouvées), les propriétés sont bien connues. À ces méthodes s’ajoutent une
batterie d’algorithmes, de logiciel, de processus, voir d’appareils, autant d’élements
assurant à l’analyse spectrale une place de choix dans la vie quotidienne des laboratoires.
Cependant, c’est l’expérience de cette même vie quotidienne qui nous contraint à
fixer des limites de validité, mais surtout à présenter des objections de principe, à
la notion classique... La non-stationnarité est une non-propriété. Pour la définir,
on va d’abord expliquer ce qu’est la stationnarité [Flandrin(1993)]. La notion de
stationnarité est reliée naturellement à celles de régime établi, de stabilité temporelle. La définition utilisée en théorie du signal formalise d’une certaine manière ces
idées. Si on parle de signaux certains, on pourra dire d’eux qu’ils sont stationnaires
s’ils peuvent se décomposer en une somme d’ondes sinusoı̈dales éternelles (on retrouve ici les modes de Fourier du physicien). Les signaux aléatoires stationnaires
sont ceux pour lesquels il n’existe pas d’origine temporelle. Par conséquent, leurs
propriétés statistiques (leurs moments) ne varient pas au cours du temps.
Est non-stationnaire, tout ce qui n’est pas stationnaire : les transitoires, i.e.,
lorsqu’on n’est pas encore parvenu à un régime permanent (par ex. dans une voiture, phase d’accélération avant d’atteindre une vitesse stable), les ruptures, i.e,
les modifications brutales et intempestives d’amplitude (par ex. : dans une voiture,
panne de moteur, coup de frein brutal). La classe des signaux non-stationnaires
comprend une grande variété de signaux comme la sous classe des signaux modulés
en fréquence appelés signaux FM [Amin(1992)], [Cohen(1995)] et en particulier les
signaux à phase polynomiale que l’on rencontre souvent en télécommunications, notamment dans les signaux de type radar ou sonar [Ouldali(1999)], [Boashash(2002)].
Pour le traitement des signaux non stationnaires, au-delà des méthodes spectrales ”classiques” adaptées aux situations stationnaires, les années quatre-vingt
ont vu le développement d’un grand nombre d’approches ”modernes” qui ont
toutes un point commun : la prise en compte explicite du temps comme paramètre
de description. Dans un context d’analyse spectrale, ceci a conduit naturellement
au concept d’analyse temps-fréquence et à ses représentations et/ou modélisation
associées.
L’intensification des travaux sur le sujet et leur floraison dans des directions souvent différentes a certainement rendu assez difficile notre tâche de choix et puis
d’utilisation des méthodes existantes.
La figure 1.1.2, représente quelques signaux réels dans le plan temps-fréquence.
Contrairement au deux représentations classiques (temporelle et fréquentielle) d’un
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
1.1 Motivations
7
signal, on voit clairement l’évolution de la fréquence au cours du temps, d’où le
besoin de telle représentation pour l’analyse des signaux non-stationnaires.
1.1.3
Robustesse
Le XIX-ème siècle a vu un long débat sur le traitement des points aberrants
(”outliers”), et apparaı̂tront très tôt dans la littérature statistique des références à
cette dérive particulière des hypothèses de base. Le terme ”robuste” a été cité pour
la première fois dans un article de G. Box [Box(1953)] sur l’estimation de variance
dans le cas non-gaussien, au sens de résistance à une déviation par rapport à la
loi normale. Par la suite, de nombreux auteurs se sont intéressés aux propriétés de
certaines alternatives aux estimateurs classiques, dans le cadre de lois contaminées,
ou de mélanges de lois : on dit qu’une loi P est contaminée par une loi Q au taux
², ² ∈ [0, 1], si la loi des observations est (1 − ²)P + ²Q.
Diverses définitions du concept de robustesse ont été avancées dans la littérature
statistique [Launer et Wilkinson(1979)], [Huber(1981)], [Leroy(1987)]. Lorsque P
désigne la loi du modèle statistique, une procédure a été qualifiée de robuste si :
– elle admet une grande efficacité absolue pour toutes les alternatives à P ;
– elle admet une grande efficacité absolue sur un ensemble bien spécifié de lois ;
– elle est peu sensible à l’abandon des hypothèses statistiques sur lesquelles
elle est fondée ;
– la loi de la statistique sur laquelle est basée cette procedure doit ”peu varier”
lorsque P est soumise à de petites altérations.
Ces différentes définitions ne fournissent pas une vue exhaustive de la question,
mais proviennent toutes du même esprit. Ainsi P. Huber, dans [Huber(1972)],
écrit : ”la robustesse est une sorte d’assurance : je suis prêt a payer une perte
d’efficacité de 5 à 10 % par rapport au modèle idéal pour me protéger de mauvais
effets de petites déviations de celui-ci : je serai bien sûr heureux que ma procédure
statistique fonctionne bien sous de gros écarts, mais je n’y prête pas réellement
attention car faire de l’inférence à partir d’un modèle aussi faux n’a que peu de
signification concrète”.
En conclusion, nous retiendrons la définition suivante du concept de robustesse :
une procédure statistique sera robuste si ses performances sont peu modifiées par
de faibles modifications des hypothèses statistiques sur lesquelles elle est fondée
comme par exemple la loi P , modèle des observations. Cette approche est celle
de [Huber(1981)] et [Hampel et al.(1986)]. Cette définition de la robustesse sera
précisée -dans chacune des deux parties de cette thèse- sur deux points :
– quelles performances de la procédure faut-il retenir ?
– qu’est-ce qu’une faible modification du modèle de base ?
Plus généralement, ce problème consiste à se placer non pas dans le cadre d’une
seule loi de probabilité, mais dans une vaste classe de lois. Cette approche permet
de répondre à de nombreuses questions auxquelles le traitement du signal classique
n’apporte de solutions que dans un contexte gaussien [Lecoutre et Tassi(1980)].
Par exemple, la séparation des composantes de sources dans un environnement nongaussien impulsif [Sahmoudi et al.(2004a)], [Zhang et Kassam(2004)], l’estimation
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
8
Introduction
des paramètres d’un signal éventuellement non-stationnaire noyé dans un bruit
non-gaussien [Friedmann et al.(2000)], [Sahmoudi et al.(2004b)] et la détection multiutilisateurs dans un environnement non-gaussien [Poor et Tanda(2002)].
1.2
Position du Problème
Réduire l’effet du bruit additif et séparer des mélanges de sources sont deux
problèmes fondamentaux et récurrents dans la plupart des applications en traitement du signal et de l’image [Kay(1998b)], [Hyvarinen et al.(2001)]. C’est d’autre
part deux problèmes théoriques centraux en théorie statistique que ce soit pour
des objectifs d’estimation ou de détection [Kay(1998a)], [Kay(1998b)].
Le cas où le signal, bruit ou signal source, est de nature impulsive s’avère
particulièrement intéressant à la fois sur le plan théorique comme l’inférence statistique des processus α-stable [Samorodnitsky et Taqqu(1994)] et sur le plan pratique comme la réduction de l’effet du bruit atmosphérique en communications HF
et l’effet des valeurs abbérantes sur les procédures de traitement statistique d’un
signal observé [Nikias et Shao(1995)], [Kassam(1995)]. C’est aussi un cas mal ou
peu étudié dans la littérature du traitement de signal relativement au cas standard
où le signal est supposé de loi gaussienne.
Le but de ce travail est de développer des techniques d’estimation et de séparation
dans des milieux présentant des phénomènes impulsifs se caractérisant par des
processus admettant des distributions à décroissance lente, appelées également, à
queues lourdes et en particulier les distributions α-stables [Nikias et Shao(1995)],
[Adler et al.(1998)] .
1.2.1
Séparation de sources impulsives à variance infinie
La séparation aveugle de sources est une technique de traitement des signaux
(ou images) multicapteurs dans laquelle on postule qu’une séquence d’observations
x(t), t = 1, · · · , T est modélisée par
x(t) = As(t) + b(t), t = 1 . . . T
(1.1)
où A est une matrice m × n à rang plein, s(t) est un n-vecteur de sources dont les
composantes sont indépendantes et b(t) représente un éventuel bruit additif. La
séparation aveugle de sources, ou encore l’analyse en composantes indépendantes,
est un problème qui consiste à retrouver des signaux sources s(t) statistiquement
indépendants à partir de leurs observations x(t) (leurs mélanges) reçus sur le réseau
de capteurs et cela sans connaissance a priori de la structure des mélanges ou des
signaux sources [Hyvarinen et al.(2001)]. La séparation de sources intervient dans
des applications diverses telles que la localisation et la poursuite de cibles en radar et sonar, la séparation de locuteurs (problème dit de “cocktail party”), la
détection et la séparation dans les systèmes de communication à accès multiple,
l’analyse en composante indépendante de signaux biomédicaux (e.g., EEG, ECG
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
1.2 Position du Problème
9
et fMRI),..[Cichocki et Amari(2002)],etc.
Ce problème a été largement étudié et de nombreuses solutions ont été proposées. Il s’agit de méthodes mettant en oeuvre la minimisation d’un critère
de séparation ; certaines sont algébriques et font appel à des statistiques d’ordre
deux et/ou d’ordre supérieur [Cichocki et Amari(2002)], [Hyvarinen et al.(2001)].
D’autre utilisent des outils d’optimisation comme les algorithmes adaptatifs ou de
type bloc basés sur une décomposition parcimonieuse. D’autres encore exploitent
l’indépendance statistique des sources par le biais du principe du maximum de
vraisemblance ou encore en utilisant la théorie de l’information (principe de ”l’infomax”) [Cichocki et Amari(2002)], [Hyvarinen et al.(2001)].
Le problème de séparation d’un mélange linéaire instantané, est arrivé a une
certaine maturité mais il reste peu étudié pour certains cas statistiquement ardus.
Lorsque des observations présentent des changements brusques traduisant l’apparition d’événements significatifs modélisées par des lois α-stables, les méthodes classiques ne s’appliquent plus ou sont mal adaptées. En effet, malgré leurs différences,
il s’avère que la plupart de ces méthodes utilisent les statistiques d’ordre deux et/ou
d’ordre supérieur ou la densité de probabilité des sources ce qui est indéfini dans
le cas des sources α-stable. L’objectif de la première partie de cette thèse est de
proposer des méthodes statistiques pour la séparation de sources impulsives de
modèle α-stable. Nous nous focalisons ici sur les deux points suivants :
• Signaux sources impulsifs : Si les sources s(t) sont impulsives c’est-à-dire telles
que les probabilités des valeurs extrêmes ne sont pas négligeables, le modèle
de séparation de sources peut être plus réaliste en considérant une distribution à queue lourde comme les lois α-stables (0 < α < 2) pour modéliser les
sources. Il s’agit d’une famille paramétrique de distributions de probabilité
très flexible pour prendre en compte les caractéristiques statistiques (caractère exponentiel, symétrie, dispersion et position ) de la distribution des
observations des phénomènes à grandes variations d’échelle [Kidmose(2001)],
[Shereshevski(2002)]. Dans cette partie on traitera l’exploitation des statistiques fractionnaires d’ordre inférieure (FLOS) et l’adaptation des méthodes
existantes pour séparer des mélanges α-stables.
• Généralisation : On s’intéressera également à la généralisation de l’utilisation
des FLOS dans le problème de la séparation d’autres classes de sources.
Cela nous a permis d’introduire des techniques de séparation de sources
fondamentalement différentes de celles existantes qui ne font intervenir que
les statistiques d’ordre deux (SOS) ou d’ordres supérieurs (HOS).
1.2.2
Estimation de signaux FM multicomposantes dans
un environnement impulsif
Dans cette deuxième partie de la thèse, nous traitons les signaux multicomposantes affectés par un bruit additif non-gaussien de nature impulsive. Un signal
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
10
Introduction
FM est dit multicomposantes si sa representation temps-fréquence, présente des
crêtes multiples dans le plan temps-fréquence. Analytiquement, un signal est dit
multicomposantes si on peut l’écrire comme somme de signaux monocomposantes.
Le signal FM multicomposantes bruité considéré dans cette partie est donné par
le modèle suivant
x(t) = s(t) + z(t)
=
M
X
si (t) + z(t)
(1.2)
(1.3)
i=1
avec
– si (t) : désigne la i-ème composante du signal x(t). Elle est de la forme si (t) =
ai (t) ejφi (t) et elle est supposée à une seule crête seulement, ou une seule
courbe continue, dans le plan temps-fréquence.
– ai (t) : désigne l’amplitude de la i-ème composante si (t) du signal x(t).
– φi (t) : désigne la phase de la i-ème composante si (t) du signal x(t).
Lorsque la phase φi (t) est un polynôme de degré I, on dit que le signal si (t)
est un signal FM à phase polynomiale. Dans ce cas
)
( I
X
bi,k tk
(1.4)
si (t) = ai (t) exp j
k=0
– z(t) : désigne le bruit impulsif, modélisé par des lois à queue lourdes (”heavytailed”). À titre d’exemple de ce genre de lois de probabilités, que l’on utilisera pour valider nos approches, nous considérons la famille des lois α-stables
avec α < 2 [Samorodnitsky et Taqqu(1994)] et la famille des densités de probabilité des lois gaussiannes généralisées [Kay(1998a)].
Les signaux FM et en particulier les signaux à phase polynomiale (SPP) se rencontrent souvent en télécommunications, notamment dans les signaux de type radar ou sonar [Cohen(1995)], [Suppappola(2003)]. Ces signaux modélisent une vaste
gamme de signaux non-stationnaires puisqu’ils ont des caractéristiques fréquentielles
qui évoluent continuellement au cours du temps avec des vitesses de variation qui
peuvent être importantes [Boashash(2002)].
Nous nous intéressons au problème d’estimation de la fréquence instantanée de
chaque composante si (t) du signal FM (12.1), définie par [Boashash(1992a)]
4
IFi (t) =
1 dφi (t)
2π dt
(1.5)
Plusieurs solutions existent déjà dans la littérature dans les cas mono-composante
et multi-composantes en présence d’un bruit gaussien [Francos et Friedlander(1995)],
[Francos et Porat(1999)], [Ouldali(1999)], [Davy et al.(2002)]. Étant donné le caractère fortement non-stationnaire, les signaux à phase polynomiale ne peuvent
être traités par des techniques développées sous l’hypothèse stationnaire comme le
périodogramme, la méthode de Prony, Music,...D’autres part les techniques adaptatives basées sur l’hypothèse de stationnarité locale du signal à l’intérieur de la
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
1.3 Objectifs et Contributions
11
fenêtre d’analyse sont peu efficaces pour étudier ces signaux dont la fréquence instantanée peut évoluer rapidement. Par la suite l’analyse de ces signaux nécessite
une approche qui prend en compte explicitement ce caractère non-stationnaire.
C’est pourquoi l’analyse conjointe en temps et en fréquence a été introduite.
En dépit de l’intérêt que suscitent les signaux FM non-stationnaires en traitement du signal et de la théorie qui s’est développée depuis une trentaine d’années,
il reste bien des problèmes à résoudre surtout en ce qui concerne le cas multicomposante dans un bruit non-gaussien. Notre travail de recherche, dans cette deuxième
partie, s’est alors axé dans deux directions :
• Cas multi-composantes : on ne trouve dans la littérature que peu de méthodes
efficaces d’analyse dans le cas multi-composantes c’est-à-dire constituées de
somme des signaux mono-composante. En effet, les techniques existantes
sont souvent des adaptations ou des extensions des méthodes de traitement
des signaux mono-composante.
• Bruit impulsif : Supposer que le bruit b(t) est de loi Gaussienne dans la majorité des algorithmes d’estimation de la fréquence instantanée qui existent
dans la littérature peut se révéler dramatique pour certaines applications
dans lesquelles le bruit peut être impulsif, ou constitué de sources nuisibles
que l’on ne cherche pas à estimer. Une alternative à ce problème est de
modéliser le bruit par les distributions α-stable permettant de prendre en
compte une structure non-gaussienne du signal bruit [Cappé et al.(2002)].
La difficulté majeure réside dans la détermination de la contribution de chaque
composante au niveau des observations du signal et dans la réduction de l’effet du
bruit impulsif.
1.3
Objectifs et Contributions
L’objectif principal est d’utiliser des théories et techniques existantes et de
développer de nouvelles techniques pour le traitement des signaux de nature nongaussienne (impulsive), et/ou non-stationnaire. Plus précisément, le travail de cette
thèse de doctorat se situe au carrefour des deux grandes problématiques suivantes
dans le contexte d’environnement impulsif (bruit ou signaux sources) :
[A]- Séparation aveugle des mélanges linéaires instantanés des sources
impulsives
Ce problème a été peu étudié pour certains cas statistiquement ardus. En effet,
lorsque les sources sont modélisées par des lois α-stables, les méthodes classiques
ne s’appliquent plus, car la densité de probabilité n’a pas d’expression analytique
explicite et les moments d’ordre supérieur ou égal à 2 sont infinis. Dans ce cas,
nous avons introduit quatres approches originales :
– Une approche basée sur le critère de dispersion minimum qui consiste à
minimiser la somme des dispersions des observations blanchies. L’étape de
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
12
Introduction
pré-blanchiement des observations est basée sur une nouvelle matrice de
covariance normalisée que nous avons introduite.
– Une deuxième approche basée sur l’idée des statistiques normalisées a été
proposée pour adapter les méthodes existantes basées sur les statistiques
d’ordre deux ou d’ordre supérieur.
– Une troisième approche en utilisant des fonctions de contrastes, sous contrainte
d’orthogonalité, basées sur des fonctionnelles sous- ou sur-additives. En particulier, nous proposons un critère qui consiste à minimiser la somme des
normes Lp des observations après une étape de blanchiment pour séparer
des sources éventuellement à variance infinie.
– Une quatrième approche de structure semi-pramétrique. Dans cette méthode
nous formulons le problème de séparation de source sous forme d’un problème
d’estimation par le principe du maximum de vraisemblance. Par la suite,
nous combinons une version stochastique de l’algorithme2 EM et l’approximation des PDF α-stables par les fonctions log-spline afin d’estimer la PDF
et la matrice du mélange.
[B]- Estimation de signaux FM multicomposantes dans un environnement impulsif
La littérature reste relativement pauvre dans le cas multicomposante et en
particulier dans le cas de bruit impulsif α-stable. Pour contribuer à la résolution
de ce problème, nous avons proposé des méthodes paramétriques et d’autres nonparamétriques basées sur l’analyse temps-fréquence.
– Méthodes paramétriques : Nous commençons par ramener le problème à celui
de l’estimation de signaux harmoniques noyés dans un bruit impulsif grâce
à une transformée polynomiale du signal. Une méthode haute résolution
(MUSIC) est alors appliquée au signal ainsi transformé pour l’estimation des
paramètres. Trois cas de figures sont considérés : (i) Celui de l’application
directe de l’algorithme MUSIC au signal harmonique tronqué en amplitude ;
(ii) celui de l’application de l’algorithme MUSIC à l’estimée robuste de la
fonction de covariance du signal harmonique et (iii) celui de l’application de
MUSIC à la covariation généralisée du signal.
– Méthodes non-paramétriques : Dans une première approche, nous avons appliqué la procédure de robustesse au sens minimax d’Huber contre l’effet du
bruit impulsif sous forme d’une étape de pré-traitement par deux techniques
différentes à savoir : (i) technique de compression des amplitudes par un
filtre non-linéaire de type |x|β ; 0 < β < 1 et (ii) la technique de troncature du signal à partir d’une large valeur. Par la suite nous représentons le
signal dans le plan temps-fréquence en utilisant des transformées quadratiques adéquates au cas multicomposantes et un algorithme de type ad-hoc
pour extraire les composantes et estimer leurs fréquences instantanées. Par
contre dans la deuxième, nous avons combiné l’approche de robustesse Mestimation avec les transformées temps-fréquence quadratiques pour définir
2
Le terme algorithme vient de la prononciation latin du nom de Abu Ja’far Muhammad Ibn Mus Al-Khawarizmi, mathématicien arabe du XI-ème siècle vivant à Bagdad et
précurseur de l’algébre [Khawarizmi(ecle)]
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
1.4 Organisation du Document
13
des transformées robustes à l’effet du bruit impulsif et des termes croisés
d’un signal multicomposantes.
Finalement, une étude numérique vient compléter les résultats théoriques et permet de comparer nos approches à d’autres méthodes existantes dans la littérature.
Notons aussi que les deux problématiques traitées dans cette thèse sont très
riches et attire de plus en plus les spécialistes du traitement du signal. Nous citons
par exemple une nouvelle contribution qui traite le problème de la séparation
aveugle des mélanges convolutifs des signaux FM [Castella et al.(2004)], ce qui est
une combinaison des deux problèmes abordé dans ce travail.
1.4
Organisation du Document
Conscient que l’aspect abstrait des probabilités et statistiques rebute beaucoup
de traiteur du signal, nous avons voulu presenter un exposé vivant, clair et illustré
par de nombreux exemples, figures et shémas.
Ce document est constitué de la présente introduction, de trois parties illustrant
les différents aspects de nos travaux et d’une conclusion. Nous avons ajouté, en
début de chaque chapitre, une introduction détaillant, plus encore, le contexte et
les enjeux de la partie traité dans le dit chapitre, ainsi que les travaux effectués.
Chaque chapitre se termine par une étude de robustesse des contributions, de leurs
performances et des éventuels prolongements que l’on pourrait envisager de donner
à cette méthode. De plus, les tables de matières accompagnent les trois parties de
la thèse.
Plus précisement, ce rapport de thése est organisé comme suit :
Introduction
• Chapitre 1 : Présente les motivations et l’originalité de ce travail de thèse,
précise le cadre technique des problèmes posés et résume nos contributions principales.
Première partie : Préliminaires
• Chapitre 2–4 : Constituée de trois chapitres, réunissent les notions utiles pour
la suite de la famille α-stables des distributions de probabilités non-Gaussiennes,
d’estimation robuste ainsi que l’outil temps-fréquence pour l’analyse des signaux
non-stationnaires. Le lecteur y trouvera toutes les définitions, théorèmes et formules qu’il doit savoir pour la compréhension du manuscrit.
Deuxième partie : Contributions novatrices en séparation aveugle de
sources impulsives de modèle alpha-stable
• Chapitre 5 : Une présentation générale de la séparation aveugle de sources
ainsi que les grands principes des méthodes existantes sont rappelés. Par
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
14
Introduction
la suite, nous précisons le problème que l’on aborde : la séparation aveugle
d’un mélange instantané linéaire de sources impulsives modélisées par des
distributions alpha-stables.
•
•
•
•
Ensuite, nous introduisons trois nouvelles approches pour le cas des sources
impulsives de modèle α-stable :
Chapitre 6 : Approches basées sur les moments statistiques fractionnaires
d’ordre inférieure. Nous proposons une fonction de contraste basée sur le
critère du dispersion minimum.
Chapitre 7 : Approche de séparation par des fonctions de contrastes sous
contrainte d’orthogonalité. Nous proposons dans ce chapitre deux classes
de fonctions de contrastes basées sur des fonctionnelles sous- ou sur- additives. Des exemples pratiques de fonctions de contrastes sont introduits
pour application aux sources à queue lourde. En particulier, nous proposons
la fonction de contraste qui consiste à minimiser la somme des normes Lp
des observations.
Chapitre 8 : Approche basée sur les statistiques normalisés. Dans ce chapitre nous introduisons des statistiques normalisées dans le but de pouvoir
appliquer correctement les méthodes de séparation de sources basées sur
l’existence des statistiques d’ordre deux et d’ordre supérieur.
Chapitre 9 : Approche semi-paramétrique du principe du maximum de vraisemblance basée sur la combinaison d’une version stochastique de l’algorithme EM et d’une technique d’approximation des densités α-stable par les
fonctions log-splines.
Troisième partie : Contributions novatrices en séparation et éstimation
des signaux FM non-stationnaires afféctés par un bruit impulsif
• Chapitre 10 : Nous commençons cette partie par une présentation générale
des grandes approches paramétriques et non-paramétriques temps-fréquence
existantes dans la littérature.
• Chapitre 11 : Dans ce chapitre, nous présentons trois approches paramétriques
robustes à l’effet du bruit alpha-stable pour l’estimation des signaux FM à
phase polynomiale multi-composantes.
• Chapitre 12 : Dans ce chapitre, nous introduisons deux approches non paramétriques basées sur la représentation temps-fréquence des signaux FM
non-stationnaires considérés dans un environnement impulsif de modèle alphastable.
Conclusion et Perspectives
• Chapitre 13 : A la fin de ce manuscrit une conclusion vient résumer les apports
essentiels du présent travail ainsi que les directions futures de recherche qu’on
envisage.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
1.4 Organisation du Document
Fs=1Hz N=7000
15
WHALE SIGNAL
Time−res=120
7000
6000
Time (secs)
5000
4000
3000
2000
1000
0.05
0.1
0.15
0.2
0.25
0.3
Frequency (Hz)
0.35
0.4
0.45
7
8
9
0.5
(a) Signal de Baleines
Fs=20Hz N=600
b−Distribution
Time−res=5
30
25
Time (seconds)
20
15
10
5
1
2
3
4
5
6
Frequency (Hz)
10
(b) Signal Electroenphalogram
Fs=1Hz N=400
BAT SIGNAL
Time−res=8
400
350
300
Time (secs)
250
200
150
100
50
0.05
0.1
0.15
0.2
0.25
0.3
Frequency (Hz)
0.35
0.4
0.45
0.5
(c) Signal de Chauve-Souris
Fig. 1.2: Exemples de signaux non-stationnaires.
(a–c) Représentent des signaux d’applications de la vie réelle par la Bdistribution : (a) pour un signal de Baleines, (b) pour un signal electroencephalogram et (c) pour un signal de Chauve-souris.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
16
Introduction
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
Première partie
Outils pour le Traitement des Signaux
non-Gaussiens et/ou non-Stationnaires
L’objectif de cette partie est d’introduire un certain nombre de
concepts en statistiques et en traitement du signal qui ont servi
comme outils de base pour achever le travail de cette thèse, et
qui seront fréquemment utilisés par la suite dans ce document.
Chapitre 2
Distributions à Queues Lourdes
La proprièté de stabilité, le théorème de la limite centrale et la caractérisation
parfaite par les moments d’ordre un (la moyenne) et d’ordre deux (la variance ou
la covariance) sont des propriètés qui font de la loi gaussienne une des lois les plus
utilisées en modélisation statistique. Cependant, bien que les calculs d’inférence
statistique soient simples, l’hypothèse de gaussianité s’avère trop restrictive en particulier dans certains domaines pour lesquels il faut prendre en compte une plus
grande variabilité des données. Dans le cadre des distributions non-gaussiennes à
variance infinie sont apparues les lois α-stables , dont le moment d’ordre 2 est infini
dès que α est strictement inférieur à 2. Ces lois sont utilisées dans de nombreux
domaines tels que les télécommunications [Bestravos et al.(1998)], le traitement
du signal [Nikias et Shao(1995)] et la finance [Bassi et al.(1998)], [Rachev(2003)]
etc... Elles font partie de la classe des lois de probabilités non-gaussiennes à queue
lourde1 qui englobent d’autres modèles existant dans la litérature et qui ont attiré
l’attention de beaucoup de chercheurs en statistique et en traitement du signal. Le
but de ce chapitre n’est pas de faire une description exhaustive des modèles nongaussiens, il s’agit seulement d’introduire ceux qui sont particulièrement adaptés
à la modélisation des phénomènes impulsifs. On présente plus en détail la famille
des distributions α-stables. Le seul fait que les lois stables ont une queue de type
lourde ou bien asymptotiquement parétienne (pour faire référence à la loi de Pareto) ne suffit pas pour justifier leur importance. Il existe deux raisons profondes, la
première provient d’un théorème que nous verrons dans ce chapitre dit théorème
central limite généralisé et qui accorde bien le statut ”lois limite” aux lois αstables. La deuxième raison provient de la proprièté de stabilité qui affirme que
toute combinaison linéaire de v.a.r. α-stables est aussi de loi α-stable.
1
Formellement, une v.a.r. a une queue lourde si elle a une queue algébrique : il existent
c, α > 0 tel que P r(| X |> x) ∼ cx−α , quand x → ∞.
20
Distributions Non-Gaussiennes à Queues Lourdes
Aprés un bref rappel historique sur les lois stables, leurs distributions univariées sont définies et diverses propriétés sont présentées dans un premier temps.
Puis, sont abordés le problème du test d’une variance finie ou infinie ainsi que
l’estimation des deux paramètres caractérisant une loi symétrique α-stable. Dans
une seconde section, le cas multivarié est traité. Certains concepts de mesure de
dépendance des v.a.r. α-stables telles que la covariation, le coefficient de covariation, le coefficient de covariation symétrique et la codifférence sont introduits ainsi
que leurs propriétés. On termine l’étude des lois α-stables par la présentation de
quelques techniques d’approximations analytiques de leur densité de probabilité.
Enfin, on présente d’autres modèles non-gaussiens à queue algébrique largement
utilisés pour la modélisation des signaux impulsifs.
2.1
Bref Historique
Au cours des développements historiques en astronomie au 18-ème siècle, Gauss
a introduit sa méthode d’estimation par le critére du moindre carré et insista sur
l’importance de la loi qui porte actuellement son nom [Gauss(1963)]. Suivi par les
développements de la théorie des séries de Fourier, Laplace et Poisson tentent de
trouver l’expression analytique de la transformée de Fourier (TF) d’une densité
de probabilité (PDF) et lancent alors la théorie des fonctions caractéristiques sur
la bonne voie. Laplace, en particulier, a souligné le fait que la densité de Gauss
et sa TF ont la même expression analytique. Son étudiant Cauchy étend l’analyse de Laplace et
R considèrenla TF d’une fonction de ”Gauss généralisée” de la
1 ∞
forme fn (x) = π 0 exp(−ct ) cos(tx)dt, en replaçant 2 par n. Il n’a pas réussi a
résoudre le problème mais quand il a considéré le cas n = 1, autre que la loi de
Gauss, il a obtenu la fameuse loi de Cauchy f1 (x) = π(c2C+x2 ) . En remplaçant l’entier naturel n par le réel α on obtient la fameuse famille fα des densités α-stables.
Cependant, à l’époque on ne savait pas qu’il s’agit d’une densité de probabilité
et c’est seulement après les travaux de Pólya et Bernstein que la famille fα est
devenu officiellement une classe de PDF pour 0 < α ≤ 2 [Janicki et Weron(1994)].
En 1925, le mathématicien Français Lévy, en étudiant le théorème limite centrale,
confirme que lorsqu’on relâche la condition de variance finie, la loi limite est une loi
stable [Lévy(1925)]. Motivé par ce dernier résultat Lévy établit la TF de toutes les
distributions α-stable, ce qui lui attribue l’originalité de la théorie des lois stables.
Plus tard en 1937, Lévy a introduit une nouvelle approche pour le traitement des
lois stables qui est celle des distributions infiniment divisibles.
D’autres mathématiciens ont contribué plus tard à l’étude approfondie des lois
stables, notablement de Doblin (1939) en utilisant les fonctions à variations regulière, de Gnedenko et Kolmogorov et de [Zolotarev(1966)]. Quelques années plus
tard, [Fama et Roll(1968)] donnent les premières tabulations des lois symétriques
α-stables (SαS), ce qui va permettre de concevoir les premiers estimateurs de
ces lois. Plus tard, les efforts des statisticiens sont focalisés sur l’estimation de
l’exposant caractéristique α qui caractérise la loi et qui détermine si la loi est
de variance finie ou infinie. [Fama et Roll(1971)] ont utilisé les quantiles pour esM. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
2.1 Bref Historique
21
timer le paramètre α ce qui permis aux premiers tests d’apparaı̂tre du modèle
i.i.d. α-stable. De nouvelles techniques d’estimation basées sur la fonction caractéristique vont apparaı̂tre dans les années 80 comme par exemple la méthode de
[Koutrouvelis(1980)] qui semble être la meilleure méthode selon plusieurs études
faites par [Akgiray et Lamoureux(1989)] et [Walter(1994)].
Simultanément, des générateurs de variables aléatoires stables sont conçus
par [Chambers et al.(1976)], dont les algorithmes permettent une amélioration
des possibilités de simulation des situations réelles comme par exemple sur les
marchés financiers ou le bruit télephonique. Suivit par les travaux de Paulauskas
dans le cas multivatriés, [Cambanis et Miller(1981)] ont établit la théorie des processus linéaires de lois stables, [Samorodnitsky et Taqqu(1994)] ont développé la
régression linéaire et non-linéaire des distributions α-stables et l’étude des processus stochastiques stables dans [Janicki et Weron(1994)].
Malgrés cette longue histoire de recherche scientifique, les lois α-stables n’attirent
que peu d’attention des chercheurs en sciences appliquées :
– En Astronomie : La première application des distributions α-stables est
apparue avant Lévy dans le domaine de l’astronomie, quand Holtsmark a
montré que la force gravitationnelle exercé par le système stellaire sur un
point de l’univers a une distribution α-stable d’indice α = 3/2.
– En Finance : Si on regarde par exemple les courbes boursières représentant
l’évolution du prix d’un titre au cours du temps, des périodes hautes s’altérnent
à des périodes basses et ainsi de suite. De plus, des fluctuations et des
périodes irrégulières peuvent être observées. Mandelbrot s’appuie alors sur
la loi de Pareto pour mettre en évidence un nouveau modèle de variation
des prix, appelé lois α-stables. [Mandelbrot(1963)] confirme que son modèle
décrit de façon réaliste la variation des prix pratiqués sur certaines bourses
des valeurs. Par la suite, [Fama(1965)] valide le modèle des lois α-stables
sur le prix du marché des actions. A la fin des années 80, plusieurs travaux
semblent rejeter le modèle i.i.d. α-stable en se retournant vers la remise en
question de l’hypothèse d’indépendance ce qui a conduit à la découverte des
lois d’échelle ou lois à longue dépendance.
– En Télécommunications : Les premiers travaux effectués pour l’application
des lois α-stables en traitement du signal ont vu le jour durant les années
70 par trois chercheurs des laboratoires de BELL (Chambers, Mallow et
Stuck) en prouvant que le modèle α-stable est bien adéquat pour modéliser
le bruit des lignes téléphoniques. Ils ont conduit une série de travaux qui
ont abouti a plusieurs résultats de références comme le critère de dispersion
minimum , filtrage de Kalman des processus α-stable et l’analyse de plusieurs algorithmes d’estimation et de détection dans un bruit non-gaussien
[Stuck et Kleiner(1974), Stuck(1977)]. En 1993, Shao et Nikias ont publié
dans IEEE Magazine un article qui a initialisé la méthodologie de traitement du signal dans un environnement α-stable. Plus tard, l’intérêt à ce
thème devient publique et plus de 120 articles de revue et de conférence sont
apparus en plusieurs applications de ce modèle. D’autres applications sont
beaucoup plus récentes, en internet par exemple le temps d’apparition d’une
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
22
Distributions Non-Gaussiennes à Queues Lourdes
page web est très variable, ce qui rappelle certains modèles à variance infinie.
Dans ce context, [Adler et al.(1998)] donnent divers exemples d’application
des lois à queues lourdes et en particulier les distributions α-stables. Par
ailleurs, en 1999 une conférence internationale sur le sujet ”Applications of
Heavy-Tailed Distributions in Statistics, Engineering and Economics” était
organisée. Quelques mois plus tard, durant la conférence ”IEEE Higher Order Statistics Workshop”, une session spéciale était consacré au sujet. En
2000, la conférence ICASSP aussi consacre une session spéciale au sujet.
Récemment en 2002, un numéro spécial du journal ”Signal Processing” est
dédié aux modèles à queue lourdes et leurs applications en radar, images,
video et en analyse des données télétrafiques (No. 82, 2002).
– D’une Manière Générale : Notons toutefois que même si le modèle i.i.d.
α-stable n’est pas toujours approprié, il représente un bon compromis entre
exactitude de modélisation et compléxité d’inférence statistique. Plusieurs
livres sont consacrés à ces lois : [Zolotarev(1986)] qui a étudié les lois αstables dans le contexte univarié ; [Samorodnitsky et Taqqu(1994)] qui ont
étudié de manière approfondie beaucoup de propriétés de ces lois dans le
cas univarié comme dans le cas multivarié, [Nikias et Shao(1995)] qui ont
appliqué ces lois dans le domaine du traitement du signal et [Nolan(2004)]
pour une étude de point du vue implémantation et modélisation des données.
En dépit de l’intérêt que représente cette famille de distributions, il reste bien
beaucoup de questions à creuser surtout dans le cas multivarié. Notre travail de
recherche s’est alors axé dans le traitement des signaux impulsifs modélisés par
des lois α-stables.
2.2
Lois Stables Univariées
2.2.1
Lois indéfiniment divisibles
Avant de définir les lois α-stables, nous allons introduire une famille de lois
plus générale : les lois indéfiniment divisibles. C’est à partir de ces lois que sera
précisée la forme de la fonction caractéristique des lois stables. L’importance de
telles lois réside dans la solution du problème suivant :
Déterminer la classe des distributions qui s’expriment comme limite d’une somme de n
variables aléatoires réelles (v.a.r.) indépendantes et identiquement distribuées (i.i.d.) ?
Pour résoudre le problème, introduisons alors la définition suivante.
Définition 2.1. Une v.a.r. X a une distribution indéfiniment divisible si et seulement si
d
∀n, ∃X1 , · · · , Xn indépendantes et de même loi telles que X = X1 + · · · + Xn
d
où = signifie l’égalité en distribution.
Il faut noter que les v.a.r. Xi n’ont pas forcèment la même loi que X mais elles
appartiennent à la même classe de distributions.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
2.2 Lois Stables Univariées
23
La classe des v.a.r. indéfiniment divisible permet de résoudre le problème cidessus. En effet, on a le théorème suivant.
Théorème 2.1. Une v.a.r. X est la limite d’une somme de v.a.r. i.i.d. si et
seulement si X est indéfiniment divisible.
Pour la démonstration voir [Shiryayev(1984), page 336].
Remarque 2.1. Une des caractérisations des lois indéfiniment divisibles est que
leur fonction caractéristique peut s’exprimer comme puissance n-ème d’une autre
fonction caractéristique.
Théorème 2.2 (Levy-Khinchin). Si X a une distribution indéfiniment divisible,
alors sa fonction caractéristique s’écrit
¾
½
Z +∞ itx
e − 1 − it sin x
M
(dx)
ΦX (t) = exp iµt +
x2
−∞
où µ est un réel et M est une mesure qui attribue une masse finie à tout intervalle
fini et telle que les deux intégrales suivantes
Z +∞
Z −x
+
−2
−
M (x) =
y M (dy) et M (−x) =
y −2 M (dy)
x
−∞
sont convergentes pour tout x > 0.
Pour la démonstration voir [Feller(1971), page 554].
Pour se rapprocher du théorème de la limite centrale et afin d’obtenir une
forme explicite de la fonction caractéristique, nous allons introduire la famille des
distributions α-stable.
2.2.2
Deux définitions équivalentes des distributions α-stables
Définition 2.2 (Propriété de Stabilité). La distribution d’une v.a.r. X est stable
si pour tout suite ak ; k ∈ IN∗ de nombres réels et toute famille X1 , · · · , Xk i.i.d.
de même loi que X, il existe ck > 0 et bk , deux réels, tels que
d
a1 X1 + · · · + ak Xk = ck X + bk
Lorsque bk = 0, on parle de distribution strictement stable.
Théorème 2.3. Pour toute v.a. stable X, il existe une constante α, 0 < α ≤ 2,
telle que la constante ck vérifie :
cαk = aα1 + · · · + aαk
Le nombre α est appelé exposant caractéristique ou bien indice de stabilité.
Dans le cas k = 2, la démonstration est détaillée dans [Samorodnitsky et Taqqu(1994)].
La généralisation au cas k ∈ IN∗ est évidente.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
24
Distributions Non-Gaussiennes à Queues Lourdes
Proposition 2.1. Si X est stable alors, X est indéfiniment divisible.
Preuve
X − bn
On considère les v.a. Yj = jan n , j = 1 · · · n
Les v.a. Yj sont indépendantes car les Xj le sont. On peut écrire
Y1 + · · · + Yn =
X1 + · · · + Xn
bn
−
an
an
d
d
comme X1 + · · · + Xn = an X + bn , d’où Y1 + · · · + Yn = X.
¥
Théorème 2.4 (Théorème Central Limite Généralisé). Sans limitation de variance finie, pour toute suite de variables aléatoires i.i.d. X1 , · · · , Xn , suite an de
nombre réels positifs et suite bn nombres réels, la somme normalisée
(X1 + · · · + Xn )
+ bn
an
converge en distribution vers une variable stable.
La démonstration est détaillée dans [Shiryayev(1984), page 338].
On peut également définir les distributions α-stables à partir de leurs fonctions
caractéristiques.
Définition 2.3. : Fonction caractéristique des lois stables (Levy-Khinchin)
Si X a une distribution stable, alors sa fonction caractéristique s’écrit :
Φ(t) = exp{iat − γ | t |α [1 + jβsign(t)ω(t, α)]}
(2.1)
où
½
ω(t, α) =
tan απ
, si α 6= 1
2
2/π log | t | , si α = 1
(2.2)
Une loi stable notée Sα (a, β, γ) est caractérisée par quatres paramètres :
– α : l’exposant caractéristique, 0 < α ≤ 2. Il caractérise les queues de
distribution en mesurant leurs épaisseurs. C’est pourquoi on parle des distributions α-stable à queues lourdes ou à queue épaisse . Quand α est proche
de 2, la probabilité d’observer des valeurs de la variable aléatoire loin de
la position centrale est faible. Une valeur proche de 0 de l’indice α signifie
que la masse de la queue a une probabilité considérable. La valeur α = 2
correspond à la loi normale (loi de Gauss) pour toute valeur de β, alors que
α = 1, β = 0 correspond à la loi de Cauchy ;
– a : paramètre de position. Il mesure la tendance centrale de la distribution. Lorsque α > 1, a représente la moyenne et si 0 < α < 1, alors a
représente la médiane ;
– γ : la dispersion, mesure la dispersion de la distribution autour du paramètre de position a. Lorsque α = 2, la variance existe et γ = 12 V ar(X) ;
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
2.2 Lois Stables Univariées
25
– β : paramètre de symétrie, −1 ≤ β ≤ 1. Si β = 0, la loi est symétrique
par rapport au paramètre de position a, de fonction caractéristique φα (t) =
exp{iat−γ|t|α }. Dans ce cas la loi de probabilité est dite α-stable symétrique
ou tout simplement SαS. Les distributions α-stable symétrique représente
une sous classe importante des distributions α-stable. Par exemple la loi de
Gauss et la loi de Cauchy sont des lois SαS.
Par convention, une loi α-stable est dite standard si a = 0 et γ = 1. Enfin, reste à
noter aussi qu’il est assez courant dans la littérature de remplacer la dispersion γ
par σ α et d’appeler σ paramètre d’échelle.
Pour donner une comparaison à la loi de Gauss, nous présentons dans la figure
2.1 des réalisations de variables aléatoires i.i.d. symétriques α-stables d’exposants
α = 0.1, 0.5, 0.8, 1, 1.5 et une réalisation gaussienne. On remarque que plus α
est petit, plus la variable est impulsive.
26
6
4
x 10
5
α=0.1
α=0.5
SαS(t)
SαS(t)
4
2
0
−2
0
50
100
150
−500
100
150
200
100
150
200
0
−200
0
50
100
4
α=1.5
150
200
α=2
2
10
0
−10
50
200
SαS(t)
SαS(t)
20
0
α=1
0
50
−5
400
α=0.8
0
0
−10
200
SαS(t)
SαS(t)
500
−1000
x 10
0
−2
0
50
100
Temps t
150
200
−4
0
50
100
Temps t
150
200
Fig. 2.1: Réalisations de signaux α-stables pour différentes valeurs de α.
2.2.3
Stabilité de quelques lois usuelles
Proposition 2.2 (loi de Gauss). La loi Gauss N (m, σ 2 ) est une loi indéfiniment
divisible et α-stable de paramètre α = 2.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
26
Distributions Non-Gaussiennes à Queues Lourdes
Preuve
– Indéfiniment divisible : sa fonction caractéristique s’écrit
½
¾
t2 σ 2
Φα (t) = exp imt −
2
¾¸n
·
½
2
m
2σ
2
= exp i t − t
n
n
comme puissance nème de la fonction caractérististique d’une loi
σ2
normale N ( m
n , n ).
– Stabilité : la loi N (m, σ 2 ) est une loi S2 (m, β, σ 2 /2). Réciproquement,
une loi S2 (a, β, γ) est une loi normale N (a, 2γ).
¥
Proposition 2.3 ( Loi de Cauchy). La loi de Cauchy C(a) est une loi indéfiniment
divisible et α-stable de paramètre α = 1.
Preuve
– Indéfiniment divisible : sa fonction caractéristique s’écrit
Φα (t) = exp(−a|t|)
h
a in
= exp(− |t|)
n
comme puissance nème de la fonction caractérististique d’une loi de cauchy
C( na ).
γ
– Stabilité : la loi de Cauchy généralisée de densité f (x) = π1 γ 2 +(x−m)
2 est une
loi S1 (m, 0, γ).
¥
Proposition 2.4 (Loi de Poisson). La loi de Poisson P(λ) est une loi indéfiniment
divisible mais n’est pas stable.
Preuve
– Indéfiniment divisible : la fonction caractéristique de P(λ) s’écrit
1
(1 − itλ )r
!n
Ã
1
=
r
(1 − itλ ) n )
Φα (t) =
comme puissance nème da la fonction
– P(λ) n’est pas stable : nous proposons une demonstration par l’absurde. On
considère deux v.a. X1 et X2 de poisson, s’ils sont stables alors il existe a > 0
d
et b tels que X1 + X2 = aX1 + b.
½
IE(X1 + X2 )
= IE(aX1 + b)
⇒
V ar(X1 + X2 ) = V ar(aX1 + b)
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
2.2 Lois Stables Univariées
½
⇒
2λ = aλ + b
⇒
2λ = a2 λ
27
½
b = (2 −
a
=
√
2)λ √
2
les v.a. X1 et X2 ne prennent que des
√ valeurs dans
√ IN et donc X1 + X2 aussi,
cela entraı̂ne une contradiction car 2X1 + (2 − 2)λ n’est pas forcèment à
valeurs dans IN.
¥
2.2.4
Propriétés des lois stables
Dans cette partie, les propriétés les plus importantes des lois α-stables seront
présentées.
[A]- Densité de probabilité
Pour les v.a. α-stables, il n’existe pas une expression explicite de la densité de
probabilité (PDF) dans le cas géneral. Cependant on peut obtenir une expression
sous forme d’une intégrale de la PDF à l’aide de la transformée de Fourier inverse
de la fonction caractéristique
f (x; α, β) =
=
Z +∞
1
exp(−itx)Φα (t)dt
2π −∞
Z
1 +∞
exp(−tα ) cos[xt + βtα ω(t, α)]dt
π 0
Quand la distribution représentée par cette densité est symétrique (β = 0)
autour de zéro (a = 0), la fonction caractéristique est une fonction réelle et paire,
ce qui permet de simplifier l’expression de la densité de probabilité
Z
1 +∞
exp(−γ|t|α ) cos(tx)dt
f (x; α, β) =
π 0
Proposition 2.5 (Propriétés de la densité).
1. La densité de probabilité vérifie : f (x; α, β) = f (−x; α, −β).
2. La densité de probabilité d’une distribution α-stable est une fonction bornée.
3. La densité de probabilité d’une distribution α-stable est de classe C ∞ .
Pour la démonstration, voir [Zolotarev(1986)].
La forme explicite de la densité des lois α-stables n’existe que dans les trois
cas importants suivants :
1. La loi de Gauss S2 (a, 0, γ) :
½
¾
1
(x − a)2
√
α = 2, β = 0 =⇒ f (x; 2, 0) =
exp −
4γ
4πγ
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
28
Distributions Non-Gaussiennes à Queues Lourdes
2. La loi de Cauchy S1 (a, 0, γ) :
α = 1, β = 0 =⇒ f (x; 1, 0) =
π(γ 2
γ
+ (x − a)2 )
3. La loi de Lévy S 1 (a, 1, γ) :
2
½
¾
1
1
1
γ
γ2
α = , β = 1 =⇒ f (x; , 1) = √
−
3 exp
2
2
2(x − a)
2π (x − a) 2
qui est concentrée sur [a, ∞).
0.7
α=0.5
α=1
α=1.5
α=2
0.6
Densité de probabilité p(x)
0.5
0.4
0.3
0.2
0.1
0
−6
−4
−2
0
x
2
4
6
Fig. 2.2: Densité de probabilité α-stables pour différentes valeurs de α.
[B]- Propriétés algébriques
Proposition 2.6. Soit X1 ∼ Sα (a1 , β1 , γ1 ) et X2 ∼ Sα (a2 , β2 , γ2 ) deux v.a. αstables et indépendantes, alors X1 + X2 ∼ Sα (a, β, γ) tels que a = a1 + a2 , β =
β1 γ1 +β2 γ2
et γ = γ1 + γ2 .
γ1 +γ2
Proposition 2.7. Soit X ∼ Sα (a, β, γ) une v.a. α-stable et c une constante réelle,
alors X + c ∼ Sα (a + c, β, γ)
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
2.2 Lois Stables Univariées
29
Proposition 2.8. Soit X ∼ Sα (a, β, γ) une v.a. α-stable et h une constante réelle
non nulle, alors
hX ∼ Sα (ha, sign(h)β, |h|α γ)
si α 6= 1
1
2
α
hX ∼ S1 (ha − π h ln |h|γ α β, sign(h)β, |h| γ) si α = 1
Pour la démonstration, voir [Samorodnitsky et Taqqu(1994)].
[C]- Comportement queues lourdes
Définition 2.4. La loi de probabilité d’une v.a.r. est dite à queue lourde d’indice
α s’il existe un nombre α ∈]0, 2[ et une fonction h à variation lente, c’est-à-dire
h(bx)
lim
= 1 pour tout b ∈ IR+ tels que :
x→+∞ h(x)
IP(X ≥ x) = x−α h(x)
(2.3)
Proposition 2.9. Soit X une v.a.r. de loi Sα (a, β, γ) avec 0 < α < 2, alors on a
les deux résultats suivants

1+β


lim tα IP(X > t) = Cα
γ,

 t→+∞
2
(2.4)


1
−
β

 lim tα IP(X < −t) = Cα
γ
t→+∞
2
où Cα est une constante qui ne dépend que de α :
µZ
Cα =
∞
−α
x
¶−1 ½
sin xdx
=
0
2/π
1−α
Γ(2−α)cos(πα/2) ,
si α = 1,
si α =
6 1
Pour la démonstration voir [Samorodnitsky et Taqqu(1994), page 16].
D’aprés cette propriété (2.4) par passage à la limite quand x tend vers +∞, on
remarque que les lois α-stables sont asymptotiquement à queue lourde.
Pour une meilleure illustration des densités α-stables, nous avons présenté les
courbes de leurs densités de probabilité et de leurs queues pour différentes valeur
de α dans la figure 2.2 et la figure 2.3 respectivement. Ces figures montrent l’effet
de l’exposant caractéristique α. Nous remarquons que plus α est petit, plus la
densité est impulsive et sa queue est lourde.
[D]- Propriété de mélange
Théorème 2.5. (Théoème de mélange d’échelles)
Soit x ∼ Sαx (0, γx , 0) avec 0 < αx < 2 et soit 0 < αz ³< αx . Soit y une v. a.
´
αx /αz , 0
z
))
totalement ”skewed” de distribution alpha-stable Sαz /αx −1, (cos( πα
2αx
et indépendant de x. Alors
z = y1/αx x ∼ Sαz (0, γx , 0).
(2.5)
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
30
Distributions Non-Gaussiennes à Queues Lourdes
0.04
α=0.5
α=1.0
α=1.5
α=2.0
0.035
La queue de la densité de probabilité
0.03
0.025
0.02
0.015
0.01
0.005
0
2
2.5
3
3.5
4
x
4.5
5
5.5
6
Fig. 2.3: Les queues de la densité de probabilité α-stable pour différentes
valeurs de α.
Pour la démonstration voir [Samorodnitsky et Taqqu(1994)] et [Feller(1971)].
Ce théorème nous permet d’écrir une v.a. SαS comme produit de deux v.a. αstable dont l’une est totalement ”skewed”.
Corollaire 2.1. Soient x un
de loi normal
³ vecteur
´ N (0, 2γx ) et y une v.a. positive
¢
¡
παz 2/αz
de loi α-stable ; y ∼ Sαz /2 −1, cos( 4 )
, 0 et indépendant de x. Alors
z = y1/2 x ∼ Sαz (0, γx , 0)
(2.6)
Ce cas spécial du théorème 2.5 montre qu’une v.a. SαS peut être représentée
comme produit d’une v.a. gaussienne et d’une v.a. α-stable positive. Cette proprièté montre que les lois SαS sont des distributions gaussiennes conditionnelles
[Papoulis(1991)].
2.2.5
Moments fractionnaires d’ordre inférieur
[A]- Moments fractionnaires d’ordre positif
Même si les moments du second ordre d’une v.a. SαS avec 0 < α < 2 n’existent
pas, les moments d’ordre inférirur à α existent et s’appellent les moments fractionM. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
2.2 Lois Stables Univariées
31
naires d’ordre inférieur (FLOM). La proposition suivante donne l’expression des
FLOM en fonction de la dispersion γ et de l’exposant caractéristique α.
Proposition 2.10. Soit X une v.a. Sα (0, β, γ) ; de paramètre de position nul et
de dispersion γ. Alors
– Si α = 2 : ∀p ≥ 0, IE|X|p < +∞
– Si α < 2 :
½
p
C(α, β, p)γ α 0 < p < α,
IE|X|p =
(2.7)
+∞
p ≥ α.
¡
¢p
¡
¢¢
¡
2p−1 Γ(1−p/α)
2α cos p arctan β tan απ
où C(α, β, p) = R +∞
1 + β 2 tan2 απ
2
−p−1
2
α
2
p
0
u
sin udu
et Γ(.) représente la fonction gamma.
Ce résultat important a été démontré par Zolotarev en utilisant la transformée de
Mellin-Stieljes [Zolotarev(1986)]. Dans [Cambanis et Miller(1981)], le même résultat
a été retrouvé en utilisant une proprièté de la fonction caractéristique. Un résultat
similaire est vrai dans le cas des v.a. stables complexes [Masry et Cambanis(1984)].
[B]- Moments fractionnaires d’ordre négatif
Dans [Ma et Nikias(1995a)], les auteurs ont démontré que les v.a.r. SαS ont
aussi des moments finis d’ordre négatif !. Ce résultat surprenant pour les lois αstables symétriques SαS est présenté dans la proposition suivante.
Proposition 2.11. Soit X une v.a.r. SαS de paramètre de position nul et de
dispersion γ. Alors la formule unifiée pour ces moments d’ordre positif et d’ordre
négatif est
p
(2.8)
IE(|X|p ) = C(p, α)γ α pour tout −1 < p < α.
avec C(p, α) = 2p+1
2.2.6
p
)Γ( 1+p
)
Γ(− α
2
√
α πΓ(− p2 )
.
Simulation des lois stables
[A]- Sources de codes
Pour simuler les lois stables, Chambers et al. ont publié le premier programme
en langage FORTRAN dans [Chambers et al.(1976)]. Le même code était amelioré par
Chambers et J. Nolan est publié dans le livre [Samorodnitsky et Taqqu(1994)]. Il
existe aussi une fonction rstab dans la bibliothèque du logiciel S-PLUS. Pour un
programme MATLAB, on peut consulter la page web du professeur John NOLAN.
[B]- Quelques exemples
Nous avons simulé 5000 réalisations de lois SαS pour différentes valeurs de α.
Le tableau suivant (Tableau 2.1) représente la moyenne et la variance empirique
des 5000 réalisations. Ces résultats confirment l’équation sur le calcul des moments.
En effet, lorsque α décroı̂t vers 1, la variance diverge et lorsque α devient plus petit
que 1, c’est la moyenne qui commence à diverger.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
32
Distributions Non-Gaussiennes à Queues Lourdes
Valeur de α
IE(X)
V ar(X)
0.5
5324.87
3323423.23
0.9
27.12
3312198.76
1
-0.48
2171.12
1.2
0.01
152.13
1.5
0.03
36.76
1.7
2
0.02 0.02
6.27 2.12
Tab. 2.1 – La moyenne et la variance des lois α-stables pour différentes
valeurs de α
2.3
Inférence Statistique des Lois Stables
La première tâche du traiteur de signal est consacrée à l’étude de la modélisation
des données par des loi de probabilités. En particulier, dans cette section nous
étudierons l’adéquation de la famille des lois α-stable pour cette modélisation
[Nolan(2004)]. Plusieurs niveaux de tests et de validation sont possibles pour
différentes classes de signaux (e.g., images biomédicales, images astronomiques,
signaux EEG, etc). Dans un premier temps, on pourra tester si la distribution des
données est à queue lourde en utilisant l’histogramme de la loi normale. Si l’hypothèse de normalité est violée, on testera si la variance des données est infinie en
utilisant le test de convergence des variances [Adler et al.(1998)]. Ainsi, on pourra
conclure si les données sont dans le domaine d’attraction par l’estimation du coefficient exponentiel α directement à partir des données et en utilisant la méthode
dite “stabilized p-p plots” [Michael(1983)] .
2.3.1
Tests de la variance
Nous allons présenter deux méthodes graphiques pour tester si la distribution
de nos observations est à variance finie ou infinie.
[A]- Test graphique de la convergence de la variance empirique
La stratégie qui semble la plus simple pour tester si la variance est finie ou pas
c’est de faire augmenter la taille de l’échantillon et calculer la variance empirique
correspondant. Plus précisément, on propose l’algorithme résumé dans le tableau
2.2. Si les observations ont une loi à variance finie, lorsqu’on fait augmenter la
taille N des observations, la variance doit converger vers une valeur finie. Dans
le cas contraire, si les observations proviennent d’une loi à variance infinie, un
comportement de divergence doit être observé.
[B]- Test graphique de la queue
L’idée principale de ce deuxième test est basée sur le comportement asympto1+β
tique ”queue lourde” des lois α-stable lim tα IP(X > t) = Cα
γ. Alors, cela
t→+∞
2
implique
d log F̄ (x)
∼ −α; x → +∞
(2.9)
d log x
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
2.3 Inférence Statistique des Lois Stables
33
'
$
Test graphique de la variance
N
X
1
Step 1. Calcul de la moyenne empirique X̄ =
Xi
N
i=1
Step 2. Calcul de la variance empirique
2
σ̂N
=
1
N
N
X
(Xi − X̄)2
i=1
2
)
(N, σ̂N
Step
3. Visualisation de la courbe
assez grands.
pour des
N
&
%
Tab. 2.2 – Test graphique de la variance en utilisant la variance empirique.
où F̄ (x) = IP(X > x) est le complémentaire de la fonction de répartition F . Nous
résumons l’algorithme dans le tableau 2.3.
'
$
Test graphique log-log de la queue
Step 1. Calcul du logarithme de la queue
4
q(x) = log(
N
1 X
1I|Xi |>x )
N i=1
Step 2. Visualisation de la courbe
assez grands.
(log x, q(x))
pour des
x
&
Tab. 2.3 – Test graphique de la queue d’une distribution par la méthode dite
”log-log”.
Si la variance de la loi de distributions des données est finie la pente de la
courbe doit converger vers une valeur finie [Adler et al.(1998)].
2.3.2
Estimation des paramètres des lois α-stables
La plupart des algorithmes de traitement du signal utilisant des lois α-stables
exigent l’estimation a priori des paramètres de la distribution α-stable et en particulier l’exposant caractéristique α. D’où l’importance d’avoir des techniques efficaces d’estimation des paramètres de la loi. Pour une loi α-stable symétrique
SαS, les paramètres de la distribution à estimer sont l’exposant caractéristique α
et la dispersion γ. De nombreuses méthodes ont été proposées dans la littérature :
maximum de vraisemblance [DuMouchel(1973), Bodenschatz et Nikias(1999)], utilisation des fractiles de la distribution [Fama et Roll(1968)], utilisation de la fonction caractéristique [Koutrouvelis(1980)], utilisation des moments fractionnaires
d’ordre inférieur positifs et négatifs [Ma et Nikias(1995b)], utilisation des moments
logarithmiques de la loi SαS [Ma et Nikias(1995b)], utilisation de la fonction de
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
%
34
Distributions Non-Gaussiennes à Queues Lourdes
répartion dans [Maymon et al.(2000)] et la généralisation des méthodes existantes
au cas d’une loi α-stable non symétrique [Kuruoglu(2001)]. Dans cette partie, nous
allons discuter la méthode du maximum de vraissemblance et détailler la méthode
basée sur la fonction caractéristique.
[A]- Médode du maximum de vraisemblance
Cette approche largement utilisée en statistique souffre d’une difficulté majeure
dans le cas des distributions α-stable, à savoir le manque d’expression analytique
de la PDF. Malgré cela, [DuMouchel(1973)] a dévoloppé une telle approche dans
ce contexte. D’autres chercheurs ont utilisé des techniques de Monte Carlo ou
des approximations pour approcher les intégrales de l’expression de la densité
[Nolan(2004)]. Cependant, toutes ces méthodes nécessitent une grande complexité
de calcul. De plus, il n’existe aucune étude de convergence de cette approche dans
la littérature.
[B]- Méthode de régression basée sur la fonction caractéristique
Pour une v.a.r SαS, l’expression de la fonction caractéristique est donnée par
ϕX (t) = exp{−γ | t |α }
Ce qui entraı̂ne
| ϕX (t) |2 = exp{−2γ | t |α }
£
¤
log − log | ϕX (t) |2 = log 2γ + α log | t | .
£
¤
On pose yk = log − log | ϕX (t) |2 , λ = log 2γ, ωk = log | tk | ; l’égalité précédente
implique que yk = λ + αωk . Si on pose
¡
¢
ŷk = log − log | ϕ̂X (tk ) |2
où
"
#2 " n
#2 
n


X
X
1
| ϕ̂X (tk ) |2 = 2
cos(tk xi ) +
sin(tk xi )

n 
i=1
i=1
On peut alors proposer comme modèle linéaire suivant
Ŷ = λ + αW + ε
Or la partie imaginaire de la fonction caractéristique est nulle, on a alors l’estimateur de la fonction caractéristique donnée par :
2
| ϕˆX (tk ) | =
µ Pn
i=1 cos(tk xi )
n
¶2
.
(2.10)
En ce qui concerne le choix des tk , ainsi que le choix de K par rapport à n,
on suit la démarche décrite dans [Koutrouvelis(1980)], c’est-à-dire : quelque soit
k ∈ [1, K], tk = πk
25 et le paramètre K est choisi suivant le tableau 2.4 ci-dessous.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
2.4 Lois Stables Multivariées
α
n
200
800
1600
35
0.3
0.5 0.7
0.9 1.1
1.3
1.5 1.7
1.9
134
124
118
86
68
56
28
22
18
22
16
14
11
11
11
9
9
10
30
24
20
24
18
15
10
10
10
Tab. 2.4 – Valeurs optimales de K en fonction de n et de α
– Estimation
du paramètre α : par régression linéaire en choisissant les ωk tel
P
que K
ω
= 0, on obtient
k
k=1
PK
α̂ = Pk=1
K
ωk ŷk
(2.11)
2
k=1 ωk
– Estimation de P
la dispersion γ : De même, par régression linéaire et le choix
des ωk tel que K
k=1 ωk = 0, on obtient
1
γ̂ = exp
2
2.4
Ã
K
1 X
ŷk
K
!
(2.12)
k=1
Lois Stables Multivariées
2.4.1
Définition et propriétés
Définition 2.5. Le vecteur aléatoire X = (X1 , · · · , Xd ) est dit α-stable dans IRd
si pour toute suite de nombre positifs a1 , · · · , ak , il existe un nombre positif ck et
un vecteur D(k) ∈ IRd tels que
a1 X(1) + · · · + ak X(k) = ck X + D(k)
d
(2.13)
où X(1) , · · · , X(k) sont des copies indépendantes de X [Samorodnitsky et Taqqu(1994)].
Lorsque D(k) est le vecteur nul, on parle de loi strictement alpha-stable.
Proposition 2.12. Si X est un vecteur α-stable, alors toute combinaison linéaire
des composantes de X est une v.a.r. α-stable.
Preuve P
Soit Y = ni=1 λi Xi une combinaison linéaire des composantes de X.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
36
Distributions Non-Gaussiennes à Queues Lourdes
Considérons Y1 , · · · , Y2 des copies de Y .
Y1 + · · · + Yk
d
=
n
X
(1)
λi Xi
+ ··· +
i=1
d
=
n
X
n
X
(k)
λi Xi
i=1
³
´
(1)
(k)
λi Xi + · · · + Xi
i=1
d
=
n
X
´
³
(k)
λi ck Xi + Di
i=1
d
=
ck
n
X
λi Xi +
j=1
d
=
n
X
(k)
λi Di
i=1
ck Y + bk .
¥
Contrairement au cas mono-variable, la fonction caractéristique d’un vecteur stable
multi-variable n’a pas d’expression explicite en t.
Définition 2.6. Une fonction caractéristique d’une v.a. de dimension n est dite
α-stable si elle s’écrit sous la forme
½
T
T
exp(jt
¡ Ta − tR At), T n−1 α
¢ si α = 2;
Φ(t) =
exp jt a − S n−1 |t S
| µ(dS n−1 ) + jβα (t) , si 0 < α < 2.
(2.14)
où
–
R
½
απ
T n−1 |α sign(tT S n−1 )µ(dS n−1 ), si α 6= 1, 0 < α < 2
tan(
)
2
S n−1 |t S
βα = R
T
n−1
log |tT S n−1 |µ(dS n−1 ),
si α = 1.
S n−1 t S
(2.15)
n−1
– S
est la sphère unité de dimension n,
– a, t ∈ IRn ,
– µ(.) est la mesure spectrale de la sphére unité 2 ,
– A est une matrice symétrique, semi-définie positive.
Notons que le cas α = 2 correspond a une distribution gaussienne multivariée
de moyenne a et de matrice de covariance 2A. Notons aussi qu´à l’exception de
ce dernier cas α = 2, les distribution stables multivariées sont déterminés par le
vecteur a ∈ IRn , un scalaire 0 < α < 2 et une mesure finie µ(dS n−1 ) sur la sphère
unité S n−1 .
Définition 2.7. Un vecteur x est dit de distribution α-stable symétrique (SαS)
si x est un vecteur α-stable et si les distributions de −x et x sont identiques.
Théorème 2.6. Soit x un vecteur α-stable, on a les résultats suivants.
2
C’est une mesure sur l’ensemble des boréliens de la sphére unité
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
2.4 Lois Stables Multivariées
37
1. Si toute combinaison linéaire des composantes de x a une loi symétrique
α-stable, alors x est un vecteur SαS.
2. Si toute combinaison linéaire des composantes de x a une distribution αstable, avec un indice de stabilité α ≥ 1, alors x est un vecteur α-stable.
La démonstration est détaillée dans [Samorodnitsky et Taqqu(1994), page 59].
Proposition 2.13. Soit A une matrice de type m × n et x un vecteur SαS de
dimension n, alors y = Ax est un vecteur SαS de dimension m.
Preuve
D’après le théorème 2.6, il suffit de montrer que toute combinaison
linéaire des composantes de y a une distribution SαS. En effet, soient
b1 , · · · , bm , m réels et x un vecteur SαS, alors nous avons
m
X
bj Yj ∼ SαS ⇐⇒ bt y ∼ SαS
j=1
=⇒ bt Ax ∼ SαS =⇒
Ãm
n
X
X
j=1
!
bi aij
Xj ∼ SαS.
i=1
Or le vecteur x a une distribution SαS, alors la dernière combinaison
ci-dessus a une distribution SαS. Par conséquent, le vecteur y est un
vecteur SαS.
¥
Remarque 2.2. Blanchiment des SαS
Il est montré dans plusieurs ouvrages de traitement statistique du signal qu’on peut
blanchir tout vecteur de distribution Gaussienne. Précisément, si x est un vecteur
Gaussien alors on peut l’écrire sous la forme
x = Ay
où A est une matrice constante et y est un vecteur Gaussien à composantes
indépendantes. Cependant, dans le cas des lois stables, la représentation de deux
variables stables de même exposant caractéristique α, 0 < α < 2, comme combinaison linéaire d’un nombre fini de variables stables indépendantes est impossible en
général [Schilder(1970)]. Ce résultat remarquable nous impose de faire attention
lors de la généralisation de certaines propriètés des lois Gaussiennes au cas des
lois stables au sens de Lévy.
2.4.2
Moments des lois stables multivariées
Le calcul des moments des lois stables multivariées découle de celui des lois
stables univariées.
Théorème 2.7.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
38
Distributions Non-Gaussiennes à Queues Lourdes
1. Si X1 , · · · , Xn sont des des v.a.r. α-stables et indépendantes, alors
IE (|X1 |p1 · · · |Xn |pn ) < ∞ si et seulement si pi < α; i = 1, · · · , n
2. Si X1 , · · · , Xn sont des v.a.r. dépendantes et conjointement α-stables, alors
IE (|X1 |p1 · · · |Xn |pn ) < ∞ si et seulement si 0 < p1 + · · · + pn < α
Cette condition est trés faible et souvent réalisée dans la pratique. Pour plus
de détails voir [Miller(1978)].
Dans sa forme générale, une distribution alpha-stable multi-variée reste difficile à
exploiter dans la pratique du traitement du signal. Cependant il existe quelques
sous-classes des distributions α-stables multi-variées avec une expression simplifiée
de la fonction caractéristique. Une telle classe est celle des distributions sousGaussiennes dont la description est presentée ci-dessous.
2.4.3
Vecteur aléatoire α-sous-gaussien
Définition 2.8. La fonction caractéristique des distributions α-sous-Gaussiennes
est donnée par
¶
µ
1 T
def
α/2
Φ(t) = exp − (t Rt)
2
où R est une matrice définie positive. Cette sous-classe est souvent noté par α −
SG(R) [Cambanis et Miller(1981)].
Un vecteur aléatoire de distribution α − SG(R) peut se décomposer comme
produit (ou mélange) d’un vecteur aléatoire α-stable et d’un vecteur gaussien.
Proposition 2.14. Soit x un vecteur α-stable ; x ∈ SG(R), alors
1
x = η2y
avec η une variable aléatoire positive α2 − stable et y un vecteur Gaussien de
moyenne nulle et de covariance R. En plus, η et y sont indépendantes.
Pour la démonstration voir [Cambanis et Miller(1981)].
³
¡ ¢2/α ´
Dans la proposition précédente η ∼ Sα/2 a = 0, β = 1, γ = cos πα
. Alors
4
on peut la voir comme extension du résultat de mélange des SαS.
Comme on peut le voir de la formulation de la définition par la fonction
carctéristique ci-dessus, les paramètres β et γ ne sont plus indépendants. Leurs
valeurs peuvent être déterminées en utilisant la fonction caractéristique et la
mesure spectrale. En effet, contrairement aux distributions alpha-stables monovariables qui forment une classe paramétrique, les distribution stables multi-variées
forment une classe non-paramétrique. Pour plus de détails sur cette classe le lecteur
intéressé peut consulter [Samorodnitsky et Taqqu(1994)].
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
2.5 Mesure de Dépendance des v.a.r. α-Stables
2.5
39
Mesure de Dépendance des v.a.r. α-Stables
Le coefficient de corrélation est la mesure classique de dépendance (à l’ordre
2) entre deux v.a.r. X1 et X2 de variances finies. Cependant, pour les lois alphastables, les moments d’ordre p avec p ≥ α et 0 < α < 2 sont infinis et en particulier
la variance. Par conséquent, le coefficient de corrélation n’est plus valable en tant
que mesure de dépendance. Dans ce cas, d’autres mesures existent dans la littératue
utilisant les moments fractionnaires d’ordre inférieur à α comme la covariation, la
codifférence et le coefficient de covariation symétrique et d’autres basées sur les
rangs ou sur les densités de probabilités. Dans cette section, nous allons présenter
les plus connues et les plus utilisés pour mettre en évidence certaines particularités
surprenantes des lois stables concernant la structure de dépendance.
2.5.1
Covariation
Nous supposons dans cette partie que 1 < α ≤ 2.
Définition 2.9. Soit (X1 , X2 ) un vecteur SαS avec α strictement supérieur à 1,
la covariation de X1 sur X2 est définie par la quantité
Z
[X1 , X2 ]α =
x1 x<α−1>
dµS 1 (x1 , x2 )
(2.16)
2
S1
où S 1 est la sphère unité, µS 1 est la mesure spectrale et .<.> designe la notation
suivante x<a> = sign(x) | x |a .
Nous présentons une autre définition de la covariation équivalente à la précédente
qui permettra de démontrer facilement plusieurs propriétés.
Proposition 2.15. Soit (X1 , X2 ) un vecteur SαS avec α strictement supérieur à
1, la covariation de X1 sur X2 peut s’écrire
[X1 , X2 ]α =
1 ∂γ(θ1 , θ2 )
|θ1 =0,θ2 =1
α
∂θ1
(2.17)
avec γ(θ1 , θ2 ) est le paramètre de dispersion de la variable aléatoire θ1 X1 + θ2 X2 .
Preuve
La démonstration se fait aisément en se rappelant que
Z
γ(θ1 , θ2 ) =
| θ1 x1 + θ2 x2 |α dµS 1 (x1 , x2 ).
S1
Proposition 2.16.
1. Dans le cas Gaussien (α = 2), la covariation est identique à la moitié de la
covariance.
1
(X, Y ) ∼ SαS =⇒ [X, Y ]2 = Cov(X, Y )
2
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
40
Distributions Non-Gaussiennes à Queues Lourdes
2. Si X et Y sont deux v.a.r. indépendantes et conjointement SαS alors,
[X, Y ]α = 0.
3. La covariation [X, Y ]α est linéaire en X ou bien linéaire à gauche, c’est-àdire si (X1 , X2 , Y ) est un vecteur SαS alors
[a1 X1 + a2 X2 , Y ]α = a1 [X1 , Y ]α + a2 [X2 , Y ]α
pour toutes constantes réelles a1 et a2 .
4. En général, [X, Y ]α n’est pas linéaire à droite, c’est-à-dire par rapport à Y
mais elle possède la proprièté de pseudo-linéarité suivante : si (X, Y1 , Y2 ) est
un vecteur SαS et que Y1 et Y2 sont indépendantes, alors
[X, b1 Y1 + b2 Y2 ]α = b<α−1>
[X, Y1 ]α + b<α−1>
[X, Y2 ]α
1
2
pour toutes constantes réelles b1 et b2 .
Les démonstrations sont détaillées dans [Samorodnitsky et Taqqu(1994)].
2.5.2
Métrique de covariation
Définition 2.10. Soit X une v.a.r. SαS de dispersion γ et de paramètre de location a = 0. La norme de X est définie par
½
γ
si 0 < α < 1
(2.18)
kXkα =
1
α
si 1 ≤ α ≤ 2
γ
Alors, la norme kXkα est une quantité liée directement à la dispersion γ et
détermine la distribution de X via la fonction caractéristique.
Définition 2.11. Si X et Y sont deux v.a.r. conjointement α-stable, la distance
entre X et Y est définie par
dα (X, Y ) = kX − Y kα
(2.19)
En combinant les deux équations (6.3) et (2.7), on peut facilement remarquer
que la distance dα mesure le p-ème moment de la différence des deux v.a.r. Dans
le cas α = 2, cette distance est identique à la moitié de la variance de la différence
des deux v.a.r. Notons aussi que la convergence en distance dα est équivalente à
la convergence en probabilité [Cambanis et Miller(1981)].
Il est connu dans la théorie des statistique d’ordre deux que l’espace des v.a.r.
d’un processus aléatoire à variance finie est un espace de Hilbert. Cependant, ce
n’est pas le cas pour les v.a.r. α-stables mais il existe un résultat similaire. En
effet, si l’on considère un processus α-stable X(t); t ∈ T , alors l’ensemble des
combinaisons linéaires des variable aléatoires X(t) forment un espace linéaire noté
l(X(t), t ∈ T ). Dans cet espace tous les v.a.r. sont conjointement α-stables de
même exposant caractéristique α [Cambanis et Miller(1981)]. Le théorème suivant
précise la structure de l’espace linéaire des v.a.r. SαS.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
2.5 Mesure de Dépendance des v.a.r. α-Stables
41
Théorème 2.8.
– Pour tout 0 < α ≤ 2, la distance dα définit une métrique sur l’espace
l(X(t), t ∈ T ).
– Particulièrement pour 1 ≤ α ≤ 2, k.kα est une norme définie sur l’espace
l(X(t), t ∈ T ).
Preuve
Il faut et il suffit de vérifier les trois axiomes d’une norme.
1. Soit X une v.a.r. SαS, alors
kXkα = 0 ⇐⇒ γX = 0 =⇒ ϕX (t) = 1 =⇒ X = 0p.s.
2. Soit λ un scalaire réel et X une v.a.r. SαS de dispersion γ.
D’aprés la proposition 2.8, la dispersion de λX est |λ|α γ. On a
donc
1
1
kλXkα = (|λ|α γ) α = |λ|γ α = |λ|kXkα
3. Si X1 et X2 sont deux v.a.r. conjointement SαS de mesure spectrale µ, alors
1
kX1 + X2 kα = γXα 1 +X2
µZ
¶1
α
α
=
|x1 + x2 | µ(dS)
S
µZ
¶1
α
≤
S
|x1 | µ(dS)
1
α
α
µZ
¶1
α
|x2 | µ(dS)
α
+
S
1
α
= γX1 + γX2
= kX1 kα + kX2 kα .
Alors k.kα définit bien une norme sur l’espace vectoriel des vecteurs
SαS.
¥
La difficulté fondamentale en traitement des signaux alpha-stables par les statistiques fractionnaires d’ordre inférieur est que la théorie des espaces d’Hilbert
n’est pas valide dans ce cas : l’espace linéaire des processus alpha-stables est
un espace de Banach pour 1 ≤ α ≤ 2 mais seulement un espace métrique pour
0 < α < 1.
2.5.3
Coefficient de covariation
Dans cette partie, (X, Y ) est un vecteur SαS avec α > 1.
Définition 2.12. Le coefficient de covariation de X sur Y est défini par
def
λX,Y =
[X, Y ]α
[Y, Y ]α
(2.20)
où [X, Y ]α est la covariation entre X et Y .
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
42
Distributions Non-Gaussiennes à Queues Lourdes
Ces définitions de covariation et de coefficient de covariation ne sont pas très faciles à utiliser en pratique puisqu’elles utilisent la mesure spectrale. Heureusement,
on peut connecter la covariation et le coefficient de covariation avec les moments
fractionnaires d’ordre strictement inférieur à α.
Théorème 2.9. Soient X et Y deux v.a.r. conjointement SαS avec 1 < α ≤ 2.
Notons la dispersion de Y par γY , alors
– Covariation :
[X, Y ]α =
IE(XY <p−1> )
γY ; 1 ≤ p < α
IE(| Y |p )
(2.21)
– Coefficient de covariation :
λX,Y =
IE(XY <p−1> )
; 1≤p<α
IE(| Y |p )
(2.22)
Les moments fractionnaires d’ordre inférieur dépendent de la loi α-stable qui
dépend directement de α. Cela implique que la covariation et le coefficient de
covariation dépendent de α.
Proposition 2.17. Soit X une v.a.r. α-stable d’exposant caractéristique α et de
dispersion γ. Alors la dispersion de X peut être exprimé sous la forme
Z
γX =
|x|α dµS 1 (x) = [X, X]α
(2.23)
S1
Preuve
Il suffit de combiner les deux équations (2.16) et (2.21).
¥
Proposition 2.18.
1. Soit (X, Y ) un vecteur SαS, alors on a
a
λaX,bY = λX,Y
b
pour tout couple (a, b) ∈ IR × IR∗
2. Soit (X, Y, Z) un vecteur SαS, alors on a
λX+Y,Z = λX,Z + λY,Z
3. Le coefficient de covariation entre X et Y n’est pas symétrique et n’est pas
borné.
Preuve
1. D’aprés la définition du coefficient de covariation, on a
λaX,bY
=
=
[aX, bY ]α
ab<α−1> [X, Y ]α
=
[bY, bY ]α
| b |α [Y, Y ]α
a [X, Y ]α
a
= λX,Y .
b [Y, Y ]α
b
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
2.5 Mesure de Dépendance des v.a.r. α-Stables
43
2. D’aprés la linéarité à gauche de λX,Y on peut écrire
λX+Y,Z
[X + Y, Z]α
[X, Z]α + [Y, Z]α
=
[Z, Z]α
[Z, Z]α
[X, Z]α
[Y, Z]α
=
+
[Z, Z]α
[Z, Z]α
= λX,Z + λY,Z .
=
3. Il suffit de prendre X = cY ; avec c 6= ±1 et voir que λX,Y = c
et que λY,X = 1/c. Alors, λX,Y 6= λY,X .
Par le même exemple on peut conclure que le coefficient de covariation λX,Y = c n’est pas borné.
¥
2.5.4
Codifférence
Nous supposons dans cette partie que 0 < α ≤ 2.
Comme le coefficient de covariation, la codifférence est une autre quantité qui
permet de mesurer la dépendance entre deux v.a.r. SαS.
Définition 2.13. La codifférence entre X et Y est définie par
τX,Y = kXkαα + kY kαα − kX − Y kαα
(2.24)
où k.kα est la norme de covariation introduite précédemment.
Proposition 2.19.
1. La codifférence est symétrique : τX,Y = τY,X .
2. Si α = 2, comme le coefficient de covariation, la codifférence est liée à la
covariance :
τX,Y = Cov(X, Y )
Preuve
1. Pour montrer la symétrie de la codifférence, il suffit de montrer
que
kX − Y kαα = kY − Xkαα .
Or k.kαα est une norme et donc pour toute v.a.r. X SαS, on a
kXkαα = k − Xkαα ce qui achève la preuve.
2. On a vu que [X, Y ]2 = 12 Cov(X, Y ) et que kXkαα = [X, X]α . Ce
qui entraı̂ne que
1
1
kXk22 = Cov(X, X) = V ar(X)
2
2
et donc
1
1
1
τX,Y = V ar(X) + V ar(Y ) − V ar(X − Y )
2
2
2
or V ar(X − Y ) = V ar(X) + V ar(Y ) − 2Cov(X, Y ), ce qui donne
le résultat souhaité soit τX,Y = Cov(X, Y ).
¥
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
44
Distributions Non-Gaussiennes à Queues Lourdes
2.5.5
Coefficient de covariation symétrique
Définition 2.14. (Garel et al., 2004) Soit (X, Y ) un couple aléatoire réel SαS.
Le coefficient de covariation symétrique entre X et Y est définit par
Corrα (X, Y ) = λX,Y λY,X =
[X, Y ]α [Y, X]α
[X, X]α [Y, Y ]α
On obtient alors le résultat suivant qui corrige les inconvénients du coefficient
de covariation.
Proposition 2.20. Soit (X, Y ) un couple aléatoire réel SαS. Nous avons les
propriétés suivantes
1. Corrα (X, Y ) = Corrα (Y, X) et |Corrα (X, Y )| ≤ 1.
2. Si X et Y sont deux v.a.r. SαS indépendantes, alors Corrα (X, Y ) = 0.
2.5.6
Estimation des coefficients de covariation
Proposition 2.21. (Samorodnitsky et Taqqu, 1994 ; d’Estampes, 2003)
Soit (X, Y ) un couple aléatoire réel SαS où α > 1, nous avons pour tout 1 ≤ p < α,
λX,Y =
[X, Y ]α
IEXY <p−1>
=
[Y, Y ]α
IE|Y |p
. Soit (X1 , ..., Xn ) (resp. (Y1 , ..., Yn )) un n-échantillon de même loi que X (resp.
Y ). En prenant p = 1 dans
on peut construire un estimateur
Pl’équation précédente,
P
de λX,Y à savoir λ̂X,Y = ni=1 Xi sign(Yi )/ ni=1 |Yi | . Pour estimer Corrα (X, Y ),
nous utilisons alors la quantité suivante
¶ µ Pn
¶
µ Pn
Y
sign(X
)
X
sign(Y
)
i
i
i
i
i=1
i=1
ˆ α (X, Y ) =
Pn
Pn
Corr
i=1 |Yi |
i=1 |Xi |
qui est le produit de l’estimateur du coefficient de covariation λX,Y par l’estimateur
du coefficient de covariation λY,X .
2.6
2.6.1
Représentation Analytique des PDF α-Stables
Développement en séries entières
A l’exception des trois lois particulières, lois de Gauss, loi de Cauchy et la loi
de Lévy, la PDF des distributions α-stables n’a pas d’expression analytique exacte.
Cependant, il existe un développement en série entière de celle ci. Par exemple,
le développement en série entière de la PDF d’une distribution α-stable standard
SαS, est donné par [Samorodnitsky et Taqqu(1994)] :

∞
X

(−1)k−1
sin(kαπ/2)

1

Γ(αk + 1)
si 0 < α < 1

 π
k!
x | x |αk
k=1
f (x)α =
(2.25)
∞
X

(−1)k 2k + 1 2k

1

Γ(
)x ,
si 1 ≤ α ≤ 2

πα

2k!
α
k=0
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
2.6 Représentation Analytique des PDF α-Stables
45
Vu que ces sommes regroupent un nombre infini de termes, il est difficile de les
utiliser dans la pratique.
2.6.2
Développement asymptotique
Pour les distributions SαS avec α > 1, il existe un développement asymptotique de la densité de probabilité proposé dans [Bergstrom(1952)].
n
f (x)α =
1 X (−1)k 2k + 1 2k
Γ(
)x + O(| x |2n+1 ) quand | x |→ 0
πα
2k!
α
(2.26)
k=0
et
n
X
1 (−1)k
sin(kαπ/2)
Γ(αk +1)
+O(| x |−α(n+1)−1 ), quand | x |→ ∞
π k!
| x |αk+1
k=1
(2.27)
Le calcul de la série asymptotique pour des larges valeurs de n pose des problèmes
de calcul au niveau de la fonction gamma. Ces difficultés peuvent être réduites en
suivant la procédure proposée dans [Nikias et Shao(1995), page 17].
f (x)α =
2.6.3
−
Approximation par un mélange fini
[A]- Approximation par un mélange fini de gaussiennes
Dans cette section, nous considérons une v.a.r. gaussienne X, une v.a.r. Y α1
stable et la v.a. Z = Y 2 X de loi α-stable selon le corollaire 2.1 et on présente
la méthode d’approximation des PDF SαS par un mélange de gaussiennes introduite par [Kuruoglu(1998)]. On peut déduire l’expression de la densité de Z par
la propriété de marginalization des PDF
Z
fZ (z) =
+∞
fZ|V (z | v)fV (v)J(z, v)dv
−∞
(2.28)
1
où fZ (.) et fV (.) représentent les densités de Z et de V = Y 2 respectivement et
J(z, v) représente le Jacobien de Z par rapport à V . Or X est une v.a.r. gaussienne,
alors pour une réalisation V = v, fZ|V est conditionnellement distribuée selon la
loi Gaussienne. On peut alors réexprimer l’équation (2.28) sous la forme
1
fZ (z) = √
2π
Z
+∞
exp(−
−∞
z2
)fV (v)v −1 dv
2γv 2
(2.29)
Cette densité est appelé mélange d’échelles de la loi normale et la fonction h(v) =
fV (v) est dite fonction de mélange. La fonction de mélange est la densité du v.a.r.
1
V = Y 2 dont l’expression est obtenue grâce à la formule suivante
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
46
Distributions Non-Gaussiennes à Queues Lourdes
Théorème 2.10. Soit V = T (Y ), où T représente une transformation inversible,
alors
fV (v) = fY (T −1 (v)) |
ou simplement = fY (y) |
dT −1 (v)
|
dv
dy
|
dv
(2.30)
(2.31)
Pour le cas spécial que nous avons considéré ici, cette relation est réduite à
fV (v) = fY (y)2v.
(2.32)
Notons que la décomposition en mélange d’échelle gaussiennes est une propriété
bien étudiée dans la littérature [Andrews(1974)].
L’échantillonnage de fZ (z) de l’équation (2.28) sur un ensemble fini de N
points permet d’obtenir une approximation de la PDF SαS par un mélange fini
de densités gaussiennes :
PN
f(α,a,0,γ) (z) ≈
2
exp(− (z−a)
)fV (vj )
2γvj2
P
√
2πγ N
j=1 fV (vj )
1
j=1 vj
(2.33)
Pour une bonne approximation, on doit prendre N assez grand ce qui va rendre
le calcul assez complexe. Pour réduire cette complexité, [Kuruoglu(1998)] propose
d’utiliser un certain nombre de composantes et puis de raffiner l’approximation
en utilisant l’algorithme EM [Dempster(1977)]. Cette procédure permet alors d’estimer la densité SαS, nous résumons les étapes essentielles dans le tableau 2.5
suivant.
[B]- Approximation par un mélange fini de Pearson
Pour le cas d’une densité α-stable de paramètres β = +1 et α < 1, une approximation par un mélange fini de densités de Pearson qui sont des PDF α-stables
d’indice α = 1/2 été proposé récemment dans [Kuruoglu(2003)]. Notons que l’auteur suit la même démarche pour le cas de mélange de gaussiennes ci-dessus en
prenant αx = 1/2 au lieu de αx = 2 dans le théorème des mélange d’échelles des
lois α-stables (théorème 2.5).
2.7
Autres Distributions à Queues Lourdes
Dans cette section, nous introduisons d’autres classes de distributions à queues
lourdes. La première est celle des lois gaussiennes généralisées (GG) et la deuxième
classe est celle des lois appeleées lois normales inverse gaussiennes (NIG).
2.7.1
Loi gaussienne généralisée
Une généralisation des lois de Gauss et de Laplace est donnée par le modèle
des lois gaussiennes généralisées. La distribution de ce modèle est décrite par une
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
2.7 Autres Distributions à Queues Lourdes
47
$
'
Estimation de la PDF SαS
Step 1. Initialisation : Étant donné les paramètres de la PDF SαS
désirée, on génère la fonction
caractéristique ϕY (.) d’une´ v.a.r. Y stable,
³
2
)) α
positive de paramètres α/2, β = −1, a = 0, γ = (cos( πα
4
Step 2. Evaluation de la PDF stable positive fY en N points : En
appliquant la FFTa à la fonction caractéristique ϕY (.) générée dans l’étape
précédente, où N représente le nombre de composantes gaussiennes dans le
mélange
Step 3. Evaluation de la fonction de mélange fV : c’est la densité de
1
la v.a.r. V = Y 2 donnée par
fV (v) = 2vfY (v 2 )
(2.34)
Step 4. L’approximation analytique de la PDF SαS : Par substitution
de l’équation (2.34) dans l’équation (2.33) :
PN
f(α,a,0,γ) (z) =
2
)fY (vj2 )
exp(− (z−a)
2γvj2
P
√
2
2πγ N
j=1 vj fY (vj )
j=1
(2.35)
Step 5. Affinage de l’approximation par l’algorithme EM : Nous
cherchons à estimer un mélange de gaussiennes de la forme
f(α,a,0,γ) (z) =
N
X
pj G(z/j)
(2.36)
j=1
P
où pj sont les fréquences de pondération tel que N
i=1 pj = 1 et G(z/j) sont
des PDF gausiennes. On considère M observations (zm , m = 1, · · · , M ) comme
variables cachées et on applique l’algorithme EM qui consiste à initialiser l’algorithme par une première estimation de G(zm /j) et pj et puis alterner les
deux étapes ”Expectation” et ”Maximisation” [Dempster(1977)].
a
La transformée de Fourier rapide
&
%
Tab. 2.5 – Approximation de la PDF SαS par le modèle de mélange de
gaussiennes et affinage de l’approximation par l’algorithme EM.
densité de type exponentielle de la forme :
fp (x) = c exp(− |
x α
| )
σ
(2.37)
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
48
Distributions Non-Gaussiennes à Queues Lourdes
où c =
α
1
2σΓ( α
)
et Γ(.) est la fonction gamma. Le paramètre σ > 0 représente le
paramètre d’échelle de la distribution et α > 0 est le paramètre qui caractérise
l’impulsivité. Notons que pour α = 2, fα (x) est gaussienne, alors que α = 1 correspond à la loi de Laplace. Conceptuellement, plus α est petit, plus la distribution
est impulsive.
Cette classe de PDF est utilisée depuis longtemps. Les références les plus anciennes à ma connaissance sont [Subbotin(1923)] et [Frechet(1924)]. En raison de
leur simplicité dans les calculs mathématiques elle sont largement exploitées dans
les applications du traitement du signal [Kay(1998a)] pour modéliser plusieurs processus, qui sont observés dans des domaines variés dont le traitement de la parole,
l’audio ou le signal vidéo, l’image, la turbulence et les systèmes multi-utilisateurs
[Zoubir et Brcich(2002)].
Notons que les moments de ce type de v.a. sont finis et calculables analytiquement, par opposition à d’autres PDF à queues lourdes comme les α-stables
presentées au début de ce chapitre.
Proposition 2.22. L’expression analytique des moments d’ordre k est donnée
par :
½
0
si k est impair,
k
(2.38)
IE(X ) =
2c
k+1
)
si
k est pair.
Γ(
α
ασ −k−1
Preuve
– Si k est impair, IE(X k ) = 0, car la fonction xk exp(− |
impaire.
– Si k est pair, on a :
Z
k
+∞
x
σ
|α ) est
x α
| )dx
σ
−∞
Z +∞
x
= 2
cxk exp(−( )α )dx
σ
0
Z +∞
k+1
2c
=
y α −1 exp(−y)dy
−k−1
ασ
0
2c
k+1
=
Γ(
).
ασ −k−1
α
IE(X ) =
cxk exp(− |
¥
dans la proposition suivante, nous présentons le comportement de la loi gaussienne
généralisée pour différentes valeurs de α.
Proposition 2.23. (Comportement de la loi GG pour différentes α)
– Si α = 2, la loi gaussienne généralisée correspond à la loi de Gauss standard.
– Si α > 2, la queue de la loi gaussienne généralisée est moins lourde que
celle de la loi de Gauss standard, c’est-à-dire que la PDF tend vers 0 plus
rapidement que la PDF de Gauss.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
2.7 Autres Distributions à Queues Lourdes
49
– Si α tend vers +∞, la loi gaussienne généralisée converge vers la loi uniforme.
– Si 0 < α < 2, la queue de la loi gaussienne généralisée est de nature impulsive.
Malgré le succés relatif de la famille des lois gaussienne généralisées, elle présente
quelques limitations. En effet, lorsque α < 1 la ”peaky shape” de la distribution
est non approprié à certaines situations pratiques du bruit. De plus, la minimisation de la fonction de coût basée sur la norme Lp reste un problème majeur.
On peut noter également la décroissance exponentielle de la queue contrairement
au comportement algébrique de la queue des processus impulsifs rencontrés dans
plusieurs applications [Nikias et Shao(1995)].
2.7.2
Loi normale inverse gaussienne
La famille des lois normales inverses gaussiennes (NIG) est une sous classe universelle des distributions hyperboliques généralisées. Le travail pionnier sur les lois
N IG est introduit par Barndorff-Nielsen en 1977 et en 1995. D’autres réferences
récentes existent dans la littérature comme par exemple [Barndorff(1998)]. Contrairement au lois SαS, la densité des distributions N IG a une expression explicite.
[A]- Définition
Définition 2.15. Une v.a.r. X est de loi N IG si sa densité de probabilité est de
le forme
√
p
K
(α
αδ
δ 2 + x2 )
1
exp(δ α2 − β 2 − βµ) √
exp(βx)
fX (x) =
π
δ 2 + x2
(2.39)
où
µ, δ > 0,
0 ≤| β |≤ α ∈ IR et K1 est la fonction de Bessel modifiée de
seconde espèce d’indice 1.
Une v.a.r. X de loi N IG est notée X ∼ N IG(α, β, δ, µ). Une loi N IG est
paramétrée par quatres paramètres α, β, δ et µ et ces paramètres ont la même
interprétation que ceux des lois α-stables : α détermine le comportement de la
queue, plus α est petit plus la queue est lourde ; β est un paramètre de symétrie,
β = 0 donne une densité symétrique, β > 0 implique que la densité est étalée
vers la droite, β < 0 implique que la densité est étalée vers la gauche ; δ est un
paramètre d’échelle et µ est un paramètre de position. Pour illustrer l’allure des
lois N IG nous avons tracé la PDF pour plusieurs valeurs de α dans la figure 2.4.
Définition 2.16. La fonction caractéristique d’une v.a.r. N IG est donnée par
[Barndorff-Nielsen(1997)].
n p
o
p
ϕX (t) = exp δ α2 − β 2 − δ α2 − (β + jt)2 + jµt
(2.40)
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
50
Distributions Non-Gaussiennes à Queues Lourdes
0.7
α=2
α=1
α=0.5
α=0.001
0.6
NIG(α,0,1,0)(x)
0.5
0.4
0.3
0.2
0.1
0
−5
−4
−3
−2
−1
0
x
1
2
3
4
5
Fig. 2.4: La densité de probabilité de la loi N IG(α, 0, 1, 0) pour différent
valeurs de α.
[B]- Propriétés
– Indéfiniment divisible : D’aprés la forme exponentielle de la fonction caractéristique des lois NIG, on peut l’exprimer facilement sous forme d’une
puissance d’une autre fonction caractéristique d’une loi N IG, par conséquent
les lois N IG sont des lois indéfiniment divisible. Cette proprièté signifie que
si X1 , · · · , XN sont des
v.a.r. indépendantes et si Xi ∼ N IG(α, β, δi , µi ),
PN
alors la somme S =
est aussi de loi N IG. En plus nous avons
i=1 Xi P
PN
S ∼ N IG(α, β, δ, µ) avec δ = N
i=1 δi et µ =
i=1 µi . Cette propriété est
similaire à la proprièté qui caractérise la classe des α-stable. Cependant, la
distribution N IG n’est pas stable. Une façon de le voir, est que le paramètre
δ diverge pour une somme infinie normalisée de v.a.r. N IG.
– Contient les lois de Gauss et de Cauchy : D’après la forme de la fonction
caractéristique on peut remarquer facilement que les lois de Cauchy et de
Gauss apparaı̂tront comme des cas spéciaux des lois N IG. En effet, la loi de
Gauss représente le cas limite β = 0, α → ∞ et σ 2 = δ/α ; et N IG(0, 0, δ, µ)
correspond à la loi de Cauchy.
– Comportement asymptotique de la queue : Dans [Hanssen et Oigard(2001)],
l’auteur a démontré que le comportement asymptotique de la PDF N IG est
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
2.7 Autres Distributions à Queues Lourdes
51
donné par
½
lim f (x) ∝
|x|→∞
| x |−3/2 exp(βx − α | x |) , α 6= 0
| x |−2
, α → 0.
(2.41)
On voit alors que pour α 6= 0, le comportement asymptotique des PDF N IG
combine une décroissance algébrique et exponentielle dont le terme exponentiel est
déterminé par les deux paramètres α et β. Quand α → 0, la PDF N IG approche de
celle de Cauchy et donc le comportement asymptotique de la queue aussi approche
de la queue de Cauchy.
2.7.3
Loi t-Student
Définition 2.17. ( William Sealy Gosset 1908)
Introduite par Gosset en 1908, la PDF d’une distribution t-Student est paramétrisée
sous la forme
µ
¶− α+1
2
x2
Tα = c 1 +
(2.42)
α
où
1 Γ( α+1 )
c= √ √ 2α
α πΓ( 2 )
Il s’agit d’une densité symétrique par rapport à l’axe des ordonnées.
Définition 2.18. (Roland Aylmer Fisher 1925)
Fisher s’interessa aux travaux de Gosset. Il lui écrivait en 1912 pour lui proposer
une démonstration géométrique de la loi de Student, et pour introduire la notion
de degré de liberté. Il publia notamment un article en 1925 dans lequel il définit la
loi de student comme rapport de deux v.a.r. indépendantes U et Y suivant respectivement une loi N (0, 1) et une loi χ2 (α) :
def U
Tα = q
Y
α
=
√ U
α√
Y
(2.43)
On dit que le quotient Tα suit une loi de t-Student (ou tout simplement : loi de
Student)3 à α degré de liberté.
Proposition 2.24. (Propriétés de la loi de Student)
– L’espérance : De l’expression de la PDF ci-dessus, on peut déduire qu’une
v.a.r. de loi t-Student est centrée et de moyenne nulle.
– La variance : Lorsque α ≤ 2, la loi de Student n’admet pas de variance finie.
α
Si α > 2, le calcul de la variance donne α−2
.
3
Student était le pseudonyme choisi par le statisticien William Sealy Gosset (18761937). Il fut l’un des premiers statisticiens du monde de l’entreprise, consacrant sa carière à
l’industrie agro-alimentaire, au sein de laquelle il a toujours été reconnu à la fois comme industriel et comme scientifique. Très associé au monde universitaire, il a largement contribué
au développement scientifique de cette période.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
52
Distributions Non-Gaussiennes à Queues Lourdes
– Queue algébrique : Il est facile de constater, d’après la forme de la PDF,
que la loi de Student est de queue algébrique d’indice α. Plus α devient petit,
plus la queue devient lourde.
– Cas extrêmes :
1. Lorsque α → ∞, la distribution de Student est équivalente à la distribution de Gauss.
2. Lorsque α → 0, la distribution devient très impulsive.
– Cas particulier : Lorsque α = 1, le modèle correspond à celui de Cauchy.
La famille des lois t-Student était introduite en traitement du signal pour la
première fois par Hall en 1966 comme modèle empirique pour le bruit atmosphérique
en communication radio [Hall(1966)]. Néanmoins, on trouve bien avant ce modèle
dans la littérature des statistiques mathématiques indexé par un entier k au lieu
du réel α. Il se peut que Hall l’avait généralisé en remplaçant k par α.
2.8
Conclusion
Malgré le rôle important des lois α-stables dans la modélisation des signaux à
densité de probabilité à queue lourde, elles présentent des limites : préférons dire,
elles ouvrent plusieurs questions pour surmonter les difficultés rencontrées lors de
l’inférence statistique en l’absence d’une expression explicite de la densité et en
l’absence des moments de second ordre et d’ordre supérieur.
Pour contribuer à la résolution de certaines questions relatives à la séparation
de sources impulsives de distributions α-stables et d’estimation d’un signal noyé
dans un bruit impulsif de modèle α-stable, nous allons proposer dans les chapitres
suivants de nouvelles approches.
En effet, nous allons utiliser les moments d’ordre inférieur, introduire des statistiques normalisées et approcher la densité de probabilité par la famille des fonctions
log-splines pour pouvoir manipuler les observations de lois α-stables.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
Chapitre 3
Robust Estimation
The robust minimax approach is an alternative to the conventional maximum
likelihood (ML) that overcomes the ML estimate sensitivity and improves the efficiency in an environment with unknown heavy-tailed distribution [Huber(1981)].
In this chapter, we provide a brief background on the fundamental concepts of
robust estimation that will be used in the subsequent 2-nd part of this thesis.
3.1
Robustness
The term ”robust” was coined in statistics by G.E.P. Box in 1953. Various
definitions of greater or lesser mathematical rigor are possible for the term. However, in general, referring to a statistical estimator, it means ”insensitive to small
departures from the idealized assumptions for which the estimator is optimized”
[Hampel et al.(1986), Huber(1972)]. The word ”small” can have two different interpretations, both important : either fractionally small departures for all data
points, or else fractionally large departures for a small number of data points. It is
the latter interpretation, leading to the notion of outlier points, that is generally
the most stressful for statistical procedures.
Roughly speaking, robustness means insensitivity to gross measurement errors,
and errors in the specification of parametric models. For example, consider the estimation of the mean from 100 measurements. Assume that all measurements (but
one) are distributed between -1 and 1, while one of the measurements has the
value 1000. Using the simple estimator of the mean given by the sample average,
the estimator gives a value that is not far from the value 10. Thus, the single,
probably erroneous, measurement of 1000 had a very strong influence on the estimator. The problem here that the average corresponds to minimization of the
54
Robust Estimation
squared distance of measurements from the estimate. The square function implies
that measurements far away dominate.
To get a good estimator in presence of outliers, statisticians have developed
various sorts of robust statistical estimators. Many, if not most, can be grouped in
one of three categories.
• M-estimates : follow from maximum-likelihood arguments which are the
usually the most relevant class for model-fitting, that is, estimation of parameters. We therefore consider these estimates in some detail below.
• L-estimates : are linear combinations of order statistics. These are most
applicable to estimation of central value and central tendency. Two typical
L-estimates will give the general idea. They are (i) the median, and (ii) Tukey’s trimean, defined as the weighted average of the first, second, and third
quartile points in a distribution, with weights 1/4, 1/2, and 1/4, respectively.
• R-estimates : are estimates based on rank tests. For example, the equality
or inequality of two distributions can be estimated by the Wilcoxon test
of computing the mean rank of one distribution in a combined sample of
both distributions. The Kolmogorov-Smirnov statistic and the Spearman
rank-order correlation coefficient are R-estimates in essence, if not always
by formal definition [Huber(1981)].
Some other kinds of robust techniques, can be found in the field of optimal control
and filtering rather than from mathematical statistics literature.
3.2
M- Estimation
Huber (1964) proposes a generalization of the least squares principle for constructing estimators of (principally) location parameters. Suppose, on the basic model,
that the sample comes from a distribution with distribution function F (x − θ). It
is the location parameter, θ, which we wish to estimate. We might estimate θ by
Tn = Tn (x1 , x2 , . . . , xn ) chosen to minimize
n
X
ρ(xj − Tn )
(3.1)
j=1
where ρ is some real valued non-constant function. As special cases we note
that ρ(t) = t2 yields the sample mean, ρ(t) = |t| yields the sample median, whilst
ρ(t) = − log f (t) yields the maximum likelihood estimator (where f (x) is the
density function under the basic model when θ = 0). If ρ is continuous with
derivative ψ, equivalently we estimate θ by Tn satisfying
n
X
ψ(xj − Tn ) = 0.
(3.2)
j=1
Such an estimator is called a maximum likelihood type estimator, or M-estimator.
If ρ is convex, then ( 3.1) and (3.2) are equivalent ; otherwise, (3.2) is still very
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
3.2 M- Estimation
55
useful in searching for the solution to (3.1). Usually we restrict attention to convex
ρ, so that ψ is monotone and Tn unique. Under quite general conditions Tn can
be shown to have desirable properties as an estimator. If ρ is convex Tn is unique,
translation invariant, consistent, and asymptotically normal [Huber(1972)]. The
choice of ρ that leads to an optimal robust estimator of θ is now discussed. One
particular estimator with desirable properties of robustness arises from the Huber
function
½
ρH (t) =
1 2
2t
k|t|
− 21 k 2
if | t |≤ k,
if | t |> k
(3.3)
for a suitable choice of k. It turns out that the estimator Tn is equivalent to
the sample mean of a sample in which all observations xj such that | xj − Tn |> k
are replaced by Tn − k or Tn + k, whichever is the closer.
Another M-estimator, with
½
ρ(t) =
1 2
2t
1 2
2η
if | t |≤ η,
if | t |> η
(3.4)
can be similarly interpreted as a trimmed mean. Tn is now the sample mean of those
observations xj satisfying | xj −Tn |< η. This extends the modified trimming above
from rejection of a single extreme value to rejection of all sample values whose
residuals about Tn are sufficiently large in absolute value. See [Huber(1981)] for
details.
Standard cost functions
– Normal criterion : For a gaussian distribution, the ML estimation leads to
ρ(x) =
x2
;
2
ψ(x) = x
(3.5)
N
N
X
X
4 1
(xi − θ) = 0 yields θ̂ = x̄ =
xi
N
i=1
i=1
– Double exponential criterion : For a modulus distributions, the score function is given by
½
ρ(x) = |x|;
N
X
ψ(x) =
−1 x < 0
1 x>0
(3.6)
ψ(xi − θ) = 0 yields θ̂ = sample median
i=1
– Maximum likelihood criterion : The choice of ρ(x) = − log f (x) (where f
represent the observation PDF) gives the ordinary maximum likelihood estimate.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
56
Robust Estimation
When the basic model involves a scale parameter, the distribution function is
of the form F [(x − θ)/σ]), modified forms of the M-estimator have been proposed.
The estimator of θ is a solution Tn of an equation of the type
n
X
ψ[(xj − Tn )/σ̂] = 0
(3.7)
j=1
where the scale parameter estimator σ̂ is robust for σ and is estimated either
independently by some suitable scheme or simultaneously with θ by joint solution
of ( 3.7)
3.2.1
Minimax M-estimate of location estimator
In this section, we consider a robust estimation in a minimax sense based on
Huber’s minimax M-estimator [Huber(1981)]. Huber considered the robust location estimator problem. Suppose we have one-dimensional (1-D) i.i.d. observations
x1 , x2 , . . . , xn . The observations belong to some sample space X , which is a subset
of the real line IR. A parametric model consist of a family of probability distributions Fθ (or equivalently a family of PDF fθ ) on the sample space, where the
unknown parameter θ belongs to some parameter space θ. When estimating location in the model X = IR, Fθ (x) = F (x − θ), the M-estimator is determined by a
ψ− function of the type ψ(x, θ) = ψ(x − θ), i.e., the M-estimate of the location
parameter θ is given by the solution to the equation
n
X
ψ(xi − θ) = 0.
(3.8)
i=1
Assuming that the sample distribution belongs to the set of ²- contaminated Gaussian models given by :
P² = {(1 − ²)N (0, ν 2 ) + ²H; H is a symmetric distribution}
(3.9)
where 0 < ² < 1 is fixed, and ν 2 is the variance of the nominal Gaussian
distribution. It can be shown that, within mild regularity, the asymptotic variance
of an M-estimator of the location θ defined by (3.8) at a distribution F ∈ P² is
given by [Huber(1981)]
R 2
ψ dF
V (ψ; F ) = R 0
(3.10)
( ψ dF )2
Huber’s idea was to minimize the maximal asymptotic variance over P² , that is,
to find an M-estimator ψ0 that satisfies
sup V (ψ0 ; F ) = inf sup V (ψ; F ).
F ∈P²
ψ F ∈P²
(3.11)
This is achieved by finding the least favorable distribution F0 , i.e., the distribution
that minimizes the Fisher information
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
3.2 M- Estimation
57
Z Ã
I(F ) =
00
F
F0
!2
dF
(3.12)
00
over all F ∈ P² . Then, ψ0 = − FF 0 is the maximum likelihood estimator for this least
favorable distribution. Using the above concepts of minimax robustness, Huber
showed that the Fisher information is minimized by
(
f0 (x) =
√1−²
2πν
√1−²
2πν
2
x
exp( 2ν
2 ),
for | x |≤ kν 2
2 2
exp( k 2ν − k | x |), for | x |> kν 2
(3.13)
where k, ², and ν are connected through
φ(kν)
²
− Q(kν) =
kν
2(1 − ²)
where
(3.14)
x2
1
def
φ(x) = √ e− 2
2π
and
Z ∞
x2
1
e− 2 dx.
Q(t) = √
2π t
The corresponding minimax M-estimator is then determined by the Huber
penalty function and its derivative, which is given by
( 2
½
x
sign(x)K, |x| > K
,
|x| ≤ K
2
; ψH (x) =
(3.15)
ρH (x) =
K2
x, |x| ≤ K.
K|x| − 2 |x| > K
def
N
X
ψH (xi − θ) = 0 is solved by numerical methods
i=1
These are the ρ and ψ functions associated with a function which is “normal” in
the middle with “double exponential” tails. The constant K regulates the degree
of robustness ; good choices for K are between 1 or 2 times the standard deviation
of the observations. The corresponding M-estimator is the minimax solution.
3.2.2
Influence Function
The influence function (IF) introduced in [Hampel et al.(1986)], is an important tool used to study robust estimators. It measures the influence of a vanishingly
small contamination of the underlying distribution on the estimator. It is assumed
that the estimator can be defined as a functional T operating on the empirical
distribution function Fn , T = T (Fn ) and that the estimator is consistent as
n → +∞, i.e., T (F ) = limn→+∞ T (Fn ), where F is the underlying distribution.
The influence function is defined as
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
58
Robust Estimation
T [(1 − t)F + t∆x ] − T (F )
t→0
t
IF (x; T, F ) = lim
(3.16)
where ∆x is the distribution that puts a unit mass at x. Roughly speaking, the
influence function IF (x; T, F ) is the first derivative of the statistic T at an underlying distribution F and at the coordinate x.
The influence function measures the effect of a deviation from the assumed
distribution on a descriptive statistics, T , in other words robustness. The utility
of the influence function is that allows us to calculate the asymptotic covariance
of the M-estimates using the formula [Huber(1981)],
Z
Cov {T (Fn ), T (F )} = IF (x; T, F )IF (x; T, F )T dF
(3.17)
One can proceed to calculate IF (x; T, F ) and Cov {T (Fn ), T (F )} for the given
signal model.
3.2.3
M-Estimation of a deterministic signal parameter
Consider the general signal in noise model given in (12.1) : x(t) = s(t, θ) + z(t)
where z(t) is i.i.d. noise and the signal, s(t), is parameterized by θ = (θ1 , · · · , θM )T ,
(.)T denoting transposition. The aim is to estimate θ from N observations x(t), t =
1, · · · , N . Given the noise density, f (z), one obtains the ML solution as
θ̂ = arg min
θ
N
X
ρ{x(t) − s(t, θ)}
(3.18)
n=1
where ρ(x) = − log f (x). Alternatively, we can solve the M coupled equations
N
X
ψ{x(t) − s(t, θ)}
n=1
∂s(t, θ)
=0
∂θ
(3.19)
where ψ(x) = −f 0 (x)/f (x) is the location score function of f (x). It is clear that
without a priori knowledge of f (x), the estimation of θ cannot be optimal. Huber
considered estimation in the presence of outliers or impulsive noise and proposed the concept of M-estimation [Huber(1981)]. In an M-estimation framework
− log f (x) is replaced with a similarly behaved function, ρ(x), chosen to confer
robustness on the estimator under deviations from a nominal density. Thus, a MEstimate for θ can be obtained as a solution of the optimization problem given in
equation (3.18) or by solving the M coupled equations
N
X
n=1
ϕ{x(t) − s(t, θ)}
∂s(t, θ)
=0
∂θ
(3.20)
where ϕ(x) = ρ0 (x). When f (x) is unknown one is unsure of how close ϕ(x) is to
ψ(x).
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
3.2 M- Estimation
3.2.4
59
Theoretical performance
Let F be the distribution of the noise and Fn its empirical counterpart from a
sample of size n. Then an estimate of θ can be defined in terms of a functional T
operating on Fn , T (Fn ), while the true parameters are obtained as T (F ). Under
some mild conditions such as IE[ϕ(x)] = 0, M-estimates possess desirable properties
such as consistency and asymptotic normality [Hampel et al.(1986), Huber(1981)].
Herein we assume a symmetric noise density and antisymmetric ϕ to ensure this
condition is met.
• Asymptotic covariance : Using the influence function concept, it is proved in
[Brcich et Zoubir(2002)] that the asymptotic covariance of estimation errors
of θ has the form
ÃN
!−1
E[ϕ2 (x)] X
Cov {T (Fn ), T (F )} =
Λn ΛTn
(3.21)
E[ϕ0 (x)]2
n=1
where ϕ(x) = ρ0 (x) and Λn is the gradient of sn (θ). Then the only degree of
freedom at our disposal for minimizing the asymptotic covariance is through
appropriate choice of ϕ(x).
• Asymptotic normality : Define Cov {T (Fn ), T (F )} to be the asymptotic
variance then,
1
d→∞
n 2 (T (Fn ) − T (F )) −→ N (0, Cov {T (Fn ), T (F )})
(3.22)
• Consistency : Let F belong to a family of distributions F, then T (Fn )
converges in probability to T (F ) as n → ∞,
IP {| T (Fn ) − T (F ) |> ²} → 0 as n → ∞,
F ∈F
(3.23)
for any ² > 0.
3.2.5
Minimax optimal cost function
Let the noise distribution f be known incompletely ; what is known is only
that it belongs to a certain class P. Applying to our M-estimator the Cramer-Rao
inequality, under certain regularity assumptions, gives
Cov {T (Fn ), T (F )} ≥ A(Λn )I(f )−1
(3.24)
where I(f ) is the Fisher information and A(Λn ) is a matrix depending only on
Λn . The worst distribution is naturally the one for which the right-hand part in
(3.24) is maximal, or I(f ) is minimal. In other words, the robust Huber’s minimax
estimator over P is defined as in the ML method by equation (3.18) with the loss
function
ρ∗ (z) = − ln(f ∗ (z))
(3.25)
where f ∗ (z) is selected such that the information on the parameter contained
therein is minimal, i.e. a solution of the problem
f ∗ (z) = arg min I(f )
f ∈P
(3.26)
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
60
Robust Estimation
R
where I(f ) = (f 0 (z))2 /f (z))dz denotes the Fisher information. We call the Mestimator robust if the loss function ρ is given according to (3.26) and (3.25).
This approach consists to consider the worst case (among p ∈ P) corresponding
to the PDF giving the minimum Fisher information value. Solving the worst case
would insure robustness (good estimation performance) if the considered signal
pdf belongs to P. It is emphasized that the robustness property of the estimator
depends on how the class P is defined. Thus, in order to obtain the robust minimax
estimator, first, an appropriate class P should be defined, and after that, the loss
function ρ is given by (3.25) and (3.26).
3.3
Concluding Remarks
M-estimation is an alternative approach for robust estimation that is used to
implement sub-optimal estimators which are robust to changes in the underlying
distribution.
Since the impulsive noise is present in communications channels, the M-estimation
of signal parameters in the additive noise model becomes an important issue. The
approach to robust estimation taken in the second part of this thesis follows the
M-estimation concept of robust statistics, except for that the density function is
modelled as an α-stable PDF and is estimated from the observations.
However, many questions remain open for serious discussions such as the choice
of the so called score function ϕ in the case of α-stable noise model. The second
part of this thesis investigates these difficulties and proposes some solutions in the
context of a multicomponent non-stationary FM signal.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
Chapitre 4
Time-Frequency Concepts
Time–frequency signal processing (TFSP) represents a set of effective methods,
techniques and algorithms used for analysis and processing of non-stationary signals, as found in a wide range of applications including telecommunications, radar
and biomedical engineering. TFSP is a natural extension of both the time domain
and the frequency domain processing, that involves representing signals in a two–
dimensional space, and so reveals “complete” information about the signal. Such
a representation is intended to provide a distribution of signal energy versus time
and frequency simultaneously. More details and advances of TFSP can be found in
[Cohen(1995), Flandrin(1998), Hlawatsch(1998), Boashash(2002)]. This chapter,
therefore, provides a brief background on the fundamental concepts of TFSP that
will be used in the subsequent 2-nd part.
4.1
Need of Time-Frequency Representation
The two classical representations of a signal s(t), are the time-domain representation, and, frequency-domain representation S(f ) = FT {s(t)}, where FT stands
for the fourier transform. Each classical representation of the signal s(t) is non localized with respect to the excluded variable. Consequently, such representations
are not suitable for signals with time-varying spectral contents (non-stationary
signals). For non-stationary signals, an indication as to how the frequency content
of the signal changes with time, is needed.
The magnitude spectrum (frequency representation) of a signal gives no indication as to how the frequency content of the signal changes with time, an important
information when one deals with FM signals. Time–frequency signal processing,
being a natural extension of both the time domain and the frequency domain pro-
62
Time–Frequency Concepts
cessing, preserves and reveals this information about the signal. TFSP involves,
and is intended to provide a distribution of signal energy versus both time and
frequency. For this reason, the TF representation is commonly referred to as a
TFD [Boashash(2002)].
In order to see the inherent limitations of the classical representations of a nonstationary signal, consider a linear frequency modulated (LFM) signal with length
N = 128 and sampling frequency fs = 1 Hz. Its frequency increases linearly from
0.1 to 0.4 Hz. Figure 4.1 shows different representations of this signal. The time
representation of the LFM signal gives no indication about the frequency content
of the signal, neither does the spectrum of the signal as to how the spectrum
of the signal changes with time. This example shows more clearly why classical
representations are inadequate for non-stationary signals.
1
0.8
0.6
0.4
x(t)
0.2
0
−0.2
−0.4
−0.6
−0.8
−1
0
20
40
60
80
100
120
140
Time [Sec]
(a)
700
600
500
|S(f)|2
400
300
200
100
0
0
20
40
60
80
100
120
140
Frequency [Hz]
(b)
Fig. 4.1: (a) : Time-domain and (b) : frequency-domain representations of
an LFM signal. It shows clearly the inherent limitation of classical representations of a non-stationary signal.
To overcome the inadequacies of classical representations of a non-stationary
signal, which was exposed by the above example, we desire a representation in
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
4.2 Nonstationarity and FM Signals
63
the two dimensional (t, f ) space. Such a representation is called a TFD. As an
illustration, Figure 4.1 shows one particular TF representation of the LFM signal
in Figure 4.1 using Wigner-Ville distribution (WVD).
The representation in Figure 4.1 not only shows the start and stop times and
the frequency range of the LFM signal, but also clearly shows the variation in
frequency with time. The latter feature, which shows at a glance the frequency at
a given time or the time at which a given frequency is present, is missing from the
conventional signal representations in Figure 4.1.
The use of a TFD for a particular signal inevitably depends on the nature of the
signal (whether it is mono- or multi-component), and, the properties that the TFD
is expected to satisfy. A set of the properties a TFD needs to satisfy, is reported in
[?]. In [Boashash et Sucic(2003)], Boashash et al. give a subset of those properties
which are more important in most practical applications.
Fs=1Hz N=128
Time−res=1
120
Time (seconds)
100
80
60
40
20
0.05
0.1
0.15
0.2
0.25
0.3
Frequency (Hz)
0.35
0.4
0.45
0.5
Fig. 4.2: A TF representation of the LFM signal in 4.1.
4.2
Nonstationarity and FM Signals
We recall now some important definitions.
Définition 4.1 (Analytic signal [Boashash(2002)]).
Let s(t) be a real FM signal of the general form :
s(t) = A(t) · cos[θ(t)],
(4.1)
with the assumption that the spectra of the amplitude A(t) and phase θ(t) are
separated (nonoverlapped) in frequency, i.e. the signal approaches a narrowband
condition [Boashash(1992a)].
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
64
Time–Frequency Concepts
Let H[·] denote the Hilbert transform of the signal, such that
H[s(t)]
def
=
s(t) ?
=
1
p.v.
π
1
πt
½Z
∞
−∞
s(τ )
dτ
t−τ
¾
where p.v.{.} is the Cauchy principle value of the improper integral given in
this case by
¸
·Z t−δ
Z ∞
s(τ )
s(τ )
dτ +
dτ
(4.2)
lim
δ→0
−∞ t − τ
t+δ t − τ
A signal z(t) defined as
∆
z(t) = s(t) + jH[s(t)] ≈ A(t) ejθ(t)
(4.3)
is called the analytic signal of the real signal s(t). The approximation is valid
for the above narrowband condition.
The definition of the analytic signal is important to define the IF of signal s(t).
Définition 4.2 (Instantaneous frequency [Boashash(2002)]).
Let z(t) be an analytic signal given in the form
z(t) = Az (t) ejθz (t)
(4.4)
The instantaneous frequency of signal z(t) is then defined as
∆
fin (t) =
1 dθz (t)
2π dt
(4.5)
The IF, fin (t), presents a measure of the localization in time of “that” frequency at time t. In this sense, a signal is said to be nonstationary if its IF varies
in time. We can observe in Figure 4.3 the TV behavior of an engineering signal
(linear FM signal, used in radar and military applications) and real–life signals
(whale song, electroencephalogram signal, bat signal).
Note that, Definition 4.2 is applicable to monocomponent signals only, such
the signal illustrated in Figure 4.3(a). When more than one “ridge” appears in the
signal TF representation, the signal is said to be multicomponent, e.g. the signals
in Figure 4.3(b–d). The importance of the IF and its applications is represented
by Boashash in [Boashash(1992a), Boashash(1992b), Boashash(1992c)].
The nonstationarity can also be expressed in the common sense of random
process as shown in [Boashash et Sucic(2002)]. Let z(t) be a complex signal of
which the autocorrelation function is defined as :
n
τ
τ o
∆
Rz (t, τ ) = E z(t + )z ∗ (t − )
(4.6)
2
2
If Rz (t, τ ) only depends on the time–lag τ , which is the difference in time between
t1 = t+τ /2 and t2 = t−τ /2, the signal s(t) is said to be wide–sense stationary (we
only consider the second–order moment). On the other hand, when this condition is
not satisfied, s(t) is said to be nonstationary, the autocorrelation function Rz (t, τ )
depends on both the time and the time–lag.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
4.2 Nonstationarity and FM Signals
Fs=1Hz N=128
65
Fs=1Hz N=7000
Time−res=4
WHALE SIGNAL
Time−res=120
7000
120
6000
100
5000
Time (secs)
Time (secs)
80
60
40
4000
3000
2000
20
1000
0.05
0.1
0.15
0.2
0.25
0.3
Frequency (Hz)
0.35
0.4
0.45
0.5
0.05
(a) Linear FM signal
Fs=20Hz N=600
0.15
0.2
0.25
0.3
Frequency (Hz)
0.35
0.4
0.45
0.5
0.4
0.45
0.5
(b) Whale signal
Fs=1Hz N=400
b−Distribution
Time−res=5
0.1
BAT SIGNAL
Time−res=8
30
400
350
25
300
250
Time (secs)
Time (seconds)
20
15
200
150
10
100
5
50
1
2
3
4
5
6
Frequency (Hz)
7
8
9
10
0.05
(c) Electroenphalogram signal
0.1
0.15
0.2
0.25
0.3
Frequency (Hz)
0.35
(d) Bat signal
Fig. 4.3: Examples of nonstationary signals.
An engineering application is shown in (a) for a linear FM signal (plotted using the Wigner–Ville distribution). Real–life applications are shown in
(b–d) for a whale signal, an electroencephalogram signal, and a bat signal,
respectively (all plotted using the B distribution).
Définition 4.3 (Linear FM signal [Boashash(2002)]).
Consider the typical FM transmission in communications systems, a narrowband
FM signal is commonly defined as [Boashash(1992a)] :
µ
¶
Z t
∆
s(t) = A(t) · cos 2πfc t + 2π
m(τ ) dτ .
(4.7)
−∞
When m(t) is a linear function of t, i.e. m(t) = αt, signal s(t) is called a linear
frequency–modulated (LFM) signal. In addition, if A(t) is a rectangular function the signal is called a “chirp”. A chirp signal, with duration T and bandwidth
B, can be expressed as : [Rihaczek(1985)] :
schirp (t) = rectT (t) cos[2π(fc t +
α 2
t )]
2
The analytic signal associated with schirp (t) is then given by
α 2
)
z LFM (t) = rectT (t) ejθ(t) = rectT (t) ej2π(fc t+ 2 t
(4.8)
and its IF is
chirp
fin
(t) =
1 dθ(t)
= fc + αt.
2π dt
(4.9)
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
66
Time–Frequency Concepts
The chirp signal defined in (4.8) is of practical importance. It is the basic signal used in radar applications, and can be easily generated [Rihaczek(1985)]. It
is also used in military communication applications where the chirp is sent out as
a hostile signal to destroy other communications [Proakis(1995), Milstein(1988),
Amin(1997)]. In this thesis, we will refer to the chirp signal as an LFM signal (i.e.
the rectangular amplitude is implicit).
Based on the above concepts of analytic signals and instantaneous frequency
for nonstationary signals, we now see how they evolve around the fundamentals of
TFSP.
4.3
The STFT, SPEC, WVD, and Quadratic TFD
To study the spectral properties of the signal at time t, an intuitive approach is
to, first, take a slice of the signal by applying a moving window centered at time t
to the signal, and then calculate the magnitude spectrum of the windowed signal.
Consider a signal s(τ ) and a real, even window h(τ ), whose FTs are S(f ) and
H(f ) respectively. To obtain a localized spectrum of s(τ ) at time τ = t, multiply
the signal by the window h(τ ) centred at time τ = t, obtaining
sh (t, τ ) = s(τ )h(τ − t),
(4.10)
and then take the FT w.r.t. τ , obtaining
Sh (t, f ) = FT {s(τ )h(τ − t)}
t→f
(4.11)
Sh (t, f ) is called the short-time Fourier transform (STFT). The squared magnitude of the STFT, denoted by ρspec (t, f ) is called the spectrogram (SPEC)
[Boashash(1992c), Cohen(1995)]. It is mathematically expressed as
¯Z ∞
¯2
¯
¯
2
spec
−j2πf
τ
ρ (t, f ) = |S(t, f )| = ¯¯
s(τ )h(τ − t)e
dτ ¯¯ .
(4.12)
−∞
where S(t, f ) is the STFT. By varying t, one can obtain the spectral density as a
function of t.
The SPEC is a simple, popular and robust method in the analysis nonstationary signals. It is a proper energy distribution in the sense that it is positive. On
the other hand, the SPEC has an inherent limitation : the frequency resolution is
dependent on the length (and the type) of the analysis window ; too short windows
cause a decrease in frequency resolution, and too long windows cause a decrease in
time resolution, thus an inherent trade–off between time and frequency resolution
in the SPEC for a particular window.
It was argued that since a signal has a spectral structure at any given time,
there should exist the notion of an “instantaneous spectrum” which has the physical attributes of an energy density. Based on this argument, the WVD was derived,
and is defined for an analytic signal z(t) as [Boashash(1992c)]
Z ∞
τ
τ
∆
Wz (t, f ) =
z(t + )z ∗ (t − ) e−j2πf τ dτ.
(4.13)
2
2
−∞
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
4.4 Reduced Interference Distributions
67
It can be observed from (4.13) that the WVD is the Fourier transform (FT)1 of
Kz (t, τ ) from τ to f , where
τ
τ
∆
Kz (t, τ ) = z(t + )z ∗ (t − )
2
2
(4.14)
is called the time–lag signal kernel.
The WVD is the most widely studied TFD. It achieves maximum energy
concentration in the TF plane about the IF for LFM signals [Cohen(1995)]. However, it is in general non–positive and it introduces cross–terms when multiple
frequency laws (e.g. two LFM components) exist in the signals.
A general class of quadratic TFD can be obtained by smoothing/filtering
the WVD in t and f , and is expressed as [Cohen(1966)]
ZZZ ∞
τ
τ
∆
ρz (t, f ) =
ej2πν(u−t) Γ (τ, ν) z(u + )z ∗ (u − ) e−j2πf τ dν du dτ (4.15)
2
2
−∞
where Γ (τ, ν) is a two–dimensional function in the Doppler–lag domain, (τ, ν),
and is called the TFD Doppler–lag kernel. The kernel determines the TFD and
its properties. We can obtain the TFD with certain desired properties by properly
constraining the Γ (τ, ν) function. Table 4.1 lists some common TFD and their
corresponding Doppler–lag kernels.
Equation (4.15) can be simplified as [Boashash(2002)] :
ρz (t, f ) = γ(t, f ) ? ? Wz (t, f ).
t f
(4.16)
The notation ?? in (4.16) represents a convolution in both t and f directions, and
tf
γ(t, f ) is the time–frequency kernel obtained through a dFT operation on Γ (τ, ν)
as :
ZZ ∞
γ(t, f ) =
Γ (τ, ν) e−j2πf τ e+j2πtν dτ dν
−∞
Remark 4.1. Convention of dFT and dIFT operations : a dFT operation, transforming a function of two variables (t, f ) to another function of (τ, ν), contains
one FT operation from t to ν and one IFT operation from f to τ , and the FT
and IFT are interchangeable ; inversely, a dIFT operation, transforming a function of two variables (τ, ν) back to (t, f ), contains one IFT operation from ν to
t and one FT operation from τ to f , and these IFT and FT operations are also
interchangeable.
4.4
Reduced Interference Distributions
The problem of cross–terms introduced by WVD when applying it to a multicomponent signal can be dealt with by selecting a suitable kernel Γ (τ, ν) which
minimizes the cross–terms effectively. The corresponding TFD to such kernels are
1
Convention of FT and IFT operations : an FT operation will transform a function
either from t to ν domain, or from τ to f domain ; inversely, an IFT operation goes either
from ν back to t, or from f back to τ .
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
68
Time–Frequency Concepts
known as reduced interference distributions (RID). Examples of time-frequency
RID include the CWD [Choi et Williams(1989)], the BJD [Cohen(1966)], the cone–
shaped ZAMD [Zhao et al.(1990)] and the MBD [Hussain(2002)], defined in Table 4.1.
The RID may be applied in situations where there are simultaneously a number
of signals of interest which need to be separated.
Tab. 4.1: Some common TFD and their kernels.
Name
Kernel : g(ν, τ )
WVD
1
Z
SPEC
τ
τ
h(u + ) h∗ (u − ) e−j2πν u
2
2
2 2
CWD
e−v
BJD
sin(πντ )
πντ
ZAMD
MBD
4.5
TFD : ρz (t, f )
τ ∗
τ
z(t + ) z (t − ) e−j2πf τ dτ
2
2
Z
2
e−j2πf τ z(τ )h(τ − t) dτ ZZ r
τ
πσ −π2 σ(u−t)2 /τ 2
τ ∗
e
z(t + ) z (t − ) e−j2πf τ du dτ
τ2
2
2
Z ∞
Z t+ |τ |
2
τ
1
τ ∗
z(u + )z (u − ) du e−j2πf τ dτ
|τ |
2
2
−∞ |τ | t− 2
Z ∞
Z t+ |τ |
2
τ ∗
τ
z(u + )z (u − ) du e−j2πf τ dτ
h(τ )
|τ |
2
2
−∞ 2α−1t−2 2 Z
τ ∗
Γ(2α)/2
Γ (α)
τ
? z(t + ) z (t − ) e−j2πf τ dτ
t
2
2
cosh2α (t)
Z
τ /σ
sin(πντ )
πντ
|Γ(α + jπν)|2
; α ∈ IR+
Γ2 (α)
h(τ )|τ |
The WVD and Ambiguity Function
By taking the dFT of the WVD, we obtain the symmetrical AF, also called
Sussman AF
Az (τ, ν) = FT FT −1 {Wz (t, f )}
t→ν f →τ
ZZ ∞
=
Wz (t, f ) ej2πf τ e−j2πνt dt df
−∞
Z ∞
=
Kz (t, τ ) e−j2πνt dt.
(4.17)
−∞
Slightly different definitions of AF have been used by different authors, however,
they are all related to the symmetrical form Az (t, τ ) [Matz et Hlawatsch(1998b)].
A nonstationary signal, therefore, can be analyzed in either the time–frequency
domain (t, f ) or the ambiguity domain (τ, ν), also called Doppler–lag domain.
There, also, exists a relationship between the WVD and the AF via the Radon
transform [Jain(1989)], that is, the FT of the Radon–transformed WVD yields the
AF in polar coordinates [Ristic(1995)].
The concept of AF has been used as a very effective tool in the design of radar
signals [Boashash(1992c), Cook et Bernfeld(1993)]. This function is the basis in
modern radar technology.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
4.6 Relationships Among Dual Domains
4.6
69
Relationships Among Dual Domains
The relationship between dual domain pairs, time–frequency and Doppler–
delay, and, time–delay and Doppler–frequency, can be represented as in Figure 4.4
[Boashash(2002)] through FT and IFT with respect to variables. Each arrow in Figure 4.4 represents a FT from one variable to the other, the inverse direction
represents an IFT operation.
FT
FT
FT
FT
Fig. 4.4: Quadratic representations corresponding to the WVD.
Wz (t, f ), Az (τ, ν), Kz (t, τ ) and Dz (ν, f ) are respectively the WVD, AF, time–
lag signal kernel and the Doppler–frequency signal kernel of the analytic
signal z(t).
Moreover, for the general quadratic class of TFD in (4.15), the above relationship is illustrated in Figure 4.5 [Boashash(1992c), Boashash(2002)], the Az (τ, ν) is
the GAF. Note that, there is a strong coherence between quadratic TF signal repre-
FT
FT
FT
FT
Fig. 4.5: Dual domains of general signal quadratic representations.
γ(t, f ), Γ (τ, ν), G(t, τ ) and G(ν, f ) are the TFD time–frequency, Doppler–lag,
time–lag and Doppler–frequency kernel, respectively. ρz (t, f ) and Az (τ, ν) are
the general quadratic TFD and the GAF of the analytic signal z(t).
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
70
Time–Frequency Concepts
sentations and LTV systems [Matz et Hlawatsch(1998b), Hlawatsch et Matz(2000)].
A quadratic time-frequency analysis of a LTV system can be based on the linear relation between the WVD [Mecklenbräuker et Hlawatsch(1997)] or modified WVD [Hlawatsch et Matz(2000)] of both the input and the output. This
input-output relationship can be, in general, described as TFD [Gaarder(1968),
Altes(1980), Flandrin(1988a), Nguyen et al.(2001b)]
E {ρx (t, f )} = ρs (t, f ) ? ? Ψh (t, f )
t f
(4.18)
where ρs (t, f ) is a TFD of the input s(t) ; Ψh (t, f ) is the scattering function which
is related to the random LTV channel impulse response h(t, ν) ; and E {ρx (t, f )}
is the expected value of a TFD of the output x(t).
4.7
Time–Frequency Signal Synthesis
Opposite to the TF signal analysis whereby the analysis algorithms are used to
analyze the TV frequency behavior of signals, TF signal synthesis algorithms are
used to synthesize, or estimate, signals from their TFD. Mathematically, assuming
that z(t) is a signal of interest with ρz (t, f ) being its TFD in the general quadratic
class, the synthesis problem can be formulated as : find the analytic signal ẑ(t)
whose estimate TFD, ρẑ (t, f ), best approximates ρz (t, f ). Consequently, ẑ(t) gives
the best estimate of z(t). Seminal to the problem of TF signal synthesis is the
algorithm in [Boudreaux-Bartels et Marks(1986)] using WVD. The basis for the
solution is the inversion property of the WVD [Boashash(1992c)]
Z ∞
t
1
(4.19)
Wz ( , f ) ej2πf t df
z(t) = ∗
z (0) −∞
2
implying that the signal may be reconstructed to within a complex exponential
constant ejα = z ∗ (0)/|z(0)| given |z(0)| 6= 0. Other time-frequency synthesis algorithms can be found in [Boashash(1991), McHale et Boudreaux-Bartels(1993),
Wood et Barry(1994), Hlawatsch et Krattenthaler(1997), Francos et Porat(1999)].
4.8
IF Estimation
There are two major existing approaches for IF estimation using TFD. The
first is built on the first–order moment of TFD [Boashash(1991)]. The first–order
moment of the WVD yields the IF [White et Boashash(1988), Boashash(1991)],
while others yield approximations of the IF [Boashash(1992c)]. However it fails to
estimate multicomponent signals due to the presence of cross–terms.
The second approach is built on utilizing the fact that all TFD have peaks
around the IF laws of signals. The peaks of the WVD was used for IF estimation
and applied to many problems [Boashash(1992c)]. For better performance at lower SNR, the XWVD was proposed [Boashash et O’Shea(1993)]. Other algorithms
of TFD–based peak estimation can be found in, for examples, [Boashash(1992c),
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
4.9 Engineering Applications of Time–Frequency Methods
71
Stankovic et Katkovnik(1998), Katkovnik et Stankovic(1998)]. Like the first approach, this approach also suffers from the presence of cross–terms in multicomponent signals which results in poor estimation.
Upon the desired to design high resolution RID, the B-Distribution (BD) was
then proposed in [Barkat(2000)], and MBD was developed in [Hussain(2002)], both
with adaptive algorithms for IF estimation of multicomponent signals.
4.9
Engineering Applications of Time–Frequency
Methods
This section looks at the existing applications of time-frequency signal processing and describes a representative selection of time-frequency signal processing
applications, encompassing telecommunications, radar, sonar, power generation,
image quality, and biomedical engineering.
– Time-Frequency Methods in Communications : Telecommunications
is one of the key industries where time-frequency methods are already playing
an important role. [Barbarossa et Scaglione(1999b)] investigate the problem
of optimal precoding and channel capacity for transmission over linear timevarying (LTV) channel, in wireless communications where the multipath
channels are underspread with finite Doppler and delay spread.
In modern communication systems, a number of users can share the same
communication channel via multiple-access (MA), common examples are
FDMA, TDMA, and CDMA [Rappaport(1996)]. The potential demand for
wireless communications combined with restricted availability of the radio
frequency spectrum has motivated intense research into bandwidth-efficient
multiple-access schemes. CDMA has received its potential attention amongst
other schemes. Issues such as designing/assigning the spreading codes in
CDMA and multiple access interference become major concerns in research
activities of which a number of approaches are based on the concept of timefrequency [Crespo et al.(1995), Haas et Belfiore(1997), Joshi et Morris(1998)].
One of the main objectives of the third-generation mobile and personal telecommunication systems is to provide wide range of services with different
bit rates [Swarts et al.(1999)]. Then, a new approach called time-frequencyslicing (TFS) was proposed for multirate access in [Karol et al.(1997)].
– Time-Frequency Methods in Radar : Time-frequency methodologies
have made significant inroads already in these field. A baseband Doppler
radar return from a helicopter target is an example of a persistent nonstationary signal. A linear time-frequency representation provides a high
resolution suitable for preserving the full dynamic range of such complicated
signals [Marple S.L.(2001)].
– Time-Frequency Methods in Biomedical Engineering : An example
of time-frequency methodology used for the detection of seizures in recorded
EEG signals is proposed in [Celka et al.(2001)]. The techniques used are
adapted to the case of newborn EEGs, which exhibit some well defined
features in the time-frequency domain that allow an efficient discrimination
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
72
Time–Frequency Concepts
between abnormal EEGs and background. Another TF approach to newborn
EEG seizure detection is described in [H. Hassanpour et Boashash(2003)].
– Other Applications : There are a number of applications that could not
be included in the chapters for obvious space reasons.
4.10
Concluding Remarks
Time-frequency signal analysis (TFSA) is a collection of theory and algorithms
used for analysis and processing of non-stationary signals, as found in a wide range
of applications.
In this chapter, the main knowledge of TFSA have been summarized. Concisely presented, this comprehensive tutorial introduction to TFSA is accessible to
anyone who has taken a first course in signal processing.
However, expert reader can find a more detailed references and real life applications in the signal processing literature like as [Boashash(2002), Cohen(1995)].
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
Deuxième partie
Séparation de Sources Impulsives
à Variance Infinie
Dans le premier chapitre de cette partie (chapitre 5), nous
rappelons les grands principes et méthodes existants de la
séparation de sources. Ensuite, nous présentons dans les
quatres chapitres suivants (chapitres 6–9) nos approches
novatrices en séparation aveugle d’un mélange linéaire
instantané de sources impulsives alpha-stables.
Chapitre 5
State of the Art of BSS
Blind source separation (BSS) or independent component analysis (ICA) is a
method for finding underlying factors or components from multivariate statistical
data. What distinguishes BSS from other methods is that it looks for components
that are both statistically independent, and non-gaussian. In the second part of this
thesis, we will focus on the BSS of linear instantaneous mixtures. This BSS problem
has the advantages of simplicity and generality since the statistical principles used
in this context can be applied also to solve the convolutive mixing problem. In
this chapter, we briefly introduce the basic concepts, and estimation principles of
BSS. Following different types of sources statistical information, our contributions
in this part will be divided in four other chapters : The first one is devoted to BSS
methods using fractional lower-order statistics (FLOS). We will pay a particular
attention to this chapter1 and give a general framework, for separation methods
using FLOS. In a second contribution of BSS, we give a theoretical procedure for
constructing contrast functions using sub- or super -additive functional. The third
contribution is devoted to BSS methods based on some normalized HOS. While the
fourth one is devoted to a semi-parametric maximum likelihood approach coupling
a stochastic version of the EM algorithm and the use of the log-spline functions to
approximate the sources PDFs.
1
To our best knowledge there exists no BSS procedure based on FLOS whilst many are
based on HOS and SOS.
76
5.1
5.1.1
State of the Art of BSS
Introduction
What is blind source separation (BSS) ?
Blind source separation (BSS) is a fundamental problem in signal processing
that is sometimes known under different names : blind array processing, independent component analysis, waveform preserving estimation, etc. In all these
instances, the underlying model is that of m statistically independent signals
s(t) = (S1 (t), · · · , Sm (t))T whose n mixtures are observed y(t) = (y1 (t), · · · , yn (t))
possibly in a noisy environment w(t) as shown in Fig. 5.1. BSS addresses the
problem, from an observable mixture of source signals, to separate or ideally to
reconstruct the unknown source signals. The term blind refers to the fact that the
source signals and the way the source signals are mixed is unknown. The mixtures of the source signals are termed the observable signals, and the model of the
mixing of the source signals is referred to as the mixing system A. The separated
signals are obtained from the observable signals by means of a separation system
B. The separated signals are obtained from the observable signals by means of a
separation system ; in Figure 5.1 the signal model is depicted as a block diagram.
BSS can have many applications in areas involving processing of multi-sensor si-
w(t)
s(t)
A
y(t)
x(t)
B?
s(t)
Fig. 5.1: Signal model for the blind source separation problem
gnals. Examples at least include : Source localization and tracking by radar and
sonar devices ; speaker separation (cocktail party problem) ; multiuser detection
in communication systems ; medical signal processing, e.g., separation of EEG or
ECG signals ; industrial problems such as fault detection ; extraction of meaningful
features from data, etc.
This area has been very active over the last two decades. Surprisingly, this seemingly impossible problem has elegant solutions that depend on the nature of the
mixtures and the nature of sources statistical information [Hyvarinen et al.(2001)].
5.1.2
Brief history of BSS
The problem of blind source separation (BSS) has been first introduced by
J. Hérault, C. Jutten, and B. Ans [Hérault et Ans(1984)], [Hérault et al.(1985)]
for linear instantaneous mixtures. Then, many researchers have been attracted
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
5.1 Introduction
77
by the subject, and many other works appeared. More precisely, all through the
1980s, BSS was mostly known among French researchers, with limited influence
internationally. The few BSS papers are some presentations in international neural
network conferences in the mid-1980. In those time, another related and attractive
field was higher-order spectral analysis, on which the first international workshop
was organized in 1989. In this workshop, early papers on ICA by [Cardoso(1989a)]
and P. Comon [Comon(1989)] were given. Cardoso used algebraic methods, especially higher-order cumulant tensors, which eventually led to the JADE algorithm
[Cardoso et Souloumiac(1993)]. The use of fourth-order cumulants has been earlier
proposed by [Lacoume et Ruiz(1988)].
The work of the scientists in the 1980’s was extended by, among others, A. Cichocki
and R. Unbehauen, who were the first to propose one of the presently most popular ICA algorithms [Cichocki et al.(1994)], [Cichocki et Unbehauen(1996)]. Some
other papers on ICA and BSS from early 1990s are published [Jutten(2000)]. However, until the mid-1990s, BSS remained a rather small and narrow research effort.
Several algorithms were proposed that worked, usually in somewhat restricted problems, but it was not until later that the rigorous connections of these to statistical
optimization criteria were exposed.
BSS attained wider attention and growing interest after the publication of the infomax principle based approach [Bell et Sejnowski(1995)] on the in the mid-90’s.
This algorithm was further refined by S.-I. Amari and his co-workers using the natural gradient [Amari et al.(1996)], and its fundamental connections to maximum
likelihood estimation, as well as to the Cichocki-Unbehauen algorithm, were established. A couple of years later, A. Hyvarinen and E. Oja presented the fixed-point
or FastICA algorithm [Hyvarinen(1999)], which has contributed to the application
of ICA to large-scale problems due to its computational efficiency.
Since the mid-1990s, there has been a growing wave of papers, workshops, and special sessions devoted to ICA. Indeed, many approaches for performing BSS have
been proposed from different view-point. Using second-order statistics the popular
SOBI algorithm was introduced in [Belouchrani et al.(1997a)] for spatial correlated sources. The same methodology was generalized to cyclostationary sources in
[Abed-Meraim et al.(2001)]. A useful generalization of the fourth-order cumulants
based methods was proposed in [Pesquet et Moreau(2001)], [Moreau(2001)]. The
convolutive model has been addressed in [Castella et Pesquet(2004)]. Motivated by
the useful incorporation of the priori information of data in the BSS framework,
the Bayesian approach have been introduced in [Djafari(1999)]. Other researchers
consider some priori information of the propagation system in a semi-blind model have also contribute in the BSS field. For example, in [Davy et al.(2002)] the
Bayesian approach was coupled with MCMC techniques to estimate the chirp signals. Before that, in [Benidir(1997)] a polynomial approach was proposed. Nonlinear mixture BSS model was early investigated in [Abed-Meraim et al.(1996)] and
[Krob et Benidir(1993)]. Since 1999 an international workshop on ICA and BSS
gather, every year, more than 150 researchers working on blind signal separation,
and contributes to the transformation of BSS to an established and mature field
of research.
As an extension to the instantaneous mixtures, other model of signals mixtures
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
78
State of the Art of BSS
have been considered in the literature of signal processing. More precisely, we can
distinguish three classes of mixtures :
[C]- Linear instantaneous mixtures
This model is commonplace in the field of narrow band array processing where
the transfer function between sources and sensors is given by a constant matrix
A (i.e., involves no delays or frequency distortion) called the ‘array matrix’ or
the ‘mixing matrix’. Many array processing techniques rely on the modelling of A
[Krim et Viberg(1996)] : each column of A is assumed to depend on a small number
of parameters. This information may be provided either by physical modelling (for
example, when the array geometry is known and when the sources are in the farfield of the array) or, more likely, by direct array calibration. In many circumstances
however, this information is not available or is not reliable. Blind source separation
techniques address the issue of identifying A and/or retrieving the source signals
without resorting to any a priori information about mixing matrix A : it exploits
only the information carried by the received signals themselves, hence the term
blind. Performance of such a blind technique, by its very nature, is essentially
unaffected by potential errors in the propagation model or in array calibration (this
is obviously not the case of parametric array processing technique). Of course, the
lack of information on the structure of A must be compensated by some additional
assumptions on the source signals as will be shown next.
[A]- Non-linear mixtures
In the basic signal model of BSS, an unknown linear mixing process is often assumed. However, this model fails as soon as the linear approximation of the physical
phenomenon is not valid. This is the case for example when the signal is received
at an array of sensors with non-linear characteristics. Some particular non-linear
models have been thoroughly studied in the literature such as the post non-linear
model where the mixing process is a cascade of a linear mixture and a componentwise non-linear transform [Taleb(1999)], and the linear quadratic model where
the observations are quadratic functions of the sources [Krob et Benidir(1993)],
[Abed-Meraim et al.(1996)] and [Taleb(1999)]. The general non-linear problem is
still largely unsolved except for some tentative solutions based on neural networks
using self-organizing feature maps [Taleb(1999)] or information preserving nonlinear maps [Taleb(1999)], [Hyvarinen et al.(2001)].
[B]- Linear convolutive mixtures
Many real-world communication systems involve source signals that are delayed and attenuated by different amounts on their way to the different sensors
(receivers), as well as multipath propagation. Moreover, the multipath can be diffuse with long delay spread causing intersymbol interference and resulting in a
situation termed ‘linear convolutive mixing’. Mathematically, the mixing is described by a matrix of linear filters operating on the sources. Although not completely
solved, this problem is much better known than the non-linear mixing one. The
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
5.1 Introduction
79
first research works have focused on the case where the mixing is square (i.e.,
the number of inputs equals the number of outputs) where multitude solutions
have been given using neural networks, independent component analysis (ICA),
or information-theory approaches [Hyvarinen et al.(2001)]. Interestingly, by stacking successive observations into a single vector, the convolutive mixtures can
be expressed as instantaneous mixtures (with a full column rank mixing matrix if
more outputs than inputs). Thus, BSS solutions for instantaneous mixtures can be
adapted to solve the convolutive mixture problem, e.g., [Mansour et al.(2000a)],
[Babie-Zadeh(2002)], [Castella et al.(2004)], [Castella et Pesquet(2004)].
A good source with historical accounts and more complete list of references is
[Jutten(2000)], a good overview of statistical principle in BSS is [Cardoso(1998)],
and an elegant overview paper is [Mansour et al.(2000b)]. There is still much work
that is left to do. For example, we still do not have an adequate explanation
for why ICA converge for so many problems, almost always to the same solutions, even when the signals were not derived from independent sources !. Other
serious problems include under-determined mixtures case, non-stationary sources,
heavy-tailed sources, non-linear mixtures, dependent sources...etc still open for
researchers efforts.
5.1.3
Statistical information for BSS
Statistical moments of signals provide rich sources about the desired information. The whole spectrum of statistical moments runs from order 0 to order ∞ (see
Figure 5.2). The oldest traditional signal separation methods utilize only secondorder moments like as PCA methods [Belouchrani et al.(1997a)]. All through the
1990s, BSS methods was extended to use widely the higher-order statistical moments [Nikias et Petropulu(1994)], [Comon(1994)], [Cardoso et Souloumiac(1993)].
More recently, lower-order fractional statistical signal processing techniques extract
useful information from pthe-order statistics with −1 < p < 2. In this thesis, we
will show that blind source separation methods based on stable models can be adequately solved using fractional lower-order moments, i.e., moments of order less
than 2.
Lower−order Fractional
Moment Theory
0
0.5 0.8 1
Second−order
Moment Theory
1.5 1.8 2
Higher−order
Moment Theory
3
4
Fig. 5.2: Order of Statistics in Blind Source Separation
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
80
State of the Art of BSS
Thus, the sources statistical information can be of three types :
[A]- Higher order statistical information
For non Gaussian independent sources higher order statistics (HOS) can be
used to achieve the BSS. The first HOS-BSS approach traces back to the pioneering adaptive algorithm of [Hérault et Ans(1984)], [Hérault et al.(1985)]. This
method do not use explicitly HOS but try to equalize the channel by minimizing a
cost function that contains implicitly the information of higher order moments of
the output. Alternative batch algorithms, that explicitly use higher-order cumulants, have been later developed ; see, for instance [Cardoso(1991)], [Comon(1994)].
Other HOS-based solutions include the separation by maximum likelihood (ML),
separation by neural networks, separation by contrast function, separation by
information-theoretic criteria, .. etc.
[B]- Second order statistical information
It was early used in blind equalization [Delmas et al.(2000)], in DOA estimation [Delmas(2004)] and in many other classical signal processing methods related
on estimation and detection. When the data show some kind of temporal dependency, alternative BSS methods can be developed based on second order statistics
[Abed-Meraim et al.(1997b)]. SOS-based methods are expected to be more robust
to poor signal to noise ratios and short data sizes [Gazzah et Abed-Meraim(2003)].
BSS is feasible based on spatial correlation matrices [Belouchrani et al.(1997a)],
[Mansour et al.(2000a)]. These matrices show a simple structure which allows
straightforward blind identification procedures based on eigendecomposition. For
example, a new algorithm using second-order cyclostationary statistics is introduced in [Abed-Meraim et al.(2001)].
[C]- Fractional lower order statistical information
It is known that, for a non-Gaussian stable distribution with characteristic exponent α, only moments of order less than α are finite. In particular, the secondorder moment of a stable distribution with α < 2 does not exist, making the use
of covariance as a measure of correlation meaningless. Similarly, many standard
signal processing tools (e.g., spectral analysis and all higher-order techniques) that
are based on the assumption of finite variance will be considerably weakened and
may, in fact, give misleading results.
Recall that the stable distribution is best used to model signals and noise that exhibit impulsive nature. This type of signal tends to produce outliers. Although the
SOS and HOS -based BSS methods usually lead to analytically tractable results,
they are no longer appropriate for an impulsive non-Gaussian signals. It has been
demonstrated many times in the literature that second and higher-order estimates
can deteriorate dramatically when only a small proportion of extreme observations
is present in the data.
The absence of a finite SOS and HOS does not mean, however, that there are no
other adequate measures of independence of stable random variables. As it will be
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
5.2 Linear Instantaneous Mixtures
81
shown later in this thesis, the dispersion of a stable random variable plays a role
analogous to the SOS. Despite the aforementioned difficulties, significant progress
has been made in developing a linear estimation theory for stable processes over
the past thirty years.
In this thesis, we introduce a new source separation class of methods based on
the use of the fractional lower-order statistics (FLOS), i.e., statistics of order less
than 2.
5.2
Linear Instantaneous Mixtures
Consider m mutually independent signals whose n ≥ m linear combinations
are observed in noise :
x(t) = y(t) + w(t) = As(t) + w(t)
(5.1)
where s(t) = [s1 (t), · · · , sm (t)]T is the real source vector, w(t) = [w1 (t), · · · , wn (t)]T
is the real noise vector, and A is the n × m full rank mixing matrix.
The purpose of blind source separation is to find a separating matrix i.e., an m × n
matrix B such that z(t) = Bx(t) is an estimate of the source signals.
5.2.1
Separability and indeterminacies
When the sources are white stationary processes, then their separation can be
achieved under the following conditions.
Theorem 5.1. If there is at most one Gaussian source, then the independence of
the components of y implies BA = PΛ, where P and Λ represent a permutation
and a diagonal matrix, respectively.
In other words, the linear instantaneous mixtures are separable, up to a permutation and a scale indeterminacies, provided that there is at most one Gaussian
source.
We will not prove the identifiability of the BSS model here, since the proof is quite
complicated ; see the Comon’s paper [Comon(1994)]. Next, we develop a kind of a
constructive discussion (non-rigorous proof) about the identifiability.
[A]- Separability of instantaneous linear mixtures model
To make sure that the basic BSS model given in (9.6) can be estimated, we
have to make certain assumptions :
1. The sources s(t) are at each time instant mutually independent : This is the
principle on which ICA rests. Surprisingly, not much more than this assumption is needed to ascertain that the model can be estimated. This is why BSS
is such a powerful technique with applications in many different areas.
Basically, r.v.s Y1 and Y2 are said to be independent if information on the
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
82
State of the Art of BSS
value of Y1 (of Y2 ) does not give any information on the value of Y2 (of
Y1 ). Technically, independence can be defined by the PDFs. Let us denote
p(y1 , y2 ) the joint PDF of Y1 and Y2 , and by pi (yi ) the marginal PDF of Yi
for i = 1, 2. Then we say that Y1 and Y2 are independent if the joint PDF is
decomposable in the following way :
p(y1 , y2 ) = p1 (y1 )p2 (y2 )
(5.2)
2. At most one source has Gaussian distribution : Whitening also helps us understand why Gaussian variables are forbidden in BSS. Assume that the joint
distribution of two sources s1 and s2 , is Gaussian. This means that their
joint PDF is given by
p(s1 , s2 ) =
1
s2 + s22
1
ksk2
exp(− 1
)=
exp(−
)
2π
2
2π
2
(5.3)
Now, assume that the mixing matrix A is orthogonal. For example, we could
assume that this is so because the data has been whitened. Using the classic
formula of transforming PDF’s, and noting that for an orthogonal matrix
A−1 = AT holds, we get the joint density of the mixtures x1 and x2 as
density is given by
p(x1 , x2 ) = |det(AT )|
kAT xk2
1
exp(−
)
2π
2
(5.4)
Due to the orthogonality of A, we have kAT xk2 = kxk2 , det(A) = 1 and
AT is also orthogonal. Thus we have
p(x1 , x2 ) =
ksk2
1
exp(−
) = p(s1 , s2 )
2π
2
(5.5)
and we see that the orthogonal mixing matrix does not change the PDF,
since it does not appear in this PDF at all. The original and mixed distributions are identical. Therefore, there is no way how we could infer the mixing
matrix from the mixtures.
The phenomenon that the orthogonal mixing matrix cannot be estimated
for Gaussian variables is related to the property that uncorrelated jointly
Gaussian variables are necessarily independent. Thus, the information on
the independence of the components does not get us any further than whitening. Thus, in the case of Gaussian independent components, we can only
estimate the BSS model up to an orthogonal transformation. In other words,
the matrix A is not identifiable for Gaussian independent components. With
Gaussian variables, all we can do is whiten the data.
What happens if we try to estimate the BSS model and some of the components are Gaussian, some non-Gaussian ? In this case, we can estimate all
the non-Gaussian components, but the Gaussian components cannot be separated from each other. In other words, some of the estimated components
will be arbitrary linear combinations of the Gaussian components. Actually,
this means that in the case of just one Gaussian source, we can estimate
the model, because the signal Gaussian component does not have any other
Gaussian component that it could be mixed with.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
5.2 Linear Instantaneous Mixtures
83
3. The number of sensors is greater than or equal to the number of sources n ≥ m :
This assumption is needed to make the mixing matrix A a full rank matrix.
Then after estimating the matrix A, we can compute its inverse, say B, and
obtain the independent components simply in the noiseless case by s = Bx.
[B]- Indeterminacies in the instantaneous linear mixtures model
In the BSS model (9.6), it is easy to see that the following two ambiguities or
indeterminacies will necessary hold. First there is no way of knowing the original
labelling of the sources, hence any permutation of the outputs is also a satisfactory
solution, i.e., if z(t) is a solution then Pz(t) is also a solution for any permutation
matrix P. Choosing a labelling of the outputs can only be done with some extra
knowledge of the system. The second ambiguity is that exchanging a fixed scalar
factor between a source signal and the corresponding column of A does not affect
the observations as is shown by the following relation :
x(t) = As(t) + w(t) =
m
X
ap
p=1
λp
λp sp (t) + w(t)
(5.6)
where λp is an arbitrary real factor and ap denotes the p-th column of A.
It follows that the best that one can do is to determine B (or equivalently the
matrix A) up to a permutation and scaling of its columns [Mansour et al.(2000b)].
Therefore, B is said to be a separating matrix if
By(t) = PΛs(t)
where P is a permutation matrix and Λ a non-singular diagonal matrix. Similarly,
blind identification of A is understood as the determination of a matrix equal to
A up to a permutation matrix and a non-singular diagonal matrix.
Many authors take advantage of the scaling indetermination by assuming, without
any loss of generality, that the source signals have unit variance, so that the dynamic range of the sources is accounted for by the magnitude of the corresponding
columns of A. Other normalization strategies exist such as normalizing the diagonal entries of A (respectively B) to unity.
5.2.2
How to find the independent components
It may be very surprising that the independent components can be estimated
from linear mixtures with no more assumptions than their independence. In this
chapter, we will try to explain briefly why and how this is possible.
[A]- Uncorrelatedness is not enough
The first thing to note is that independence is a much stronger property than
uncorrelatedness. Considering the BSS problem, we could actually find many different uncorrelated representations of the signals that would not be independent
and would not separate the sources. Uncorrelatedness in itself is not enough to separate the components. This is also the reason why principal component analysis
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
84
State of the Art of BSS
(PCA) or factor analysis cannot separate the signals : they give components that
are uncorrelated, but little more. In fact, by using the well-known decorrelation
methods, we can transform any linear mixture of the independent components into
uncorrelated components, in which case the mixing is orthogonal. Thus, the trick
in BSS is to estimate the orthogonal transformation that is left after decorrelation.
This is something that classic methods cannot estimate because they are based on
essentially the same covariance information as decorrelation. In the following, we
consider a couple more sophisticated and popular procedures for estimating ICA.
[B]- Nonlinear decorrelation is the basic ICA method
One way of stating how independence is stronger than uncorrelatedness is to
say that independence implies nonlinear uncorrelatedness : If s1 and s2 are independent, then any nonlinear transformation g(s1 ) and h(s2 ) are uncorrelated.2 In
contrast, for two r.v. that are merely uncorrelated, such nonlinear transformations
do not have zero covariance in general. Thus, we could attempt to perform BSS by
a stronger form of decorrelation, by finding a representation where the yi are uncorrelated even after some nonlinear transformations. This gives a simple principle
of estimating the separating matrix B :
BSS approach 1 : Nonlinear decorrelation. Find the matrix B so that
for any i 6= j, the components yi and yj are uncorrelated, and the transformed components g(yi ) and h(yj ) are uncorrelated, where g and h are
some suitable nonlinear functions.
This is a valid approach to estimating ICA : If the nonlinearities are properly chosen, the method does find the independent components. Although this principle is
very intuitive, it leaves open an important question : How should the nonlinearities
g and h be chosen ? Answer to this question can be found be using principles from
estimation theory and information theory. Estimation theory provides the most
classic method of estimating any statistical model : the maximum likelihood method. Information theory provides exact measures of independence, such as mutual
information. Using either one of these theories, we can determine the nonlinear
functions g and h in a satisfactory way.
[C]- Independent components are the maximally non-gaussian components
Another very intuitive and important principle of ICA estimation is maximum
non-gaussianity. The idea is that according to the central limit theorem, sums of
non-gaussian r.v.s are closer toPgaussian that the original ones. Therefore, if we
take a linear combination y = i bi xi of the observed mixture variables, this will
be maximally non-gaussian if it equals one of the independent components. This
is because if it were a real mixture of two or more components, it would be closer
to a gaussian distribution, due to the central limit theorem. Thus, the principle
can be stated as follows :
2
In the sense that their correlation is zero i.e. IE[g(s1 )h(s2 )] = 0.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
5.3 Basic BSS Methods
85
BSS approach 2 : Maximum non-Gaussianity.
P Find the local maxima
of non-gaussianity of a linear combination y = i bi xi under the constraint
that the variance of y is constant. Each local maximum gives independent
component.
To measure non-gaussianity in practice, we could use, for example, the kurtosis. Recall that the kurtosis is a normalized higher-order cumulant, which are
some kind of generalizations of variance using higher-order polynomials. Cumulants have interesting algebraic and statistical properties which is why they have
an important part in the theory of BSS. An interesting point is that this principle
of maximum non-gaussianity shows the very close connection between BSS and
an independently developed technique of robust statistic called projection pursuit.
in projection pursuit, we are actually looking for maximally non-gaussian linear
combinaisons, which are used for visualization and other purposes. Thus, the BSS
problem can be interpreted as projection pursuit directions.
[D]- Importance role of numerical techniques :
In addition to the estimation principle, one has to find efficient algorithm for
implementing the computations needed. Thus, numerical algorithms are an integral
part of BSS methods. The numerical methods are typically based on optimization
of some objective functions. The basic optimization method is the gradient method.
For example, a well known fixed-point algorithm called FastICA has been tailored
to exploit the particular structure of the ICA problem.
5.3
5.3.1
Basic BSS Methods
BSS by minimization of mutual information
An important approach for blind source separation, inspired by information
theory, is the minimization of mutual information. The motivation of this approach
is that it may not be very realistic in many cases to assume that the data follows
the BSS model. Therefore, we would like to present here an approach that does not
assume anything about the date. What we want to have is a general measure of the
dependence of the components of a random vector. Using such a measure, we could
define BSS as a linear decomposition that minimize that dependence measure. We
recall here very briefly the basic definitions of information theory. The differential
entropy H of a random vector y = (y1 , · · · , yn )T with density p(y) is defined as :
def
H(y) = −IE{log p(y)}
(5.7)
A normalized version of entropy is given by negentropy J, which is defined as
follows
def
J = H(yGauss ) − H(y)
(5.8)
where yGauss is a Gaussian random vector of the same covariance or correlation
matrix as y. Negentropy is always non-negative, and is equal to zero only for
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
86
State of the Art of BSS
Gaussian random vectors. Mutual information I between n r.v.s yi , i = 1, · · · , n is
defined as follows
n
X
I(y1 , · · · , yn ) =
H(yi ) − H(y)
(5.9)
i=1
Mutual information
can also be expressed as the Kullback-Leibler divergence of
Q
py (y) and i pyi (yi ) :
¸
py (y)
I(y) =
py (y) log Q
dy
y
i pyi (yi )
def
Z
·
(5.10)
[A]- Mutual information as a measure of dependence
From the well known properties of the Kullback-Leibler
divergence, I(y) is alQ
ways non-negative, and is zero if and only if py (y) = i pyi (yi ), that is, if y1 , · · · , yn
are independent [Cover et Thomas(1991)]. Consequently, I(y) is a measure of dependence or contrast function and source separation algorithms can be designed
based on its minimization.
[B]- Mutual information as maximum likelihood estimation
Mutual information (MI) and likelihood are intimately connected. Indeed, it
was shown that the minimization of the mutual information is asymptotically a
Maximum Likelihood (ML) estimation of the sources [Taleb(1999)]. More and more
connections exist between MI and ML approaches in practice because we do not
know the distributions of the sources. For example, the need of approximation of
MI can use the ML estimation. Consequently, many recent works are based on this
criterion [Pham(1999)].
[C]- Mutual information as maximization of non-gaussianity
To state the idea, suppose that the whitening z = Wx has been done, and
hence a unitary matrix U must be estimated to achieve independent outputs. Now
1
from y = Uz we have py (y) = |det(U)|
pz (z). Consequently, H(y) = H(z) and
P
I(y) =
H(y
)
−
H(z).
Since
H(z)
does
not depend on U, minimizing I(y)
i
i
with respect to U is equivalent to minimizing the sum of the marginal etropies.
Moreover, −H(yi ) can be seen as the Kullback-Leibler divergence between the
density of yi and a zero-mean unit-variance Gaussian density (up to a constant
term). This leads us to this conclusion that U must be estimated to produce the
outputs as non-Gaussian as possible. This fact has a nice intuitive interpretation :
from the central limit theorem we know that the mixing tends to gaussianize the
observations, and hence the separating system should go to the opposite direction.
A well-known algorithm based on the non-gaussianity of the outputs is FastICA
[Hyvarinen(1999)], [FastICA(1998)] which uses negentropy as a measure of nongaussianity.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
5.3 Basic BSS Methods
87
[D]- Algorithms for minimization of mutual information
To use MI in practice, we need some method of estimating or approximating
it from real data. We recall that there is many mutual entropy approximation techniques. The cumulant-based approximation was proposed in [Jones et Sibson(1987)],
and it is almost identical to those proposed in [Comon(1994)]. The approximation
of entropy using nonpolynomial functions were introduced in [Hyvarinen(1998)],
and they are closely related to the measure of non-gaussianity that have been
proposed in the projection pursuit literature, see, e.g., [Cook et al.(1993)].
5.3.2
BSS by maximization of non-gaussianity
Non-gaussianity is actually of paramount importance in blind source separation. Without non-gaussianity the separation is not possible at all, as shown above.
An important class of source separation algorithms is based on the non-gaussianity
of the outputs [Hyvarinen(1999)], [FastICA(1998)]. As a first practical measure of
non-gaussianity, the fourth-order cumulant, or kurtosis was introduced. Practical algorithms were derived using the gradient and fixed-point methods. However,
kurtosis has some drawbacks in practice, when its value has to be estimated from
a measured sample. The main problem is that kurtosis can be very sensitive to
outliers. in other words, kurtosis is not a robust measure of non-gaussianity. To
mitigate this problem, the negentropy was proposed as an important measure of
non-gaussianity. Its properties are in many ways opposite to those of kurtosis : It
is robust but computationally complicated. Furthermore, computationally simple
approximation of negentropy that more or less combine the good properties of
both measures are introduced in different paper in BSS literature (for more detail
refer to [Hyvarinen et al.(2001), chapter 8]).
5.3.3
BSS by maximum likelihood estimation
A very popular approach for estimating the independent component analysis
model is maximum likelihood (ML) estimation. Maximum likelihood estimation
is a fundamental method of statistical estimation ; a short introduction will be
provided in chapter 8. One interpretation of ML estimation is that we take those
parameter values as estimates that give the highest probability for the observations.
To perform maximum likelihood estimation in practice, we need an algorithm to
perform the numerical maximization of likelihood. For that, we distinguish two
cases :
• Sources PDF’s are known : If the densities of the independent components
are known in advance, a very simple gradient algorithm can be derived.
To speed up convergence, the natural gradient version and especially the
FastICA fixed-point algorithm can be used that maximize the likelihood
faster and more reliably.
• Sources PDF’s are unknown : If the densities of the independent components
are not known, the situation is somewhat more complicated. Fortunately,
however, it is enough to use a very rough density approximation as we will
perform in chapter 8 using the family of log-spline functions. The choice of
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
88
State of the Art of BSS
the density can then be based on the information whether the independent
components are sub- or super-gaussian. Such an estimate can be simply
added to the gradient methods, and it is automatically done in FastICA.
This is also the approach we have used throughout this thesis in the noisy case
(see chapter 8) as a semi-parametric maximum likelihood approach.
5.3.4
BSS by algebraic tensorial methods
One approach for estimation of independent component analysis consists of
using higher-order cumulant tensor. Tensors can be considered as generalization of
matrices, or linear operators. Cumulant tensors are then generalization of the covariance matrix. The covariance matrix is the second-order cumulant tensor, and the
fourth order tensor is defined by the fourth-order cumulants Cum(xi , xj , xk , xl ).
We can use the eigenvalue decomposition of the covariance matrix to whiten the
data [Abed-Meraim et Hua(1997)]. This means that we transform the data so that
second-order correlations are zero.
As a generalization of this principle, we can use the fourth-order cross-cumulant
tensor to make the fourth-order cumulants zero, or at least as small as possible.
This kind of higher-order decorrelation gives one of the most popular methods for
blind source separation [Cardoso et Comon(1996)]. Joint approximate diagonalization of eigenvalue decomposition is one method in this category that has been successfully used in low-dimensional problems [Cardoso et Souloumiac(1993)]. In the
special case of distinct kurtoses, a computationally very simple method (FOBI) can
be devised. An accessible and fundamental paper is [Cardoso(1999)] that also introduces sophisticated modifications of the previously proposed tensorial methods.
A more interesting generalization is given in [Moreau(2001)]. The tensor-based methods, however, have become less popular recently [Belouchrani et al.(2001)]. This
is because methods that use the whole EVD like JADE are restricted, for computational reasons, to small dimensions. Moreover , they have statistical properties
inferior to those methods using non-polynomial cumulant or maximum likelihood.
We shall consider this approach in more details in chapter 7. Indeed, we propose
in this thesis a normalized version of this class of methods using some normalized
second-order and fourth-order cumulants tensors to separate heavy-tailed signals
[Sahmoudi et al.(2004a)].
5.3.5
BSS by non-linear decorrelation
This approach is the early research effort in BSS that was successfully used by
Jutten, Hérault, and Ans to solve the first ICA problems. A good review of this
class of techniques can be found in [Jutten(2000)]. Today, this work is mainly of
historical interest, because there exist several more efficient algorithms for BSS.
Nonlinear decorrelation can be seen as an extension of second-order methods.
Independent sources can in some cases be found as nonlinearly uncorrelated linear combinations. The nonlinear functions used in this approach introduce higher order statistics into the solution method, making blind source separation
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
5.3 Basic BSS Methods
89
possible. In [Cichocki et Unbehauen(1996)], the first most popular learning algorithm was introduced as an extension of the first source separation algorithm
[Hérault et Ans(1984)]. Another well known algorithm is the equivariant adaptive
separation via independence ; EASI algorithm based on nonlinear decorrelation
[Cardoso et Lahed(1996)]. In [Amari et Cardoso(1997)], a different framework based on the estimating functions was introduced. Other somewhat related methods
were proposed in the blind source separation literature [Cichocki et Amari(2002)].
We will recall this approach and give more detail in chapter 7 as well as a normalized version, based on some normalized statistics, of this category of methods
to separate heavy-tailed sources [Sahmoudi et Abed-Meraim(2004b)].
5.3.6
BSS using geometrical concepts
Another method for BSS is the geometric approach [Mansour et al.(2001),
Mansour et al.(2002a), Babaie-Zadeh et al.(2004)]. This approach which holds essentially for two sources and two sensors, is based on a geometrical interpretation
of the independence of two random variables. To state the idea more clearly, suppose that the marginal PDF’s of the sources s1 and s2 are non-zero only within the
intervals M1 ≤ s1 ≤ M2 and N1 ≤ s2 ≤ N2 . Then, from the independence of s1 and
s2 , we have ps1 s2 (s1 , s2 ) = ps1 (s1 )ps2 (s2 ), and hence the support of ps1 s2 (s1 , s2 ) will
be the rectangular region {(s1 , s2 )|M1 ≤ s1 ≤ M2 , N1 ≤ s2 ≤ N2 }. In other words,
the scatter plot of the source samples forms a rectangular region in the (s1 , s2 )
plane. The linear mapping x = As transforms this region into a parallelogram
region. Without loss of generality, one can write
·
A=
1 a
b 1
¸
then it can be seen that the slopes of the borders of the scatter plot of the observations will be b and 1/a. Hence estimating the mixing matrix A is equivalent to
estimating the slopes of the borders of this parallelogram.
5.3.7
Source separation using Bayesian framework
Throughout our work so far we have assumed that there is no information
available about the true parameter beyond that provided by the data. However,
there are situations in which most statisticians would agree that more can be said.
Technically, there is a substantial number of statisticians in the Bayesian School
who feel that it is always reasonable, and indeed necessary, to think of the true
value of the parameter θ as being the realization of a random variable θ with a
known distribution. This distribution does not always correspond to an experiment
that is physically realizable but rather is thought of as a measure of the beliefs of
the experimenter concerning the true value of θ before he or she takes any data.
To describe the Bayesian procedure to source separation, let us write the Bayes’
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
90
State of the Art of BSS
theorem in the case of a source separation problem [Knuth(1999)]
P (A, s(t)|x(t), I) =
P (x(t)|A, s(t), I) P (A, s(t)|I)
P (x(t)|I)
(5.11)
where I represents any prior information. We can rewrite the equation as a proportionality and equate the inverse of the prior probability of the data P (x(t)|I)
to the implicit proportionality constant
P (A, s(t)|x(t), I) ∝ P (x(t)|A, s(t), I) P (A, s(t)|I)
(5.12)
The probability on the left-hand side of Equation (5.12) is referred to as the posterior probability . It represents the probability that given model accurately describes
the physical situation. The first term of the right-hand side is the likelihood of the
data given model. It describes the degree of accuracy with which we believe the
model can predict the data. The final term on the right is the prior probability
of the model, also called the prior. This prior represent the degree to which we
believe the model to be correct based only on our prior information about the
problem. It is through the assignment of the likelihood and priors that we express
all of our knowledge about the particular source separation problem.
If the linear mixture is relatively noise-free, the aims become now to estimate a
separating matrix B that optimizes the posterior probability of the model and estimate the source signals by applying the separation matrix to the recorded data.
The Bayesian methodology has several advantages. The most important advantage is the fact that all of the prior knowledge about a specific problem is expressed in terms of prior probabilities that must be evaluated. This provides one
with the means to incorporate any additional relevant information into a problem
[Djafari(1999)], [Snoussi et M.-Djafari(2000)].
Finally, I want to refer the French reader to Snoussi’s thesis [Snoussi(2003)] as one
of the best references, to my knowledge, about this class of methods.
5.3.8
BSS using time structure
In many applications, the source signals represent temporally correlated (colored) random processes referred to as colored time signals or time series. In that
case, they may contain much more structure than white random processes. This
additional information can actually make the estimation of the BSS model possible
in cases where the basic BSS methods cannot estimate it. For that, we should make
some assumptions on the time structure of the sources that allows for their separation. These assumptions are alternatives to the assumption of non-gaussianity.
[A]- Separation by autocovariances
The simplest form of time structure is given by autocovariances. This means covariances between the values of the signal at different time instants : Cov[xi (t)xi (t−
τ )] where τ is some lag constant. If the data have time-dependencies, the autocovariances are often different from zero. In addition, to autocovariances of one signal,
we also need covariances between two signals : Cov[xi (t)xj (t − τ )] where i 6= j.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
5.3 Basic BSS Methods
91
All these statistics for a given time lag can be grouped together in the time-lagged
covariance matrix
def
Cxτ = E{x(t)x(t − τ )T }
(5.13)
The key point here is that the information in a time-lagged covariance (called also
cross-correlation) matrix Cxτ could be used instead of the higher-order information
[Molgedey et Schuster(1994)]. What we do is to find a matrix B so that in addition
to making the instantaneous cross-correlation of y(t) = Bx(t) go to zero, the lagged
covariances are made zero as well :
E{yi (t)yj (t − τ )} = 0, for all i, j, τ
(5.14)
The motivation for this is that for the sources si (t), the lagged covariances are all
zero due to independence. Using these lagged covariances, we get enough extra information to estimate the sources, under certain conditions like as the the assumption of ”any two sources have different spectral shapes [Belouchrani et al.(1997a)]”.
No higher-order information is then needed. Using this approach, we have a simple
algorithm, called AMUSE [Tong et al.(1991)], for estimating the separating matrix
B from whitened data :
1. Whiten the data x to obtain z.
def
2. Compute the eigenvalue decomposition of C̄zτ = 21 [Cτ + CTτ ], where Cτ =
E{z(t)z(t − τ )} is the time-lagged covariance matrix, for some lag τ .
3. The rows of the separating matrix B are given by the eigenvectors.
An essentially similar algorithm was proposed in [Tong et al.(1991)]. An extension
of the AMUSE method that improves its performance is to consider several time
lags τ instead of a single one. Then, it is enough that the covariances for one
of these time lags are different. Thus, the choice of τ is a somewhat less serious
problem. The principle consist to simultaneously diagonalize all the corresponding
lagged covariance matrices. The algorithm called SOBI (second-order blind identification) [Belouchrani et al.(1997a)] is based on these principles, and so is TDSEP
[Ziehe et Müller(1998)].
[B]- Separation by non-stationarity of variances
If it is assumed that the sources are non-stationary, then we can divide the
signals into short windows as a sparse decomposition of the signal and consider
the covariances in each one
Et∈Tk {yi (t)yj (t)}
(5.15)
where Tk = (kT, (k + 1)T ].
Then, using joint diagonalization of the covariance matrices at different segments
we can separate the non-stationary sources [Pham et Cardoso(2001)].
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
92
5.4
5.4.1
State of the Art of BSS
BSS of Impulsive Heavy-Tailed Sources
Why heavy-tailed α-stable distributions ?
The emphasis in this thesis is on a class of signals that have heavy tailed. Heavy
tail signals are likely to exhibit large observation and have often an impulsive nature - and it turns out that a broad class of real life signals have heavy tailed
nature. The term heavy tail refers to that the probability density functions of the
signals have relative large mass in the tails.
This section provides a motivation for this part of this thesis and a brief discussion
of the fundamental problems in blind signal separation. This part of my work is
partly motivated by the lack of strong theoretical basis for blind signal separation
of heavy tailed signals in the signal processing literature. Furthermore, the motivation for considering heavy tailed distribution in this thesis is that many real
world signals turns out to have heavy tail laws [Adler et al.(1998)].
The main difference between the non-Gaussian stable distribution and the
Gaussian distribution is that the tails of the stable density are heavier than those
of the Gaussian density. In addition, the stable distribution is very flexible as a
modeling tool in that it has a parameter α (0 < α ≤ 2), called the characteristic exponent, that controls the heaviness of its tails. A small positive value of
α indicates severe impulsiveness, while a value of α close to 2 indicates a more
Gaussian type of behavior. Stable distribution obey the Generalized Central Limit
Theorem (GCLT) which states that if the sum of i.i.d random variables with or
without finite variance converges to a distribution by increasing the number of
variables, the limit distribution must be stable [Samorodnitsky et Taqqu(1994)].
Thus, non-Gaussian stable distributions arise as sums of random variables in the
same way as the Gaussian distribution. Another defining feature of the stable
distribution is the so-called stability property, which says that the sum of two
independent stable random variables with the same characteristic exponent is
again stable and has the same characteristic exponent. For these reasons, statisticians [Samorodnitsky et Taqqu(1994)], economists [Rachev(2003)], signal processing and communications engineers [Nikias et Shao(1995)], and other scientists
engaged in a variety of disciplines have embraced alpha-stable processes as the
model of choice for heavy-tailed data.
5.4.2
Existing BSS methods for heavy-tailed signals
A common characteristic property of many heavy-tailed distributions, such as
the α−stable family, is the nonexistence of finite second or higher order moments.
There are several well-known methods for source separation [Hyvarinen et al.(2001)],
based in general on second or higher-order statistics of the observations and so
are inadequate to handle heavy-tailed sources. In that case, fractional lower-order
theory can be used for stable signal separation. Only a limited literature was dedicated to BSS of impulsive signals. In [Shereshevski et al.(2001)], authors proposed the RQML algorithm based on the idea of setting the signals to zero wheM. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
5.5 Conclusion & Future Research
93
never they are larger (in absolute value) than some threshold K. Recall that
RQML is the restricted quasi-maximum likelihood approach introduced as an extension of the popular Pham’s quasi-maximum likelihood approach to the α-stable
sources case. Other solutions exist in the literature based on the spectral measure
[Kidmose(2001)], the order statistics [Shereshevski et al.(2001)] and the characteristic function [Eriksson et Koivanen(2003)].
Recently, in [Chen et Bickel(2004)] a new method based on a consistent prewhitening step was proposed. The authors, use the characteristic function based
contrast function proposed in [Kagan et al.(1973)] to achieve the source separation
problem and show that this approach can be consistent even when some hidden
sources do not have finite second moments. However, this approach has consistent
performance justly in the following two cases ; first, at most one component of
sources has infinite second moment and second if we have justly two alpha-stables
sources.
In this thesis, we introduce some new methods for α-stable source separation from their observed linear mixtures using the minimum dispersion criterion
[Sahmoudi et al.(2003a)], contrast functions [Sahmoudi(2005)], normalized statistics [Sahmoudi et al.(2004a), Sahmoudi et Abed-Meraim(2004b)] and the maximum likelihood [M. Sahmoudi et al.(2005)].
5.5
Conclusion & Future Research
In this chapter the fundamental methods of BSS are presented. Although several limitations and assumptions impede the use of BSS methods, it seems appropriate to conjecture that the algorithms and methods are useful tools with many
potential applications where many second-order statistical methods reach their limits. Several researchers believe that these techniques will have a huge impact on
engineering methods and industrial applications.
It is interesting to note that there are many issues subject to further investigation.
– Underdetermined BSS : Having more sources than sensors is of theoretical
and practical interest.
– Noisy BSS : Much more work needs to be done to determine the effect
of noise on performance. Sparse representation and independent factorial
analysis are very promising ideas.
– Non-stationarity problem : time-frequency analysis and unsupervised classification are two promising approaches in this context.
– BSS for data mining and data warehouse : Data mining, the extraction of
hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important
information in their data warehouses. The goal is to find a subset of a collection of documents relevant to a user’s information request. We believe that
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
94
State of the Art of BSS
the BSS model enables to extend the formulation into more unsupervised
classification problems.
– Blind separation of heavy-tailed signals : some standard BSS methods cannot
work in this case, other are mathematically not justifiable for the fact that
heavy-tailed distribution not have neither finite SOS nor HOS . The goal of
the half of this thesis is to investigate this problem.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
Chapitre 6
Minimum Dispersion Approach
6.1
Introduction
This chapter introduces a new Blind Source Separation (BSS) approach for
extracting impulsive source signals from their observed mixtures. The impulsive,
or heavy-tailed signals are modeled as real-valued symmetric alpha-stable (SαS)
processes characterized by infinite second and higher order moments. A new whitening procedure by a normalized covariance matrix is introduced. The proposed
approach uses the minimum dispersion (MD) criterion as a measure of sparseness and independence of the data. We show that the proposed method is robust, so-named for the property of being insensitive to possible variations in the
underlying form of sampling distribution. Algorithm derivation, discussion and
simulation results are provided to illustrate the good performance of the proposed approach. In particulary, the new method has been compared with three of
the most popular BSS algorithms ; JADE [Cardoso et Souloumiac(1993)], EASI
[Cardoso et Lahed(1996)] and RQML [Pham et Garrat(1997)].
6.1.1
The failure of second and higher -order methods
From signal processing point of view, the adoption of a stable model for
signal or noise has important consequences. Second-order stationary processes
have been historically the main subject of study in statistical signal processing.
Second-order-based estimation techniques are commonly recognized as the natural tools to be used in the presence of gaussian noise. Research efforts on higherorder statistics (HOS) have led to the development of improved estimation algorithms for non-Gaussian environments, but this work has been based on the
assumptions that second-order and higher-order statistics of the processes exist
96
Minimum Dispersion Approach
and are finite [Nikias et Petropulu(1994)]. Important non-Gaussian impulsive processes can be efficiently modeled by heavy-tailed processes with infinite variance
for which neither the classical second-order theory nor the theory of HOS are useful [Nikias et Shao(1995)].
It has been shown repeatedly in the literature that infinite-variance processes that
appear in practice are well modeled by probability distributions with algebraic
tails, i.e., random variables for which
IP (|X| > x) ∼ cx−α
(6.1)
for some fixed c, α > 0.
Algebraic-tailed r.v.s exhibit finite absolute moments for orders less than α ; i.e.,
IE|X|p < ∞,
p<α
(6.2)
Conversely, if p ≥ α, the absolute moments become infinite, and thus unsuitable
for statistical analysis. When α < 2, the processes present infinite variance, and
the standard second or higher-order statistics cannot be successfully applied.
6.1.2
Fractional lower-order statistics (FLOS) theory
Alternative attempts to characterize the behavior of impulsive signals have
relied on fractional lower-order statistics (FLOS) in the context of non-Gaussian
α-stable distribution (α < 2). It has been shown that FLOS give robust measures of
impulsive processes’ characteristics [Ma et Nikias(1995a)], [Nikias et Shao(1995)].
For a zero location alpha-stable r.v. X with dispersion γ, the norm of X is defined
as
½
γ
if 0 < α < 1
kXkα =
(6.3)
1
α
if 1 ≤ α ≤ 2
γ
Hence, the norm kXkα is a scaled version of the dispersion γ. If X and Y are
jointly alpha-stable, the distance between X and Y is defined as
dα (X, Y ) = kX − Y kα
(6.4)
p
Combining this equations with the fact that IE|X|p ∝ γ α for 0 < p < α, it is easy
to see that the p-th order moment of the difference between two alpha-stable r.v.s
is a measure of the distance dα between these two r.v.s. In addition, all fractional
lower-order moments of an alpha-stable r.v. are equivalent, i.e. p-th and q-th order
moments differ by a constant factor independent of the r.v. as long as p, q < α.
Furthermore, it was shown in [Schilder(1970)] that for 1 ≤ α ≤ 2, k.kα is a norm
in the linear space of alpha-stable processes. Our proposed blind source separation
methodology presented in this chapter uses the notion of fractional lower-order
moments to achieve robust signal reconstruction.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
6.2 Source Separation Procedure
6.2
97
Source Separation Procedure
6.2.1
Whitening by normalized covariance matrix
The first step consists of whitening the observations (orthogonalizing the mixture matrix A). For finite variance signals, the whitening matrix W is computed
as the inverse square root of the signal covariance matrix. At a first glance, this
should not be applied to α-stable sources. However, we prove in the following that
a properly normalized covariance matrix converges to a finite matrix with the appropriate structure when the sample size N tends to infinity. More specifically, we
propose and prove the following results.
Theorem 6.1. Let X1 and X2 be two SαS variables of dispersions γ1 and γ2 and
ˆ 1 |2
γ1
PDFs f1 (.) and f2 (.), respectively. Then, we have limN →∞ IE|X
where
ˆ 2 |2 = γ2
IE|X
P
N
1
ˆ denotes the time averaging operator IE[g(X)]
ˆ
IE
=
g[X(t)].
N
t=1
Proof
Let T be an arbitrary positive constant and 1I{|X|≤T } the indicator
function which equals 1 if |X| ≤ T and 0 otherwise. Then, due to the
ergodicity of X1 and X2 , we have
1 PN
2
2
ˆ 2 1I{|X |≤T } ]
IE[X
1
t=1 x1 (t)1I{|X1 |≤T } N →∞ IE[X1 1I{|X1 |≤T } ]
1
N
−→
=
P
N
1
2
ˆ 2 1I{|X |≤T } ]
IE[X22 1I{|X2 |≤T } ]
IE[X
2
2
t=1 x2 (t)1I{|X2 |≤T }
N
(6.5)
Due to the symmetry of α-stable PDF the right hand side term can
be expressed as
RT
RT 2
2
IE[X12 1I{|X1 |≤T } ]
x f1 (x)dx
−T |x| f1 (x)dx
= RT
= R0T
(6.6)
2
IE[X2 1I{|X2 |≤T } ]
|u|2 f2 (u)du
u2 f2 (u)du
−T
0
Using integration by parts and the fact that (1 − Φ(x)) ∼ C2α γx−α
as x → ∞ for any SαS distribution function Φ, we obtain that as
T → ∞, the above ratio is equivalent to
RT
Cα γ1 [x2−α ]T0 − 2 0 x1−α dx T →∞ γ1
−→
(6.7)
R
Cα γ2 [u2−α ]T − 2 T u1−α du
γ2
0
0
Thus, from equations (6.5), (6.6) and (6.7) the ratio
asymptotically to
γ1
γ2 .
ˆ 2]
IE[X
1
ˆ
IE[X22 ]
converge
¥
4
Theorem 6.2. Let x = As be a data vector of an α-stable process mixture and
P
def
T
R̂ = N1 N
t=1 x(t)x(t) its sample covariance matrix. Then the normalized covariance matrix of x defined by
def
R̂ =
R̂
T race(R̂)
(6.8)
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
98
Minimum Dispersion Approach
converges asymptotically to the finite matrix ADAT , where D is the positive diagonal matrix D = diag(d1 , · · · , dm ) with di = Pm γγij kaj k2 where γi is the dispersion
j=1
of the ith source signal and k.k denotes the euclidian norm.
Proof
We have, clearly,
Pm ˆ
m
X
ˆ i (t)]2
IE[si (t)]2 ai aTi
R̂
IE[s
= Pmi=1
=
ai aTi
P
m ˆ
2 ka k2
ˆ j (t)]2 kaj k2
IE[s
IE[s
(t)]
T race(R̂)
j
j
j=1
j=1
i=1
(6.9)
Using theorem 6.1, we see that
Pm
j=1
ˆ i (t)]2
IE[s
γi
N →∞
−→ di = Pm
2
2
2
ˆ j (t)] kaj k
IE[s
j=1 γj kaj k
N →∞
Then, from equations (6.9) and (6.10) R̂ −→
Pm
T
i=1 di ai ai
(6.10)
= ADAT
¥
Proposition 6.1. Let R̂ be the normalized covariance matrix defined above in
(6.8) of the considered α-stable mixture. Then the inverse square root matrix of R̂
is a data whitening matrix.
Proof
Theorem 6.2 means that the normalized covariance matrix R̂ has the
appropriate structure to compute a whitening matrix. Indeed, the
whitening matrix can be obtained from the eigendecomposition of
−1/2
R̂ = UΣUT as W = Σs UTs where Σs (resp. Us ) corresponds
to the diagonal (resp. orthogonal) matrix of the m largest eigenvalues (resp. eigenvectors) of R̂. Then, we can write I = WRWT =
1
1
WADAT WT = (WAD 2 )(WAD 2 )T . Recall that, without loss of
1
generality, A can be replaced by AD 2 (D being a positive diagonal
matrix) because of the scaling indeterminacy. We can see that W
1
transforms AD 2 (i.e. the mixing matrix) into an orthogonal matrix.
¥
6.2.2
Minimum dispersion criterion
[A]- Minimum dispersion criterion in signal processing
The minimum dispersion (MD) criterion is a common tool in linear theory of
stable processes as the dispersion of a stable r.v. plays a role analogous to the
variance. For example, the larger the dispersion of a stable distribution, the more
it spreads around its median. Hence, the minimum dispersion criterion becomes
a natural and mathematically meaningful choice as a measure of optimality in
signal processing problems based on stable models. By minimizing the error dispersion, we minimize the average magnitude of estimation errors. Furthermore, it
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
6.2 Source Separation Procedure
99
has been shown that minimizing the dispersion is also equivalent to minimizing
the probability of large estimation errors. Hence, the minimum dispersion criterion
is well justified under the stable model assumption. It is a direct generalization
of the minimum mean-squared error criterion and relatively simple to calculate
[Nikias et Shao(1995)].
Minimizing the dispersion is also equivalent to minimizing the fractional lowerorder moments of estimation errors that measure the Lp distance between an
estimate and its true value, for 0 < p < α ≤ 2. This result is not surprising since
the Lp norms for p < 2 are well known for their robustness against outliers such
as those that may be described by the stable law. It is also known that all the
lower-order moments of a stable r.v. are equivalent, i.e. any two of the lower-order
moments differ by a fixed constant that is independent of the r.v. itself. A common
choice is the L1 norm, which is sometimes very convenient.
Stable signal processing based on fractional lower-order moments will inevitably introduce nonlinearity to even linear problems. The basic reason for the non-linearity
is that we have to solve linear estimation problems in Banach or metric spaces instead of Hilbert spaces. It is well known that, while the linear space generated
by a Gaussian process is a Hilbert space, the linear space of a stable process is
a Banach space when 1 ≤ α < 2 and only a metric space when 0 < α < 1
[Cambanis et Miller(1981)]. Banach or metric spaces do not have as nice properties and structures as Hilbert spaces for linear estimation problems.
[B]- Minimum dispersion criterion for BSS
4
Let z(t) = Bx(t) where B is unitary, x denotes the whitened data, i.e, x = Wx
and B is a separating matrix to be estimated. Let us consider the global MD
criterion given by the sum of dispersions of all entries of z, i.e.
4
J(B) =
m
X
γzi
(6.11)
i=1
where γzi denotes the dispersion of zi (t) the i-th entry of z(t).
In this chapter we prove that the MD criterion defines a contrast function in the
sense that the global minimization of the objective function given in (6.11) leads to
a separating solution. The p-th order moment of an α-stable r.v. and its dispersion
are related through only a constant (see property 2.7). Therefore, the MD criterion is equivalent to least lp -norm estimation where 0 < p < α. Although the most
widely used contrast functions for BSS are based on the second and fourth-order cumulants [Cichocki et Amari(2002)], we believe however that there are good reasons
to extend the class of contrast functions from cumulants to fractional moments, as
we argue next. Mutual information (MI) is usually chosen to measure the degree
of independence. Because the direct estimation of MI is very difficult, one can then
derive approximative contrast functions, often based on cumulant expansions of
the densities. However, one can approximate the Shannon entropy (that is closely
related to the MI) using the lp -norm concept ([Karvanen et Cichocki(2003)]) and
hence use it to approximate the MI. For example, in [Hyvarinen(1999)] the author
uses the lp -norm concept to approximate the MI and then to find the optimal
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
100
Minimum Dispersion Approach
contrast function for exponential power family of density fp (x) = k1 exp(k2 |x|p ).
Thus we propose the MD criterion or equivalently the FLOM based criterion (2.7)
for measuring independence of alpha-stable distributed data. We should also note
here that the lp -norm is commonly used as a measure of sparseness of signals
[Karvanen et Cichocki(2003)]. This leads to the use of MD criterion as a measure of sparseness which has been demonstrated to be a powerful concept in BSS
[Cichocki et Amari(2002)], [Karvanen et Cichocki(2003)]. Consequently, the MD
criterion can be used as a cost function to achieve the BSS as shown by the following result.
Theorem 6.3. The minimum dispersion criterion
def
J(B) =
m
X
γ zi
(6.12)
i=1
is a contrast function under orthogonality constraint for separating an instantaneous mixture of alpha-stable sources.
Proof
Note that z(t) is an orthogonal mixture of the sources and can be
def
written as z(t) = Cs(t) with C = BWA orthogonal. Here we prove
that MD criterion J(B) reaches its minimum value in the set of orthogonal matrices if and only if BW (W is the whitening matrix) is
a separating matrix or equivalently if and only if C is a generalized
permutation matrix (i.e. a permutation matrix times a non-singular
diagonal matrix). Indeed, using properties 1 and 2 of SαS processes
presented in section I, one can write :


m
m
X
X

(6.13)
J(B) =
| Cij |α γsj 
i=1
=
=
Ã
j=1
m
m
X
X
j=1
m
X
!
| Cij |α
γsj
(6.14)
i=1
aj γsj
(6.15)
j=1
4 Pm
α
with aj =
i=1 | Cij | and Cij being the (i, j)-th entry of C. Now
since aj and γsj are positive, minimizing J(B) is equivalent to minimize all aj coefficients. Let us prove that coefficients aj satisfy
aj ≥ 1 ∀ j, and aj = 1 if and only if C is a generalized permutation
matrix. Since C is unitary (which implies
|≤P1) and α < 2 we
P that | Cji
m
α≥
2
have | Cij |α ≥| Cij |2 . Therefore aj = m
|
C
|
ij
i=1
i=1 | Cij | = 1.
The equality holds if and only if ∀i | Cij |α =| Cij |2 or equivalently
if Cij = 0 or |Cij | = 1. C being unitary, the latter is satisfied if and
only if ∀j ∃ ij such that | Cij j |= 1 and Cij = 0 ∀ i 6= ij .
¥
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
6.2 Source Separation Procedure
101
The proposed method requires no or little a priori knowledge of the input signals. The dispersion as well as the characteristic exponent α are estimated according to [Tsihrintzis et Nikias(1996)] where the proposed estimator is proved to
be consistent and asymptotically normal. This estimator is based on the theory of
fractional lower order moments of the SαS distributions.
6.2.3
Separation algorithm : Jacobi implementation
Theorem 6.3 proves that under orthogonal transform the signal has minimum
dispersion if its entries are mutually independent. The problem now is to minimize
a cost function under orthogonal constraint. Different approaches exist to perform
this constraint optimization problem. We chose here to estimate B as a product
of Givens rotations according to
B=
Y
Y
Ωpq (θ)
(6.16)
#sweeps 1≤p<q≤m
Were Ωpq (θ) is the elementary Givens rotation defined as orthogonal matrix where
all diagonal elements are 1 except for the two elements c = cos(θ) in rows (and
columns) p and q. Likewise, all off-diagonal elements of Ωpq (θ) are 0 except for
the two elements s = sin(θ) and −s at positions (p, q) and (q, p), respectively. The
minimization of J(Ωpq (θ)) is done numerically by searching θ using a fine grid into
[0, π/2] 1 The so called MD algorithm can be summarized as follows in Table 6.2.3.
'
$
Minimum Dispersion Algorithm
Step 1. Whitening transform
Step 2. Sweep. For all pairs 1 ≤ p < q ≤ m, do
– Compute the Givens angle 0 ≤ θ̂pq < π/2 that maximize the pairwise
independence for zp and zq by minimizing the global dispersion J(Ωpq (θ)).
– If θ̂pq > θ̂min a , rotate the pair accordingly.
– If no pair has been rotated in the previous sweep, endb . Otherwise perform
another sweep.
a
The constant θ̂min is a threshold value that defines the minimum rotation angle that
is significant in estimating B.
b
In our simulation, we used an angle grid resolution of π/100. This value is the same
one that used for the threshold value
&
%
Tab. 6.1 – The principal steps of the proposed minimum dispersion (MD)
algorithm.
1
Here, we consider [0, π/2] instead [0, π] because Ωpq (θ + π/2) is equal to Ωpq (θ) up to
a generalized permutation matrix.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
102
6.3
Minimum Dispersion Approach
Performance Evaluation & Comparison
This section examines the statistical performances of our MD-based separation procedure. The numerical results presented below have been obtained in the
following setting. The source signals are i.i.d. impulsive standard SαS (µ = 0
and γ = 1). The number of sources is m = 3 and the number of observations is
n = 3. The statistics are evaluated over 100 Monte Carlo runs and the mixing
matrix as well as the sources are generated randomly at each run. The performance of our MD method is compared to three widely used BSS algorithms ;
JADE [Cardoso et Souloumiac(1993)], EASI [Cardoso et Lahed(1996)] and RQML
[Shereshevski et al.(2001)]. To measure the quality of source separation, we did use
the generalized rejection level criterion defined below.
6.3.1
Generalized rejection level index
To evaluate the performance of the separation method, we propose to define the rejection level Iperf as the mean value of the interference signal dispersion over the desired signal dispersion. This criterion generalizes the existing one
[Cichocki et Amari(2002)] based on signal powers2 which represents the mean value of interference to signal ratio. If source k is the desired signal, the related
generalized rejection level would be :
P
P
α
def γ( l6=k Ckl sl )
l6=k |Ckl | γl
=
(6.17)
Ik =
γ(Ckk sk )
|Ckk |α γk
where γ(x) denotes the dispersion of a SαS RV x. Therefore, the averaged rejection
level is given by
m
m
1 X
1 X X | Cij |α γj
Iperf =
Ii =
.
m
m
| Cii |α γi
i=1
6.3.2
(6.18)
i=1 j6=i
Experimental results
• First experiment
In Figure 6.1, we present an example of separation of highly impulsive sources
(α = 0.5) mixed by a random 3 × 3 matrix A. It appears that the proposed
algorithm achieves very good separation quality.
• Second experiment
In Figure 6.2, the mean rejection level of the MD algorithm versus the characteristic exponent is plotted. The sample size is set to N = 1000. It appears
that the parameter α is of crucial importance as it has a major influence on the
separation performance. Two important features are observed : the mean rejection
2
For SαS processes the variance (power) is replaced by the dispersion.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
6.3 Performance Evaluation & Comparison
6
5
x 10
7
Source1
3
5
2
0
1
−5
0
−10
−5
−1
Amplitude
Amplitude
6
Source2
0
0
5000
Mixture1
10000
40
20
0
5000
Mixture2
10000
−15
400
50
200
0
0
−50
x 10
0
Source3
5000
Mixture3
10000
0
−20
−40
0
50
Amplitude
x 10
103
−200
−100
5000
10000
0
5000
10000
0
First estimated signal
Second estimated signal
100
50
0
50
0
−50
0
−50
−100
0
5000
Time
10000
−50
0
5000
Time
5000
10000
Third estimated signal
−100
10000
0
5000
Time
10000
Fig. 6.1: Extraction of 3 α-stable sources from 3 observations where α = 0.5,
and N = 10000.
level increases when the sources are very impulsive (α close to zero) or when they
are close to the Gaussian case (α close to two). In the latter case (i.e. α = 2), the
source separation is not possible.
5
EASI
Iperf: Generalized mean rejection level in dB
0
−5
−10
JADE
RQML
−15
−20
MD
−25
−30
0
0.2
0.4
0.6
0.8
1
1.2
1.4
α: Characteristic exponent of the sources
1.6
1.8
2
Fig. 6.2: Generalized mean rejection level versus α where N = 1000.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
104
Minimum Dispersion Approach
• Third experiment
In Figure 6.3, the simulation study shows that estimation errors of the characteristic exponent α of sources distribution have little influence on the performances
of the algorithm.
10
Iperf Generalized mean rejection level in dB
0
−10
−20
−30
−40
−50
−0.2
−0.15
−0.1
−0.05
0
0.05
0.1
∆α: Error of characteristic exponent estimation
0.15
0.2
Fig. 6.3: Generalized mean rejection level versus the estimation error ∆α.
• Fourth experiment
In Figure 6.4, for our proposed MD algorithm, two different scenarios lead
to similar performances. In the first scenario, we consider a mixture of three αstable sources with same characteristic exponent α = 1.5 and in the second one,
we assume wrongly three SαS sources with α = 1.5 while, in reality, the sources
are SαS with different characteristic exponents α1 = 1.5, α2 = 1 (Cauchy pdf)
and α3 = 2 (Gaussian pdf). It can be observed that the algorithm can separate
sources from their mixtures even though we deviate from the assumptions under
which it is derived. Consequently, the MD algorithm is robust to possible sources
modelization errors.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
6.3 Performance Evaluation & Comparison
105
−10
MD: sources with same characteristic exponent α=1.5
MD: sources with different characteristic exponent α1=1, α2=1.5 and α3=2
−12
Iperf: Generalized mean rejection level
−14
−16
−18
−20
−22
−24
−26
−28
−30
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
N: sample size
Fig. 6.4: Generalized mean rejection level versus sample size N.
• Fifth experiment
Figure 6.5 shows the performance obtained by each of the four BSS algorithms
as a function of the sample size N in the case of α = 1.5.
One can observe that good performances are reached by the MD algorithm for
relatively small/medium sample sizes.
This figure demonstrates also that EASI fails to separate α-stable signals and
that JADE is sub-optimal in this context. This is due to the fact that EASI and
JADE are not specifically designed for heavy-tailed signals.
As a comparison between MD and RQML, we can observe a certain performance gain in favor of the MD algorithm. This is due to the fact that truncating
observations, in RQML procedure, created by large source signal values is not
optimal because these observations must be very informative.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
106
Minimum Dispersion Approach
5
EASI
Iperf: Generalized mean rejection level
0
−5
JADE
−10
RQLM
−15
MD
−20
−25
−30
0
500
1000
1500
2000
2500
3000
N: sample size
3500
4000
4500
5000
Fig. 6.5: Generalized mean rejection level versus sample size for α = 1.5.
• Sixth experiment
In the sixth experiment, we consider the case where the observation is corrupted by an additive white Gaussian noise.
The mean rejection level versus noise power is depicted in Figure 6 for α = 1.5
and N = 1000.
In this experiment, the noise level σ 2 is varied between 0 dB and -30 dB.
As can be seen, the performances degrade significantly when the noise power
is high. This might be explained by the fact that the theory does not take into
consideration additive noise.
Improving robustness against noise is still an open problem under investigation.
It can be seen from Figure 6.6, however, that the proposed MD method has reliable
performance and outperforms RQML algorithm in the low or moderate noise power
situation.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
6.4 Concluding Remarks
107
0
EASI
Iperf Generalized mean rejection level in dB
−5
−10
JADE
−15
RQML
−20
MD
−25
−30
−25
−20
−15
−10
−5
0
noise power in dB
Fig. 6.6: Generalized mean rejection level versus the additive noise power
for α = 1.5.
6.4
Concluding Remarks
We have introduced a two step procedure for α-stable source separation.
A first generalized whitening step allows to orthogonalize the mixing matrix using
a normalized covariance matrix of the observation.
In the second step, the remaining orthogonal matrix is estimated by minimizing a
global dispersion criterion.
The proposed method is robust to modelization errors of the source pdf. Numerical
examples are presented to illustrate the effectiveness of the proposed method that
is shown to perform better than the RQLM method. Moreover, they confirm that
existing BSS methods, which are not designed specifically to handle impulsive
signals, fail to provide good separation quality.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
108
Minimum Dispersion Approach
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
109
Chapitre 7
Sub- and Super- Additivity
based Contrast Functions
In this chapter, we introduce a generalization of our previous contribution.
Indeed, we provide a systematic method to construct contrast functions through
the use of sub- and super-additive functionals1 . Some practical examples of useful
contrast functions are introduced and discussed.
7.1
BSS Using Contrast Functions
In this chapter, we consider the mixture model given by x = As where A is an
unknown n × m mixing matrix, x denotes the observation vector and s represents
the source vector. The separation problem consist of finding a separating matrix
B such that the components of y = Bx are independent. Note that belong this
chapter we consider BSS under orthogonality constraint (assuming implicitly that
a whitening step has already been performed). Thus, in the rest of this chapter we
suppose that B is an orthogonal matrix.
7.2
On contrast functions
The concept of contrast function for source separation has been first presented by [Comon(1994)]. A contrast function for source separation is a real valued
1
These arguments follow the same procedure as in [Sahmoudi et al.(2005)]. Furthermore, inspired from the proof of the fact that the minimum dispersion is a contrast function, this chapter is a direct generalization of the previous one
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
110
Sub- and Super- Additivity based Contrast Functions
function of the distribution of a random vector which is minimized (or maximized)
when the source separation is achieved. To characterize mathematically a contrast
function we use the following definition
Définition 7.1. A functional F is a contrast function if and only if it satisfies
the two requirements :
R1. F (Cs) ≥ F(s) for any independent random vector s and any invertible
matrix C.
R2. The equality F (Cs) = F(s) holds if and only if C = PD, where P and
D are a permutation and a diagonal matrix, respectively.
Thus, we define the contrast function as in [Comon(1994)] as functional of the
distribution of Bx or also of B which attains its minimum (or maximum) when
separation is achieved. Intuitively, a measure of dependence between the components of Bx would be a contrast, but there may be other. It should be noted
that the construction of a contrast function is only a first step toward a separation procedure. The contrasts proposed here are theoretical functionals because
they depend on the distribution of the reconstructed sources, which is unknown.
To obtain a usable contrast, this distribution or in fact certain functionals of it
must be estimated from the data. We will not consider here this problem as well
as that of constructing a good algorithm for minimizing the resulting empirical
contrast. It should be however pointed out that the ease of the estimation and
of the minimizing algorithm, both in term of implementation and computational
cost, should be taken into account, besides performance considerations, in assessing the final separation method. As existing examples of contrast functions, it
has been shown in [Comon(1994)] and [Cardoso(1998)] that the sum of the 4-th
order cross-cumulants of the components is a contrast function. Other contrast
functions can be found in [Moreau et Macchi(1996)], [Moreau et Pesquet(1997)],
[Moreau et Stoll(1999)], [Cardoso(1999)], [Pham(2000)], [Adib et al.(2002)].
Note that the ideas of this chapter are inspired from the projection pursuit
methodology described in [Huber(1985)]. In this paper, Huber used sub- and superadditive functionals (under other additional assumptions) to define a test statistics
for normality. Similarly, we use these classes of functionals to define some index
of non-gaussianity. Thus minimizing the proposed criterions may be viewed as
maximizing the non-gaussianity of the observations.
Remark 7.1. Some heuristic arguments that sub- additivity functionals go together with non-gaussianity measure, are as follows :
– The cumulants which are widely used as measure of non-gaussianity are
additive (sub- and super- additive) functionals.
– The exponential Shannon entropy defined by
½ Z
¾
H(x) = exp − log(f )f dx
(7.1)
where f is the PDF of x, which is commonly used for non-gaussianity measure is super-additive. For a proof, see [Blachman(1965)].
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
7.3 Orthogonality constraint
111
Définition 7.2. A functional F of the distribution of a random variable X, denoted by F(X), is said to be scale equivariant if
F(aX) = |a|F(X)
(7.2)
for any real number a.
Note that if F is scale equivariant, then |F| is also scale equivariant. Hence,
we can without loss of generality assume in this work F ≥ 0.
7.3
Orthogonality constraint
Principal Component Analysis (PCA) or whitening consists of transforming
the observation vector to decorrelated outputs. However, it is well known that the
PCA job is not sufficient for separating the sources.
To see this, consider a square BSS model (i.e. n = m). For estimating the n × n
matrix A, taking into account n scale ambiguities, we must determine n(n − 1)
unknown coefficients. The second-order decorrelation constraints give n(n − 1)/2
equations, which is not sufficient for determining A. Thus we get a proof that the
Gaussian sources cannot be separated : they are characterized justly by first and
second order statistics.
It is interesting to note that the second-order independence (whitening), solves
the BSS problem up to an orthogonal transformation. To see this, consider the
factorization B = UW of the separating matrix, where w is the spatial whitening
matrix of the observations and U is an orthogonal transformation. In other words,
for z = Wx, we suppose that IE{zzT } = I without loss of generality. Now, since
the output are independent, from
IE{yyT } = UIE{zzT }UT = I,
(7.3)
we deduce that UU = I is the second half job of the BSS problem.
Thus one can say that whitening solves half of the problem of BSS. Because whitening is a very simple and standard procedure, much simpler than any BSS algorithms, it is a good idea to reduce the complexity of the problem this way. The
remaining half of the parameters has to be estimated by some other method. This
fact shows that for finding the other required equations, other information must
be used such as HOS and FLOS.
Even in cases where whitening is not explicitly required, it is recommended,
since it reduces the number of free parameters and considerably increases the
performance of the methods, especially with high-dimensional data.
7.4
Sub-Additivity based Contrast Functions
Définition 7.3. A functional F of the distribution of a random variable X, denoted by F(X), is said to be sub-additive if
F(X + Y ) ≤ F(X) + F(Y )
(7.4)
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
112
Sub- and Super- Additivity based Contrast Functions
for any two independent random variables X and Y .
Définition 7.4. A functional F of the distribution of a random variable X, denoted by F(X), is said to be σ-sub-additive if
F σ (X + Y ) ≤ F σ (X) + F σ (Y )
(7.5)
for any two independent random variables X and Y .
Theorem 7.1. Let suppose that F is a σ-sub-additive and scale equivariant functional and that the mixing matrix A is orthogonal. Let σ be a real number such
that σ ≥ 2. Then, the following objective function
C(B) = −
n
X
F σ (yi ) where y = Bx
(7.6)
i=1
is a contrast function for blind separation of linear instantaneous mixtures under
the orthogonality constraint of the mixing matrix B.
Proof
def
Let write C = BA. Then, letting Cij be the general element of C
and sj be the components of s, one has
yi =
m
X
Cij sj .
(7.7)
j=1
Hence, using the equivariant and the sub-additivity properties of F
we have

 
σ
m
X
X
Fσ 
Cij sj  ≤ 
|Cij |F(sj ) .
(7.8)
j
j=1
³P
m
´ ³P
´
P
m P F(sj )
Write m
|C
|F(s
)
=
F(s
)
|C
|
m
j
ij
j
ij , and
j=1
j=1
j=1
j=1 F(sj )
σ
use the convexity property of the function x 7→ x for a real number
σ ≥ 2, then we have :


X
Cij sj 
F σ (yi ) = F σ 
(7.9)
j

σ
m
m
X
X
F(s )


Pm j
|Cij |σ
≤
F(sj )
F(sj )
j=1
j=1
j=1
σ−1

m
m
X
X
F(sj )
|Cij |σ F(sj )
≤ 
j=1
≤ Υ
m
X
(7.10)
(7.11)
j=1
|Cij |2 F(sj ).
(7.12)
j=1
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
7.4 Sub-Additivity based Contrast Functions
where Υ is the constant quantity
³P
m
´σ−1
Summing the
P
above inequalities and use the orthogonality constraint |Cij |2 ≤ ni=1 |Cij |2 =
1, one gets
Ãm
!
n
n
m
X
X
X
X
σ
2
F (yi ) ≤ Υ
|Cij | F(sj ) = Υ
F(sj )
(7.13)
i=1
j=1
j=1 F(sj )
113
i=1
.
j=1
Clearly the equality is attained P
if C is a generalized2 permutation
matrix. This proves that C(B) = ni=1 F σ (yi ) is a contrast function.
¥
Thus one needs only to find a sub-additive scale equivariant functional F.
7.4.1
Lp -norm contrast functions ; p ≥ 1
R
def
Let F(Y ) = kY kp = (IE|Y |p )1/p = ( |y|p fy (y)dy)1/p , the Lp norm of the
random variable Y , where fy (.) denote the density function of Y .
Note that by the second and third axioms in a norm definition, F(Y ) = kY kp is
sub-additive and scale equivariant. Thus, the Lp -norm criterion
Cp (B) =
n
X
kyi kσp
with σ ≥ 2
(7.14)
i=1
is a contrast function that can separate sub- and super- Gaussian sources. Indeed,
it is worth to emphasis that the existing of fractional lower-order moments imply
that the Lp -norm contrast function can separate heavy-tailed α-stable signals by
choosing 1 ≤ p < α. For examples one can choose σ = 2p.
7.4.2
Alpha-stable scale contrast function
Let us consider a mixture of α-stable sources with the same characteristic
exponent α and dispersion γ. The scale parameter of an alpha-stable distribution
1
1
1
is defined by S = γ α . We recall that S(aY ) = γ α (aY ) = aγ α (Y ) = aS(Y ) .
def
1
Then, the α-stable scale functional F(Y ) = γ α is equivariant. Note that the scale
functional is also sub-additive. To prove this, let us consider two independent r.v
X and Y , then we have
1
S(X + Y ) = (γX+Y ) α
= (γX + γY )
1
α
(7.15)
1
α
(7.16)
1
α
≤ (γX ) + (γY ) .
(7.17)
The last inequality result for α ≥ 1 follows the fact that for non negative numbers
u, v, r, with r ≤ 1 one has (u + v)r ≤ ur + v r , because
Z u+v
Z v
Z v
(u + v)r − ur =
rtr−1 dt =
r(t + u)r−1 dt ≤
rtr−1 dt = v r
u
0
0
2
We mean by generalized permutation matrix any matrix DP, where P is a permutation
matrix and D is a diagonal matrix.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
114
Sub- and Super- Additivity based Contrast Functions
From theorem 7.1 the sum of scale of all output BSS model
C(B) =
n
X
Syσi
with σ ≥ 2
(7.18)
i=1
defines a contrast function. Thus, this is another contrast function that can separate linear alpha-stable mixtures. In the algorithm derivation step one can estimate
Syi using one of existing method to estimate the dispersion γyi .
Remark 7.2. The Lp -norm contrast function can separate a mixture of heavy
tailed and other non-heavy tailed sources. This robustness property follows from
the fact that the fractional lower-order statistics needed for empirical computation
are always defined for any r.v. with distribution not necessary alpha-stable unlike
the alpha-stable scale contrast function it is restricted to sources with alpha-stable
distributions.
7.5
Super-Additivity based Contrast Functions
Définition 7.5. A functional G of the distribution of a random variable X, denoted
by G(X), is said to be super-additive if
G(X + Y ) ≥ G(X) + G(Y )
(7.19)
for any two independent random variables X and Y .
Définition 7.6. A functional G of the distribution of a random variable X, denoted
by G(X), is said to be σ-super-additive if
G σ (X + Y ) ≥ G σ (X) + G σ (Y )
(7.20)
for any two independent random variables X and Y .
Theorem 7.2. Suppose that G is a σ-super-additive and scale equivariant functional, that the mixing matrix A is orthogonal and that σ is a real number such
that σ < 2. Then, the following objective functions
C(B) = −
n
X
G σ (yi ) where y = Bx
(7.21)
i=1
are contrast functions for blind separation of linear instantaneous mixtures under
the orthogonality constraint of the mixing matrix B.
Proof
– To prove the first contrast function requirement, remember that
with the same notations as in the proof of theorem 7.1, one has yi =
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
7.5 Super-Additivity based Contrast Functions
115
Pm
j=1 Cij sj . Using the σ-super-additivity and the scale equivariant
properties of the functional G, we have
σ
−G (yi ) ≤ −
≤ −
m
X
j=1
m
X
G σ (Cij sj )
(7.22)
|Cij |σ G σ (sj )
(7.23)
j=1
(7.24)
Summing this quantity for all output components
P and using the fact
that |Cij |σ ≥ |Cij |2 since σ < 2 and |Cij |2 ≤ ni=1 |Cij |2 = 1 du to
the orthogonality constraint, one gets
−
n
X
σ
G (yi ) ≤ −
i=1
n X
m
X
|Cij |σ G σ (sj )
i=1 j=1
≤ −
= −
à n
m
X
X
j=1
m
X
(7.25)
!
2
|Cij |
G σ (sj )
(7.26)
i=1
G σ (sj )
(7.27)
j=1
So
C(y) = C(Cs) ≤ C(s)
(7.28)
Thus, the requirement R1 is fulfilled.
– Finally, C(Cs) = (s) or equivalently,
C(y) = −
n
X
i=1


m
n
X
X
Gσ 
Cij sj  = −
G σ (sj )
j=1
(7.29)
i=1
Pm
requires that ∀i,
j=1 Cij sj = sj . It implies that C:j has exactly
one nonzero component Cij j = ±1. Since C is orthogonal, it means
that C = DP, where D denotes a diagonal matrix with entries
±1, and P the permutation matrix associated to the permutation
i(1), · · · , i(n).
Clearly the equality in (7.28) is P
attained if C is a permutation matrix. This proves that C(B) = − ni=1 G σ (yi ) is a contrast function.
¥
7.5.1
Dispersion contrast function
Here, we consider a linear mixture of α-stable signals with the same characte1
ristic exponent α, dispersion γ and scale functional S = γ α considered above.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
116
Sub- and Super- Additivity based Contrast Functions
We verified above that S is scale equivariant. It is also easy to see that S is αadditive ; α-sub- and α-super- additive :
³ 1
´α
S α (X + Y ) = γ α (X + Y ) = γ(X) + γ(Y ) = S α (X) + S α (Y )
(7.30)
Then, from theorem 7.2 the objective function
C(B) =
n
X
S α (yi ) =
i=1
n
X
γyi
(7.31)
i=1
is a contrast function for alpha-stable source separation.
Thus we give another proof that the global minimum dispersion criterion, which
we introduced in [Sahmoudi et al.(2003a)] is a contrast function.
7.6
Jacobi-Gradient Algorithm for Prewhitened
BSS
As presented in the previous chapter, every orthogonal matrix can be parameterized in terms of Givens rotation angles, each of which defines a rotation in a
single plane of the high-dimensional vector space. Then, these individual rotations
can be cascaded to span the whole set of rotation matrices. Every rotation matrix
has a unique set of Givens rotation angles that characterize it. In n-dimensions, a
Givens rotation matrix in the plane formed by the i-th and j-th axes is denoted by
Ωij , and is given as was presented in [Sahmoudi et al.(2005)]. A rotation matrix is
then formed from these sparse matrices according to
B=
m−1
Y
m
Y
Ωpq
(7.32)
p=1 q=p+1
The multiplication order can be always from the left or from the right. It is not
crucial to the generality of this formula as long as we maintain the same order
when taking the derivative of the matrix with respect to a rotation angle.
[A]- Optimization & algorithm
Our aim is to solve the previously mentioned constrained optimization problem, which becomes unconstrained if Givens angles are used : Let θkl , k =
1, · · · , m − 1, l = 1, · · · , m be the Givens rotation angles that form up our parameter vector Θ.
To derive a nice and fast algorithm, we propose here to combine the Jacobi-like
decomposition of Givens rotations and the Gradient algorithm using a numerical
computation for searching θ. The so called Jacobi-Gradient algorithm can be summarized as follows :
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
7.7 Concluding Remarks
117
$
'
Jacobi-Gradient Algorithm
Step 1. Initialize Givens angles randomly.
Step 2. Estimate robustly from data, especially if the sources are in
noisy environment, the considered sources statistics used in the contrast
function.
Step 3. Calculate the gradient of the cost function with respect to the
Givens angle. The gradient is ∂C(B)
∂θkl
Step 4. Update the Givens angles using gradient ascent
θ(k + 1) = θ(k) + η
∂C(B)
∂θ
Step 5. Go back to step 3 and continue until convergence.
&
%
[B]- Complexity
A key concern in many adaptive algorithms is the computational complexity.
It is clear that if the multiplications in (7.32) are performed from left, the first
output is only affected from the Givens angles with indices θ1q , q = 2, · · · , m,
the second is affected by all the angles θ1q , q = 1, · · · , m and θ2q , q = 3, · · · , m
and so on. Thus, if we wish to extract the m source component, we only need to
adapt the angles θij , i = 1, · · · , m, j = i + 1, · · · , m which makes a total of
m2 − m(m + 1)/2 parameters, that is less than m2 parameters required in many
Jacobi like algorithms. But then, we will have to evaluate either the sin and cos of
all these parameters once. In addition, the necessary matrix vector multiplications
in the algorithm will be performed at each iteration, which amount O(m2 ).
7.7
Concluding Remarks
In this chapter some robust contrast functions have been introduced. A practical contrast function was derived for application to heavy-tailed sources.
Coupling Jacobi and Gradient optimization techniques, a nice implementation
was proposed for prewhitened BSS methods.
We note that this work was developed recently and due to time limitations
we can not present any experimental result. We note that we plan to extend this
work using some result in robust statistics of sub- and super- additive functionals
of heavy-tailed random variables [Huber(1981)]. Performance analysis also will be
investigated.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
118
Sub- and Super- Additivity based Contrast Functions
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
Chapitre 8
Normalized HOS-based Approaches
This chapter introduces a new approach for the blind separation (BS) of heavy
tailed signals that can be modeled by real-valued symmetric α-stable (SαS) processes. As the second and higher order moments of the latter are infinite, we propose to use normalized statistics of the observation to achieve the BS of the sources.
More precisely, we show that the considered normalized statistics are convergent
(i.e., take finite values) and have the appropriate structure that allows for the use
of standard tensorial BS as well as non-linear decorrelation techniques based on
second and higher order cumulants.
8.1
Introduction
By the generalized central limit theorem, the α-stable laws are the only class
of distributions that can be the limiting distribution for sums of i.i.d. random variables [Samorodnitsky et Taqqu(1994)]. Therefore, many signals are impulsive in
nature or after certain pre-processing, e.g. using for example wavelet transform,
and can be modeled as stable processes [Nikias et Shao(1995)], [Cappé et al.(2002)].
Unlike most statistical models, the α-stable distributions except the Gaussian have
infinite second and higher order moments. Consequently, standard blind source separation (BSS) methods would be inadequate in this case as most of them are
based on second or higher order statistics [Cichocki et Amari(2002)].
In this chapter, we propose a new approach for the BS of heavy tailed sources
using normalized statistics (NS). It is first shown that suitably normalized secondand fourth-order cumulants exist and have the appropriate structure for the BSS.
This is a similar result to those of [Swami et Sadler(1998)] in the ARMA stable
context. Then, for extracting α-stable source signals from their observed mixtures
120
Normalized HOS-based Approaches
one can use any standard procedure based on second- or forth-order cumulants.
This BSS method has several advantages over the existing ones that are discussed
in the sequel. Simulation-based comparisons with the minimum dispersion (MD)
criterion based method in [Sahmoudi et al.(2005)] are also provided.
8.2
Normalized Statistics of Heavy-Tailed Mixtures
8.2.1
Normalized moments
Thanks to the algebraic tail-behavior, we demonstrate here that the ratio of
the k-th moments of two random SαS variables with α 6= 2 converges to a finite
value (even though the moments themselves are infinite). More precisely, we have
the following theorem :
Theorem 8.1. Let X1 and X2 be two SαS variables of dispersions γ1 and γ2 and
PDFs f1 (.) and f2 (.), respectively. Then, for k ≥ α, we have
RT
k
IE(|X1 |k ) 4
γ1
−T |x| f1 (x)dx
=
lim
=
R
T
k
k
T →∞
γ2
IE(|X2 | )
−T |u| f2 (u)du
(8.1)
proof
Let Rk represents the above ratio, then due to the symmetric PDF of
X1 and X2 , we have
RT
RT k
k
x f1 (x)dx
4
−T |x| f1 (x)dx
Rk = R T
= R0T
(8.2)
k
k
−T |u| f2 (u)du
0 u f2 (u)du
Using integration by parts, we get
Rk =
[−xk (1 − Φ1 (x))]T0 + k
[−uk (1 − Φ2 (u))]T0 +
RT
R0T
k 0
xk−1 (1 − Φ1 (x))dx
uk−1 (1 − Φ2 (u))du
(8.3)
where Φ(.) denotes the cumulative function of the considered PDF.
From the heavy tails property (see chapter 2), we can observe that
for any SαS cumulative function Φ, we have (1 − Φ(x)) ∼ C2α γx−α as
x → ∞. Then, as T → ∞, Rk is equivalent to :
RT
Cα γ1 [−xk−α ]T0 + k 0 xk−1−α dx
γ1
Rk ∼
→
R
T
Cα γ2 [−uk−α ]T + k
γ2
uk−1−α du
0
0
¥
Using a similar proof, one can demonstrate that the ratio of the square of the k-th
moment to the 2k-th moment of a random SαS variable (α 6= 2) converges to zero
for k > α. More precisely, we have the following theorem :
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
8.3 Normalized Tensorial BSS Methods
121
Theorem 8.2. Let X be a SαS variable of dispersion γ and PDF f (.). Then, for
k > α, we have :
RT
( −T |x|k f (x)dx)2
(IE|X|k )2 4
= lim R T
=0
(8.4)
2k
T →∞
IE|X|2k
−T |x| f (x)dx
8.2.2
Normalized second and fourth order cumulants
Using above results, we can establish now that the normalized covariance matrix of the mixture signal converges to a finite valued matrix with the desired
algebraic structure. We have the following result :
Theorem 8.3. Let x be an SαS vector given by x = As (s being a vector of
SαS independent random variables). Then the normalized covariance matrix of x
satisfies :
m
X
Cum[x(i), x(j)]
4
dk ak (i)ak (j)
=
R(i, j) = Pn
k=1 Cum[x(k), x(k)]
k=1
or equivalently :
R = ADAT where D = diag(d1 , · · · , dm ) and
γi
2
γ
j=1 j || aj ||
di = Pm
aj being the j-th column vector of A.
Similarly, the normalized quadri-covariance tensor [Cardoso(1991)] of the mixture signal converges to a finite valued tensor with the desired algebraic structure.
We have the following result :
Theorem 8.4. Let x be an SαS vector given by x = As (s being a vector of SαS
independent random variables). Then the normalized quadri-covariance tensor of
x satisfies :
Q(i, j, k, l)
4
=
=
Cum[x(i), x(j), x(k), x(l)]
Pn
r=1 Cum[x(r), x(r), x(r), x(r)]
m
X
κr ar (i)ar (j)ar (k)ar (l)
r=1
where
γi
4
j=1 γj || aj ||
κi = Pm
8.3
8.3.1
Normalized Tensorial BSS Methods
Separation algorithms
Thanks to theorems 8.3 and 8.4, we can now use existing BSS methods based
on 2nd and 4th order cumulants, e.g. [Comon(1994)] for the ICA algorithm and
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
122
Normalized HOS-based Approaches
[Cardoso et Souloumiac(1993)] for the JADE algorithm. In this work, we have
applied JADE to the normalized 2nd and 4th order cumulants of the observations.
The so called Robust-JADE1 algorithm can be summarized as follows in Table 8.1.
'
$
Robust-JADE Algorithm
Step 1. Compute a whitening matrix Ŵ from the normalized sample
covariance R̂x (that is estimated as the standard sample covariance matrix
divided by its trace value).
Step 2. Compute the most significant eigenpairs a {λ̂r , M̂r ; 1 ≤ r ≤ m}
from the normalized sample 4th-order cumulants of the whitened process
4
z(t) = Ŵx(t).
Step 3. Diagonalize jointly the set λ̂r M̂r ; 1 ≤ r ≤ m by a unitary matrix Û.
a
See [Cardoso et Souloumiac(1993)] for more details about the JADE algorithm
&
Tab. 8.1 – The principal steps of the proposed Robust-JADE algorithm.
We provide here some remarks about the above separation method and discuss
certain advantages of the use of normalized statistics.
– Based on theorem 8.2, the normalized 4-th order cumulants are equal to the
normalized 4-th order moments of the SαS source mixture (recall here that
for a real valued zero-mean random variable x, we have cum(x, x, x, x) =
IE(x4 ) − 3(IE(x2 ))2 ). In other words, for SαS sources, one can replace the
4-th order cumulants by the 4-th order moments of the mixture signal.
– One major advantage of the proposed method compared to the FLOM based methods is that no a priori knowledge or pre-estimation of source PDF
parameters (in particular, the characteristic exponent α) is required. Consequently, the normalized-statistics based method is robust to modelization
errors with respect to the source PDF.
– In the case where the sources are non-impulsive, the proposed method coincides with the standard one (in our case, with the JADE method). Indeed,
because of the scaling indeterminacy, the normalization would have no effect
in this case.
– Another advantage of the NS-based method is that it can easily be extended to the case where the sources are of different types : i.e., sources with
different characteristic exponents or non-impulsive sources in presence of
other impulsive ones. That can be done for example by using the above NSbased method in conjunction with a deflation technique [Adib et al.(2002)].
1
In the fact Robust-JADE the same algorithm JADE up to some multiplicative
constants which has any effect of BSS. We refer the resulting algorithm by Robust-JADE
to express it’s validity for heavy-tailed sources.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
%
8.3 Normalized Tensorial BSS Methods
123
Indeed, in that case, one can prove that the normalized statistics coincide
with those of the mixture of the ‘most impulsive’ sources only (i.e the ones
with the smallest characteristic exponent) which can be estimated first then
removed (by deflation) to allow the estimation and separation of the other
sources. This point is still under investigation and will be presented in details in future work.
– In this paper, we have established only the convergence of the ‘exact’ normalized statistics (expressed by the mathematical expectation). In fact, one
can prove along the same lines of [Swami et Sadler(1998)] that the sample
estimates of the second and fourth order cumulants converge in probability
to the exact normalized statistics given by theorems 8.3 and 8.4.
8.3.2
Performance evaluation & comparison
This section examines the statistical performances of the separation procedure.
The numerical results presented below have been obtained in the following setting. The source signals are i.i.d. impulsive symmetric standard α-stable (β = 0,
µ = 0 and γ = 1).
The number of sources is m = 3 and the number of observations is n = 4.
The statistics are evaluated over 100 Monte-Carlo runs and the mixing matrix is
generated randomly at each run.
• Performance index
To measure the quality of source separation, we did use the generalized rejection level criterion defined as follows : If source k is the desired signal, the related
generalized rejection level would be :
P
P
α
def γ( l6=k Ckl sl )
l6=k |Ckl | γl
=
(8.5)
Ik =
γ(Ckk sk )
|Ckk |α γk
where γ(x) (resp. γl ) denotes the dispersion of an SαS random variable x (resp.
def
source sl ) and C = Â# A . Therefore, the averaged rejection level is given by
m
Iperf =
m
1 X X | Cij |α γj
1 X
Ii =
.
m
m
| Cii |α γi
i=1
i=1 j6=i
The performances of the NS-based method (referred to as Robust-JADE) are compared with those of the MD method introduced in [Sahmoudi et al.(2005)].
• First experiment
Figure 8.1, present the generalized mean rejection level versus the additive
Gaussian noise power (N = 1000 and α = 1.5). We can observe also a certain
performance gain in favor of the minimum dispersion (MD) algorithm.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
124
Normalized HOS-based Approaches
0
Mean rejection level
−5
−10
Robust−JADE
−15
−20
MD
−25
−30
−25
−20
−15
Noise power in dB
−10
−5
0
Fig. 8.1: Generalized mean rejection level versus the noise power.
• Second experiment
Figure 8.2, present the generalized mean rejection level versus the sample size
(α = 1.5 and the mixture is noise-free).
We can observe a certain performance gain in favor of the Robust-JADE algorithm.
−10
−12
Mean rejection level in dB
−14
−16
−18
−20
−22
−24
MD
Robust−JADE
−26
−28
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
N: sample size
Fig. 8.2: Generalized mean rejection level versus the sample size.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
8.4 Normalized Non-linear Decorrelation BSS Methods
8.4
125
Normalized Non-linear Decorrelation BSS Methods
In this chapter, we focus on the use of normalized statistics (NS) of heavytailed sources for the BSS problem. In [Sahmoudi et al.(2004a)], the NS have been
introduced for alpha-stable sources to justify for the use of algebraic-based separation algorithms (JADE, SOBI,..etc) to achieve the BSS in the heavy-tailed case.
In this work, we propose to use the NS to robustify the class of Non-linear decorrelation algorithms like as the EASI algorithm [Cardoso et Lahed(1996)]. Algorithm
derivation, discussion and simulation results are provided to illustrate the usefulness of NS in that context. The new method has been compared with two of the
most popular BSS algorithms ; EASI and Quasi Maximum-Likelihood algorithm
[Pham et Garrat(1997)].
To deal with the particular BSS problem of heavy-tailed data, we proposed in
[Sahmoudi et al.(2004a)] to use normalized second and higher order statistics that
are shown to be of finite values and have the appropriate structure based on
which the BSS can be achieved. In this section, we propose to use the normalized statistics to robustify and adapt the class of BSS algorithms using composite (second and higher order) criterion to the impulsive source case. Particularly,
the EASI (equivariant adaptive separation by independence) algorithm proposed
by Cardoso and Lahed in [Cardoso et Lahed(1996)] and its batch version IBSS
[Belouchrani et al.(1997b)] has attracted a lot of interest in the Independent Component Analysis community. However, we show by both analytical studies and computer experiments that this algorithm fails to perform separation of heavy-tailed
sources such as alpha-stable signals, and divergence behaviors may be observed.
We introduce then a robust-EASI criterion based on the normalized statistics and
that is shown to be effective for the BSS problem in the considered context. Algorithmic details and simulations results related to the iterative implementation,
referred to as Robust-EASI algorithm, are provided and discussed in this section.
A broad and increasingly important class of non-Gaussian phenomena encountered
in practice can be characterized as impulsive [Adler et al.(1998)]. It is for this type
of signals and noise that the heavy-tailed distributions provide a useful theoretical
tool. In this section we use the heavy-tailed behavior characterization of α-stable
distributions to achieve the normalization of high-order statistics of the considered
linear mixture. Let us recall this last property :
Property 8.1. : Heavy-tailed asymptotic behavior
Let X ∼ SαS an α-stable r.v. with α < 2. Then :
P (X > x) ∼ γCα x−α
as x → ∞
where Cα is a positive constant depending only on α.
Thus, α-stable distributions have inverse power, i.e. algebraic tails. In contrast,
the Gaussian distribution has exponential tails. This proves that the tails of stable
laws are much thicker than those of the Gaussian distribution. And the smaller
the value of α is, the thicker the tails. An important consequence of property 8.1
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
126
Normalized HOS-based Approaches
is the no-existence of the second and higher order moments of stable distributions,
except for the special case α = 2. However, thanks to the heavy-tailed behavior of
the sources s, the normalized covariance and fourth-order cumulants exist for x =
As. More precisely, we have established in [Sahmoudi et al.(2004a)] the following
result :
Theorem 8.5. : Normalized statistics of heavy-tailed mixtures
1. Let X be a heavy-tail distributed r.v. with index α. Then, for k > α, we have :
k )2
ˆ
(IE|X|
=0
2k
ˆ
T →∞ IE|X|
lim
P
ˆ denotes the time averaging operator IE[g(X)]
ˆ
where IE
= T1 Tt=1 g[X(t)].
2. Let R̂ be the sample covariance matrix of the mixture signal in (9.6) :
T
1X
R̂ =
x(t)x(t)∗
T
def
t=1
R̂
Then, the normalized sample covariance matrix
converges when T → ∞
T race(R̂)
∗
to a finite valued matrix of the form ADA with D is a positive diagonal matrix.
ˆ
3. Let Cum[x(i),
x(j), x(k), x(l)] be the sample quadri covariance tensor. Then
the normalized sample quadri covariance tensor
ˆ
Cum[x(i),
x(j), x(k), x(l)]
Pn
ˆ
Cum[x(r),
x(r), x(r), x(r)]
r=1
of the mixture signal converges to a finite valued tensor.
The consequence of this result are that many of the algorithms for source
separation can be modified to be applicable to heavy-tailed sources.
8.4.1
Robust composite criterion for source separation
[A]- EASI family criterion
It was shown in [Cardoso et Lahed(1996), Belouchrani et al.(1997b)] that a
general composite criterion for blind source separation can be defined as
½
Cg (B) = IE{zz∗ − I + [g(z)z∗ − zg(z)∗ ]};
(8.6)
with z = Bx
IE denotes the mathematical expectation, I is the identity matrix and z∗ denotes
the (conjugate) transpose of the (complex) vector z. The non-linear function g is
chosen such that, if z is a random vector with i.i.d. components, then Cg (B) is the
null matrix in the noiseless case :
B is a separating matrix ⇒ Cg (B) = 0
(8.7)
where g is usually chosen as an odd non-linear function that preserves the phase
of its argument, i.e. the i-th coordinate of g(z) is of the form gi (z) = gi (zi ) =
fi (|zi |2 )zi where fi is a real function.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
8.4 Normalized Non-linear Decorrelation BSS Methods
127
[B]- Robust-EASI family criterion
In the case of heavy-tailed sources such as alpha-stable signals which have infinite moments of order equal or greater than two, the criterion Cg (B) is inadequate
and divergence behaviors may be observed, especially if the non-linearities in g are
strongly increasing functions (like a cubic distortion for instance). For simplicity
let us choose g(zi ) = |zi |2 zi .
• Note, for example, that even when B equals A−1 and we have z = s, the
right hand of (8.7), Cg (B), does not converge to 0 since IE{|s2i |} and IE{|s4i |} are
infinite, which undermines the validity of this separation procedure.
• In practice, one can always agree that for a finite sample size that sample
estimate of Cg (B) is of finite value. However, for impulsive signals and large sample
sizes, the second order term IE{zz∗ } − I in Cg (B) will be negligible compared to
the higher order term IE{g(z)z∗ − zg(z)∗ } (see 1 of theorem 8.5). In that case, the
whitening will not be performed correctly and thus the algorithm fails to converge
to the optimal solution.
To mitigate this difficulty, we propose to modify criterion (8.6) to ensure the
convergence of the two terms
IE{zz∗ } − I
and
IE{g(z)z∗ − zg(z)∗ }
For that, we use the concept of normalized statistics. Hence, in this section we
propose a robustified version of EASI approach obtained by modifying Cg (B) into :
h
i
(
∗
∗
zz∗
Pm −zg(z)
Cg (B) = IE T race(zz
+ g(z)z
∗) − I
4
j=1 |zj |
(8.8)
with z = Bx
resulting in the so-called Robust-EASI family of source separation algorithms. This modification preserves the structure of the standard EASI and IBSS
algorithms such that the term
IE{
zz∗
− I}
T race(zz∗ )
in (8.8) has the effect of driving the diagonal elements of C = AB to all ones.
Meanwhile, the other term
IE{
g(z)z∗ − zg(z)∗
Pm
}
4
j=1 |zj |
in (8.8) drives the off-diagonal elements of C to zeros.
8.4.2
Iterative quasi-Newton implementation
To solve (8.8), we propose to use a block technique based on the processing of
T received sample and consists of searching the zeros of Cˆg (B), which is the sample
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
128
Normalized HOS-based Approaches
version of Cg (B) :
4
ˆ
C(B)
=
1
T
T nh
X
t=1
∗
Pz(t)z(t)
m
2
j=1 |zj (t)|
i h
io
∗
∗
Pm −z(t)g(z(t))
− I + g(z(t))z(t)
4
|zj (t)|
j=1
(8.9)
with z = Bx
An approximate solution of Cˆg (B) = 0 may be obtained by the Newton technique :
ˆ
Cg (B) is replaced by its first order approximation around some B so that the resulting linear equation can be solved exactly ; solutions are then obtained iteratively
in the form Bp+1 = (I + Ep )Bp . At step p, a matrix Ep is determined from a local
linearization of Cˆg (Bp+1 ). The benefit that leads to explicit expression of Ep under
the additional assumption that Bp is close to a separating matrix. This iterative
implementation is summarized by the following algorithm so called Robust-EASI
algorithm in Table 8.2.
$
'
Robust-EASI Algorithm
Step 1. Initialization : chose B0 randomly and set
z(t) = B0 x(t), t = 1, · · · , T
Step 2. Computation of matrix E :
Eij =
with
ρ̂ii κ̂ij + ζ̂ji∗ (δ̂ij − ρ̂ij )
ρ̂ii ζ̂ij + ζ̂ji∗ ρ̂jj
,
i, j = 1, · · · , m
(8.10)
∗
ˆ Pmzi zj 2 }
ρ̂ij = IE{
|zj |
j=1
∗
2
2
(|zj | )−fi (|zi | )]
ˆ zi zj [fjP
κ̂ij = IE{
}
m
4
j=1 |zj |
2 [f 0 (|z |2 )|z |2 +f (|z |2 )−f (|z |2 )]
|z
|
j
i
i
i
i
j
j
i
ˆ
Pm
ζ̂ij = IE{
}
4
j=1
|zj |
Step 3. Update the estimated source signals :
z(t) ←− (I + E)z(t), for t = 1, · · · , T
Step 4. Check for convergence : if ||E|| < ² Stop
(² is a small threshold ), otherwise go back to step 2.
&
%
Tab. 8.2 – The principal steps of the proposed Robust-EASI algorithm.
8.4.3
Performance evaluation & comparison
In this section we compare our NS-based Robust-EASI method to two widely used BSS algorithms, EASI and RQML [Shereshevski et al.(2001)]. Recall
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
8.4 Normalized Non-linear Decorrelation BSS Methods
129
that RQML is the restricted quasi-maximum likelihood approach introduced as
an extension of the popular Pham’s quasi-maximum likelihood (QML) approach
[Pham et Garrat(1997)] to the α-stable sources case. Two simulation examples
with different types of distributions and a variety of sample size and noise power
are presented. All simulation results are averaged over 200 Monte-Carlo runs and
the mixing matrix A is generated randomly at each run. To measure the quality
of separation, we did use the generalized rejection level criterion defined as follows
[Sahmoudi et al.(2004a)] :
m
Iperf =
m
1 X
1 X X | Cij |α γj
Ii =
.
m
m
| Cii |α γi
i=1
(8.11)
i=1 j6=i
def
where γl denotes the dispersion of source sl and C = BA.
[A]- Experiment 1 : Alpha-Stable Mixture
In this experiment, mixtures of three heavy-tailed symmetric standard α-stable
(µ = 0 and γ = 1) signals with characteristic exponent α = 1.5 are considered.
The number of observations is n = 3 and the mixture is noise-free.
Figure 8.3 presents the generalized mean rejection level (8.11) versus the sample
size of each algorithm.
Alpha−stable Mixture
0
−5
Iperf rejection level in dB
Robust−EASI
EASI
RQML
−10
−15
−20
−25
−30
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
Sample size
Fig. 8.3: Generalized mean rejection level versus the sample size.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
130
Normalized HOS-based Approaches
From this results, we can observe that EASI has failed to separate α-stable mixtures. It can be seen that the new proposed Robust-EASI can separate correctly
the α-stable mixtures even for short sample sizes and outperforms the RQML
method.
[B]- Experiment 2 : Generalized Gaussian Mixture
The generalized Gaussian distribution has a density proportional to exp(−|x|p ),
p > 0. A p less than 2 gives a distribution suitable as an impulsive signal model. By
inferring p a wide class of probability distributions can be characterized including
uniform, Gaussian, Laplacian and other sub- and super- Gaussian densities. In this
experiment the three sources are impulsive with generalized Gaussian distribution
according to p = 1.5. Three mixtures corrupted by additive white Gaussian noise
are considered. We characterize the performance of each algorithm in terms of
def
signal rejection level. When C = BA, the i-th estimated source is :
sˆi (t) = zi (t) =
m
X
Cij sj (t)
j=1
which contains the j-th source signal at level |Cij |2 /|Cii |2 . Then, in this case the
averaged rejection level is given by
m
Iperf =
m
1 X
1 X X | Cij |2
Ii =
.
m
m
| Cii |2
i=1
(8.12)
i=1 j6=i
Figure 8.4 presents the mean rejection level (8.12) versus the noise power of each
algorithm.
Eventhough the impulsive sources are not heavy-tailed, the Robust-EASI algorithm still outperform largely the two other algorithm. That illustrates the fact
that the proposed approach is quite general and can be applied to a larger class
of source signal distribution including of course the heavy-tailed one.
In overall comparison, Robust-EASI method has reliable performance in all
considered situations whereas the other methods may fail, in particular if the
underlying assumptions on sources are not completely valid.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
8.5 Concluding Remarks
131
Generalized Gaussian Mixture
0
−2
RQML
−4
−6
EASI
Iperf Mean rejection level
−8
−10
−12
Robust−EASI
−14
−16
−18
−20
−30
−25
−20
−15
Noise power in dB
−10
−5
0
Fig. 8.4: Mean rejection level versus the noise power with T = 1000.
8.5
Concluding Remarks
In this chapter, two new NS-based blind separation methods for impulsive
source signals with heavy-tailed distributions are introduced :
• Robust-JADE algorithm : The normalized 2-nd and 4-th order cumulants of
the mixture signal are shown to be convergent to finite-valued matrices with
the appropriate algebraic structure that is traditionally used in many 2-nd
and higher order statistics based BSS methods. The advantages of the proposed method Robust-JADE are discussed and a simulation based comparison
with the MD method is provided to illustrate and assess its performances.
• Robust-EASI algorithm : In this chapter, we have proposed another approach
using normalized statistics for heavy-tailed mixing to improve the robustness of the EASI family algorithms. A normalized criterion was derived and
used for heavy-tailed source separation. The latter is solved using an efficient quasi-Newton iterative algorithm. Comparative simulations have been
provided to illustrate the effectiveness of the Robust-EASI algorithm. More
studies need to be done on the optimal non-linear function g choice for heavytailed sources. Note that the same methodology used here can be used to
derive a normalized version of other existing non-linear decorrelation criterion that can separate correctly heavy-tailed independent components.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
132
Normalized HOS-based Approaches
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
Chapitre 9
A Semi-Parametric ML Approach
In this chapter, we propose a method for estimating in a semi-parametric way
the density of the missing data in the blind source separation problem. We consider
a log-spline model of a fixed size and maximum likelihood estimator of this density
in the linear BSS problem. So we get a log-spline density estimator, which can
be approached using a stochastic version of the expectation-maximization (EM)
algorithm coupled with the MCMC method.
9.1
The Likelihood of the BSS Model
A very popular approach for estimating independent sources is the maximum
likelihood (ML) method. A short introduction was provided in chapter 5. In this
section, we show how to apply ML estimation to BSS.
9.1.1
Derivation of the likelihood
It is not difficult to derive the likelihood of the observation vector x in the noisefree BSS model. This likelihood is based on the well-known result on the density of
a linear transform [Papoulis(1991)]. According to this formula, the density function
px of the mixture vector x = As can be formulated as
Y
px (x) = |det(B)|ps (s) = |det(B)|
pi (si )
(9.1)
i
A−1
where B =
and the pi denote the densities of the independent components.
This can be expressed as a function of B = (b1 , · · · , bn )T and x, as follows :
Y
px (x) = |det(B)|
pi (bTi x)
(9.2)
i
134
A Semi-Parametric Maximum Likelihood Approach
Assume that we have T observations of x, denoted by x(1), · · · , x(T ). Then the
likelihood can be obtained as the product of this density evaluated at the T points :
L(B) =
T
Y
|det(B)|
t=1
n
Y
pi (bTi x(t))
(9.3)
i=1
very often it is more practical to use the logarithm of the likelihood, since it is
algebraically simpler. This does not make any difference here since the maximum
of the logarithm is obtained at the same point as the maximum of the likelihood.
The log-likelihood is given by
log L(B) =
T X
n
X
pi (bTi x(t)) + T log |det(B)|
(9.4)
t=1 i=1
9.1.2
Sources density estimation
In this work, we propose a new procedure based on the maximum likelihood approach to estimate the mixing matrix. However, there is another thing to estimate
in the BSS model, though. This is the density of the independent components. This
make the problem much more complicated, because the estimation of densities is,
in general, a non-parametric problem. Non-parametric means that it cannot be
reduced to the estimation of a finite parameter set. In fact the number of parameters to be estimated is infinite, or very large. Thus the estimation of the BSS
model has also a non-parametric part, which explains why the proposed method is
called semi-parametric. Non-parametric pdf estimation is known to be a difficult
problem. This is why we would like to avoid the non-parametric density estimation
in the BSS. There are two ways to avoid it :
• Parametric : First, in some cases we might know the densities of the independent components in advance, using some prior knowledge on the data at hand.
Then the likelihood would really be a function of the mixing matrix only. If reasonable small errors in the specification of these prior densities have little influence
on the estimator, this procedure will give reasonable results. In fact, it will be
shown below, by computer simulation, that this is the case in impulsive environment using the alpha-stable distributions.
• Semi-parametric : A second way to solve the problem of density estimation is
to approximate the densities of the independent components by a family of densities that are specified by a limited number of parameters. If it is possible to use
a very simple family of densities to estimate the BSS model, we will get a simple
solution. Fortunately, this turns out to be the case using log-spline functions.
9.1.3
Optimization via the EM algorithm
If the likelihood of the observation can’t be maximized directly, it is possible to
do iterative maximization steps in order to approach the maximum. For example,
the expectation-maximization (EM) algorithm, proposed in [Dempster(1977)], is
a broadly applicable approach for iterative computation of maximum likelihood
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
9.2 Semi-Parametric Source Separation
135
estimates, useful in a variety of incomplete data (or partially observed data) statistical problems.
We recall here briefly the principle of the EM algorithm which is a two-step
iterative procedure. One iteration is composed by E-step and M-step. The E-step
computes
Q(θ | θk ) = IE {log f (x, s; θ) | x; θk }
and the M-step determines θk+1 as maximizing Q(θ | θk ).
Stochastic versions of EM have been introduced from different perspectives to
deal with situations where the E-step is infeasible in closed form. This is often the
case, because the expectation has no analytical form. In this case, Monte Carlo
Markov Chain (MCMC) replace the E-step by a Monte Carlo approximation of
the expectation based on a large number of independent simulations of the missing
data [Meng et Rubin(1993)]. An other way to get round the difficulty of the computation of the expectation is also to use simulation, not for approximating some
integral deriving from an expectation, but for getting some numerical plausible
value standing for the missing data. One proposition was to simulate the missing
data from the a posteriori distribution with the current values of the parameters
[Diebolt et Celeux(1993)]. To be more efficient, the SAEM algorithm, proposed in
[Delyon et al.(1999)], replaces the E-step not only by the simulation of the missing
data, but by stochastic approximation involving these simulated data. At iteration
k, SAEM generate m(k) realizations sk (j) (1 ≤ j ≤ m(k)) from the a posteriori
distribution denoted p(s | x; θk ) and updates Qk−1 (θ) according to



 1 m(k)
X
log f (x, sk (j); θ) − Qk−1 (θ)
(9.5)
Qk (θ) = Qk−1 (θ) + γk

 m(k)
j=1
where γk is a sequence of positive step-sizes decreasing to 0. Anyway, the use of
simulated data for estimating parameters in missing data statistical problems is a
powerful approach that tends to become popular since the 90’s. In this work we
use this procedure for the blind source separation problem.
9.2
9.2.1
Semi-Parametric Source Separation
Noisy linear instantaneous mixtures.
In this chapter, we consider the classical noisy linear BSS model with instantaneous mixtures given by :
x(t) = As(t) + ε(t), t = 1 . . . T
(9.6)
where A is a n × m unknown full column rank mixing matrix. The sources
s1 (t), · · · , sm (t) are collected in a m × 1 vector denoted s(t) and are Q
assumed to be
i.i.d. signals : the joint distribution density π is factorized as π = m
j=1 πsj . The
noise vector ε(t) (independent with s(t)) has independent components ε1 (t), . . . , εn (t)
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
136
A Semi-Parametric Maximum Likelihood Approach
with zero mean and unknown variance σ 2 . The goal of a BSS method is to find a
separating matrix i.e. an m × n matrix B such that the recovered sources Bx(t)
are as independent as possible. In the noiseless case, (9.6) admits a unique solution up to scaling and permutation indeterminacy y(t) = Bx(t) such that
4
C = BA = PΛ, where Λ is a diagonal scaling matrix and P is a permutation matrix (see [Hyvarinen et al.(2001)]). At most one source is allowed to be
Gaussian to ensure the identifiability. Another problem is that if one or more
sources do not have finite second or higher moments (e.g. heavy-tailed distributions) then prewhitening or criteria optimization would cause a breakdown
[Chen et Bickel(2004), Sahmoudi et al.(2004a), Sahmoudi et al.(2005)].
9.2.2
The proposed approach.
Our purpose is to estimate by maximum likelihood the density π, the mixing
matrix A and the noise-variance σ 2 . In this section, we present a semi-parametric
method to BSS using maximum likelihood estimation in a log-spline model in order to avoid any assumption of the source distribution. Nevertheless we suppose
that all sources are independent and have the same common distribution. We use
log-spline models for two reasons : on one hand, they have good functional approximation properties, on the other hand, they are well-adapted to the implementation
of the SAEM algorithm [Kuhn et Lavielle(2004)] allowing to compute easily our
estimator. Moreover, this estimation technique is inherently robust towards outliers and impulsiveness effects. For this reason, we apply this method to impulsive
random variables with possibly heavy-tailed distributions characterized by infinite
second and higher order moments.
Any BSS problem can be seen as an usual missing data problem. Indeed,
the observed data are the observations {x(t)}1≤t≤T , whereas the random sources
{s(t)}1≤t≤T are the unobserved data. Then, the complete data of the model is
{x(t), s(t)}1≤t≤T . We suppose that the unobserved sources are related to the ob1
servations through the density functions
Qm h of x conditionally to s . Our purpose is
to estimate the sources density π = j=1 πsj , the mixing matrix A and the noisevariance σ 2 . For that we propose a semiparametric approach which consists of
combining the logspline model for sources density approximation with a stochastic
version of the EM algorithm. We use logspline models for two reasons : on one hand,
they have good functional approximation properties, on the other hand, they are
well-adapted to the implementation of the SAEM (Stochastic Approximation version of the Expectation Maximization) algorithm [Kuhn et Lavielle(2004)] allowing
to compute easily our estimator. Indeed, the first assumption of the used SAEM
algorithm is equivalent to suppose that the complete data likelihood f (x, s, η)
belongs to the curved exponential family and can be written :
n
o
f (x, s, η) = exp −Ψ(η) + hS̃(x, s), Φ(η)i
(9.7)
1
The distribution of x conditionally to s, denoted by h, corresponds in the fact to the
distribution of the additive noise in the BSS model (9.6) with the same variance and a
non-zero mean value equal to As.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
9.2 Semi-Parametric Source Separation
137
where h., .i denotes the scalar product, η denotes the unknown global parameters
vector to be estimated and S̃(x, s) is known as the minimal sufficient statistics
(MSS) of the complete model. In this case of unknown density functions following
model (9.7), a good approximation which satisfies this latter condition is given by
the logspline model. Moreover, it is was shown that this estimation technique is
inherently robust against outliers and impulsiveness effects [Takada(2001)]. For this
reason, we apply this method to impulsive random variables with possibly heavytailed distributions characterized by infinite second and higher order moments. We
define now precisely the logspline model which will be used.
9.2.3
Density estimation by B-spline approximations
In order to get a non parametric estimate of the source density function π,
we propose to use the logspline model. Let I be equal to [a, b] where −∞ < a <
b < +∞ and consider a given knots sequence τ = (tl )1≤l≤K+1 with a = t1 and
b = tK+1 . Consider now the space S q,τ of spline functions of positive order q on
I, namely piecewise polynomial functions of degree q − 1 associated to this knots
sequence. Then the dimension of S q,τ is equal to J = q + K − 1 and there exists a
B-splines basis denoted B1 , · · · , BJ for S q,τ [de Boor(1978)]. The logspline density
estimation method models a log-density function as a spline function :


J
X
∀s ∈ I,
πθ (s) = exp 
θj Bj (s) − c(θ)
(9.8)
j=1


 
Z
J
X
where c(θ) = log  exp 
θj Bj (s) ds
I
j=1
is a normalization factor and θ = (θ1 , . . . , θJ ) ∈ RJ . We choose the dimension
√
J of the logspline model in function of the sample size T such that J = o( T )
(see [Kuhn et Lavielle(2004)] for more details). We define now the observed loglikelihood corresponding to the logspline model of the observations defined as follow :
Z
T
1X
log
h (x(t)|s) πθ (s)ds
(9.9)
LT (θ) =
T
I
t=1
Then we consider the maximum likelihood estimator πθ̂T,J of the density π in the
logspline model given by :
θ̂T,J = arg max LT (θ)
θ∈ΘJ
(9.10)
This family is not identifiable since we have for all a real : c(θ + a) = c(θ) +
a implying that πθ+a = πθ . We set systematically θJ = 0 in order to get an
identifiable family of log-density functions and we denote ΘJ the subspace of RJ
composed of vectors having zero as last coordinate and Mq,τ the set of associated
densities, i.e. {πθ , θ ∈ ΘJ }. We describe briefly some properties of the B-splines
detailed in de Boor’s book [de Boor(1978)] :
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
138
A Semi-Parametric Maximum Likelihood Approach
– B-spline
For all 1 ≤ j ≤ J, the function Bj takes values in the interval [0, 1]. MoreoP
ver, we have Jj=1 Bj (s) = 1 ∀s ∈ I.
– Approximation property of the logspline model
We define δJ = inf θ∈ΘJ k log f − log πθ k∞ . For some positive continuous
density function f on I, δJ tends to zero when J goes to infinity.
See [de Boor(1978)] for more details on the links between the convergence rate
and the regularity of f . The particular properties of the logspline model let us
think that πθ̂T,J will have remarkable properties when T tend to infinity. In a first
time, we explain how we compute this estimator in practice simultaneously with
the mixing matrix and the noise variance.
9.2.4
The SAEM algorithm
To compute the unknown parameters η = (θ T , vec(A)T , σ 2 )T , we use the
SAEM algorithm coupled with a MCMC (Markov Chain Monte-Carlo) procedure
presented in [Kuhn et Lavielle(2004)]. Here we apply this algorithm for estimating
the mixing matrix A and the variance σ 2 using the logspline model to approach the
estimate πθ̂T,J . The complete log-likelihood corresponding to the logspline model
has the following expression :
Lcom
T (η) =
T
T
1X
1X
log h (x(t)|s(t)) +
log πθ (s(t))
T
T
t=1
(9.11)
t=1
So we apply the SAEM algorithm to this parametric model in order to approach
the estimator η̂ T,J of η, that maximizes the observed log-likelihood. To put out
the minimal sufficient statistics of the model, we write the developed expression
of the complete log-likelihood :
Lcom
T (η)
"
#
T
J
T
X
1X
1X
=
log h (x(t)|s(t)) +
θj
Bj (s(t)) − c(θ)
T i=1
T t=1
j=1
P
We choose as MSS S̃(x, s) = ( T1 Ti=1 Bj (si ), 1 ≤ j ≤ J) and we implement the
k-th iteration of the SAEM algorithm as :
• S-step : Generate a realization s0 using as proposal distribution the prior
distribution πθk and take sk equal to s0 or to sk−1 according to the value of
the acceptance probability.
• A-step : Update the minimal sufficient statistics S̃k according to the stochastic approximation :
³
´
S̃k = S̃k−1 + βk−1 S̃(x, sk ) − S̃k−1
(9.12)
where βk is a positive step-sizes sequence decreasing to 0.
• M-step : Update η k by maximizing the complete log-likelihood of the model evaluated in the observations and in the current value of
the minimal sufficient statistics.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
9.3 Performance evaluation & comparison
139
This algorithm converges a.s. toward a local maximum of the log-likelihood of the
observations under very general regularity conditions (see [Kuhn et Lavielle(2004)]
for convergence results). In practice, the algorithm is easy to implement and has
a relatively low computational cost.
9.3
9.3.1
Performance evaluation & comparison
Some existing BSS methods
We briefly describe here three BSS approaches for comparison with the new
semi-parametric approach introduced above.
1) FastICA algorithm [Hyvarinen et al.(2001)].
Under the whitened zero-mean demixing model y = Wz, the FastICA algorithm
finds the extrema of a generic cost function IE{G(wT z}, where wT is one of the
rows of the demixing matrix W. The cost function can be e.g. a normalized cumulant or an approximation of the marginal entropy which is minimized in order to
find maximally nongaussian projections wT z. This algorithm is facing three problems. First, some sources may not have zero means in which case the mean values
must be explicitly included in the analysis. Second, in FastICA, the derivative of
the even function G is assumed to be an odd function. If this condition fails to
be satisfied, the FastICA as such may not work. Third, FastICA is not robust to
heavy-tailed effect.
2) JADE algorithm [Cardoso et Souloumiac(1993)].
This algorithm operates on cumulants as a measure of independence. It seeks to
approach independence through the maximizing of the higher order cumulants.
However, one major weakness of this algorithm is that higher order cumulants are
extremely vulnerable to outlier effects. Besides being sensitive to outliers, JADE
also fails to separate certain source distribution, i.e. skewed zero-kurtotic signals
generated by the power distribution. This is because by minimizing only the 4-th
order cumulants, third order effects like the skewness are ignored.
3) Minimum Dispersion (MD) algorithm [Sahmoudi et al.(2005)].
This approach is a two-step parametric algorithm for heavy-tailed source separation.
Step 1 : Robust whitening.
In the case of α-stable signals, it is proven in [Sahmoudi et al.(2004a)] that the norP
R̂x
malized covariance matrix of x defined by R̂nx =
with R̂x = T1 t x(t)x(t)T
T race(R̂x )
converges asymptotically (i.e. when T tends to infinity) to the finite matrix ADAT ,
where D is a positive diagonal matrix. Hence, the normalized covariance matrix
has the appropriate structure and the whitening problem becomes standard.
Step 2 : MD criterion.
Let z(t) = Bx(t) where B is an orthogonal separating matrix to be estimated and
x denotes the whitened data. It is shown in [Sahmoudi et al.(2004a)]
that under
Pm
orthogonality constraint, the MD criterion given by J(B) = i=1 γzi , where γzi
denotes the dispersion of zi (t) the i-th entry of z(t), is a contrast function.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
140
A Semi-Parametric Maximum Likelihood Approach
The essential limitation of this method is that it can be used only for heavy-tailed
sources with α-stable distribution.
9.3.2
Parametric versus semi-parametric approaches
The MD method is said to be parametric in the sense that it relies on the
a priori knowledge of the exact source pdf. In this case, we have a finite set of
parameters to estimate. On the other hand, the SAEM method is said to be semiparametric in the sense that the source pdf is unknown and need to be jointly estimated with the desired parameters (i.e. mixing matrix) [P. Bickel(1998)]. Clearly,
estimating a pdf is a difficult problem as the number of parameters to be estimated is infinite. In the semi-parametric approach, we estimate a limited number of
parameters by replacing the estimation problem by an approximation one. The
parametric approach is preferred whenever a reliable a priori knowledge on the
source pdf is available. In the situations where the pdf is only partially or inaccurately known, semi-parametric methods should be used because of their robustness
against modelization errors as shown next by simulation results.
9.3.3
Computer simulation experiments
Here, we compare our proposed semi-parametric method SAEM to JADE,
FastICA and to the parametric MD algorithm. In all simulation experiments the
results are averaged over 100 iterations and the mixing matrix A is generated
randomly at each run. The stepsize sequence (βk ) used for SAEM was βk = 1/k.
For the choice of the size J of the logspline model in SAEM, we have tested some
values for J lower than 10 since we have at least 100 observations. The best estimation seems to be given for q = 4 and J = 5, so we will hold these values for
the following experiments. We choose as initial value θ0 , such that the logspline
density estimate is initialized with the uniform distribution on I = [−50, 50].
To measure the quality of separation we will use Amari’s error criterion as a
performance index (PI) defined as
PI =
m
X

m
X

i=1
j=1

Ãm
!
m
X
X
|Ci,j |
|C
|
i,j
− 1 +
−1
maxk |Ci,k |
maxk |Ck,j |
j=1
i=1
where C = (Ci,j )1≤i,j≤m = BA is the global system.
• Experiment 1 : Robustness against outliers.
First, we test the robustness against outliers. We mix two sources, one of Gaussian
distribution and the second of uniform distribution with randomly chosen mixing
matrices. The data set contains 1000 points. Without outliers, the performances
of SAEM, JADE and FastICA are all excellent (PI ≈ 0.05). To test for outlierrobustness, we replace 50 data point with outliers, i.e. uniformly distributed data
points within a disc of radius 500 around the origin (the norm of the original
data points is roughly within the range from 0 to 100). As expected, SAEM still
works fine. In fact, typically it does not even change its solution, because it simply
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
9.3 Performance evaluation & comparison
141
ignores the outliers in the B-spline adjustment stage. JADE and FastICA however,
produce arbitrary results because they employ higher-order statistics which are
highly sensitive to outliers.
• Experiment 2 : Asymptotic consistency.
Figure 9.1 shows some simulation results in case of noiseless three mixtures (n = 3
observations) of three sources (m = 3) with, respectively, a uniform distribution on
[0, 1], a Gaussian distribution with zero mean and unit variance and standard SαS
with α = 1.5. To detect whether BSS algorithms can obtain consistent estimates in
such situation, the sample size was increased from (1) : T = 1000 to (2) : T = 5000.
We compare SAEM and two other famous BSS algorithms, JADE and FastICA.
Similarly to [Chen et Bickel(2004)], we present the boxplots based on quartils to
assess the consistency of our method.
1
0.9
0.8
PI: Amari error
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
FastICA(1)
FastICA(2)
JADE(1)
JADE(2)
SAEM(1)
SAEM(2)
Fig. 9.1: Consistency of different BSS algorithms. The sample sizes were
1000 for case (1) and 5000 for case (2).
From the boxplots (Figure 9.1), we can see that as the sample size increases,
the estimation error (PI) for SAEM decreases more significantly toward zero than
for JADE and FastICA.
• Experiment 3 : Robustness against impulsive noise.
In this experiment we add impulsive noise to the above mixtures (considered in the
experiment 2) according to x(t) = As(t) + σ²(t) with ²(t) being a n-dimensional
Gaussian noise of unit variance. We track the evolution of the performance index as
a function of the noise level σ for kurtotic (super-Gaussian) noise : we used multidimensional Gaussian noise, where we change the absolute value to the power of
5.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
142
A Semi-Parametric Maximum Likelihood Approach
0.7
FastICA
JADE
SAEM
0.6
PI: Amari error
0.5
0.4
0.3
0.2
0.1
0
0
0.05
0.1
0.15
0.2
0.25
Noise level
0.3
0.35
0.4
0.45
0.5
Fig. 9.2: The performance index versus noise level.
Figure 9.2 shows that JADE and FastICA start to fail at a certain noise level,
whereas SAEM continues to produce good BSS solutions. Note that we have chosen
the median over 100 runs because the PI depend strongly on the actual realization
of the noise.
• Experiment 4 : Robustness against error modelization.
Here, we consider m = 3 impulsive sources with generalized gaussian distribution
of parameter p = 1.5 (i.e. the source pdf is proportional to exp(−|x|p )). In that
case, the signals are of finite variances and n = 4 noise free mixtures are considered.
1.2
MD
SAEM
1
PI: Amari error
0.8
0.6
0.4
0.2
0.1
0.05
0
1000
1500
2000
2500
3000
Sample size
3500
4000
4500
5000
Fig. 9.3: The performance index versus sample size.
As can be observed from figure 9.3, the MD method fails to separate correctly
the sources as it relies on the SαS source pdf assumption that is not verified in
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
9.4 Concluding Remarks
143
this example. This illustrate the robustness of the SAEM compared to the MD
method with respect to the pdf modelization errors.
9.4
Concluding Remarks
In this work, we developed a new semi-parametric BSS method using the SAEM
algorithm. The proposed method is applied for the blind separation of noisy linear
instantaneous mixtures of possibly heavy-tailed sources. The SAEM based method
is compared with the JADE, FastICA and the minimum dispersion (MD) methods
and shown to be more general (as it can be applied to a larger class of source signals
and in different scenario). The proposed SAEM algorithm outperforms JADE and
FastICA in terms of consistency and robustness against the outliers and impulsive
noise and outperforms the MD method in terms of robustness against modelization
errors.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
144
A Semi-Parametric Maximum Likelihood Approach
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
Troisième partie
Séparation et Estimation des
Signaux FM Multicomposantes
dans un Environnement Impulsif
Dans le premier chapitre de cette partie (chapitre 10), nous
rappelons les grands principes et méthodes existants de
l’estimation des signaux FM non-stationnaires. Ensuite, nous
présentons nos approches novatrices en présence d’un bruit
additif de nature impulsive modélisé par une distribution
α-stable.
Chapitre 10
State of the Art
The two last decades in particular has witnessed a surge of interest in the
analysis of time-varying or non-stationary processes. The beginning of the 80’s
saw efforts in various parts of the world at developing spectral analysis techniques
which would overcome the drawbacks of classical spectral analysis [Grenier(1984)],
[Boashash(1991)]. These drawbacks arise largely due to the fact that the Fourier
transform signal characterization (upon which classical analysis is essentially based) assumes the spectral characteristic of both signal and noise are time-invariant.
When the important spectral features of the signal and or noise are time varying,
the effect of Fourier analysis is to produce an averaged (smeared) spectral representation. One of the consequences of this smearing is that there is a loss in frequency
resolution. One can try to reduce this smearing by obtaining the spectral estimates over short time intervals, so that the spectral components do not vary too
greatly within the window. However, the shortened observation windows produce
smearing of a different kind, this time due to the uncertainty relationships of time
and band limited signals. Research early in the 1980s focused on two directions :
modern parametric spectral analysis and time-frequency analysis.
In this chapter, we present a brief state-of-the-art of the nonstationary FM signals
analysis problem, which consists of spotlights on a few important existing methods.
10.1
Modern Spectral Analysis Approaches
Parametric modeling of nonstationary, signals has received a great deal of attention in the eighties [Grenier(1984)]. The approaches usually developed are the
representations of these signals by AR or ARMA models with time-varying coefficients. The coefficients are then approximated on a basis of known time-varying
functions, giving rise to a set of invariant parameters which are the coordinates of
148
State of the Art
the coefficients. This approach offers the advantage of leading to the same type of
identification procedures as for AR or ARMA models with constant parameters.
Another potential advantage of this kind of modeling is an improved accuracy of
parameter estimation methods applied to time-varying signals in comparison to
other estimation methods based upon the assumption that the signal is stationary
over a time interval. Several algorithms derived for stationary signals plus observation noise have been extended to the nonstationary case [Grenier(1984)]. However,
it appeared that in some cases the performance of the estimators was reduced in
the nonstationary case. For more details one can finde a good review of parametric
or modern spectral analysis methods in [Grenier(1984)] and [Boashash(1992a)].
10.2
Time-Frequency Analysis Approaches
The long research on the Winger-Ville Distribution (WVD), realized that it
was a means to attain good frequency localisation for rapidly time-varying signals
[Boashash(1992c)]. This interest was fuelled by the discovery that it had a number
of very attractive properties [Classen et Mecklenbrauker(1980)], as well as the evidence that the technique could be put to good practical use. The advance of digital
computers also aided its popularity, as the hitherto prohibitive task of computing
a two dimensional distribution came within practical reach. As the research in the
area continued, the importance of the WVD for random signal analysis become
apparent. In [Martin(1982)], the author showed that the WVD’s expected value
is simply the Fourier transform of the time-varying autocorrelation function. This
gave the WVD an important interpretation as a time varying Power Spectral Density (PSD), and sparked significant research efforts along this direction.
The WVD as an important time-varying filtering tool was also realised early. In
[Boudreaux-Bartels et Marks(1986)], a simple algorithm is derived which consiste
of masking (filtering) the input signal and then performed a least-squares inversion
of the WVD to cover the filtered signal. Many refinements, extensions and simplifications were developed to further this pioneering work on WVD based timevarying filtering. Detection and estimation were other research areas which saw
theoretical developments based on the WVD [Kay et Boudreaux-Bartels(1985)],
[Boashash et Rodriguez(1984)]. One of the crucial factors motivating such interest
was the fact that since the WVD is a unitary ’energy preserving) transform, many
of the classical detection and estimation problem solutions had alternate implementations based on the WVD. The time-frequency nature of the implementation,
however, allowed greater flexibility than did the classical ones.
Despite all the advances made in the theory and application of the WVD to
so many areas of signal processing, it was generally accepted that the WVD had a
number of limitations. One of the main limitations was considered to be the nonlinear nature of the WVD. The WVD performs a bi-linear transformation of the
frequency components of a signal, a fact which is significant for both deterministic and random signals. For deterministic multicomponent signals, the bi-linearity
causes ”cross-terms” or ”artefacts” to occur between the true frequency components. This can often render the WVD almost impossible to interpret visually.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
10.2 Time-Frequency Analysis Approaches
149
For random signals, the bi-linear transformation exaggerates the effects of noise
by creating cross-terms between all noise and signal components. At low signal to
noise ratio (SNR), where the noise term of the bi-linear kernel dominates, this effect can contribute to a very rapid degradation of performance. A second drawback
which was attributed to the WVD is its inherent bias towards infinite duration signals. Since it is essentially the Fourier transform of a bilinear kernel, it is ”tuned”
to the presence of infinite duration complex sinusoid in the kernel and hence to
linear FM components in the signal itself. Practical signals are often highly localised in time, so that a simple Fourier transformation of the kernel, does not provide
a very effective analysis of the data.
Much came of the efforts to overcome these drawbacks. Cohen had already paved
the way for reducing the non-linear effects of the WVD by his work in quantum
mechanics, in which he proposed a generalized class of ”smoothed” Wigner distributions [Cohen(1966)]. He showed that an infinite number of joint distributions
with useful properties could be produced by performing a 2D smoothing function
of the Wigner distribution, the particular distribution depending on the smoothing function used. Researchers then turned to 2D smoothing functions to reduce
the artefacts, the most popular smoothing functions initially being the 2D Gaussian function. Further impetus to the attempted reduction of artefacts came with
the understanding that in the ambiguity function domain, the cross-terms tended to be distant from the origin, while the auto terms passed through the origin
[Flandrin(1998)]. This was especially helpful since the WVD was known to be related to the ambiguity function by 2D Fourier transformation [Boashash(1991)].
2D Fourier inversion of isolated regions of the ambiguity function was then used
to effect the cross-term reduction.
Subsequently, greater refinements and purpose entered the design procedure for
these TFDs. Choi and Williams used the ambiguity domain to design their variable level smoothing function, so that artefacts could be reduced to a greater or
lesser extent, depending on the application [Choi et Williams(1989)]. Zhao, Atlas
and Marks designed kernels in which the artefacts folded back onto the auto-terms
[Zhao et al.(1990)]. The latter effect was desirable, so as to be able to obtain visually satisfying representations. In [Kootsookos et al.(1992)], authors showed how
one could vary the shape of the cross-terms by appropriate kernel design. parallel
to the developments in smoothing of the WVD, another approach was used to
nullify the troublesome non-linear effects in the WVD. This approach was based
on the fact that the cross WVD (XWVD), although being closely related to the
WVD and having many of its desirable properties, is a linear distribution in the
observed signal. Efforts were made, then, to use the XWVD instead of the WVD,
wherever possible.
The problems relating to the WVD’s poor performance with short duration signals
were addressed in a number of different ways. Perhaps the first method proposed
was to modify the WVD by performing the spectral estimation of the kernel function with a Mellin transform [Marinovic(1984)]. Another method put forward for
better dealing with short duration signals was to use autoregressive spectral estimators of the kernel, which could be reliably be applied to short data sequences
[Boashash(1991)].
The emphasis on time-varying spectral analysis which occurred during the 1980s
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
150
State of the Art
also led very naturally to a heightened awareness of instantaneous frequency. For
analysts who were used to dealing with time-invariant systems, the simultaneous
use of the words instantaneous and frequency contained an element of contradiction. Frequency is usually assigned to the eigenvalues of the system’s eigenfunctions, and only defined for persistent processes. It becomes clear that a better
understanding of what was meant by ”instantaneous frequency” and how to estimate this important quantity was needed.
10.2.1
IF estimation using time-frequency methods
Not surprisingly, then, much work did focus on the concepts underlying the IF
and its relationship to TFDs [Boashash(1991)]. A summary of the developments
may be found in [Boashash(1992a), Boashash(1992b)]. Further work concentrated
on techniques for estimating the IF, with a number of useful new algorithms being
developed. Various techniques had been devised over the years for the estimation
of IF, but many of them were developed in the communications area, and as such,
were suited more to communications signals than to those encountered generally
in signal processing environments. Several IF estimation techniques have been developed recently to allow for a broader signal model, or for greater robustness to
noise. This is because the instantaneous frequency is one of the most important
features of any signal. There are two major approaches for IF estimation of FM
signals : parametric and non-parametric.
The non-parametric approach is based on using time-frequency distribution.
In summary, there are two major existing approaches for IF estimation using
TFD. The first is built on the first–order moment of TFD [Boashash(1991)].
The first–order moment of the WVD yields the IF [White et Boashash(1988),
Boashash(1991)], while others yield approximations of the IF [Boashash(1992c)].
However it fails to estimate multicomponent signals due to the presence of cross–
terms. The second approach is built on utilizing the fact that all TFD have peaks
around the IF laws of signals. The peaks of the WVD was used for IF estimation
and applied to many problems [Boashash(1992c)]. For better performance at lower
SNR, the XWVD was proposed [Boashash et O’Shea(1993)].
Other algorithms of TFD–based peak estimation can be found in, for examples,
[Boashash(1992c)], [Stankovic et Katkovnik(1998)], [Katkovnik et Stankovic(1998)],
[Luigi et Moreau(2002a)], [Luigi et Moreau(2002b)]. Like the first approach, this
approach also suffers from the presence of cross–terms in multicomponent signals
which results in poor estimation.
Upon the desired to design high resolution RID, BD was then proposed in
[Barkat(2000)], and MBD was developed in [Hussain(2002)], both with adaptive
algorithms for IF estimation of multicomponent signals.
10.2.2
Analysis of noisy multicomponent signals
There is a wide range of applications where we encounter signals comprised
of I components with different IF laws fi (t) and different envelopes ai (t), in adM. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
10.3 Robust time-frequency analysis
151
ditive noise. It is often desired from such an observed signal, to determine the
IF law of each component. This can be achieved by representing the observed signal z(t) in time-frequency (t-f) domain and use time-frequency filtering methods
to recover the individual components [Cohen(1995)], [Boashash(1991)]. Another
approach involves extending parametric and non-parametric algorithms for IF estimation of monocomponent FM signals to the case of multicomponent signals
and design an algorithm that simultaneously tracks the various IF components
of the observed signal [Peleg et Friedlander(1996)], [Hussain et Boashash(2002)].
Both approaches require the use of time-frequency distributions (TFDs) with very
specific properties such as high time-frequency localization of the instantaneous
frequency components and high reduction of cross-terms interferences.
In practice, the signal under consideration may be subjected to additive noise. In
general, and for various reasons, the additive noise is assumed to be Gaussian.
The analysis of non-stationary signals affected by additive Gaussian noise has
been addressed in several places [Friedlander et Francos(1995)], [Barkat(2000)],
[Barbarossa et Scaglione(2000)], [Barkat et Abed-Meraim(2004)], [Hussain(2002)].
However, in some situations, the assumption about the Gaussianity of the noise is
not valid and, therefore, alternative techniques are needed in this case.
10.3
Robust time-frequency analysis
In the presence of impulsive heavy-tailed noise, which is well-modeled by the
family of alpha-stable distributions, time-frequency representation are severely corrupted by impulse-related artifacts, which tend to obscure the essential details of
the desired signal.
Recently, two novel techniques were proposed for the analysis of a monocomponent
FM signal contaminated by additive noise having unknown heavy-tailed distribution :
– First, a robust time-frequency distributions are developed as a generalization
of the robust minimax M-estimates. In [Katkovnik(1998)], a robust periodogram was proposed for the analysis of a single tone affected by additive heavy-tailed noise. In [Katkovnik et al.(2002)], the authors used the
so-called robust spectrogram and robust Winger-Ville distribution (WVD),
respectively, to address the problem of non-stationary signals embedded in
heavy-tailed noise. In [Barkat et Stankovic(2004)], the author extend the
work proposed in [Katkovnik et al.(2002)] to design a robust polynomial
WVD (PWVD). However, it is known that the spectrogram suffers from low
resolution in the time-frequency domain, that the WVD suffers from the presence of artifacts for non-linearly frequency modulated signals, and that the
PWVD suffer from the presence of cross-terms for multicomponent signals.
– second, in [Griffith(1997)] the author used the fractional-lower order covariance, which is a correlation measure that is well-behaved in alpha-stable
noise, have been developed a set of robust time-frequency representations
that offer significant improvements in performance over conventional quadratic time-frequency representations. However, the use of fractional-lower
order statistic in a time-frequency distribution has much expensive comM. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
152
State of the Art
plexity computation. In addition there is no consistent estimator of covariation and lower-order covariance in the literature to use it in practice.
In this third part of this thesis, we will propose two class of robust time-frequency
procedures to analysis multicomponent non-stationary signals in heavy-tailed noise
with α-stable model. The first one is based on a generalization of the work presented in [Barkat et Stankovic(2004)] to design robust time frequency distribution.
The second one use a preprocessing stage as a first step to mitigate the effect of
the impulsive noise before the time-frequency IF estimation step.
10.4
Concluding Remarks
In this chapter, we described various approaches to IF estimation of nonstationary signals. Because of the discussed limitations of the existing methods,
there was great interest in developing alternatives, especially for the multicomponent signals in impulsive noise environment.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
Chapitre 11
Robust Parametric Approaches
In this chapter we address the problem of instantaneous frequency (IF) estimation of multicomponent nonstationary FM signals in impulsive α-stable noise
environment. Three parametric techniques are introduced using a two-step procedure. The first step consists of transforming the polynomial phase estimation problem into a frequency estimation one using a phase-polynomial transform (PPT).
In the second step, we perform the frequency estimation by three robust versions
of the MUSIC (MUltiple Signal Classification) algorithm using truncated data
(TRUNC-MUSIC), robust covariance estimate (ROCOV-MUSIC) and generalized
covariation coefficients matrix which entries are a fractional lower order statistics
of the signal (FLOS-MUSIC), respectively. We illustrate and compare the proposed
methods by simulations examples.
11.1
Introduction-Problem Statement
Many signals used in communications, radar, sonar, and other man-made signals, as well as various natural signals, involve phase modulation (FM) of a carrier. This model of FM signals was used in many references to define the notion
of a multicomponent signal [Peleg et Friedlander(1996)], [Barbarossa(1995)]. Such
complex signals can be affected by impulsive noise which can be modeled correctly
by α-stable processes. In this work, we parameterize the model of an FM signal
by assuming that the phase of the component is a polynomial function of time.
Remark 11.1. We note that the estimation approach proposed in this work can be
applied to multicomponent signals where the phase of some of the components is a
continuous, but not necessarily polynomial, function of time. Indeed, via Weierstrass theorem we can approximate any continus function by a polynomial one. More
154
Robust Parametric Approaches
about that can be found in [Peleg et Friedlander(1995)].
Without loss of generality, for a simple presentation we focus in this chapter on the case of a quadratic phase. If the considered signal is of phase order
higher to two, we can reduce the order of the signal by demodulation. If the estimate of the higher-order polynomial phase coefficient is accurate, the highest
order term is effectively removed, and we can proceed to use the same procedure
PPT/demodulation to estimate the next phase parameter. This procedure will be
repeated until all the coefficients have been estimated.
Then, the signal model in the quadratic phase case is given by
x(t) =
I
X
si (t) + z0 (t) =
i=1
I
X
ai (t) cos{φi (t)} + z0 (t)
(11.1)
i=1
where t = 0, . . . , N − 1, φi (t) = 2π(fi t + δi t2 ) + θi is the phase of the i-th so called
chirp component. The parameters fi , δi , i = 1, . . . , I are unknown real coefficients.
The values {θi , i = 1, · · · , I} are realizations of random variables, distributed uniformly and independently over [0, 2π). N is the sample size and I is the number
of components of the observed signal. The amplitudes ai (t) are assumed α-stable,
independent from the noise term z0 (t) and with location parameters ai 6= 0 and
dispersions γi .
The random noise z0 (t) is modeled as a symmetric α-stable process (SαS) with
zero location parameter.
Our primary interest is to estimate the instantaneous frequency IFi of each
signal component si , defined as
4
IFi (t) =
1 dφi (t)
= fi + 2δi t
2π dt
(11.2)
1
By decomposing ai (t) = γiα ai,0 (t) + ai where ai,0 (t) being a standard (zero location parameter and unit dispersion) α-stable process, we can re-write the signal
expression as
x(t) =
I
X
i=1
ai cos{φi (t)} +
I
X
1
γiα ai,0 (t)cos{φi (t)} + z0 (t)
i=1
|
{z
(11.3)
}
z(t)
=
I
X
ai cos{φi (t)} + z(t)
(11.4)
i=1
According to the stability property of α-stable laws [Nikias et Shao(1995)], z(t)
is an α-stable process . Thus, the problem of estimation (IFi )1≤i≤I of the multicomponent chirp signal affected by multiplicative and additive α-stable noise is
reduced to that of estimating (IFi )1≤i≤I of a constant amplitude chirp signals, i.e.
having the same IF laws as the original signals, but affected by the additive noise
only.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
11.2 Polynomial-Phase Transform of FM Signals
11.2
155
Polynomial-Phase Transform of FM Signals
Consider the polynomial phase estimation of the signal x(t) in Eq.(11.1). One
possible solution to this problem is the maximum likelihood estimation algorithm.
However, this estimation algorithm requires a large amount of computation. Indeed, the lack of explicit expression of the α-stable noise PDF forces us to use some
existing approximation which turn out very expensive in numerical computation.
Therefore, we propose to use, a much simpler, procedure which is based on the
polynomial-phase transform (PPT) [Sahmoudi et al.(2003b)].
The PPT is a tool for analyzing constant-amplitude polynomial-phase signals
[Peleg et Friedlander(1995)].
In the quadratic phase case, the PPT can simply performed as :
y(t) = x(t + τ )x(t)
=
I1
X
i=1
|2
|ai
cos{2π(2τ δi t) + ϕi } + z 1 (t)
2
(11.5)
(11.6)
2
where τ is the delay parameter (to choose preferably in [ N2 , 2N
3 ]), ϕi = 2π(τ fi +τ δi )
and z 1 (t) is the term of noise plus interferences.1
11.3
IF Estimation Procedure of FM Signals
Now we apply one of the proposed algorithms in Section 11.4 to y(t) to estimate
the parameters δi , i = 1, . . . , I1 . 2 In order to estimate the parameters fi , i =
1, . . . , I, we consider the demodulation of the signal as follows : For i = 1, . . . , I1 ,
we compute
x(i) (t) = xa (t) exp(−j2π δ̂i t2 )
X
≈
exp{2jπ(fk t) + θi } + w(t)
k∈Ji
where Ji is the set of component indices with the same coefficient δi , δ̂i is the
estimate of δi , xa (t) is the analytic signal of x(t), and w(t) represents noise plus
interference. For each demodulated signal, we estimate the frequencies {fk , k ∈ Ji }
using one of the proposed algorithms (see section III.2) applied to the real part of
the demodulated signal <{x(i) (t)}. Note that it is not necessary to use a high
resolution method in the case where Ji contains one single signal index.
1
Note that z 1 (t) is an impulsive noise but not necessarily SαS.
We might have I1 < I in the case where certain chirp components of the signal have
the same phase coefficients δi but different coefficients fi .
2
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
156
11.4
Robust Parametric Approaches
Robust Subspace Estimation
In this section, we address the frequency estimation problem of multicomponent sinusoidal signals observed in impulsive noise environment given by equation (11.1) with φi (t) = 2πfi t + θi . We propose to apply the high resolution subspace algorithm MUSIC (Multiple Signal Classification) [Benidir(2002)] for the
frequency estimation. As the performance of the standard MUSIC algorithm based on the sample covariance matrix degrades if the underlying noise is impulsive,
we propose to apply MUSIC in the following three ways :
1. In the first one, we apply MUSIC to the truncated harmonic signal.
2. In the second one, we apply MUSIC to the generalized covariation function
of the signal.
3. In the third one, we apply MUSIC to the minimax robust covariance estimate
of the harmonic signal.
11.4.1
TRUNC-MUSIC algorithm
In α-stable environment, the use of sample covariance is no longer appropriate
for frequency estimation due to the infinite variance of the noise. To avoid this
difficulty, we propose to truncate in amplitude the ‘large-valued’ observations that
represent “large” impulsive noise realizations and apply MUSIC to the finite covariance matrix of the truncated process. TRUNC-MUSIC (TRUNC stands for
truncation) can be summarized in Table 11.1.
'
$
TRUNC-MUSIC Algorithm
Step 1. Truncation constant choice : We propose to compute the histogram
and choose K such that [−K, K] contains 90 % of the data.
Step 2. Pre-Processing : We truncate the signal according to :
x̃(t) =
x(t)
sign[x(t)]K
if
if
| x(t) |≤ K
| x(t) |> K
Step 3. Frequency estimation : Apply MUSIC algorithm to the covariance
matrix of the truncated signal x̃(t)
&
Tab. 11.1 – The proposed frequency estimation TRUNC-MUSIC algorithm.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
%
11.4 Robust Subspace Estimation
11.4.2
157
FLOS-MUSIC algorithm
In this section we propose to use the fractional lower order statistics (FLOS)
of the signal for the frequency estimation. We consider an L × L generalized covariation coefficient (GCC) matrix Γ, whose (n, l)-th entry is given by :
Γn,l =
E[x(n)x(l)<p−1> ]
[x(n), x(l)]α
=
,
[x(l), x(l)]α
E[| x(l) |p ]
1≤p<α
(11.7)
where x<p−1> = |x|p−1 sign(x). It has been shown in [Altinkaya et al.(2002)] that
for a sinusoidal signal in α-stable noise, we have :
Γn,l =
I
X
ηi cos{2πfi (n − l)} + Pz δn
(11.8)
i=1
where {ηi , i = 1, · · · , I} are positive real constants depending on α and ai , Pz is a
real constant depending on noise pdf and δn is the Kronecker coefficient.
Equation (11.8) shows that we can obtain the frequency estimates by applying
MUSIC algorithm to the GCC-matrix Γ. In practice, we follow the following procedure summarized in Table 11.2.
'
$
FLOS-MUSIC Algorithm
Step 1. Compute an estimate of Γn,l (for p = 1) using
[Altinkaya et al.(2002)]
N −M
X+1
Γ̂n,l =
x(n + i − 1)sign(x(l + i − 1))
i=1
N −M
X+1
(11.9)
| x(l + i − 1) |
i=1
h i
Step 2. Apply MUSIC to the GCC matrix estimate Γ̂n,l
for the frequency estimation.
&
1≤n,l≤L
%
Tab. 11.2 – The proposed robust frequency estimation FLOS-MUSIC algorithm.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
158
11.4.3
Robust Parametric Approaches
ROCOV-MUSIC algorithm
[A]- Robust estimation of the covariance.
Huber considered the parameter estimation problem in the presence of outliers or impulsive noise and proposed the concept of M-estimation statistics theory
[Huber(1981)]. Here, we consider M-estimates for the signal auto-covariance func4
tion γ(k) = E[x(t + k)x(t)]. Note that the robust autocovariance estimation is
equivalent to the robust variance estimation according to
1
E(XY ) = [V ar(X + Y ) − V ar(X − Y )]
4
(11.10)
where V ar is the variance. For an α-stable distribution we have infinite variance,
for that we propose to first truncate the observations using a large valued constant
K À 1. The M-estimator of the variance σ 2 is a solution of the following equation
[Huber(1981)]
N −1
1 X
x2 (i)
u(d2i ) 2 − u(d2i ) = 0
(11.11)
N
σ
i=0
x2 (i)
σ2
where d2i =
is the Mahalanobis quadratic distance and u is a weighting function defined in IR+ .
The existence and uniqueness of the solution of Eq.(11.11) was shown in
[Huber(1981)] under mild assumptions about the weighting function such as boundedness and continuity.
This function is typically chosen such that observations coming from the tails of
the assumed contaminated distribution are down-weighted.
Here, we use the robust non-descending weighting function which is based on
Huber’s minimax function given by u(d) = ω(d)/d with :
ω(d) = min(d, k)
(11.12)
where k is a suitable constant [Huber(1981)]. We can compute the M-estimate of
the variance as a solution of the latter equation.
Then the needed covariance matrix is estimated through the auto-covariance coefficients as shown in Table 11.3.
[B]- Robust frequency estimation
Now, we apply the subspace approach to estimate the parameters of the sinusoidal signals. Thus, we outline the proposed algorithm ROCOV-MUSIC in
Table 11.4.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
11.5 Performance Evaluation & Comparison
159
$
'
ROCOV Algorithm
Step 1. Initialize the ROCOV
by the standard
P −1 algorithm
2
variance estimator : σ02 = N1 N
x
(i),
i=0
Step 2. Sweep : At the (j + 1)th iteration compute
N
−1
X
2
σj+1
=
2 2
ωi,j
x (i)
i=0
N
−1
X
;
2
−1
ωi,j
i=0
ωi,j = u(di,j ) = ω(di,j )/di,j , d2i,j =
x2 (i)
,
σj2
ω is the Huber non-
descending function given above in (11.12).
Step 3. According to Equation (11.10), compute the Mestimates γ̂(k), k = 0, . . . , L − 1 using the M-estimator of the
variance of [x(t + k) + x(t)] and [x(t + k) − x(t)] computed in
the above step 2.
Step 4. Stop the sweeps when the error is smaller than a given
threshold ².
&
%
Tab. 11.3 – The proposed robust covariance estimation ROCOV algorithm.
'
$
ROCOV-MUSIC Algorithm
Step 1. Compute the M-estimates γ̂(k), k = 0, . . . , L−1 using the so called
ROCOV algorithm summarized in Table 11.3.
Step 2. Apply MUSIC to the robust covariance matrix estimate Γ̂x =
T oeplitz[γ̂(k)0≤k≤L−1 ] for the frequency estimation.
&
Tab. 11.4 – The proposed frequency estimation ROCOV-MUSIC algorithm.
11.5
Performance Evaluation & Comparison
11.5.1
Mixture of sinusoidal component
Here, we perform a comparison study by simulation of the proposed robust frequency estimation methods TRUNC-MUSIC, ROCOV-MUSIC and FLOM-MUSIC.
For that, we consider three (I = 3) sinusoidal component with same amplitude
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
%
160
Robust Parametric Approaches
a1 = a2 = a3 = 1 and frequencies f1 = 0.1, f2 = 0.3 and f3 = 0.4.
We assume that the signal is affected by an impulsive noise with α-stable distribution. The characteristic exponent value is α = 1.5. We run 200 Monte-carlo
realizations to compute all considered statistics.
Figure 11.1 presents the mean square error (MSE) versus the noise dispersion in
dB, (the considered sample size is N = 1000). In Figure 11.2, we present the MSE
versus the sample size. The noise dispersion considered here is γ = 0.1.
Comparaison des algorithmes TRUNC−MUSIC, ROCOV−MUSIC et FLOM−MUSIC
−10
TRUNC−MUSIC
ROCOV−MUSIC
FLOM−MUSIC
−20
−30
MSE
−40
−50
−60
−70
−80
−90
−10
−8
−6
−4
−2
0
2
4
6
8
10
Dispersion en dB
Fig. 11.1: The MSE versus the noise dispersion in dB, N=1000.
Comparaison des algorithmes TRUNC−MUSIC, ROCOV−MUSIC et FLOM−MUSIC
−30
FLOM−MUSIC
TRUNC−MUSIC
ROCOV−MUSIC
−40
MSE
−50
−60
−70
−80
−90
0
100
200
300
400
500
600
700
800
900
1000
Taille d’ echantillon
Fig. 11.2: The MSE versus the sample size, γ = 0.1.
These figures show the effectiveness of the proposed methods.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
11.5 Performance Evaluation & Comparison
11.5.2
161
Mixture of two chirps
In this subsection, we conduct three experiments to illustrate the proposed
procedure of IF estimation (or equivalently parameter phase estimation). In the
first one we use the TRUNC-MUSIC algorithm in the second step of our proposed approach as introduced in Section 11.3. While in the second one, we use the
ROCOV-MUSIC algorithm and the FLOS-MUSIC algorithm in the third experiment.
For that we consider two linear FM3 component mixtures (I = 2) with same
amplitudes a1 = a2 = 1, frequencies f1 = 0.05, f2 = 0.3 and second order parameters δ1 = 0.0001 and δ2 = 0.0003.
We suppose that the signal is affected by an impulsive noise with α-stable model
(α = 1.5). We run 500 Monte-carlo realizations to compute all evaluated statistics.
Chirp 1
Chirp 2
−10
−20
−15
−25
−20
MSE de f
1
MSE de f2
−15
−30
−35
−35
−45
−40
1000
1500
Taille de l’échantillon
−45
500
2000
−70
100
−80
50
−90
MSE de δ2
1
−30
−40
−50
500
MSE de δ
−25
−100
−110
1000
1500
Taille d’ échantillon
2000
1000
1500
Taille de l’échantillon
2000
0
−50
−100
−120
−130
500
1000
1500
2000
Taille de l’échantillon
−150
500
Fig. 11.3: The MSE versus the sample size, γ = 0.1.
Figures 11.3 and 11.4 show the MSE of the estimated phase parameters versus
the sample size and the noise dispersion respectively.
The three proposed techniques are compared using the same legend as in the
previous figures 11.1 and 11.2. These simulation examples show the effectiveness
of the proposed methods to mitigate the impulsive noise. The comparative study
shows clearly certain advantage for the ROCOV-MUSIC based procedure.
3
LFM called also commonly chirp signals.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
162
Robust Parametric Approaches
Chirp 2
−10
−10
−15
−15
−20
MSE f2
MSE f1
Chirp 1
−5
−20
−25
−30
−5
0
Dispersion
5
−35
−10
10
−78
−75
−80
−80
−82
−85
−84
−86
−90
−105
0
Dispersion
5
10
5
10
−95
−100
−5
0
−90
−88
−92
−10
−5
Dispersion
MSE δ2
MSE δ1
−30
−10
−25
5
10
−110
−10
−5
0
Dispersion
Fig. 11.4: The MSE versus the noise dispersion in dB, N=1000.
11.6
Concluding Remarks
In this chapter, three two-step methods for IF estimation in heavy-tailed noise
are introduced. The first step consists of transforming the polynomial phase estimation into a frequency estimation problem one. The frequency estimation in the
second step of the proposed parametric methods is based on the use of the subspace MUSIC algorithm, applied respectively, on the amplitude truncated signal,
on the robust covariance matrix and on the generalized covariation coefficients
matrix. Simulation results are presented to validate our IF estimation methods.
In the considered simulation context, the comparative study shows the superiority of the parametric method using the robust covariance estimation technique
(ROCOV-MUSIC).
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
Chapitre 12
Robust Time-Frequency Approaches
As shown in this chapter, the conventional TFDs are quite sensitive with respect to non-Gaussian noise, in particular, to impulsive noise in which case they
produce poor estimation results. In order to get a good estimation performance
in this context, we propose in a first approach a preprocessing stage of the signal
to attenuate the impulsive noise effect before processing the signal TFD. In the
second approach, we use the robust statistics theory to define a new robust TFD,
so named robust MB-distribution (MB-distribution : Modified B-distribution). We
show that the resulting TFD from the two proposed approaches is able to reveal
the instantaneous frequency of the noisy multicomponent signal in an accurate
way.
12.1
Introduction-Problem Statement
This chapter is concerned with the analysis of multi-component FM signals,
corrupted by additive heavy-tailed noise. A multi-component signal means a signal
whose time-frequency representation presents multiple ridges in the time-frequency
plane.
• Signal model
Analytically, the noisy signal considered in this chapter is defined as,
x(t) = s(t) + z(t) =
M
X
si (t) + z(t)
(12.1)
i=1
where each component si (t), of the form si (t) = ai (t) ejφi (t) , is assumed to have
only one ridge, or one continuous curve, in the time-frequency plane. ai (t) is the
amplitude and φi (t) denotes the phase of the ith component of the signal. The
164
Robust Time-Frequency Approaches
probability density function (PDF) of the random impulsive noise z(t) is modeled
as a heavy-tailed distribution1 . Examples of this kind of distributions include αstable with α < 2 and generalized Gaussian laws.
• Symmetric α-stable process (SαS) : As it has been presented in the first
part of this thesis, the PDF of SαS processes does not have closed form except for
the cases α = 1 (Cauchy distribution), α = 2 (Gaussian distribution) and α = 1/2
(Levy distribution). Due to their heavy tails, stable distributions do not have finite
second or higher-order moments, except for the limiting case of α = 2.
• Generalized Gaussian (GG) PDF : Another way to model impulsive noise
processes is through the generalized Gaussian PDF given by fα (x) = A exp (−b|x|α )
where 0 < α < 2 . For α = 2 we have the Gaussian distribution and for α = 1 we
have the Laplacian distribution which is known to be a good model for impulsive
noise.
• Time-frequency analysis : Our primary interest, in this work, is to estimate
the instantaneous frequency of each FM signal si (t) of (12.1), defined as
4
IFi (t) =
1 dφi (t)
2π dt
(12.2)
Time-frequency analysis techniques are used here as they reveal the multicomponent nature of such signals. Ideally, for a given FM signal, the TFD is represented as a row of delta functions around the signals instantaneous frequency.
This property makes the peak of the TFD a very powerful tool as an IF estimator. However, quadratic TFD of multi-component signals suffer from the presence of cross-terms, which can obscure the real features of interest in the signal.
The properties of a quadratic TFD are completely determined by its kernel. This
kernel should have the shape of a two-dimensional (2-D) low-pass filter to attenuate the cross-terms that exist away from the origin in the ambiguity domain
and preserve the auto-terms that concentrate around the origin of this domain
[Hussain et Boashash(2002)]. Considerable efforts have been made to define TFDs
that reduce the effect of cross terms while improving the time-frequency resolution (e.g., [Hussain et Boashash(2002), Barkat et Abed-Meraim(2004)]). This led
to the so-called reduced interference distributions that include the modified Bdistribution (MBD), and the signal-dependent optimal time-frequency representation. In this work, we have used the MBD [Hussain et Boashash(2002)] given
by :
Z Z
+∞
T (t, f ) =
−∞
τ
τ
GσM B (t0 )[x(t − t0 + )x∗ (t − t0 − )]e−j2πf τ dt0 dτ
2
2
(12.3)
kσ
where GσM B (t0 ) = cosh(t
0 )2σ , 0 ≤ σ ≤ 1 is a real parameter that controls the
tradeoff between component’s resolution and cross-terms suppression and kσ =
Γ(2σ)/(22σ−1 )Γ2 (σ) is the normalizing factor. The choice of the MBD, stems from
the fact that it presents a good performance in terms of resolution and crossterms suppression [Hussain et Boashash(2002)]. The effect of additive Gaussian
1
For complex-valued noise signal, we simply consider that z(t) = zr (t) + jzi (t) where
zr (t) and zi (t) represent two independent heavy-tailed processes with a same pdf function.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
12.2 Failure of Standard TFD in Impulsive Noise
165
noise on the time-frequency representation is another consideration that has direct influence on the instantaneous frequency estimation with an important issue
[Peleg et Friedlander(1996)], [Hussain et Boashash(2002)].
However, in many practical applications, especially in communications, signals are
disturbed by impulsive noise due to the propagation environment or to large errors in collecting and recording the data. These noise processes are commonly
modeled by heavy-tailed distribution [Nikias et Shao(1995)]. Since outliers or impulsive noise have an unusually great influence on standard IF estimators, robust procedures attempt to modify those schemes. Only a limited literature was
dedicated to the analysis of multi-component FM signals in impulsive noise. In
[Sahmoudi et al.(2004b)], authors propose a class of parametric robust methods
to handle linear FM signals. In the same paper [Sahmoudi et al.(2004b)] a TFDbased technique has been proposed using a pre-processing stage to mitigate the
impulsive noise effect. The other alternative, which is the focus of this chapter, is
to apply the M-estimation principle in order to design robust TFD with respect to
impulsive noise. In [Katkovnik et al.(2003)] and [Barkat et Stankovic(2004)], the
authors proposed the robust spectrogram and the robust polynomial Wigner-Ville
distribution (PWVD), respectively. However, it is known that the spectrogram
suffers from low resolution in time-frequency domain, while the PWVD suffers
from cross-terms for multi-component signals. In this chapter, we use the modified
B-distribution [Hussain et Boashash(2002)] and the M-estimation theory to design a new robust TFD, which is referred to as the robust modified B-distribution
(R-MBD), that is used for the analysis of multi-component FM signals in heavytailed noise. We show that the proposed approach can solve problems that existing
time-frequency distributions cannot.
12.2
Failure of Standard TFD in Impulsive Noise
12.2.1
Effect of impulsive spike noise on TFD
To examine the effect of additive impulsive noise on the time-frequency representation of a signal, it is useful to use a model which is simple and provides
considerable insight into the nature of the artifacts that appear. We will carry out
our analysis in discrete time. For a clear and simple illustration, let us consider
the spike model for impulsive noise. The signal to be examined is
x(n) = s(n) + AδK (n − n0 )
(12.4)
where δK (n) is the Kronecker delta function,
PAδK (n−n0 ) represent the spike noise
model and A À Es is its amplitude. Es = n | s(n) |2 is the energy of the signal
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
166
Robust Time-Frequency Approaches
s(n). If we compute for example the WVD of x(n), we get
Wx (n, f )
=
2
X
{s(n + m) + AδK (n + m − n0 )} {s∗ (n − m) + A∗ δK (n − m − n0 )} e−j4πmf
m
=
Ws (n, f ) + 2A∗
+2A
X
X
s(n + m)δK (n − m − n0 )e−j4πmf
m
s∗ (n − m)δK (n + m − n0 )e−j4πmf
m
+2 | A |2
X
δK (n + m − n0 )δK (n − m − n0 )e−j4πmf
m
=
n
o
Ws (n, f ) + Real A∗ s(2n − n0 )e−j4π(n−n0 )f + 2 | A |2 δK (n − n0 )
where Real(z) denotes the real part of the complex number z. The effect of this
single impulse at n = n0 is to place a very strong impulsive ridge with magnitude
2 | A |2 located in the time frequency plane and extending over all frequencies.
In addition, there is a secondary artifact that is the result of the cross-product
of the signal s(n) with the impulse. This artifact is a decimated copy of s(n)
extending over all frequencies and modulated in the normalized frequency domain
by the complex exponential term exp(−4jπ(n − n0 )). This additive cross term will
oscillate more rapidly in the normalized frequency domain for values of n that are
further removed from n0 .
12.2.2
Effect of impulsive α-stable noise on TFD
Because alpha-stable noise is impulsive, its effect on quadratic time-frequency
representation (QTFR) is different from the effect that is observed in the Gaussian
case (α = 2). In that case, the energy of the noise is uniformly spread over timefrequency plane. This can be seen by examining the autocorrelation function of
the observed signal x(n) given by
Rx (n, m) = IE {x(n + m)x∗ (n − m)}
(12.5)
One can show that the time-frequency of a signal s(n) in additive alpha-stable
noise is severely degraded. Indeed, let z(n) denote the α-stable additive noise with
α < 2, then for m 6= 0,
Rx (n, m) = s(n + m)s∗ (n − m) + s(n + m)IE {z ∗ (n − m)}
∗
(12.6)
∗
+s (n − m)IE {z(n + m)} + IE {z(n + m)z (n − m)} , (12.7)
and for m = 0,
Rx (n, 0) =| s(n) |2 +s(n)IE{z ∗ (n)} + s∗ (n)IE{z(n)} + IE{| z(n) |2 }
(12.8)
Since IE{z(n)} is infinite when α ≤ 1, then all elements of the autocorrelation
matrix are infinite. Also, since IE{| z(n) |2 } is infinite when α < 2 we have
Rx (n, 0) = ∞ for all n. Thus the autocorrelation blows up for α < 2, making
the standard time-frequency useless for characterizing signal in impulsive environments.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
12.3 Pre-processing Techniques based Approach
12.2.3
167
The need of robust TFD in Gaussian environment
Here, we suppose that the noise z(n) is a complex Gaussian random process
and we examine the instantaneous autocorrelation function of the observed signal
x(n). One can write the noise PDF as
pz (z) =
1
1
exp(− 2 | z |2 )
2
πσ
πσ
(12.9)
We can express the instantaneous autocorrelation of the observed signal as
Rx (n, m) = sx (n, m) + z1 (n, m) + z2 (n, m) + Rz (n, m)
(12.10)
Clearly the term sx (n, m) is deterministic, the two terms z1 (n, m) and z2 (n, m)
represent complex random variables. To analyse the instantaneous autocorrelation
behavior, one must analysis the probability distribution of the final term Rz (n, m).
Using the PDF formula of functions of r.v.s [Benidir(2002)], we have
pRz
∝ pz (h−1 (y))
−1
∝ exp(−c | h
(12.11)
2
(y) | )
∝ exp(−c | y |)
(12.12)
(12.13)
where h is the considered transform z 7→ h(z) = zz ∗ . Thus the instantaneous
autocorrelation has a Laplace PDF which has heavy tails. Then, computation of
a QTFD of a signal in Gaussian noise generate an impulsive noise. Consequently,
robust time-frequency analysis is necessary also in Gaussin environment.
12.3
Pre-processing Techniques based Approach
The first step consists in reducing the impulsive noise amplitudes in order to
improve the quality of the TFD of the considered noisy signal. To do so, two
solutions might be suggested.
12.3.1
Exponential compressor filter
We propose here to pass the noisy signal through a nonlinear device that
compresses the large amplitudes (i.e., reduces the dynamic range of the noisy
signal) before further analysis [Barkat et Abed-Meraim(2003b)]. The output of
the nonlinear device, is expressed as
x̃(t) = ψβ [x(t)] = |x(t)|β sign[x(t)]
where 0 < β ≤ 1 is a real coefficient that controls the amount of compression
applied to the input noisy signal x(t).
This technique is similar to that used in nonuniform quantization where a
totally different nonlinear law is used [Jayant et Noll(1984)]. A plot of this compressor law is displayed in Figure 12.1 for different values of β.
Observe that the compressor law is linear around the origin (i.e., very small
input values). The linearity and its corresponding interval range, obviously, depend
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
168
Robust Time-Frequency Approaches
2
β=0.1
β=0.5
β=0.9
1.5
Compressor output
1
0.5
0
−0.5
−1
−1.5
−2
−2
−1.5
−1
−0.5
0
0.5
Compressor input
1
1.5
2
Fig. 12.1: The nonlinear law of the compressor used in the pre-processing stage.
on the value of β. The smaller is β, the smaller is the linearity range. This means,
that for weak signals (i.e., the noiseless signal amplitude is small enough compared
to the noise spikes), and using an appropriate value of β, the compressor output
signal may be approximated by a scaled version of the input noiseless signal embedded in a new additive noise whose variance is much smaller than the input noise
variance. Figure 12.2 displays the time representation of a linear FM in impulsive
noise compressed using β = 1 (i.e., no compression) β = 0.9, β = 0.5 and β = 0.2,
respectively. If we assume the effect of the compressor on the desirable noiseless
signal characteristics (i.e., its IF) to be negligible, then the achieved reduction in
the noisy signal variance will yield better results in its analysis.
10
10
β=0.9
β=1
5
Amplitude
Amplitude
5
0
0
−5
−5
−10
0
100
200
300
Time [samples]
400
500
−10
4
0
100
500
400
500
β=0.2
1
Amplitude
2
Amplitude
400
2
β=0.5
0
−2
−4
200
300
Time [samples]
0
−1
0
100
200
300
Time [samples]
400
500
−2
0
100
200
300
Time [samples]
Fig. 12.2: Compression of a linear FM signal in impulsive noise using different
values of β.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
12.4 Robust Time-Frequency Approach
12.3.2
169
Huber filter
We use here the Huber criterion to define the Huber filter which truncates in
amplitude the ‘large-valued’ observations that represent “large” impulsive noise
realizations. For the choice of the truncation constant K, we propose to compute
the histogram of observations and choose K such that [−K, K] contains 90 % of
the data ; then, the output of the Huber linear filter, is expressed as :
½
x(t)
if | x(t) |≤ K
x̃(t) = ψH [x(t)] =
sign[x(t)]K if | x(t) |> K
The second step consists in applying a time-frequency analysis presented in
section Section 12.5 to the processed signal x̃(t) for the IF problem estimation.
12.4
Robust Time-Frequency Approach
In order to get a good TFD-based IF estimation performance in impulsive
environment, we use the robust statistics theory of M-estimation to define a new
robust quadratic time-frequancy distribution.
12.4.1
Optimal TFD kernel in α-stable noise
Recall that the fractional lower order moments (FLOMs) of an alpha-stable
random variable with zero location parameter and dispersion γ are given by E|X|p =
p
C(p, α)γ α for 0 < p < α where C(p, α) is a constant depending only on p and
α. This tells us that the pth order moment of an α-stable random variable and
its dispersion are related through only a constant. Therefore, the MD criterion is
equivalent to least Lp -norm estimation where 0 < p < α and the estimates of a
parameter θ can be obtained from equation (3.18) using the Lp -norm loss function
ρp (x) = |x|p
;
ψp (x) = sign(x)p|x|p−1
(12.14)
4
(sign(x) = x/|x|) as a tool of the robust estimation which appears originally as
a heuristic idea supported latter by theoretical and experimental studies. In particular, for p = 1, L1 -norm criterion referred to as “modulus function” was used
in [Katkovnik et al.(2003), Barkat et Stankovic(2004)] to define the robust periodogram and robust PWV distributions. It should be emphasized that the least
Lp -norm estimates are not only optimal in a MD sense for α-stable data, but
also optimal in the maximum likelihood sense for the family of generalized Gaussian distribution. Indeed, the ML estimator coincides with the Lp -norm criterion
choosing p as the same value of the index p in the generalized Gaussian PDF. In
addition, applying equations (3.25) and (3.26) over the class of generalized Gaussian pdf we can show easily that the least Lp -norm estimates is optimal also in the
robust minimax sense if we choose p as the smallest value in the considered set of
p values. It is recognized that “outliers” which arise from heavy tailed distribution
noise or are simply bad points due to measurement errors, have an unusually large
influence on the standards estimators based on least squares estimators. Accordingly, as mentioned previously robust methods have been developed to modify least
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
170
Robust Time-Frequency Approaches
squares schemes so that the outliers have much less influence on the final estimates.
One of the most satisfying robust procedures is that given by a modification of
the principle of maximum likelihood ; hence we proceed with that approach called
M-estimation [Huber(1981)].
12.4.2
A new robust quadratic time-frequency distribution
Let consider the noisy signal (12.1) in discrete-time x(kT ) = s(kT ) + z(kT )
where T is a sampling period. A standard time-frequency distribution, at a point
(kT, f ), is shown to be a solution of the optimization problem [Katkovnik et al.(2003)]
B̂ = arg min J (kT, f, B)
B
where
J (kT, f, B) =
N/2
X
w(nT )ρ[e(k, f, n)],
(12.15)
(12.16)
n=−N/2
e(k, f, n) = Gx (kT, nT )e−j2πf nT − B
where w(nT ) is a window function, Gx (kT, nT ) being the kernel of the considered
quadratic time-frequency distribution of the FM signal x(kT ) and B is an estimate
of the expectation of the sample average of the quantity G(kT, nT )e−j2πf nT . If we
choose the loss function ρ(e) = |e|2 , we can show by solving for B the expression
dJ (kT,f,B)
= 0 that the optimal solution corresponds to the standard TFD
dB∗
Bxs (kT, f ) =
N/2
X
w(nT )
Gx (kT, nT )e−jπf nT
PN/2
n=−N/2
n=−N/2 w(nT )
(12.17)
Thus, for a weighted window, the standard TFD can be treated as an estimate of
the mean, calculated over a set of complex-valued observations
G = {Gx (kT, nT )e−jπf nT ; n ∈ [−N/2, N/2]}
It has been shown that the optimal loss function ρ derived in the minimax Huber’s estimation theory (see section 2) could be applied to the design of a new
class of robust time-frequency distributions, inheriting properties of strong resistance to impulsive noise. In particular, some robust TFDs have been derived by
using the absolute error loss function ρ(e) = |e| in (12.16) [Katkovnik et al.(2003)].
In this work, we propose to choose the loss function ρ in the criterion (12.16) as
the Lp -norm criterion ρ(e) = |e|p wherep < 2 is a parameter to control the exponential loss function degree. The choice of this optimal criterion is well motivated
in section 12.4.1. In this work, we use the MBD to handle multi-component nonstationary FM signals given by model (12.1). However, similarly to the standard
spectrogram, WVD and PWVD, the standard MB-distribution is not an adequate
analysis tool in presence of heavy-tailed noise. To mitigate this problem, we use
the MB-distribution kernel given in Equation (12.3) and the Lp norm loss function in the design of the proposed robust MBD to analyze FM signals affected
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
12.4 Robust Time-Frequency Approach
171
by impulsive noise. In this case, we find the optimal solution, labelled the robust
modified B-distribution (R-MBD), to be


N/2


X
∂
−jπf nT
p
w(nT
)|G
(kT,
nT
)e
−
B|
=0
(12.18)
x

∂B∗ 
n=−N/2
⇐⇒
N/2
X
w(nT )(Gx (kT, nT )e−jπf nT − B)|Gx (kT, nT )e−jπf nT − B|p−2 = 0
(12.19)
n=−N/2
⇐⇒
Bxr (kT, f )
=
N/2
X
n=−N/2
d(k, f, n)
Gx (kT, nT )e−jπf nT ,
D0 (kT, f )
d(k, f, n) = w(nT )|Gx (kT, nT )e−jπf nT − Bxr (kT, f )|p−2 ,
D0 (kT, f ) =
N/2
X
d(k, f, n)
(12.20)
(12.21)
(12.22)
n=−N/2
Since, the quantity Bxr (kT, f ) appears on the right as well as on the left hand
side of Equation (12.20), an iterative procedure is necessary in order to obtain the
R-MBD. The robust-MBD algorithm can be summarized as follows in Table 12.1.
It was shown in [Kaluri et Arce(2000)] that the above iterative algorithm will
$
'
Robust-MBD Computation
Step 1. Evaluate the standard MBD using equation (12.17)
Step 2. For initialization purposes, set the iteration index i = 0 and
Bxr 0 (kT, f ) = Bxs (kT, f )
Step 3. Sweep. Set i = i + 1 and do :
– Compute d(k, f, n) and D0 (kT, f ) using equations (12.21) and (12.22)
respectively.
– Compute the robust MBD, for iteration i, Bxr i (kT, f ) using Equation
(12.20).
Step 4. If the relative absolute difference between two iterations is smaller
than a fixed threshold ε, i.e.
|Bxr i (kT, f ) − Bxr i−1 (kT, f )|/|Bxr i (kT, f )| ≤ ε
then stop the algorithm. Otherwise go to Step 3.
&
Tab. 12.1 – Computation procedure of the Robust-MBD.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
%
172
Robust Time-Frequency Approaches
converge to a single (global) minimum under a good choice of the initial value. In
our case, the choice of Bxr 0 (kT, f ) = Bxs (kT, f ) satisfies the necessary condition of
the convergence.
12.5
IF Estimation & Component Separation
The proposed component separation procedure algorithm consists of separating
the signal components and estimating their respective IF laws from the signal
TFD. In impulsive environment, we propose to apply this algorithm to (i) : the
TFD of the pre-processed signal in the first procedure and (ii) to the robust MBdistribution of the noisy signal in the second procedure. This proposed component
separation algorithm is illustrated in Table 12.5. The first step of the algorithm
consists in noise thresholding to remove the undesired ’low’ energy peaks in the
time-frequency domain. This operation can be written as :
½
Tth (t, f ) =
T (t, f ) if T (t, f ) > ²
0
otherwise
where ² is a properly chosen threshold. In our simulations we used ² = 0.01 max T (t, f ).
(t,f )
Assuming a ‘clean’ TFD, the M components IF are estimated, at each time
instant t, from the M peak positions of the TFD slice Tth (t, f ). Let observe that
if, at a time instant t0 , two components are crossing then the number of peaks (at
this particular slice T (t0 , f )) is smaller than the total number of components M .
For practical implementation reasons, we decide that a crossing occurs when the
number of peaks is smaller than M over a fixed number of consecutive slices. In
this case, we implement the following procedure :
1. Choose a particular maximum point location in the slice where the crossing
occurs.
2. Measure all distances from this point to the peaks locations of the previous
slice (with no crossing).
3. Select the 2 smallest distances and add them.
4. Repeat Steps 1 to 3 for all other maximum point locations in the slice where
the crossing occurred.
5. From the set of the smallest sums found above, the program selects the
smallest value and the points associated to them. This will yield the location
where the crossing occurred and the 2 components involved in the crossing.
Then, we use a simple numerical permutation operation of the 2 components
involved in the crossing. The details of the proposed separation technique is outlined in Table 12.5.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
12.6 Performance Evaluation & Comparison
173
'
$
Time-Frequency based Component Separation Algorithm
1. Assign an index to each of the M components in an orderly manner.
2. For each time instant t (starting from t = 1), find the components
frequencies as the peaks positions of the TFD slice T (t, f ).
3. Assign a peak to a particular component based on the smallest distance
to the peaks of the previous slice T (t − 1, f ) (the IFs are continuous
functions of time). For the special case of a crossing point (see Step 4 how
to detect it and its corresponding components), we assign the peak to both
crossing components.
4. If at a time instant t a crossing point exists (i.e., number of peak smaller
than the number of components), identify the crossing components using
the smallest distance criterion by comparing the distances of the actual
peaks to those of the previous slice.
5. Permute the indices of the corresponding crossing components.
&
%
Tab. 12.2 – Component separation procedure for the proposed algorithm
12.6
Performance Evaluation & Comparison
The estimation performance is measured by the normalized MSE defined by
Nr
1 X
kθ̂r − θk2
N M SE =
Nr
kθk2
r=1
where θ is the considered parameter, θ̂r is the estimate of θ at the rth experiment
and Nr is the number of Monte-Carlo runs chosen here equal to 500.
[B]- First experiment
To check the validity and superiority of the proposed algorithm, we consider
the time-frequency representation of a three-component FM signal corrupted by
an impulsive noise modeled as a generalized Gaussian distribution with α = 1.5
. The standard MBD, displayed in Fig. 12.3. yields a poor representation ; while,
the R-MBD displayed in Fig. 12.4. reveals clearly the features of the noisy signal.
The superiority of the R-MBD over the standard MBD is obvious in this example.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
174
Robust Time-Frequency Approaches
Fs=1Hz N=512
Time−res=1
500
450
400
Time (seconds)
350
300
250
200
150
100
50
0.05
0.1
0.15
0.2
0.25
0.3
Frequency (Hz)
0.35
0.4
0.45
Fig. 12.3: The standard MBD of the multi-component signal test.
Fs=1Hz N=512
Time−res=1
500
450
400
Time (seconds)
350
300
250
200
150
100
50
0.05
0.1
0.15
0.2
0.25
0.3
Frequency (Hz)
0.35
0.4
0.45
Fig. 12.4: The Robust-MBD of the multi-component signal test.
[A]- Second experiment
In this experiment, we consider a discrete-time multicomponent FM signal
consisting of two linear FM components embedded in additive impulsive noise
x(n) = s1 (n) + s2 (n) + z(n) n = 0, 1, . . . , N − 1
where s1 (n) = exp{j2π(a1 n + b1 n2 )} and s2 (n) = exp{j2π(a2 n + b2 n2 )}. The
noise z(n) is chosen to be α-stable with zero location parameter, characteristic
exponent α = 1 and dispersion equal to γ = 1. The signals’ IF coefficients are
given by a1 = 0.2, b1 = 0.1 ∗ 10−3 , a2 = 0.45 and b2 = −1.5 ∗ 10−3 .
In the first step, we perform a pre-processing of the noisy signal to mitigate the
impulsive noise using the exponential compressor filter exp-TFD algorithm (with
parameter β = 0.1) and in the Huber-TFD algorithm.
In the second step, we put the pre-processed signal x̃(n) through the proposed
algorithm (we chose σ = 0.01 for the MB-distribution kernel) in order to extract
the two respective components. The peaks of the extracted components (in the
time-frequency domain) are, then, used to estimate the IFs of the chirps. We use
a simple polynomial fit to obtain estimates of (a1 , b1 ) from IF1 (n) and estimates
of (a2 , b2 ) from IF2 (n).
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
12.6 Performance Evaluation & Comparison
175
The same noisy signal x(n) is also put through the R-MBD (with β = 1)
algorithm developed in this work to validate this method and to compare it with
the preprocessing-based methods. In Fig. 12.5, we display the NMSE of the RMBD, exp-TFD and Huber-TFD versus the sample size.
Chirp2
−20
−30
−25
NMSE b2
NMSE b
1
Chirp 1
−25
−35
−40
−45
−50
−30
−35
−40
−45
0
500
1000
1500
−50
2000
0
500
Sample size
−25
2000
1500
2000
−25
NMSE a2
NMSE a1
1500
−20
−30
−35
R−MBD
−40
−45
1000
Sample size
exp−TFD
Huber−TFD
0
500
1000
Sample size
−30
−35
−40
1500
2000
−45
0
500
1000
Sample size
Fig. 12.5: The NMSE versus sample size : a comparative study.
These simulations confirm the effectiveness of the proposed algorithms and, at
least in this simulation context, the best results in terms of estimation accuracy
are obtained by the R-MBD algorithm (which is, on the other hand, the most
expensive one) followed by the exp-TFD method.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
176
Robust Time-Frequency Approaches
[B]- Third experiment
Here, we assess the statistical performance of the R-MBD based IF estimator of
multi-component FM signals. For that let us consider a two linear FM components
embedded in additive impulsive α-stable noise z(t) modeled as
x(t) = s1 (t) +
s2 (t) + z(t) where s1 (t) = exp{j2π(a1 t + b1 t2 )} and s2 (t) = exp{j2π(a2 t + b2 t2 )}.
The noise z(t) is chosen with zero location parameter, characteristic exponent
α = 1 and dispersion γ. The signals IF coefficients are given by a1 = 0.2, b1 =
0.1 ∗ 10−3 , a2 = 0.45 and b2 = −1.5 ∗ 10−3 . To validate the proposed method and
to compare it with some existing methods, we implement the following procedure :
1. Compute the TFD of the two chirp components signal in α-stable noise x(t)
using r-PWVD [Barkat et Stankovic(2004)] and the proposed R-MBD. For
that, we choose σ = 0.01 for the MBD kernel and p = α/3 for the fractional
Lp -norm loss function to design the R-MBD. In the experiments, we fix the
signal length equal to N = 501 and the window length, used in the r-PWVD
implementation, equal to 101 samples.
2. Put the computed TFD matrix through the component separation algorithm
in order to extract the two respective components. The peaks of the extracted
components (in the time-frequency domain) are, then, used to estimate the
IFs of the chirps.
3. Put the same noisy signal through one of the widely used IF estimation
methods which is the High Ambiguity Function (HAF) algorithm to estimate
the four chirp parameters a1 , b1 , a2 and b2 [Peleg et Friedlander(1996)].
4. For the HAF algorithm, use a simple polynomial fit to obtain estimates of
IF1 (t) from (a1 , b1 ) and estimates of IF2 (t) from (a2 , b2 ).
In Fig. 12.6., we display the NMSE of the IF estimate versus the noise dispersion γ for HAF, r-PWVD and R-MBD. The accuracy and superiority of the
R-MBD over both algorithms r-PWVD and HAF is evident.
Chirp 1
Chirp 2
−5
−5
HAF
HAF
−10
−10
−15
−15
r−PWVD
NMSE of IF estimates in dB
NMSE of IF estimates in dB
r−PWVD
−20
R−MBD
−25
−30
−20
R−MBD
−25
−30
−35
−35
−40
−40
−45
−5
−45
0
5
10
−10*log10(γ)
15
20
−50
−5
0
5
10
−10*log10(γ)
15
20
Fig. 12.6: NMSE of IF estimates, corresponding to the HAF, r-PWVD and
the R-MBD for a noisy two-component chirp signal.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
12.6 Performance Evaluation & Comparison
177
[D]- Fourth experiment
In this experiment, a comparative study of the previous IF estimation methods
of multicomponent chirp signal is addressed. For this purpose, we consider a mixture of two chirp components of the same amplitude a1 = a2 = 1, with f1 = 0.05,
f2 = 0.3, δ1 = 0.0001 and δ2 = 0.0003 embedded in impulsive α-stable noise with
parameter exponent α = 1.
For the non-parametric TFD-based method, we propose the compressing technique
with parameter β = 0.1 (we chose σ = 0.01 for the MB-distribution kernel).
Figures 12.7 and 12.8 represent the NMSE of the phase parameters versus
the sample size and the noise dispersion, respectively. In this simulation context,
the best results are obtained by the time-frequency based method followed by the
parametric method based on robust covariance estimation (ROCOV-MUSIC).
Chirp 1
Chirp 2
0
0
−10
NMSE2f
1
NMSE f
−20
−40
−60
−80
−20
−30
−40
0
200
400
600
800
1000
−50
0
200
400
600
800
1000
800
1000
Sample size
Sample size
−25
−10
−15
NMSE delta
2
−20
1
NMSE delta
−30
−25
−35
−30
TFD
ROCOV−MUSIC
TRUNC−MUSIC
FLOS−MUSIC
−40
−45
0
200
400
600
−35
−40
800
1000
−45
0
200
400
600
Sample size
Sample size
Fig. 12.7: Normalized MSE of the various phase parameters versus sample
size, γ = 0.1.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
178
Robust Time-Frequency Approaches
Chirp 2
−20
−20
−30
−30
−40
NMSE2f
NMSE1f
Chirp 1
−10
−40
−50
−60
−5
−60
0
5
10
Dispersion in dB
15
20
−20
−70
−5
0
5
10
Dispersion in dB
15
20
0
5
10
Dispersion in dB
15
20
−35
−40
1
NMSE delta
2
−30
NMSE delta
−50
−45
−40
−50
−50
−60
−5
−55
0
5
10
15
20
−60
−5
Dispersion in dB
Fig. 12.8: Normalized MSE of the various phase parameters versus noise
dispersion in dB, N=1000.
12.7
Concluding Remarks
In this chapter, we proposed a new approach to analysis multicomponent nonstationary FM signals corrupted by additive heavy-tailed noise using the robust
statistics theory. Two different procedures are proposed :
• Robust preprocessing approach : In this part, a preprocessing stage, based
on the use of M-estimation idea, have been proposed to clean the timefrequency image. This first step allow us to get a good time-frequency representation thai it is essential in the second IF estimation step.
• Robust time-frequency distribution approach : In this part, the fractional
Lp -norm (0 < p < α) loss function have used in the M-estimation framework
to design a new robust TFD referred to as R-MBD. The proposed R-MBD
is robust to the effect of heavy-tailed α-stable noise.
Computer simulations confirm the effectiveness of the proposed algorithms and
that the best results in terms of estimation accuracy are obtained by the R-MBD
based algorithm which is, on the other hand, the most expensive one followed by
the r-PWVD based method.
In the considered simulation context, the comparative study shows the superiority of the non-parametric (TFD-based) method and the parametric method using
the robust covariance estimation technique (ROCOV-MUSIC).
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
179
Chapitre 13
Conclusions et Perspectives
De la statistique à variance infinie à la séparation de sources et au traitement
des signaux non-stationnaires, nous avons souhaité explorer des domaines moins
connus des praticiens mais qui peuvent encore révéler des spécificités théoriques
intéressantes et trouver des applications nouvelles.
13.1
Conclusion Générale
En guise de conclusion générale, nous allons tenter d’établir une synthèse globale sur le travail qui a été réalisé dans cette thèse.
Nous avons, grâce à l’utilisation d’outils mathématiques probabilistes et statistiques, tenté d’apporter de nouvelles pierres à deux édifices parmi les plus importants du traitement du signal : la séparation de sources non-gaussiennes et
l’estimation des signaux non-stationnaires. Nous avons donc répondu aux interrogations que l’on avait soulevées en débutant ce travail de thèse, même si ce
n’est que partiellement, puisque beaucoup de pistes prometteuses n’ont été que
soulevées. Mais avouons-le, une étude exhaustive des différents intérêts des distributions α-stables en traitement du signal représente un travail de longue haleine,
et les questions ouvertes semblent toujours plus nombreuses.
Cette recherche de robustesse en traitement du signal nous a amené à nous
pencher, en détail, sur les problèmes de séparation de sources, spécialement lorsqu’elles sont de nature impulsive à variance infinie et l’estimation des signaux nonstationnaires multi-composantes dans un bruit impulsif. Ces problèmes reviennent
à répondre aux questions suivantes :
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
180
Conclusions et Perspectives
1. Quelle sont les méthodes existantes, basées sur l’hypothèse d’existence des
statistiques d’ordre deux et d’ordre supérieur, qui peuvent toujours fonctionner du point de vue pratique dans le cas des distributions α-stables ?
2. Comment peut-on justifier cela du point de vue mathématique une fois
prouvé par des simulations ?
3. Comment peut-on adapter et rendre robuste les méthodes qui ne s’appliquent
plus dans le cas de sources impulsives α-stables ?
4. Comment explorer l’utilisation des moments fractionnaires d’ordre inférieur
pour séparer ce genre de sources et généraliser leur utilisation au cas de
sources de nature inconnue (impulsive ou non) ?
5. Comment peut-on réduire l’effet du bruit impulsif sur la représentation
temps-fréquence des signaux non-stationnaires ?
6. Est-il possible de définir des distribution temps-fréquence robustes à l’effet
d’un bruit impulsif ?
7. Peut-on se contenter des méthodes paramétriques d’estimation des signaux
non-stationnaires multi-composantes et les rendre robustes ?
Les méthodes développées dans cette thèse, et plus particulièrement l’emploi
des moments fractionnaires d’ordre inférieur, semblent fournir une piste intéressante
et prometteuse. Précisons, de plus, que les développements qui ont été conduits,
ne l’ont été qu’à l’aide des propriétés statistiques des lois de probabilités α-stables.
Nous les avons présentées en détail ainsi que d’autres propriétés qui n’ont pas été
employées mais qui pourraient se révéler d’une grande utilité, dans un chapitre
dédié uniquement aux lois stables et à leurs utilisation en traitement du signal.
[A]- Séparation de sources impulsives
Nous avons essayé de faire une étude complète de la classe des méthodes basées
sur les FLOS, en faisant le tour de plusieurs aspects habituellement abordés dans
la séparation de sources classiques, incluant les problèmes de blanchiment, de
séparation et d’optimisation d’une fonction de contraste ; ne laissant de coté, par
manque de temps, que les aspects d’analyse asymptotique de performances.
Proposons, tout d’abord, un critère de séparation basé sur la minimisation de la
somme des dispersions des observations. Nous avons montré que ce critère de dispersion minimum (MD) est une fonction de contraste qui permettra de séparer
les sources de distributions α-stables et nous avons réalisé une implémentation de
type Jacobi de l’algorithme proposé. Plus précisḿment, afin d’optimiser cette fonction de coût exprimée en fonction de la matrice de séparation B sous contrainte
d’othogonalité, nous avons décomposé cette matrice en produit de matrices de
Givens-Jacobi en ramenant le problème d’optimisation matricielle à celui d’optimisation d’une fonction à variable réelle θ (l’angle de rotation).
Afin d’évaluer les performances de la méthode MD nous avons défini un indice de
mesure de performance comme généralisation du rapport signal/interférencs utilisé
habituellement dans la séparation des signaux à variance finie. Nous avons ainsi
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
13.1 Conclusion Générale
181
conduit une série d’expériences de simulations pour comparer la méthode proposée
avec les méthodes classiques JADE, EASI et une méthode de type quasi-maximum
de vraisemblance proposée spécialement pour les sources α-stables, nommée RQLM
[Shereshevski et al.(2001)]. La méthode MD réalise les meilleures performances
dans tous les cas de figures considérés (avec bruit, sans bruit, petite et grande taille
d’échantillon,... ). Précisons, à ce propos, que la méthode MD présente une robustesse surprenante contre les erreurs d’estimation de l’exposant caractéristique α
des distribution α-stable. Ce comportement se retrouve également dans l’approche
suivante de Lp -norme, et s’explique par le fait que la modification de la puissance
α dans l’expression de la dispersion par une autre valeur α1 (dans le même interval
(0, 1] ou [1, 2) que α) définie aussi une fonction de contraste MD et donc permet
de séparer correctement les sources [Sahmoudi et al.(2005)].
Exploitant la relation de proportionalité entre la dispersion d’une v.a.r. α-stable
et son p-ème moment, nous avons étendu et généralisé le critère de dispersion minimum pour séparer des mélanges linéaires de sources à distributions inconnues
(α-stable ou non). Cette approche rejoint naturellement la représentation parcimonieuse des sources par la norme Lp utilisées dans la littérature. Sous contrainte
d’orthogonalité, nous avons montré que le critère qui consiste à minimiser la somme
des normes Lp des observations est une fonction de contraste qui sépare correctement les mélanges linéaires [Sahmoudi(2005)].
Toujours attaché à l’examen des méthodes existantes de séparation de sources,
nous avons constaté que l’on peut classer les approches existantes en deux classes
de méthodes : une classe de méthodes basées sur la structure algébrique tensoriel du mélange (comme l’algorithme JADE par exemple) toujours valable pour
la séparation des sources à queue lourde, en particulier celles de distributions
α-stables et une deuxième classe des méthodes basées sur différents critères de mesure d’indépendance incapable de séparer les sources considérées dans ce travail.
La question que nous avons posé logiquement après cette classification est la suivante : Comment on peut justifier, de point de vue mathématique, l’utilisation des
algorithmes de structure algébrique robuste vu qu’ils sont basés sur des statistiques
infinies ”en principe” et comment peut-on rendre robuste ceux qui ne le sont pas ?.
Pour répondre à cette question, nous avons proposé une normalisation approprié
des statistiques d’ordre deux (la covariance) et des cumulants d’ordre quatres.
Cette normalisation a fait converger asymptotiquement ces nouvelles statistiques
normalisées vers des tenseurs de structures désirées pour la séparation de sources.
Ces statistiques construites pour valider les approches classiques algébrique de
séparation de sources [Sahmoudi et al.(2004a)], s’appuient sur la proprièté de queue
lourde des lois stables. Elles nous ont permis aussi d’introduire des normalisations
convenables dans les critères de séparation de sources basés sur la décorrélation
non-linéaire. Ces nouvelles fonctions de contrastes sont alors devenues robustes et
valables dans le cas des signaux à queue lourde [Sahmoudi et Abed-Meraim(2004b)].
Néanmoins, si les composantes indépendantes n’ont pas cette dernière caractérisation,
la normalisation ne joue qu’un rôle de multiplication par une constante et qui ne
peut être que bénifique pour la convergence de certains algorithmes comme le cas
de l’algorithme EASI par exemple. Ces statistiques normalisées définissent en fait
une classe entière de techniques qui est loin d’être exploitée entièrement dans cette
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
182
Conclusions et Perspectives
thèse.
Une autre approche fondamentale en estimation statistique et qui a attirée
notre attention pour l’estimation des composantes indépendantes d’un mélange
linéaire est celle du principe du maximum de vraisemblance (MV) qui ouvre comme
toujours la possibilité de plusieurs développements. En effet, nous avons proposé
une structure semi-paramétrique de l’approche MV qui consiste à combiner une
version stochastique de l’algorithme EM et une technique d’approximation des
densités des sources par les fonctions log-splines [M. Sahmoudi et al.(2005)]. Les
avantages de cette méthode sont appréciables dans le sens du non besoin d’un
modèle des densités des sources et dans l’optique d’une robustesse aux erreurs
éventuelles de modélisation des sources puisque nous approchons et nous estimons
leurs densités directement à partir des observations.
[B]- Traitement des signaux non-stationnaires multicomposantes
Dans la deuxième partie de cette thèse, nous avons abordé certains aspects
de l’analyse des signaux FM non-stationnaires et nous avons considéré principalement, le cas d’un signal multi-composantes affecté par un bruit impulsif. Nous
avons traité le problème d’estimation de la fréquence instantanée.
Pour cela, nous avons proposé d’utiliser la méthode dite de M-estimation de
Huber robuste à l’effet des valeurs abérrantes (ou bruit impulsif) dans les données
et qui a comme objectif de fournir des estimateurs dont les performances ne sont
pas trop détériorées en présence d’un bruit non-gaussien à queue lourde.
Cet environnement nous a conduit à modéliser le bruit par une distribution αstable en conjonction avec l’approche de M-estimation dans une procédure paramétrique dans un premier temps et puis dans une procédure d’analyse tempsfréquence dans un second temps.
– Le premier objectif fut la recherche de nouvelles approches paramétriques
robustes dans ce cas particulier de bruit non-gaussien.
Nous commençons par ramener le problème à celui de l’estimation de signaux
harmoniques noyés dans un bruit impulsif grâce à une transformée polynomiale du signal. La méthode haute résolution MUSIC est alors appliquée au
signal ainsi transformé pour l’estimation des paramètres. Trois cas de figures
sont considérés en dérivant trois algorithmes : (i) Celui de l’application directe de l’algorithme MUSIC au signal harmonique tronqué. L’algorithme
correspondant est baptisé TRUNC-MUSIC ; (ii) celui de l’application de
l’algorithme MUSIC à l’estimée robuste de la fonction de covariance du signal harmonique. L’algorithme correspondant est baptisé ROCOV-MUSIC
et (iii) celui de l’application de MUSIC à la covariation généralisée du signal.
L’algorithme correspondant est baptisé FLOS-MUSIC car il est basé sur les
statistiques fractionnaires d’ordre inférieur FLOS.
Les résultats de comparaison ont montré une certaine supériorité en faveur
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
13.2 Perspectives
183
de l’algorithme ROCOV-MUSIC.
– Le second objectif de cette partie fut l’étude de l’influence d’un bruit additif
impulsif sur les méthodes d’estimation non-paramétriques temps-fréquence.
• Procedure de pré-traitement du bruit impulsif : Dans une première
approche, nous avons appliqué la procédure de robustesse au sens minimax d’Huber contre l’effet du bruit impulsif sous forme d’une étape de prétraitement par deux techniques différentes à savoir :
1. La technique de compression des amplitudes par un filtre non-linéaire
de type |x|β ; 0 < β < 1
et,
2. La technique de troncature du signal en amplitude.
Par la suite nous représentons le signal dans le plan temps-fréquence en
utilisant des transformées quadratiques adéquates au cas multicomposantes
et un algorithme d’extraction des composantes afin d’estimer leurs fréquences
instantanées.
• Procedure basée sur une distribution temps-fréquence robuste :
Par contre dans la deuxième approche, nous avons combiné l’approche de
robustesse M-estimation avec les transformées temps-fréquence quadratiques
pour définir une classe de transformées robustes à l’effet du bruit impulsif
et des termes croisés d’un signal multicomposantes.
Une étude comparative par simulation montre l’avantage des méthodes tempsfréquence comparées aux méthodes paramétriques précédentes pour l’estimation
de signaux FM multi-composantes en présence de bruit impulsif. On peut souligner
aussi que les représentations robustes temps-fréquence peuvent servir à d’autres
applications qui sortent du cadre d’estimation des signaux FM traités dans ce
travail.
13.2
Perspectives
De nombreuses questions restent ouvertes :
[A]- Séparation de sources impulsives
• Comment améliorer les méthodes proposées en exploitant les techniques
d’analyse de performances existantes dans la littérature ?
• Une implémentation de type gradient est tout-à-fait possible pour optimiser
le critère de dispersion minimum proposé pour la séparation des sources
α-stables.
• Comment convergent les algorithmes proposés ?
• Comment choisir les non-linéarités par rapport aux distributions des sources
α-stables dans l’approche basée sur la décorrelation non-linéaire ?
• Comment étendre l’étude réalisée ici dans le cas des mélanges linéaires au
cas des mélanges non-linéaires ou convolutifs ?
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
184
Conclusions et Perspectives
• Un des problèmes, qui n’a pas été traité dans cette thèse, est celui du test
de la variance. En effet, ce test permettrait avant d’appliquer n’importe quel
algorithme de savoir si on est en présence d’un mélange à queue lourde de
variance infinie ou non. Sur cette question, il est à noter que certains travaux
ont déjà été développés dans la littérature de probabilités et statistiques et
qui peuvent être exploités dans notre contexte de séparation de sources.
• Comment peut-on combiner les trois classes de statistiques de second ordre,
celle d’ordre supérieur et celles d’ordre inférieur pour définir des critères
généraux de séparation de sources ?. Sur ce point on envisage de rajouter
aux critères de séparation basés sur la décorrelation non-linéaire un terme
d’ordre inférieur pour réduire l’effet d’”impulsivité” éventuelle des sources.
• Généralisation des approches basées sur l’utilisation des statistiques normalisées au cas des sources α-stables avec différents exponents caractéristiques
α. Pour ce faire, nous envisageons d’utiliser la procédure de déflation.
• Exploiter aussi les moments statistique logarithmiques définies à partir de
la fonction caractéristique de deuxième espèce.
• Creuser ce problème dans le cas sous determiné ; plus de sources que de
capteurs en exploitant la nature parcimonieuse des sources impulsives.
• Pour traiter cette dernière question, nous nous intéressons à la séparation
de sources dans le domaine transformée en ondelettes. Les aspects qui nous
intéressent en particulier sont : Le caractère impulsif des coefficients d’ondelettes et le caractère parcimonieux de la représentation en ondelettes. Cette
propriété de parcimonie a été récemment utilisée pour pouvoir séparer un
nombre de sources supérieur à celui des capteurs.
[B]- Traitement des signaux non-stationnaires multicomposantes
• L’analyse théorique des performances des approches proposées.
• La validation des méthodes proposées à travers leur application sur des signaux réels de type radar, sonar ou biomédicale.
• L’étude théorique de l’influence de la présence d’un bruit multiplicatif aléatoire.
• Exploration des méthodes de tests statistiques dans le plan temps-fréquence
pour l’extraction des composantes d’un signal non-stationnaires FM.
• Implémentation des algorithmes temps-fréquence en temps réel pour résoudre
des problèmes pratiques en communication. Cela consiste à développer des
nouvelles méthodes de gestion des services sumultaée en temps et en fréquence
dans les réseuax de communication.
• Approfondir l’analyse des méthodes existantes de classification dans le plan
temps-fréquence car cela peut résoudre plusieurs problèmes d’estimation et
de détection des signaux non-stationnaires.
• Exploiter au mieux la distribution de probabilité de la fréquence instantannée pour améliorer l’analyse temp-fréquence.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
13.2 Perspectives
185
In the end is my beginning !
T.S. Eliot
♦
merci,
chokran,
thank you, chokran, merci, thank
you, chokran, merci,
thank
you,
chokran,
merci, thank you, chokran,
merci, thank you, chokran,
merci, thank you, chokran, merci,
thank you, chokran, merci,
thank you, chokran, merci,
thank
you
chokran,
merci, thank you chokran, merci, thank
you, chokran,
merci,
thank
you
♦
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
186
Conclusions et Perspectives
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
187
Bibliographie
[Abed-Meraim et Hua(1997)] Abed-Meraim, K. et Hua, Y. (1997). Joint schur
decomposition : Algorithms and applications. In Proceeding of First International
Conference on Information, Communications and Signal Processing , (supplement
proceedings ; ICICS’97), Singapore.
[Abed-Meraim et al.(1996)] Abed-Meraim, K., Bellouchrani, A., et Hua, Y. (1996).
Blind identification of a linear-quadratic mixture of independent component based on
joint diagonalization procedure. In Proc. of ICASSP’1996, Atlanta, USA.
[Abed-Meraim et al.(1997a)] Abed-Meraim, K., Qiu, W., et Hua, Y. (1997a). Blind
system identification. Proceedings of the IEEE, 85(8), 1310–1322.
[Abed-Meraim et al.(1997b)] Abed-Meraim, K., Loubaton, P., et Moulin, E. (1997b). A
subspace algorithm for certain blind identification problems. IEEE Trans. on
Information Theory, 43(2), 499–511.
[Abed-Meraim et al.(2000)] Abed-Meraim, K., Hua, Y., , et Ikram, M. Z. (2000). A fast
algorithm for conditional maximum likelihood blind identification of SIMO/MIMO
FIR systems. In Proc. EUSIPCO (invited paper).
[Abed-Meraim et al.(2001)] Abed-Meraim, K., Xiang, Y., Manton, J., , et Hua, Y.
(2001). Blind source separation using second order cyclostationary statistics. IEEE
Transaction on Signal Processing, 49(4), 694–701.
[Abed-Meraim et al.(2003)] Abed-Meraim, K., Nguyen, L., Sucis, V., Tupin, F., et
Boashash, B. (2003). An image processing approach for underdetermined blind
separation of nonstationary sources. In Proceeding of Int. Symp. on Sig. and Image
Proc. and Analysis, Rome.
[Adib et al.(2002)] Adib, A., Moreau, E., et Aboutajdine, D. (2002). A combined
contrast and reference signal based blind source separation by a deflation approach. In
Proceedings of the 2nd IEEE International Symposium on Signal Processing and
Information Technology (ISSPIT’2002), Marrakesh, Marocco.
[Adjrad et al.(2003)] Adjrad, M., Belouchrani, A., et Abed-Meraim, K. (2003).
Parameter estimation of multicomponent polynomial phase signals impinging on a
multi-sensor array using extended kalman filter. In Proceeding of (ISSPIT’2003),
Darmstadt, Germany.
[Adler et al.(1998)] Adler, R., Feldman, R. E., et Taqqu, M. (1998). A Practical Guide
to Heavy Tails : Statistical Techniques and Applications. Birkhauser, Boston.
[Akay et Erözden(2004)] Akay, O. et Erözden, E. (2004). Use of fractional
autocorrelation in efficient detection of pulse compression radar signals. In IEEE First
International Symposium on Control, Communications and Signal Processing, pages
33 – 36.
[Akgiray et Lamoureux(1989)] Akgiray, V. et Lamoureux, C. (1989). Estimation of
stable-law parameters : a comparative study. Journal of Business & Economic
Statistics, 7, 85–93.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
188
BIBLIOGRAPHIE
[Altes(1980)] Altes, R. A. (1980). Detection, estimation, and classification with
spectrograms. The Journal of the Acoustical Society of America, 67(4), 1232–1246.
[Altinkaya et al.(2002)] Altinkaya, M. A., Delic, H., Sankur, B., et Anarim, E. (2002).
Subspace-based frequency estimation of sinusoidal signals in alpha-stable noise. Signal
Processing, 82, 1807–1827.
[Amari(1998)] Amari, S.-I. (1998). Natural gradient works efficiently in learning. Neural
Computation, 10(2), 251–276.
[Amari et Cardoso(1997)] Amari, S.-I. et Cardoso, J.-F. (1997). Blind source
separation—semiparametric statistical approach. IEEE Trans. on Signal Processing,
45(11), 2692–2700.
[Amari et al.(1996)] Amari, S.-I., Cichocki, A., et Yang, H. (1996). A new learning
algorithm for blind source separation. In Advances in Neural Information Processing
Systems 8, pages 757–763. MIT Press.
[Ambike et Hatzinakos(1995)] Ambike, S. et Hatzinakos, D. (1995). A new filter for
highly impulsive α-stable noise. In IEEE Workshop on Nolinear Signal and Image
Processing, Halkidiki, Greece.
[Amin(1992)] Amin, M. (1992). Time-Frequency Signal Analysis : Methods and
Applications. Longman-Chesire.
[Amin(1997)] Amin, M. G. (1997). Interference mitigation in spread spectrum
communication systems using time-frequency distributions. IEEE Transactions on
Signal Processing, 45(1), 90–101.
[Amin et Zhang(2000)] Amin, M. G. et Zhang, Y. (2000). Effects of cross-terms on the
performance of time–frequency MUSIC. In Proceedings of the 2000 IEEE Sensor
Array and Multichannel Signal Processing Workshop, pages 479–483.
[Amin et al.(1999)] Amin, M. G., Wang, C., et Lindsey, A. R. (1999). Optimum
interference excision in spread spectrum communications using open-loop adaptive
filters. IEEE Transactions on Signal Processing, 47(7), 1966–1976.
[Amin et al.(2000)] Amin, M. G., Belouchrani, A., et Zhang, Y. (2000). The spatial
ambiguity function and its applications. IEEE Signal Processing Letters, 7(6),
138–140.
[Andrews(1974)] Andrews, D. F. (1974). Scale mixtures of normal distributions. Journal
Royal Statistical Society, B 36, 99–102.
[Babaie-Zadeh et al.(2004)] Babaie-Zadeh, M., Mansour, A., Jutten, C., et Marvasti, F.
(2004). A geometric approach for separating several signals. In Fifth International
Symposium on Independent Component Analysis and Blind Signal Separation, pages
798–806, Granada, Spain.
[Babie-Zadeh(2002)] Babie-Zadeh, M. (2002). On Blind Source Separation in
Convolutive and Nonlinear Mixtures. Ph.D. thesis, INPG, Grenoble.
[Barbarossa(1995)] Barbarossa, S. (1995). Analysis of multicomponent LFM signals by a
combined Wigner-Hough transform. IEEE Transactions on Signal Processing, 43,
1511–1515.
[Barbarossa et Petrone(1997)] Barbarossa, S. et Petrone (1997). Analysis of polynomial
phase signals by an integrated generalized ambiguity function. IEEE Transaction on
Signal Processing, 45(2), 316–327.
[Barbarossa et Scaglione(1999a)] Barbarossa, S. et Scaglione, A. (1999a). Adaptive
time-varying cancellation of wideband interferences in spread-spectrum
communications based on time-frequency distributions. IEEE Transactions on Signal
Processing, 47(4), 957–965.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
BIBLIOGRAPHIE
189
[Barbarossa et Scaglione(1999b)] Barbarossa, S. et Scaglione, A. (1999b). Optimal
precoding for transmissions over linear time-varying channels. In Seamless
Interconnection for Universal Services. GLOBECOM’99, volume 5, pages 2545–2549,
Piscataway, NJ.
[Barbarossa et Scaglione(2000)] Barbarossa, S. et Scaglione, A. (2000). Theoretical
bounds on the estimation and prediction of multipath time-varying channels. In
International Conference on Acoustics, Speech, and Signal Processing, ICASSP’2000,
volume 5, pages 2545–2548, Istanbul, Turkey.
[Barbarossa et al.(1997)] Barbarossa, S., Scaglione, A., Spalletta, S., et Votini, S. (1997).
Adaptive suppression of wideband interferences in spread-spectrum communications
using the Wigner-Hough transform. In International Conference on Acoustics, Speech,
and Signal Processing, ICASSP’97, volume 5, pages 3861–3864, California.
[Barkat(2000)] Barkat, B. (2000). Design, estimation, and performance of
time–frequency distributions. Ph.D. thesis, Queensland University of Technology,
Brisbane, Australia.
[Barkat(2001)] Barkat, B. (2001). Instantaneous frequency estimation of nonlinear
frequency–modulated signals in the presence of multiplicative and additive noise.
IEEE Transactions on Signal Processing, 49(10), 2214–2222.
[Barkat et Abed-Meraim(2003a)] Barkat, B. et Abed-Meraim, K. (2003a). Detection of
known FM signals in known heavy-tailed noise. In Proceeding of ISSPIT’2003,
Darmstadt, Germany.
[Barkat et Abed-Meraim(2003b)] Barkat, B. et Abed-Meraim, K. (2003b). An effective
technique for the IF estimation of FM signals in heavy-tailed noise. In Proceeding of
ISSPIT’2003, Germany.
[Barkat et Abed-Meraim(2004)] Barkat, B. et Abed-Meraim, K. (2004). Algorithms for
blind components separation and extraction from the time-frequency distribution of
their mixture. to appear in Journal of App. Sig. Proc.
[Barkat et Boashash(2001)] Barkat, B. et Boashash, B. (Oct. 2001). A high-resolution
quadratic time-frequency distribution for multicomponent signals analysis. IEEE
Transactions on Signal Processing, 49.
[Barkat et Stankovic(2004)] Barkat, B. et Stankovic, L. (2004). Analysis of polynomial
FM signals corrupted by heavy-tailed noise. Signal Processing, 84, 69–75.
[Barndorff(1998)] Barndorff, O. E. (1998). Processes of normal inverse gaussian type.
Finance and Stochastics, 2, 41–68.
[Barndorff-Nielsen(1997)] Barndorff-Nielsen, O. E. (1997). Normal inverse gaussian
distribution and stochastic volatility modeeling. Scandinavian Journal of Statistics,
24, 1–13.
[Barros(2000)] Barros, A. K. (2000). The independence assumption : Dependent
component analysis. In M. Girolami, editor, Advances in Independent Component
Analysis, pages 63–71. Springer-Verlag.
[Bassi et al.(1998)] Bassi, F., Embrechts, P., et Kafetzaki, M. (1998). Risk management
and quantile estimation. In R. E. F. R. Adler et M. Taqqu, editors, A practical guide
to heavy tails, pages 111–130. Birkhauser, Boston.
[Bell et Sejnowski(1995)] Bell, A. et Sejnowski, T. (1995). An information-maximization
approach to blind separation and blind deconvolution. Neural Computation, 7,
1129–1159.
[Bell(2000)] Bell, A. J. (2000). Information theory, independent component analysis,
and applications. In S. Haykin, editor, Unsupervised Adaptive Filtering, Vol. I, pages
237–264. Wiley.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
190
BIBLIOGRAPHIE
[Belouchrani(2001)] Belouchrani, A. (2001). Blind source separation : Concepts,
approaches and applications. In ISSPA’2001 Tutorial, Kuala–Lumpur, Malaysia.
[Belouchrani et Amin(2000)] Belouchrani, A. et Amin, M. (2000). Jammer mitigation in
spread spectrum communications using blind source separation. Signal Processing, 80,
724–729.
[Belouchrani et Amin(1996)] Belouchrani, A. et Amin, M. G. (1996). A new approach
for blind source separation using time-frequency distributions. In Proceedings SPIE
conference on Advanced algorithms and Architectures for Signal Processing, Denver,
Colorado.
[Belouchrani et Amin(1997)] Belouchrani, A. et Amin, M. G. (1997). Blind source
separation using time–frequency distributions : Algorithm and asymptotic
performance. In IEEE Proc. ICASSP’97, pages 3469–3472, Germany.
[Belouchrani et Amin(1998)] Belouchrani, A. et Amin, M. G. (1998). Blind source
separation based on time-frequency signal representations. IEEE Transactions on
Signal Processing, 46(11), 2888–2897.
[Belouchrani et Amin(1999a)] Belouchrani, A. et Amin, M. G. (1999a).
Time–frequency : MUSIC. IEEE Signal Processing Letters, 6(5), 109–110.
[Belouchrani et Amin(1999b)] Belouchrani, A. et Amin, M. G. (1999b). A two–sensor
array beamformer for direct sequence spread spectrum communications. IEEE
Transactions on Signal Processing, 47(8), 2191–2199.
[Belouchrani et Cardoso(1994)] Belouchrani, A. et Cardoso, J.-F. (1994). Maximum
likelihood source separation for discrete sources. In Proceedings EUSIPCO.
[Belouchrani et cardoso(1995)] Belouchrani, A. et cardoso, J.-F. (1995). Maximum
likelihood source separation by the expectation-maximization technique : deterministic
and stochastic implementation. In In Proceeding of NOLTA, pages 49–53.
[Belouchrani et al.(1997a)] Belouchrani, A., Abed-Meraim, K., Cardoso, J.-F., et
Moulines, E. (1997a). A blind source separation technique using second order
statistics. IEEE Trans. on Sig. Proc., pages 434–444.
[Belouchrani et al.(1997b)] Belouchrani, A., Abed-Meraim, K., et Cardoso, J.-F.
(1997b). An iterative blind source separation technique : Implementation and
performance. In Proceeding of International Conference on Information,
Communication and Signal Processing (ICICS’1997), Singapore.
[Belouchrani et al.(2001)] Belouchrani, A., Abed-Meraim, K., Amin, M. G., et Zoubir,
A. M. (2001). Joint anti-diagonalization for blind source separation. In International
Conference on Acoustics, Speech, and Signal Processing, ICASSP’2001, Salt Lake city,
Utah.
[Benidir(1994)] Benidir, M. (1994). Higher-Order Statistical Signal Processing, chapter
Theoretical foundations of higher-order statistical signal processing and polyspectra.
Longman Cheshire, Australia.
[Benidir(1997)] Benidir, M. (1997). Caracterization of polynomial functions and
application to time-frequency analysis. IEEE Trans. On Signal Processing, 45(5),
351–1354.
[Benidir(2002)] Benidir, M. (2002). Traitement du Signal, Tome 1. Dunod.
[Benidir(2003)] Benidir, M. (2003). Traitement du Signal, Tome 2. Dunod.
[Benidir et al.(2002)] Benidir, M., Ouldali, A., et Sahmoudi, M. (2002). Performances
analysis for the haf-estimator for a time-varying amplitude phase-modulated signals.
In The international IASTED Conference on Control and Applications (CA’2002),
Cancun, Mexico.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
BIBLIOGRAPHIE
191
[Bergstrom(1952)] Bergstrom, H. (1952). On some expansions of stable distribution
functions. Arkiv Mathematik, 2, 375–378.
[Berlekamp(1968)] Berlekamp, E. R. (1968). Algebraic Coding Theory. McGraw-Hill,
New York.
[Bermond(2000)] Bermond, O. (2000). Statistical Methods for Blind Source Separation
(Méthodes statistiques pour la séparation de sources). Ph.D. thesis, ENST, Paris,
France.
[Besson et Castanié(1993)] Besson, O. et Castanié, F. (1993). On estimating the
frequency of a sinusoid in autoregressive multiplicative noise. Signal Processing, 30(1),
65–83.
[Besson et al.(1999)] Besson, O., Ghogho, N., et Swami, A. (1999). Parameter
estimation for random amplitude chirp signals. IEEE Transactions on Signal
Processing, 47(12), 3208–3219.
[Besson et al.(2000a)] Besson, O., Vincent, F., Stoica, P., et Gershman, A. B. (2000a).
Approximate maximum likelihood estimators for array processing in multiplicative
noise environments. IEEE Transactions on Signal Processing, 48(9), 2506–2518.
[Besson et al.(2000b)] Besson, O., Gini, F., Griffiths, H. D., et Lombardini, F. (2000b).
Estimating ocean surface velocity and coherence time using multichannel ATI-SAR
systems. Proceedings of the IEE : F, 147(6), 299–308.
[Bestravos et al.(1998)] Bestravos, A., Crovella, M., et Taqqu, M. (1998). Heavy-tailed
distributions in the world wide web. In R. E. F. R. Adler et M. Taqqu, editors, A
practical guide to heavy tails, pages 3–25. Birkhauser, Boston.
[Bhashyam et al.(2000)] Bhashyam, S., Sayeed, A. M., et Aazhang, B. (2000).
Time-selective signaling and reception for communication over multipath fading
channels. IEEE Transactions on Communications, 48(1), 83–94.
[Bircan et al.(1998)] Bircan, A., Tekinay, S., et Akansu, A. N. (1998). Time-frequency
and time-scale representation of wireless communication channels. In Proceedings of
the IEEE-SP International Symposium on Time-Frequency and Time-Scale Analysis,
pages 373–376, Pittsburgh, Pennsylvania, USA.
[Blachman(1965)] Blachman, N. M. (1965). The convolution inequality for entropy
powers. IEEE Transaction on Information Theory, 11, 276–271.
[Boashash(1991)] Boashash, B. (1991). Time-frequency signal analysis. In S. Haykin,
editor, Advances in Spectrum Analysis and Array Processing, volume I, chapter 9,
pages 418–517. Prentice-Hall, Englewood Cliffs, New Jersey.
[Boashash(1992a)] Boashash, B. (1992a). Estimating and interpreting the instantaneous
frequency of a signal- Part 1 : Fundamentals. Proceedings of the IEEE, 80(4), 519–538.
[Boashash(1992b)] Boashash, B. (1992b). Estimating and interpreting the instantaneous
frequency of a signal- Part 2 : Algorithms and applications. Proceedings of the IEEE,
80(4), 539–569.
[Boashash(1992c)] Boashash, B., editor (1992c). Time-Frequency Signal Analysis :
Methods and Applications. Longman Cheshire, Melbourne, Australia.
[Boashash(1993)] Boashash, B. (1993). Recent advances in non-stationary signal
analysis : time-varying higher order spectra and multilinear time-frequency signal
analysis. In Proceedings of the SPIE - The International Society for Optical
Engineering, volume 2027, pages 2–26.
[Boashash(1996)] Boashash, B. (1996). Time frequency signal analysis : Past, present
and future trends. In C. T. Leondes, editor, Control and Dynamic Systems,
volume 48, pages 1–69. Academic Press, San Diego.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
192
BIBLIOGRAPHIE
[Boashash(2002)] Boashash, B. (2002). Time Frequency Signal Analysis and Processing.
Prentice–Hall.
[Boashash et Jones(1992)] Boashash, B. et Jones, G. (1992). Instantaneous frequency
and time-frequency distributions. In B. Boashash, editor, Time-Frequency Signal
Analysis, chapter 2, pages 43–73. Longman Cheshire, Melbourne, Australia.
[Boashash et O’Shea(1993)] Boashash, B. et O’Shea, P. (1993). Use of the cross
Wigner-Wille distribution for estimation of instantaneous frequency. IEEE
Transactions on Signal Processing, 41(3), 1439–1445.
[Boashash et O’Shea(1994)] Boashash, B. et O’Shea, P. (1994). Polynomial Wigner-Ville
distributions and their relationship to time-varying higher-order spectra. IEEE
Transactions on Signal Processing, 42, 216–220.
[Boashash et Ristic(1992)] Boashash, B. et Ristic, B. (1992). Robust radar algorithms.
Technical report, Signal Processing Research Centre, Queensland University of
Technology, Brisbane, Australia.
[Boashash et Ristic(1993a)] Boashash, B. et Ristic, B. (1993a). Analysis of FM signals
affected by Gaussian AM using reduced Wigner–Ville trispectrum. In International
Conference on Acoustics, Speech, and Signal Processing, ICASSP’93, volume IV,
pages 408–411, Minneapolis.
[Boashash et Ristic(1993b)] Boashash, B. et Ristic, B. (1993b). Application of cumulant
TVHOS to the analysis of composite FM signals in multiplicative and additive noise.
In F. T. Luk, editor, Proceedings of SPIE, Advanced Signal Processing Algorithms,
Architectures and Implementations, volume 2027, pages 245–255, San Diego.
[Boashash et Ristic(1993c)] Boashash, B. et Ristic, B. (1993c). Polynomial
time-frequency distributions and time-varying polyspectra. Technical report, Signal
Processing Research Centre, Queensland University of Technology, Brisbane,
Australia.
[Boashash et Ristic(1995)] Boashash, B. et Ristic, B. (1995). A time-frequency
perspective of higher-order spectra as a tool for non-stationary signal analysis. In
B. Boashash, E. J. Powers, et A. M. Zoubir, editors, Higher Order Statistical Signal
Processing, chapter 4, pages 111–149. Longman, Australia.
[Boashash et Ristic(1998)] Boashash, B. et Ristic, B. (1998). Polynomial time-frequency
distributions and time-varying higher order spectra : Application to the analysis of
multicomponent FM signals and to the treatment of multiplicative noise. Signal
Processing, 67, 1–23.
[Boashash et Rodriguez(1984)] Boashash, B. et Rodriguez, F. (1984). Recognition of
time-varying signals in the time-frequency domin by means of the wigner distribution.
In Proc. of ICASSP’1984, San Diego, USA.
[Boashash et Sucic(2002)] Boashash, B. et Sucic, V. (2002). High performance
time–frequency distributions for practical applications. In L. Debnath, editor,
Wavelets and Signal Processing. Birkhauser, Boston, New York : Springer–Verlag.
[Boashash et Sucic(2003)] Boashash, B. et Sucic, V. (2003). Resolution measure criteria
for the objective assessement of the performance of quadratic time-frequency
distributions. IEEE Trans. on Signal Processing, 51(5), 1253–1263.
[Boashash et al.(1995)] Boashash, B., Powers, E. J., et Zoubir, A. M., editors (1995).
Higher Order Statistical Signal Processing. Longman, Australia.
[Bodenschatz et Nikias(1999)] Bodenschatz, J. S. et Nikias, C. L. (1999). Maximum
likelihood symmetric α-stable parameter estimation. Trans. on signal Processing,
47(5).
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
BIBLIOGRAPHIE
193
[Boscolo et al.(2004)] Boscolo, R., Pan, H., et Roychowdhury, V. P. (2004). Independent
Component Analysis Based on Nonparametric Density Estimation. IEEE Transaction
on Neural Networks, 15(1).
[Boudreaux-Bartels et Marks(1986)] Boudreaux-Bartels, G. F. et Marks, T. W. (1986).
Time-varying filtering and signal estimation using Wigner distributions. IEEE
Transactions on Acoustics, Speech, and Signal Processing, 34, 422–430.
[Box(1953)] Box, G. E. P. (1953). Non-normality and tests on variances. Biometrika,
(40), 318–335.
[Brcich et Zoubir(2002)] Brcich, R. F. et Zoubir, A. M. (2002). Robust estimation with
parametric score function estimation. In Proceedings of the ICASSP’2002 IEEE
Conference, pages 1149–1152.
[Cambanis et Miller(1981)] Cambanis, S. et Miller, G. (1981). Linear problems in pth
order and stable processes. SIAM J. Appl. Math., 41, 43–49.
[Cao et Murata(1999)] Cao, J. et Murata, N. (1999). A Stable and Robust ICA
Algorithm Based on T-Distribution and Generalized Gaussian Distribution Model. In
Proceedings of the IEEE Signal Processing Society Workshop on Neural Networks for
Signal Processing IX, pages 283 – 292.
[Cappé et al.(2002)] Cappé, O., Moulines, E., Pesquet, J.-C., et A. Petropulu, X. Y.
(2002). Long-range dependence and heavy-tail modeling for teletraffic data. IEEE
Signal Processing Magazine, pages 14–27.
[Cardoso(1989a)] Cardoso, J.-F. (1989a). Blind identification of independent signals. In
Proc. Workshop on Higher-Order Specral Analysis, Vail, Colorado.
[Cardoso(1989b)] Cardoso, J.-F. (1989b). Source separation using higher order
moments. In Proceeding of IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP’89), pages 2109–2112, Glasgow, UK.
[Cardoso(1991)] Cardoso, J.-F. (1991). Super-symmetric decomposition of the
fourth-order cumulant tensor. blind identification of more sources than sensors. In
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP’97),
pages 3109–3112.
[Cardoso(1998)] Cardoso, J. F. (1998). Blind signal separation : statistical principles.
Proc. of the IEEE, 9(10), 2009–2025.
[Cardoso(1999)] Cardoso, J.-F. (1999). High-order contrasts for independent component
analysis. Neural Computation, 11(1), 157–192.
[Cardoso et Comon(1996)] Cardoso, J.-F. et Comon, P. (1996). Independent component
analysis, a survey of some algebraic methods. In Proc. ISCAS’96, volume 2, pages
93–96.
[Cardoso et Lahed(1996)] Cardoso, J. F. et Lahed, B. (1996). Equivariant adaptive
source separation. IEEE Transaction on Signal Procssing, 44, 3017–3030.
[Cardoso et Souloumiac(1993)] Cardoso, J. F. et Souloumiac, A. (1993). Blind
beamforming for non-gaussian signals. Radar and Signal Processing, IEE Proceedings
F.
[Castella et Pesquet(2004)] Castella, M. et Pesquet, J. C. (2004). An iterative source
separation method for convolutive mixtures of images. In Proceedings of the
International Conference on Independent Component Analysis (ICA’2004), pages
922–929.
[Castella et al.(2004)] Castella, M., Bianchi, P., Chevreuil, A., et Pesquet, J.-C. (2004).
Blind MIMO detection of convolutively mixed CPM sources. In Proceeding of
EUSIPCO’2004, Vienna, Austria.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
194
BIBLIOGRAPHIE
[Celka et al.(2001)] Celka, P., Boashash, B., et Colditz, P. (2001). Preprocessing and
time-frequency analysis of newborn eeg seizures. IEEE Engineering in Medicine &
Biology Magazine, 20, 30–39.
[Chabert et al.(2003)] Chabert, M., Tourneret, J.-Y., et Coulon, M. (2003). Joint
detection of variance changes using hierarchical Bayesian analysis. In Proceeding of the
IEEE International workshop on Statistical Signal Processing, Saint-Louis, Missouri,
USA.
[Chambers et al.(1976)] Chambers, J. M., Mallows, C. L., et Stuck, B. W. (1976). A
method for simulating stable random variables. Journal of the American Statistical
Association, 71(354), 340–344.
[Chen et Bickel(2003)] Chen, A. et Bickel, P. J. (2003). Efficient Independent
Component Analysis. Departement of Statistics, University of California, Berkele,
Technical report 634.
[Chen et Bickel(2004)] Chen, A. et Bickel, P. J. (2004). Robustness of prewhitening
against heavy-tailed sources. In Proceeding of the Fifth International Conference on
Independent component Analysis and Blind Signal Separation (ICA’2004), Granada,
Spain.
[Choi et Williams(1989)] Choi, H. et Williams, W. (1989). Improved time–frequency
representation of multicomponent signals using exponential kernels. IEEE
Transactions on Signal Processing, 37(6), 862–871.
[Cichocki et Amari(2002)] Cichocki, A. et Amari, S. (2002). Adaptive Blind Signal and
Image Processing. John Wiley & Sons, Singapore.
[Cichocki et Unbehauen(1996)] Cichocki, A. et Unbehauen, R. (1996). Robust neural
networks with on-line learning for blind identification and blind separation of sources.
IEEE Trans. on Circuits and Systems, 43(11), 894–906.
[Cichocki et al.(1994)] Cichocki, A., Unbehauen, R., et Rummert, E. (1994). Robust
learning algorithm for blind separation of signals. Electronics Letters, 30(17),
1386–1387.
[Cichocki et al.(2004)] Cichocki, A., Li, Y., et ans S. I. Amari, P. G. (2004). Beyond
ICA : Robust sparse signal representation. In Proceedings of the IEEE International
Symposium on Circuits and Systems (ISCAS ’04), volume 5, pages 684 – 687.
[Classen et Mecklenbrauker(1980)] Classen, T. et Mecklenbrauker, W. (1980). The
wigner distribution –part 1. Phillips Journal of Research, 35, 217–250.
[Cline et Brockwell(1985)] Cline, D. B. et Brockwell, P. (1985). Linear prediction of
ARMA processes with infinite variance. Stoch. Processes & Applications, 19, 281–296.
[Cohen(1966)] Cohen, L. (1966). Generalized phase-space distribution functions.
Journal of Mathematical Physics, 7(5), 781–786.
[Cohen(1992)] Cohen, L. (1992). What is a Multicomponent Signal.
[Cohen(1995)] Cohen, L. (1995). Time-frequency Analysis. Prentice-Hall.
[Comon(1989)] Comon, P. (1989). Separation of stochastic processes. In Proc. Workshop
on Higher-Order Specral Analysis, pages 174 – 179, Vail, Colorado.
[Comon(1994)] Comon, P. (1994). Independent component analysis, a new concept.
Signal Processing, 36, 287–314.
[Cook et Bernfeld(1993)] Cook, C. E. et Bernfeld, M. (1993). Radar Signals : An
Introduction to Theory and Application. Artech House, Norwood, MA.
[Cook et al.(1993)] Cook, D., Buja, A., et Cabrera, J. (1993). Projection pursuit indexes
based on orthonormal function expansions. J. of Computational and Graphical
Statistics, 2(3), 225–250.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
BIBLIOGRAPHIE
195
[Coulon et Tourneret(1999)] Coulon, M. et Tourneret, J. (1999). Multiple frequency
estimation in additive and multiplicative colored noises. In Proceeding of
ICASSP’1999, pages 1573–1576, Phoenix, USA.
[Cover et Thomas(1991)] Cover, T. M. et Thomas, J. A. (1991). Elements of
Information Theory. Wiely Series in Telecommunications.
[Crespo et al.(1995)] Crespo, P. M., Honig, M. L., et Salehi, J. A. (1995). Spread-time
code-division multiple access. IEEE Transactions on Communications, 43(6),
2139–2147.
[Davy et al.(2002)] Davy, M., Doncarli, C., et Tourneret, J.-Y. (2002). Classification of
chirp signals using hierarchical Bayesian learning and MCMC methods. IEEE Trans.
on Signal Proc., 50(2), 377 – 388.
[de Boor(1978)] de Boor, C. (1978). A practical guide to splines. Springer-Verlag, New
York, applied mathematical sciences edition.
[Delmas(2004)] Delmas, J. (2004). Asymptotically optimal estimation of DOA for
non-circular sources from second-order moments. IEEE Trans. on Signal Processing,
pages 1235–1245.
[Delmas(1997)] Delmas, J. P. (1997). An extension to the EM algorithm for exponential
family. IEEE Trans. on Signal Processing, 4(10), 2613–2615.
[Delmas et al.(2000)] Delmas, J. P., Gazzah, H., Liavas, A. P., et Regalia, P. A. (2000).
Statistical analysis of some seconde order methods for blind channel identification
equalization with respect to channel undemodeling. IEEE Trans. on Signal
Processing, 48(7), 1984–1998.
[Delyon et al.(1999)] Delyon, B., lavielle, M., et Moulin, E. (1999). Convergence of a
stochastic approximation version of the EM algorithm. Ann. Statist., 27(1), 94–128.
[Dempster(1977)] Dempster, E. J. (1977). Maximum likelihood from incomplete data
via EM algorithm. Annals Royal Statistical Society, 39, 1–38.
[d’Estamps(2003)] d’Estamps, L. (Oct. 2003). Traitement Statistique des Processus
Alpha-Stables : mesure de dépendance et identification des AR Stables. Ph.D. thesis,
Institut National Polytechnique de Toulouse, Toulouse, France.
[Diebolt et Celeux(1993)] Diebolt, J. et Celeux, G. (1993). Asymptotic properties of a
stochastic EM algorithm for estimating mixing proportions. Comm. Statist. Stochastic
Models, 9(4), 599–613.
[Djafari(1999)] Djafari, A. M. (1999). A Bayesian approach to source separation. In AIP
Conference Proceedings 567, Maximum Entropy and Bayesian Methods, pages
221–244, Boise, Idaho, USA.
[Djeddi et Benidir(2004)] Djeddi, M. et Benidir, M. (2004). Robust Polynomial
Wigner-Ville Distribution For The Analysis of Polynomial Phase Signals in α-Stable
Noise. In Proceedings of the IEEE Conference ICASSP’2004.
[Djuric et Kay(1990)] Djuric, P. C. et Kay, S. M. (1990). Parameter estimation of chirp
signals. IEEE Trans. Acoust., Speech, Signal Processing, 38(12), 2118–2126.
[DuMouchel(1973)] DuMouchel, W. H. (1973). On the asymptotic normality of the
maximum likelihood estimate one sampling from a stable distribution. Annals of
statistics, 1, 948–957.
[E. Moreau(1997)] E. Moreau, J.-C. P. (1997). Independence/decorrelation measures
with applications to optimized orthonormal representations. In IEEE International
Conference on Acoustics, Speech, and Signal Processing (ICASSP’97), volume 5.
[El-Hassouni et Cherifi(2003)] El-Hassouni, M. et Cherifi, H. (2003). A 2-d Adaptive
Least lp -Norm Filter For Impulsive Noise Cancellation in Still Images. In Proceeding
of ISPA’2003, Paris, France.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
196
BIBLIOGRAPHIE
[Elliott(1938)] Elliott, R. (1938). The wave principle. Collins, New York.
[Erdogmus et al.(2002)] Erdogmus, D., Rao, Y. N., Principe, J. C., Zaohao, J., et
Hild-II, K. E. (2002). Simultaneous extraction of principal components using givens
rotations and output variances. In ICASSP’2002, pages 1069–1072.
[Eriksson et Koivanen(2003)] Eriksson, J. et Koivanen, V. (2003).
Characteristic-function based independent component analysis. Signal Processing, 83,
2195–2208.
[et al(2004)] et al, R. B. (2004). Independent component analysis based on
nonparametric density estimation. IEEE Trans. on Neural Networks, 15(1).
[Even(2003)] Even, J. (Déc. 2003). Contributions a la Separation de Sources à l’aide de
Statistiques d’Ordre. Ph.D. thesis, Université Joseph Fourier Gronoble, Gronoble,
France.
[Fama(1965)] Fama, E. F. (1965). The behavior of stock market price. Journal of
Business, 38, 34–195.
[Fama et Roll(1968)] Fama, E. F. et Roll, R. (1968). Some properties of symmetric
stable distributions. Journal of the American Statistical Association, 63, 817–836.
[Fama et Roll(1971)] Fama, E. F. et Roll, R. (1971). Parameter estimates for symmetric
stable distributions. Journal of the American Statistical Association, 66, 817–836.
[FastICA(1998)] FastICA (1998). The FastICA package for MATLAB. Available at
http ://www.cis.hut.fi/projects/ica/fastica/.
[Feller(1966)] Feller, W. (1966). An Introduction to Probability Theory and its
Applications, volume 1. John Wiley.
[Feller(1971)] Feller, W. (1971). An introduction to probability theory and its
applications, Vol. II. John Wiley&Sons, 2nd edition.
[Fevotte et Doncarli(2004)] Fevotte, C. et Doncarli, C. (2004). Two contributions to
blind source separation using time-frequency distributions. IEEE Signal Processing
Letters, 11.
[Flandrin(1988a)] Flandrin, P. (1988a). A time-frequency formulation of optimum
detection. International Conference on Acoustics, Speech, and Signal
Processing, ICASSP’98, 36(9), 1377–1384.
[Flandrin(1988b)] Flandrin, P. (1988b). A time-frequency formulation of optimum
detection. IEEE Transactions on Acoustics, Speech, and Signal Processing, 36(9),
1377–1384.
[Flandrin(1993)] Flandrin, P. (1993). Temps-fréquence. Hermes, Paris.
[Flandrin(1998)] Flandrin, P. (1998). Time-Frequency / Time-Scale Analysis, Volume
10. Academic Press.
[Fonollosa et Nikias(1994)] Fonollosa, J. R. et Nikias, C. L. (1994). Analysis of
finite-energy signals using higher-order moments-and spectra-based time-frequency
distributions. Signal Processing, 36, 315–328.
[Francos et Porat(1999)] Francos, A. et Porat, M. (1999). Analysis and synthesis of
multicomponent signals using positive time-frequency distributions. IEEE
Transactions on Signal Processing, 47(2), 493–504.
[Francos et Friedlander(1995)] Francos, J. et Friedlander, B. (1995). Bounds for
estimation of multicomponent signals with random amplitude and deterministic
phase. IEEE Transactions on Signal Processing, 43(5), 1161–1172.
[Frechet(1924)] Frechet, M. (1924). Sur la loi des erreurs d’observation. Mtematicheskii
Sbornik, (32), 1–8.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
BIBLIOGRAPHIE
197
[Freedman et Diaconis(1982)] Freedman, D. A. et Diaconis, P. (1982). On inconsistent
M-estimators. The Annals of Statistics, 10(2), 454–461.
[Friedlander et Francos(1995)] Friedlander, B. et Francos, J. (1995). Estimation of
amplitude and phase parameters of multicomponent signals. IEEE Transactions on
Signal Processing, 43(4), 917–926.
[Friedmann et al.(2000)] Friedmann, J., Messer, H., et Cardoso, J.-F. (April 2000).
Robust parameter estimation of a deterministic signal in impulsive noise. IEEE, Tr.
On Sig. Proc., 48(4).
[Gaarder(1968)] Gaarder, N. T. (1968). Scattering function estimation.
[Gallagher(2000)] Gallagher, C. M. (2000). Estimating the autocovariation from
stationary heavy-tailed data, with applications to time series. Rapport technique,
Clemson University.
[Gallagher(2001)] Gallagher, C. M. (2001). A method for fitting stable autoregressive
models using the autocovariation function. Statistics & Probability Letters, 53(4),
381–390.
[Gallagher(2002)] Gallagher, C. M. (2002). Testing for linear dependence in heavy-tailed
data. Communication in Statistics, Theory and Methods, 31(4), 611–623.
[Gauss(1963)] Gauss, C. F. (1963). Theory of Motion of the Heavenly Bodies. Dover,
New York.
[Gazzah et Abed-Meraim(2003)] Gazzah, H. et Abed-Meraim, K. (2003). Blind
sos-based zf equalization with controlled delay robust to order over estimation.
Journal of Applied Signal Processing (IEE JASP).
[Georgiadis(2000)] Georgiadis, A. (Sept. 2000). Adaptive Equalisation for Impulsive
Noise Environments. Ph.D. thesis, The University of Edinburgh, Edinburgh, UK.
[Ghogho et al.(1999)] Ghogho, M., Nandi, A. K., et Swami, A. (1999). Cramer-Rao
bounds and maximum likelihood estimation for random amplitude phase–modulated
signals. IEEE Transactions on Signal Processing, 47(11), 2905–2916.
[Ghogho et al.(2001)] Ghogho, M., Swami, A., et Durrani, T. S. (2001). Frequency
estimation in the presence of Doppler spread : performance analysis. IEEE
Transactions on Signal Processing, 49(4), 777–789.
[Gnedenko et Kolmogorov(1111)] Gnedenko, B. V. et Kolmogorov, A. N. (1111). Limit
Distributions for Sums of Independent Random Variables. Addison-Wesley.
[Godsil(1999)] Godsil, S. (1999). MCMC and EM-based methods for inference in
heavy-tailed processes with α-stable innovations. In Proceedings of the IEEE
Statistical Signal Processing Workshop.
[Gonin et Money(1985)] Gonin, R. et Money, A. H. (1985). Nonlinear lp -norm
estimation : Part 1. on the choice of the exponent, p, where the errors are additive.
Commun. Stat. Theory Methods A, 14, 827–840.
[Grenier(1984)] Grenier, Y. (1984). Modélisation de Signaux non Stationnaires. Ph.D.
thesis, Université Paris Sud.
[Griffith(1997)] Griffith, D. W. (1997). Robust-Time Frequency Representations for
Signals in Alpha-Stable Noise : Methods and Applications. Ph.D. thesis, Departement
of Electrical Engineering, University of Delaware, Newark.
[Grigoriu(1995)] Grigoriu, M. (1995). Applied Non-Gaussian Processes. Prentice-Hall.
[H. Hassanpour et Boashash(2003)] H. Hassanpour, M. M. et Boashash, B. (2003).
Comparative performance of time-frequency based newborn EEG seizure detection
using spike signature. In ICASSP’2003, volume 2, pages 389–392.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
198
BIBLIOGRAPHIE
[Haas et Belfiore(1997)] Haas, R. et Belfiore, J.-C. (1997). A time-frequency
well-localized pulse for multiple carrier transmission. Wireless Personal
Communications, 5, 1–18.
[Hall(1966)] Hall, H. M. (1966). A new model for impulsive phenomena : Application to
atmospheric-noise communication channels. Technical Report 3412-8, 7050-7, Stanford
Electronics Laboratories, Stanford University, Stanford, California. This paper
introduce the Student-t distribution.
[Hampel et al.(1986)] Hampel, F. R., Ronchetti, E., Rousseeuw, P. J., et Stahel, W. A.
(1986). Robust Statistics : The Approach Based on Influence Functions. Wiley.
[Hanssen et Oigard(2001)] Hanssen, A. et Oigard, T. A. (2001). The normal inverse
Gaussian distribution as a flexible model for heavy-tailed processes. In Proceeding of
NSIP.
[Hérault et Ans(1984)] Hérault, J. et Ans, B. (1984). Circuits neuronaux à synapses
modifiables : décodage de messages composites par apprentissage non supervisé. C.-R.
de l’Académie des Sciences, 299(III-13), 525–528.
[Hérault et al.(1985)] Hérault, J., Jutten, C., et Ans, B. (1985). Détection de grandeurs
primitives dans un message composite par une architecture de calcul neuromimétique
en apprentissage non supervisé. In Actes du Xème colloque GRETSI, pages
1017–1022, Nice, France.
[Hlawatsch(1998)] Hlawatsch, F. (1998). Time-Frequency Analysis and Synthesis of
Linear Signal Spaces : Time-Frequency Filters, Signal Detection and Estimation, and
Range-Doppler Estimation. Kluwer Academic Publishers, USA.
[Hlawatsch et Boudreaux-Bartels(1992)] Hlawatsch, F. et Boudreaux-Bartels, G. F.
(1992). Linear and quadratic time-frequency signal representations. IEEE Signal
Processing Magazine, 9(2), 21–67.
[Hlawatsch et Krattenthaler(1997)] Hlawatsch, F. et Krattenthaler, W. (1997). Signal
synthesis algorithms for bilinear time-frequency signal representations. In
W. Mecklenbräuker et F. Hlawatsch, editors, The Wigner Distribution - Theory and
Applications in Signal Processing, pages 135–209. Elsevier, Amsterdam, Netherlands.
[Hlawatsch et Matz(1998)] Hlawatsch, F. et Matz, G. (1998). Time-frequency signal
processing : A statistical perspective. In Proc. IEEE Workshop on Circuits, Systems
and Signal Processing, pages 207–219, Mierlo, The Netherlands.
[Hlawatsch et Matz(2000)] Hlawatsch, F. et Matz, G. (to appear in September 2000).
Quadratic time-frequency analysis of linear time-varying systems. In L. Debnath,
editor, Wavelet Transforms and Time-Frequency Signal Analysis, chapter 9.
Birkhäuser, Boston (MA).
[Hlawatsch et al.(2000)] Hlawatsch, F., Matz, G., Kirchauer, H., et Kozek, W. (2000).
Time-frequency formulation, design, and implementation of time-varying optimal
filters for signal estimation. IEEE Transactions on Signal Processing, 48. to appear.
[Huber(1972)] Huber, J. P. (1972). Robust statistics : A review. Ann. Math. Statist., 43,
1041–1067.
[Huber(1985)] Huber, P. (1985). Projection pursuit. The Annals of Statistics, 13(2),
435–475.
[Huber(1981)] Huber, P. J. (1981). Robust Statistics. Wiley, New York.
[Hussain(2002)] Hussain, Z. M. (2002). Adaptive instantaneous frequency estimation :
Techniques and algorithms. Ph.D. thesis, Queensland University of Technology,
Brisbane, Australia.
[Hussain et Boashash(2002)] Hussain, Z. M. et Boashash, B. (2002). Adaptive
instantaneous frequency estimation of multicomponent fm signals using quadratic
time-frequency distributions. IEEE Trans. on Signal Proc., pages 1866–1876.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
BIBLIOGRAPHIE
199
[Hyvärinen(1997)] Hyvärinen, A. (1997). One-unit contrast functions for independent
component analysis : A statistical analysis. In Neural Networks for Signal Processing
VII (Proc. IEEE Workshop on Neural Networks for Signal Processing), pages
388–397, Amelia Island, Florida.
[Hyvarinen(1998)] Hyvarinen, A. (1998). New approximation of differential entrpy for
independent component analysis and projection pursuit. In advances in Neural
Information Processing Systems, 10, 273–279.
[Hyvarinen(1999)] Hyvarinen, A. (1999). Fast and robust fixed-point algorithms for
independent component analysis. IEEE Trans. on Neural Networks, 10(3), 626 – 634.
[Hyvarinen et al.(2001)] Hyvarinen, A., Karhunen, J., et Oja, E. (2001). Independent
Component Analysis. Wiley.
[Ichir et M.-Djafari(2003)] Ichir, M. et M.-Djafari, A. (2003). Bayesian wavelet based
signal and image separation. In AIP Conference Proceedings of Maxent23 ; Maximum
Entropy and Bayesian Inference Methods, pages 417–428, American Institute of
Physics, Jackson Hole, Wyoming, USA.
[Ikram et Zhou(2001)] Ikram, M. Z. et Zhou, G. T. (2001). Estimation of
multicomponent polynomial phase signals of mixed orders. Signal Processing, 81,
2293–2308.
[Ikram et al.(1996a)] Ikram, M. Z., Abed-Meraim, K., et Hua, Y. (1996a). Estimating
doppler parameters in SAR imaging for moving targets. In Proceeding of the IEEE
Nordic Signal Processing Symposium (NORSIG), pages 207–210, Espoo, Finlande.
[Ikram et al.(1996b)] Ikram, M. Z., Abed-Meraim, K., et Hua, Y. (1996.b). Fast discrete
quadratic phase transform for estimating the parameters of chirp signals. In Proc. of
the 30th Asilomar Conference, CA, volume 1, pages 798–801.
[Ikram et al.(1996c)] Ikram, M. Z., Abed-Meraim, K., et Hua, Y. (1996c). An iterative
approach to the parametric estimation of chirp signals. In IEEE Region Ten
Conference, Perth, Australia, volume 2, pages 681–685.
[Ikram et al.(1997)] Ikram, M. Z., Abed-Meraim, K., et Hua, Y. (1997). Fast quadratic
phase transform for estimating the parameters of multicomponent chirp signals. DSP
Review Journal, pages 127–135.
[Ikram et al.(1998)] Ikram, M. Z., Belouchrani, A., Abed-Meraim, K., et Gesbert, D.
(1998). Parametric estimation and suppression of non-stationary interference in
spread spectrum communications. In Proc. of 32nd Asilomar Conference on Signals,
Systems and Computers, Pacific Grove, CA, pages 1401–1405.
[Ilow(1995)] Ilow, J. (1995). Signal Processing in α-stable Noise Environments : Noise
Modeling, Detection and Estimation. Ph.D. thesis, Dept. of Electrical and Computer
Engineering, University of Toronto, Toronto, Canada.
[Jain(1989)] Jain, A. K. (1989). Fundamentals of Digital Image Processing.
Prentice-Hall, Englewood Cliffs, New York.
[Jakes(1974)] Jakes, W., editor (1974). Microwave Mobile Communications. IEEE Press.
[Janicki et Weron(1994)] Janicki, A. et Weron, A. (1994). Simulation and Chaotic
Behavior of α-Stable Stochastic Processes. Marcel Dekker, New York.
[Jayant et Noll(1984)] Jayant, N. et Noll, P. (1984). Digital Coding of Waveforms :
Principles and Applications to Speech and Video. Prentice-Hall.
[Jones et Sibson(1987)] Jones, M. C. et Sibson, R. (1987). What is projection pursuit ?
J. of the Royal Statistical Society, 150(A), 1–36.
[Joshi et Morris(1998)] Joshi, S. M. et Morris, J. M. (1998). Multiple access based on
Gabor transform. In Proceedings of the IEEE-SP International Symposium on
Time-Frequency and Time-Scale Analysis, pages 217–220, Pittsburgh, Pennsylvania,
USA. IEEE.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
200
BIBLIOGRAPHIE
[Jutten(2000)] Jutten, C. (2000). Source separation : from dusk till dawn. In Proc. 2nd
Int. Workshop on Independent Component Analysis and Blind Source Separation
(ICA’2000), pages 15–26, Helsinki, Finland.
[Kagan et al.(1973)] Kagan, A., Linnik, Y., et Rao, C. (1973). Characterization
Problems in Mathematical Statistics. John Wiley & Sons, USA.
[Kalluri(1998)] Kalluri, S. (1998). Nonlinear Adaptive Optimization Algorithms for
Robust Signal Processing in Non-Gaussian Environments. Ph.D. thesis, Dept. of
Electrical Engineering, University of Delaware, Newark.
[Kaluri et Arce(2000)] Kaluri, S. et Arce, G. (2000). Fast algorithms for weighted
myriad algorithm computation by fixed point search. IEEE Trans. on Signal Proc.
[Karol et al.(1997)] Karol, M. J., Haas, Z. J., Woodworth, C. B., et Gitlin, R. D. (1997).
Time-frequency-code slicing : efficiently allocating the communications spectrum to
multirate users. IEEE Transactions on Vehicular Technology, 46(4), 818–826.
[Karvanen et Cichocki(2003)] Karvanen, J. et Cichocki, A. (2003). Measuring sparseness
of noisy signals. In Proc. of the Conference ICA’2003, Japan.
[Kassam(1995)] Kassam, S. A. (1995). Signal Detection in Non-Gaussian Noise. John
Wily & Sons, New York.
[Kassam et Poor(1985)] Kassam, S. A. et Poor, V. (1985). Robust techniques for signal
processing : A survey. Proceedings of the IEEE, 73(3), 433–481.
[Katkovnik(1998)] Katkovnik, V. (1998). Robust m-periodogram. IEEE Transaction on
Signal Processing, 46(11), 3104–3109.
[Katkovnik et Stankovic(1998)] Katkovnik, V. et Stankovic, L. J. (1998). Instantaneous
frequency estimation using the Wigner distribution with varying and data driven
window length. IEEE Transactions on Signal Processing, 46(9), 2315–2325.
[Katkovnik et al.(2002)] Katkovnik, V., Djurovic, I., et Stankovic, L. (2002).
Time-Frequency Signal Analysis, chapter Robust time-frequency representations.
Prentice-Hall.
[Katkovnik et al.(2003)] Katkovnik, V., Djurovic, I., et Stankovic, L. (2003). Robust
time-frequency representation. Elsevier, Oxford.
[Kay(1998a)] Kay, S. (1998a). Fundamentals of Statistical Signal Processing : Detection
Theory. Prentice-Hall, Englewood Cliffs.
[Kay(1998b)] Kay, S. (1998b). Fundamentals of Statistical Signal Processing :
Estimation Theory. Prentice-Hall, Englewood Cliffs.
[Kay(1993)] Kay, S. M. (1993). Fundamentals of Statistical Signal Processing :
Estimation Theory. A.V. Oppenheim, series editor, Prentice-Hall Signal Processing
Series. Prentice-Hall, Englewood Cliffs, New Jersey.
[Kay(1998c)] Kay, S. M. (1998c). Fundamentals of Statistical Signal Processing, Volume
II : Detection Theory. A.V. Oppenheim, series editor, Prentice-Hall Signal Processing
Series. Prentice-Hall.
[Kay et Boudreaux-Bartels(1985)] Kay, S. M. et Boudreaux-Bartels, G. F. (1985). On
the optimality of Wigner distribution for detection. In International Conference on
Acoustics, Speech, and Signal Processing, ICASSP’85, pages 1017–1019.
[Khawarizmi(ecle)] Khawarizmi, M. I. M. (IX-ème siècle). The algebra of Mohammed
ben Musa. Edited band translated by Frederic Rosen. Georg Olms. Verlag.
[Kidmose(2001)] Kidmose, P. (2001). Blind Separation of Heavy Tail Signals. Ph.D.
thesis, Technical University of Denmark, Lyngby, Denmark.
[Knuth(1999)] Knuth, K. H. (1999). A Bayesian approach to source separation. In
Proceeding of the first International Workshop on Independent Component Analysis
and Signal Separation (ICA’1999), pages 283–288, Aussios, France.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
BIBLIOGRAPHIE
201
[Kootsookos et al.(1992)] Kootsookos, P., Lovell, B., et Boashash, B. (1992). A unified
approach to the STFT, TFDs, and instantaneous frequency. IEEE Transactions on
Signal Processing, 40, 1971 – 1982.
[Koutrouvelis(1980)] Koutrouvelis, I. A. (1980). Regression-type estimation of the
parameters of stable laws. Journal of the American Statistical Association, 75(372),
918–928.
[Krim et Viberg(1996)] Krim, H. et Viberg, M. (1996). Two decades of array signal
processing research : the parametric approach. IEEE Signal Processing Magazine,
13(4), 67 – 94.
[Krob et Benidir(1993)] Krob, M. et Benidir, M. (1993). Blind idenntification of a
linear-quadratic model using higher-order statistics. Minneapolis, USA.
[Kuelbs(1973)] Kuelbs, J. (1973). A representation theorem for symmetric stable
processes and stable measures. H. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete,
26.
[Kuhn et Lavielle(2004)] Kuhn, E. et Lavielle, M. (2004). Coupling a stochastic
approximation version of EM with a MCMC procedure. ESAIM Proba.& Stat., 8,
115–131.
[Kuruoglu(1998)] Kuruoglu, E. (1998). Signal Processing in α-stable Noise
Environments : A least lp Approach. Ph.D. thesis, University of Cambridge, UK.
[Kuruoglu(2001)] Kuruoglu, E. E. (2001). Density parameter estimation of skewed
α-stable distributions. Transaction on Signal Processing, 49(10).
[Kuruoglu(2002)] Kuruoglu, E. E. (2002). Nonlinear least lp -norm filters for nonlinear
autoregressive α-stable processes. Digital Signal Processing, 12, 119–142.
[Kuruoglu(2003)] Kuruoglu, E. E. (2003). Analytical representation for positive α-stable
densities. In Proceedings of ICASSP 2003 ; IEEE International Conference on
Acoustics, Speech, and Signal Processing, volume 6, pages 729–732.
[Lacoume et Ruiz(1988)] Lacoume, J.-L. et Ruiz, P. (1988). Sources identification : a
solution based on cumulants. In Proc. IEEE ASSP Workshop, Minneapolis,
Minnesota.
[Launer et Wilkinson(1979)] Launer, R. L. et Wilkinson, G. N., editors (1979).
Robustness in Statistics. Academic Press, The Army Research Office, Research
Triangle Park, North Carolina, USA. This book contains the proceedings of a
workshop.
[Lecoutre et Tassi(1980)] Lecoutre et Tassi, P. (1980). Statistique non paramétrique et
robustesse. statistica, Paris.
[Lee(1998a)] Lee, T.-W. (1998a). Independent Component Analysis : Theory and
Applications. Kluwer Academic, Boston/ Dordrecht/ London.
[Lee(2001)] Lee, T.-W. (2001). Independent Component Analysis : Theory and
Applications. Kluwer Academic Publishers, Boston.
[Lee et al.(1999)] Lee, T.-W., Lewicki, M. S., et Girolami, M. (1999). Blind source
separation of more sources than mixtures using overcomplete representations. Signal
Processing Letters, 6(4).
[Lee(1998b)] Lee, W. Y. (1998b). Mobile communications engineering. McGraw-Hill,
2nd. edition.
[Leroy(1987)] Leroy, P. J. R. A. M. (1987). Robust Regression & Outlier Detection. John
Wiley & Sons.
[Lévy(1925)] Lévy, P. (1925). Calcul des Probabilités. Gauthier-Villars, Paris.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
202
BIBLIOGRAPHIE
[Leyman et al.(2000)] Leyman, A. R., Kamran, Z. M., et Abed-Meraim, K. (2000).
Higher order time frequency based blind source separation technique. IEEE Signal
Processing Letters.
[Linh-Trung(2002)] Linh-Trung, N. (2002). Estimation and separation of LFM Signals
in wirless communication using time-frequency signal processing. Ph.D. thesis,
Queensland University of Technology, Brisbane, Australia.
[Luengo et al.(2003)] Luengo, D., Santamaria, I., Vielva, L., et Pantaleon, C. (2003).
Underdetermined blind separation of sparse sources with instantaneous and
convolutive mixtures. In Proceedings of the IEEE XIII-th Workshop on Neural
Networks for Signal Processing.
[Luigi et Moreau(2002a)] Luigi, C. D. et Moreau, E. (2002a). An iterative algorithm for
the estimation of linear frequency modulated signal parameters. IEEE Signal
Processing Letters, 9(4), 127–129.
[Luigi et Moreau(2002b)] Luigi, C. D. et Moreau, E. (2002b). Wigner-ville and
polynomial wigner-ville transforms in the estimation of nonlinear FM signal
parameters. In Proceedings of IEEE International Conference on Acoustics, Speech
and Signal Processing, volume 2, pages 1433–1436, Orlando, Florida.
[Luo et al.(2004)] Luo, Y., Lambotharan, S., et Chambers, J. (2004). A new block based
time-frequency approach for underdetermined blind source separation. In Proceedings
of ICASSP ’04 ; IEEE International Conference on Acoustics, Speech, and Signal
Processing, volume 5, pages 537–540.
[M. Castella et Pesquet(2004)] M. Castella, E. M. et Pesquet, J.-C. (2004). A quadratic
MISO contrast for blind equalization. In IEEE International Conference on Acoustics,
Speech and Signal Processing, ICASSP’2004, Montréal, Canada.
[M. Sahmoudi et al.(2005)] M. Sahmoudi, K. A.-M., Lavielle, M., Kunth, E., et Ciblat,
P. (2005). Blind source separation using a semi-parametric approach with application
to heavy-tailed signals. In submitted to EUSIPCO’2005.
[Ma et Nikias(1995a)] Ma, X. et Nikias, C. L. (1995a). On blind channel identification
for impulsive signal environments. In Proc. of the Conference ICASSP’1995.
[Ma et Nikias(1995b)] Ma, X. et Nikias, C. L. (1995b). Parameter estimation and blind
channel identification in impulsive signal environments. Transaction on Signal
Processing, 43(12).
[Mandelbrot(1962)] Mandelbrot, B. (1962). Sur certains prix spéculatifs : faits
empiriques et modèle basé sur les processus stables additifs non gaussiens de paul
lévy. Comptes rendus à l’Académie des Sciences, 254, 3968–9370.
[Mandelbrot(1963)] Mandelbrot, B. (1963). The variation of certain speculative prices.
Journal of busniness, 36, 394–419.
[Mansour et Ohnishi(2000)] Mansour, A. et Ohnishi, N. (2000). Discussion of simple
algorithms and methods to separate non-stationary signals. In Fourth IASTED
International Conference On Signal Processing and Communications (SPC 2000),
pages 78–85, Marbella, Spain.
[Mansour et al.(2000a)] Mansour, A., Jutten, C., et Loubaton, P. (2000a). Adaptive
subspace algorithm for blind separation of independent sources in convolutive mixture.
IEEE Trans. on Signal Processing, 48(2), 583–586.
[Mansour et al.(2000b)] Mansour, A., Barros, A. K., et Ohnishi, N. (2000b). Blind
separation of sources : Methods, assumptions and applications. Special Issue on
Digital Signal Processing in IEICE Transactions on Fundamentals of Electronics,
Communications and Computer Sciences, E83-A(8), 1498 – 1512.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
BIBLIOGRAPHIE
203
[Mansour et al.(2001)] Mansour, A., Puntonet, C. G., et Ohnishi, N. (2001). A simple
ICA algorithm based on geometrical approach. In Sixth International Symposium on
Signal Processing and its Applications (ISSPA 2001), pages 9–12, Kuala-Lampur,
Malaysia.
[Mansour et al.(2002a)] Mansour, A., Ohnishi, N., et Puntonet, C. G. (2002a). Blind
multiuser separation of instantaneous mixture algorithm based on geometrical
concepts. Signal Processing, 82(8), 1155–1175.
[Mansour et al.(2002b)] Mansour, A., Kawamoto, M., et Ohnishi, N. (2002b). A survey
of the performance indexes of ica algorithms. In 21st IASTED International
Conference on Modelling, Identification and Control (MIC 2002), pages 660 – 666,
Innsbruck, Austria.
[Marinovic(1984)] Marinovic, N. (1984). Time-Frequency Analysis. Ph.D. thesis.
[Marple S.L.(2001)] Marple S.L., J. (2001). Large dynamic range time-frequency signal
analysis with application to helicopter doppler radar data. In Sixth International,
Symposium on. Signal Processing and its Applications, volume 1, pages 260–263.
[Martin(1982)] Martin, W. (1982). Time-frequency analysis of random signals. In
Proceeding of ICASSP’1982, pages 1325–1328, Paris, France.
[Martin et Flandrin(1985)] Martin, W. et Flandrin, P. (1985). Wigner–Ville spectral
analysis of non–stationary signals. IEEE Transactions on Acoustics, Speech, and
Signal Processing, 33(6), 1461–1470.
[Masry et Cambanis(1984)] Masry, E. et Cambanis, S. (1984). Spectral density
estimation for stationary stable processes. Stochastic Processes and their Applications,
18, 1–31.
[Matz et Hlawatsch(1998a)] Matz, G. et Hlawatsch, F. (1998a). Extending the transfer
function calculus of time-varying linear systems : A generalized underspread theory. In
International Conference on Acoustics, Speech, and Signal Processing, ICASSP’98,
pages 2189–2192, Seattle, WA, USA. IEEE.
[Matz et Hlawatsch(1998b)] Matz, G. et Hlawatsch, F. (1998b). Time-frequency transfer
function calculus (symbolic calculus) of linear time-varying systems (linear operators)
based on a generalized underspread theory. Journal of Mathematical Physics, 39(8),
4041–4070.
[Matz et Hlawatsch(1999)] Matz, G. et Hlawatsch, F. (1999). Time-frequency subspace
detectors and application to knock detection. Int. J. Electron. Commun. (AEÜ),
53(6), 379–385.
[Matz et Hlawatsch(2003)] Matz, G. et Hlawatsch, F. (2003). Wigner distribution
(nearly) everywhere : time-frequency analysis of signals, systems, random processes,
signal spaces, and frames. Signal Processing (Elsevier), 83, 1355–1378.
[Matz et al.(1999)] Matz, G., Molisch, A. F., Steinbauer, M., Hlawatsch, F., Gaspard, I.,
et Artés, H. (1999). Bounds on the systematic measurement errors of channel
sounders for time-varying mobile radio channels. In Proc. IEEE VTC-99 Fall, pages
1465–1470, Amsterdam, Netherlands.
[Maymon et al.(2000)] Maymon, S., Friedmann, J., et Messer, H. (2000). A new methode
for estimating parameters of a skewed alpha-stable distribution. In IEEE Conference.
[McCullagh(1987)] McCullagh, P. (1987). Tensor Methods in Statistics. Monographs on
Statistics and Probability, Chapman and Hall.
[McGillem et Cooper(1984)] McGillem, C. et Cooper, G. (1984). Continuous and
Discrete Signal and System Analysis. HRW Series in Electrical and Computer
Engineering. CBS Publishing Japan Ltd., 2nd edition.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
204
BIBLIOGRAPHIE
[McGillem et Cooper(1991)] McGillem, C. et Cooper, G. (1991). Continuous and
Discrete Signal and System Analysis. HRW Series in Electrical and Computer
Engineering. Saunders College Publishing, 3rd edition.
[McHale et Boudreaux-Bartels(1993)] McHale, T. J. et Boudreaux-Bartels (1993). An
algorithm for synthesizing signals from partial time-frequency models using the cross
Wigner distribution. IEEE Transactions on Signal Processing, 41(5), 1986–1990.
[Mecklenbräuker et Hlawatsch(1997)] Mecklenbräuker, W. et Hlawatsch, F., editors
(1997). The Wigner Distribution – Theory and Applications in Signal Processing.
Elsevier, Amsterdam, Netherlands.
[Meng et Rubin(1993)] Meng, X. L. et Rubin, D. B. (1993). Maximum likelihood
estimation via the ECM algorithm : a general framework. biometrika, 80(2), 267–278.
[Michael(1983)] Michael, J. R. (1983). The stabilized probability plot. Biometrika, 70,
11–17.
[Middleton(1977)] Middleton, D. (1977). Statistical-physical models of electromagnetic
interference. IEEE Trans. on Electromagnetic Compatibility, EMC-19(3), 106–127.
[Miller(1978)] Miller, G. (1978). Properties of certain symmetric stables distributions.
Journal of Multivariate Analysis, 8(3), 346–360.
[Milstein(1988)] Milstein, L. B. (1988). Interference rejection techniques in spread
spectrum communications. Proceedings of the IEEE, pages 657–671.
[Mirza et Boyer(1993)] Mirza, M. J. et Boyer, K. L. (1993). Performance evaluation of a
class of M-estimators for surface parameter estimation in noisy range data. IEEE
Trans. on Robotics and Automation, 9(1), 75–85.
[Miskin(2000)] Miskin, J. (2000). Ensemble Learning for Independent Component
Analysis. Ph.D. thesis, Cambridge,
http ://www.infernce.phy.cam.ac.uk/jwm1003/.
[Molgedey et Schuster(1994)] Molgedey, L. et Schuster, H. G. (1994). Separation of a
mixture of independent signals using time delayed correlations. Physical Review
Letters, 72, 3634–3636.
[Moreau(2000)] Moreau, E. (2000). Joint-diagonalization of cumulant tensors and source
separation. In Proceedings of the 10th IEEE Signal Processing Workshop on Statistical
Signal and Array Processing (SSAP 2000), pages 339–343, Pocono Manor,
Pennsylvanie, USA.
[Moreau(2001)] Moreau, E. (2001). A generalization of joint-diagonalization criteria for
source separation. IEEE Transactions on Signal Processing, 49(3), 530–541.
[Moreau et Macchi(1996)] Moreau, E. et Macchi, O. (1996). High order contrasts for
self-adaptive source separation. International Journal of adaptive Control and Signal
Processing, 10(1), 19–46.
[Moreau et Pesquet(1997)] Moreau, E. et Pesquet, J.-C. (1997). Generalized contrasts
for multichannel blind deconvolution of linear systems. IEEE Signal Processing
Letters, 4, 182–183.
[Moreau et Stoll(1999)] Moreau, E. et Stoll, B. (1999). An iterative block procedure for
the optimization of constrained contrast function. In Proceedings of the International
Conference on Independent Component Analysis (ICA’99), pages 59–64, Aussois,
France.
[Morelande et Zoubir(2002)] Morelande, M. R. et Zoubir, A. M. (2002). Model selection
of random amplitude polynomial phase signals. IEEE Transactions on Signal
Processing, 50(3), 578–589.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
BIBLIOGRAPHIE
205
[Morelande et al.(2000)] Morelande, M. R., Barkat, B., et Zoubir, A. M. (2000).
Statistical performance comparison of a parametric and a non–parametric method for
IF estimation of random amplitude linear FM signals in additive noise. In Proceedings
of the Tenth IEEE Workshop on Statistical Signal and Array Processing, pages
262–266.
[Moussaoui et al.(2004)] Moussaoui, S., Brie, D., Caspary, O., et M.-Djafari, A. (2004).
A bayesian method for positive source separation. In Proceeding of ICASSP ’2004,
volume 5.
[Nandi(1999)] Nandi, A. K., editor (1999). Blind Estimation Using Higher-Order
Statistics. Boston : Kluwer Academic Publishers.
[Nguyen et al.(2001a)] Nguyen, L., Belouchrani, A., Abed-Meraim, K., et Boashash, B.
(2001a). Separating more sources than sensors using time-frequency distributions. In
In Proc. of Int. Symposium on Signal Processing and its Applications (ISSPA’2001),
pages 583–586, Malaysia.
[Nguyen et al.(2001b)] Nguyen, L.-T., Senadji, B., et Boashash, B. (2001b). Scattering
function and time-frequency signal processing. In International Conference on
Acoustics, Speech, and Signal Processing, ICASSP’2001, volume VI, pages 3597–3600,
Salt Lake city, Utah, USA.
[Nikias et Petropulu(1993)] Nikias, C. et Petropulu, A. (1993). Higher–Order Spectra
Analysis : A Nonlinear Signal Processing Framework. Prentice-Hall.
[Nikias et Petropulu(1994)] Nikias, C. L. et Petropulu, A. P. (1994). Higher-order
Spectra Analysis : A Nonlinear Signal Processing Framework. Prentice Hall, New York.
[Nikias et Shao(1995)] Nikias, C. L. et Shao, M. (1995). Signal Processing with
Alpha-Stable Distributions and Applications. John Wiley & Sons, New York.
[Nolan(2004)] Nolan, J. P. (2004). Stable Distributions - Models for Heavy Tailed Data.
Birkhauser, Boston.
[Nowicka(1997)] Nowicka, J. (1997). Asymptotic behavior of the covariation and the
codifference for ARMA models with stable innovations. Communications in Statistics.
Stochastic Models, 13(4), 673–685.
[Ouldali(1999)] Ouldali, A. (1999). Modélisation statistique et identification des signaux
FM à phase polynomiale. Ph.D. thesis, LSS, Supélec–Univ Paris XI, France.
[Ouldali et Benidir(1999)] Ouldali, A. et Benidir, M. (1999). Statistical analysis of
polynomial phase signals affected by multiplicative and additive noise. Signal
Processing, 42(19).
[P. Bickel(1998)] P. Bickel, e. a. (1998). Efficient and Adaptive Estimation for
Semiparametric Models. Springer.
[P.-Y. Arquès(2000)] P.-Y. Arquès, N. T.-M. e. E. M. (2000). Techniques de l’ingénieur,
Traité Mesure et Contrôle, volume RAB, chapter Les représentations temps-fréquences
linéaires et quadratiques en traitement du signal, pages 1–22. Techniques de
l’ingénieur.
[Papoulis(1991)] Papoulis, A. (1991). Probability, Random Variables, and Stochastic
Processes. McGraw-Hill.
[Peleg et Friedlander(1995)] Peleg, S. et Friedlander, B. (1995). The discrete
polynomial-phase transform. IEEE Transactions on Signal Processing, 43(8),
1901–1914.
[Peleg et Friedlander(1996)] Peleg, S. et Friedlander, B. (1996). Multicomponent signal
analysis using the polynomial-phase transform. IEEE Trans. On AES.
[Pesquet et Moreau(2001)] Pesquet, J.-C. et Moreau, E. (2001). Cumulant based
independence measures for linear mixtures. IEEE Transactions on Information
Theory, 47(5), 1947–1956.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
206
BIBLIOGRAPHIE
[Pham(1999)] Pham, D. T. (1999). Mutual information approach to blind separation of
stationary sources.
[Pham(2000)] Pham, D. T. (2000). Blind separation of instantaneous mixture of sources
via order statistics. IEEE Transaction on Signal Processing, 48(2), 363–375.
[Pham et Cardoso(2001)] Pham, D. T. et Cardoso, J. F. (2001). Blind separation of
instantanoeus mixtures of nonstationary sources. IEEE Transaction on Signal
Processing, 49(9), 1837–1848.
[Pham et Garrat(1997)] Pham, D.-T. et Garrat, P. (1997). Blind separation of a
mixture of independent sources through a quasi-maximum likelihood approach. IEEE
Transaction on Signal Processing, 45(7), 1712–1725.
[Piasco et al.(1995)] Piasco, J. M., Elkarkour, W., et Guglielmi, M. (1995). Identifiction
paramétriques de différents modèles d’un signal M.L.F. multicomposantes. In
Quinzième colloque GRETSI, pages 193–196, Juan les Pins.
[Poor et Tanda(2002)] Poor, H. et Tanda, M. (2002). Multiuser detection in flat fading
non-gaussian channels. IEEE Transactions on Communications, 50(11), 1769–1777.
[Poor et Wornell(1998)] Poor, H. V. et Wornell, G. W., editors (1998). Wireless
Communications : Signal Processing Perspectives. Prentice-Hall, New Jersey.
[Proakis(1995)] Proakis, J. G. (1995). Digital Communications. McGraw–Hill, 3rd.
edition.
[Rachev(2003)] Rachev, S. T. (2003). Handbook of Heavy Tailed Distributions in
Finance. Elsevier, amsterdam.
[Rai et Singh(2004)] Rai, C. S. et Singh, Y. (2004). Source distribution models for blind
source separation. Neurocomputing, 57, 501–505.
[Rappaport(1996)] Rappaport, T. S. (1996). Wireless Communications : Principles and
Practice. Prentice-Hall, New Jersey.
[Rihaczek(1985)] Rihaczek, A. (1985). Principles of High-Resolution Radar. Peninsula
Publishing.
[Ristic(1995)] Ristic, B. (1995). Some aspects of signal dependent and higher-order
time-frequency and time-scale analysis of non-stationary signals. Ph.D. thesis, Signal
Processing Research Centre, Queensland University of Technology, Brisbane,
Australia.
[Rupi et al.(2004)] Rupi, M., Tsakalides, P., Re, E. D., et Nikias, C. L. (2004). Constant
modulus blind equalization based on fractional lower-order statistics. Signal
Processing, 84, 881–894.
[Sahmoudi(2005)] Sahmoudi, M. (2005). Generalized contrast functions for blind source
separation with unknown number of sources. In IEEE Statistical Signal Processing
Workshop (SSP’2005) (submitted), Bordeaux, France.
[Sahmoudi et Abed-Meraim(2004a)] Sahmoudi, M. et Abed-Meraim, K. (2004,a).
Multicomponent chirp interference estimation for communication systems in impulsive
alpha-stable noise environment. In Proceeding of the IEEE International Symposium
on Control, Communications and Signal Processing (ISCCSP’04), Hammamet,
Tunisia.
[Sahmoudi et Abed-Meraim(2004b)] Sahmoudi, M. et Abed-Meraim, K. (2004b). Robust
blind separation algorithms for heavy-tailed sources. In Proceeding of the IEEE
International Symposium on Signal Processing and Information Theory, Rome, Italy.
[Sahmoudi et al.(2002)] Sahmoudi, M., Abed-Meraim, K., et Benidir, M. (2002). Blind
separation of alpha-stable sources : A new fractional lower-order moments (FLOM)
approach. In Prooceding of the IEEE International Symposium in Signal Processing
and Information Theory (ISSPIT’2002).
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
BIBLIOGRAPHIE
207
[Sahmoudi et al.(2003a)] Sahmoudi, M., bed Meraim, K., et Benidir, M. (2003a). Blind
separation of instantaneous mixtures of impulsive α-stable sources. In Proceeding of
the IEEE International Symposium on Signal and Image Processing (ISPA’2003).
[Sahmoudi et al.(2003b)] Sahmoudi, M., Abed-Meraim, K., et Benidir, M. (2003b).
Estimation des signaux chirp multi-composantes affectés par un bruit impulsif
α-stable. In Proceeding of GRETSI’2003.
[Sahmoudi et al.(2004a)] Sahmoudi, M., Abed-Meraim, K., et Benidir, M. (2004a).
Blind separation of heavy-tailed signals using normalized statistics. In Proceeding of
ICA’2004, Granada, Spain.
[Sahmoudi et al.(2004b)] Sahmoudi, M., Abed-Meraim, K., et Barkat, B. (2004b). IF
estimation of multicomponent chirp signal in impulsive α-stable noise environment
using parametric and non-parametric approaches. In Proceedings of EUSIPCO’2004,
Austria.
[Sahmoudi et al.(2005)] Sahmoudi, M., Abed-Meraim, K., et Benidir, M. (2005). Blind
separation of impulsive alpha-stable sources using minimum dispersion criterion.
IEEE Signal Processing Letters.
[Samorodnitsky et Taqqu(1994)] Samorodnitsky, G. et Taqqu, M. (1994). Stable
Non-Gaussian Random Processes : Stochastic Models with Infinite Variance.
Chapman & Hall, New York.
[Sarni et al.(2001)] Sarni, Y., Sadoun, R., et Belouchrani, A. (2001). On the application
of chirp modulation in spread spectrum communication systems. In Proccedings of
ISSPA’2001 ; Sixth International, Symposium on Signal Processing and its
Applications, volume 2, pages 501 – 504.
[Sayeed(1998)] Sayeed, A. M. (1998). Canonical time-frequency processing for
broadband signaling over dispersive channels. In Proceedings of the IEEE-SP
International Symposium on Time-Frequency and Time-Scale Analysis, pages
369–372, New York, USA. IEEE.
[Sayeed et al.(1998)] Sayeed, A. M., Sendonaris, A., et Aazhang, B. (1998). Multiuser
detection in fast-fading multipath environments. IEEE Journal on Selected Areas in
Communications, 16(9), 1691–1701.
[Schilder(1970)] Schilder, M. (1970). Some structure theorems for the symmetric stable
laws. Ann. Math. Statist., 41(2), 412–421.
[Senecal(2002)] Senecal, S. (2002). Méthodes de simulation Monte-Carlo par chaı̂nes de
Markov pour l’estimation de modèle. Application en séparation de sources et en
égalisation. Ph.D. thesis, INPG, Grenoble.
[Sengupta et Burman(2003)] Sengupta, K. et Burman, P. (2003). Non-parametric
approach to ICA using Kernel Density Estimation. In Proceedings of IEEE
International Conference on Multimedia and Expo. ICME’03, volume 1, pages
749–752.
[Serfling(1980)] Serfling, R. J. (1980). Approximation Theorems of Mathematical
Statistics. Wiley.
[Shamsunder et al.(1995)] Shamsunder, S., Giannakis, G., et Friedlander, B. (1995).
Estimating random amplitude polynomial phase signals : a cyclostationary approach.
IEEE Trans. n Signal processing, 43(2), 492–505.
[Shannon(1948a)] Shannon, C. E. (1948a). A mathematical theory of communication.
The Bell System Technical Journal, 27(3), 379–423.
[Shannon(1948b)] Shannon, C. E. (1948b). A mathematical theory of communication.
The Bell System Technical Journal, 27, 623–657.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
208
BIBLIOGRAPHIE
[Shereshevski(2002)] Shereshevski, Y. (Mars, 2002). Blind signal separation of heavy tail
sources. M.Sc. thesis, Tel Aviv University, Israel.
[Shereshevski et al.(2001)] Shereshevski, Y., Yeredor, A., et Messer, H. (2001).
Super-efficiency in blind signal separation of symmetric heavy-tailed sources. In
Proceedings of 11th IEEE Workshop on Statistical Signal Processing, pages 78 –81.
[Shi et al.(2004)] Shi, Z., Tang, H., Liu, W., et Tang, Y. (2004). Blind source separation
of more sources than mixtures using generalized exponential mixture models.
Neurocomputation, 61, 461–469.
[Shiryayev(1984)] Shiryayev, A. N. (1984). Probability. In Graduate Texts in
Mathematics, volume 95. Springer-Verlag.
[Snoussi(2003)] Snoussi, H. (2003). Approche Bayésienne en Séparation de Sources.
Applications en Imagerie. Ph.D. thesis, Université Paris-Sud Orsay, Paris.
[Snoussi et M.-Djafari(2000)] Snoussi, H. et M.-Djafari, A. (2000). Bayesian source
separation with mixture of gaussians prior for sources and Gaussian prior for mixture
coefficients. In Proc. of MaxEnt ; Bayesian Inference and Maximum Entropy Methods,
pages 388–406, Gif-sur-Yvette, FRANCE.
[Snoussi et M.-Djafari(2004)] Snoussi, H. et M.-Djafari, A. (2004). Fast joint separation
and segmentation of mixed images. Journal of Electronic Imaging, 13(2), 349–361.
[Stankovic(1997)] Stankovic, L. (1997). S–class of time–frequency distributions. IEE
Proc. Vision, Image and Signal Processing, 144(2), 57–64.
[Stankovic et Stankovic(1993)] Stankovic, L. et Stankovic, S. (1993). Wigner
distribution of noisy signals. IEEE Transactions On Signal Processing, 41(2), 956–960.
[Stankovic et Katkovnik(1998)] Stankovic, L. J. et Katkovnik, V. (1998). Algorithm for
the instantaneous frequency estimation using the time–frequency distributions with
adaptive window length. IEEE Signal Processing Letters, 5(9).
[Stoll et Moreau(2000)] Stoll, B. et Moreau, E. (2000). A generalized ICA algorithm.
IEEE Signal Processing Letters, 7(4), 90–92.
[Stone(1990)] Stone, C. J. (1990). Large-sample inference for log-spline models. Ann.
statist., 18(2), 717–741.
[Stuck(1977)] Stuck, B. W. (1977). Minimum error dispersion linear filtering of scalar
symmetric stable processes. IEEE Trans. on Automatic Control, (23), 507–509.
[Stuck et Kleiner(1974)] Stuck, B. W. et Kleiner, B. (1974). A statistical analysis of
telephone noise. Bell System Technical Journal, (53), 1263–1320.
[Subbotin(1923)] Subbotin, M. T. (1923). On the law of frequency of errors.
Mathematicheskii Sbornik, 31, 296–301.
[Sucic et al.(1999)] Sucic, V., Barkat, B., et Boashash, B. (1999). Performance
evaluation of the B distribution. In Proceedings of the Fifth International Symposium
on Signal Processing and its Applications, ISSPA’99, volume 1, pages 267–270,
Brisbane, Queensland, Australia.
[Suppappola(2003)] Suppappola, A. P., editor (2003). Applications in Time-Frequency
Signal Processing. CRC PRESS.
[Swami et Sadler(1998)] Swami, A. et Sadler, B. (1998). Parameter estimation for linear
alpha-stable processes. IEEE Signal Processing Letters, 5(2).
[Swarts et al.(1999)] Swarts, F., van Rooyan, P., Oppermann, I., et Lotter, M. P.,
editors (1999). CDMA Techniques for Third Generation Mobile Systems. Kluwer
Academic Publishers, Boston.
[Takada(2001)] Takada, T. (2001). Nonparametric density estimation : A comparative
study. Economics Bulletin, 3(16), 1–10.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
BIBLIOGRAPHIE
209
[Taleb(1999)] Taleb, A. (1999). Séparation de Sources dans des Mélanges Non Linéaires.
Ph.D. thesis, INPG, Grenoble, France.
[Thirion-Moreau et al.(2004)] Thirion-Moreau, N., Fadili, E., et Moreau, E. (2004). A
sufficient condition for separation of deterministic signals based on spatial
time-frequency representation. In Proceedings of the International Conference on
Independent Component Analysis (ICA’2004), pages 366–373.
[Tong et al.(1991)] Tong, L., Liu, R.-W., Soon, V., et Huang, Y.-F. (1991).
Indeterminacy and identifiability of blind identification. IEEE Trans. on Circuits and
Systems, 38, 499–509.
[Tourneret(1998)] Tourneret, J. (1998). Detection and estimation of abrupt changes
contaminated by multiplicative Gaussian noise. Signal Processing, 68, 259–270.
[Tourneret et al.(2003a)] Tourneret, J. Y., Doisy, M., et Lavielle, M. (2003a). Bayesian
retrospective detection of multiple change-points corrupted by multiplicative noise
application to SAR image edge detection. Signal Processing, 83, 1871–1887.
[Tourneret et al.(2003b)] Tourneret, J.-Y., Suparman, S., et Doisy, M. (2003b).
HierarchicalBayesian segmentation of signals corrupted by multiplicative noise. In
Proceeding of ICASSP’2003, pages 165–168, Hong-Kong, China.
[Tsakalides et Nikias(1996)] Tsakalides, P. et Nikias, C. (1996). The robust
covariation-based MUSIC (ROC-MUSIC) algorithm for bearing estimation in
impulsive noise environmentts. Trans. on Signal Processing, 44(7), 1623–1633.
[Tsihrintzis et Nikias(1996)] Tsihrintzis, G. et Nikias, C. (1996). Fast estimation of the
parameters of alpha-stable impulsive interference. 44(6).
[VanTrees(1968)] VanTrees, H. L. (1968). Detection, Estimation, and Modulation
Theory : Part I. John Wiley & Sons.
[VanTrees(1992)] VanTrees, H. L. (1992). Detection, Estimation, and Modulation
Theory. Radar-Sonar Signal Processing and Gaussian Signals in Noise. Krieger Pub.
Co., Malabar, Florida.
[Ville(1948)] Ville, J. (1948). Théorie et applications de la notion de signal analytique.
Cables et Transmissions, 2A(1), 61–74.
[Vincent(1995)] Vincent, I. (1995). Classification de Signaux non Stationnaires. Ph.D.
thesis, Université de Nantes/Ecole Centrale de Nantes.
[Walter(1994)] Walter, C. (1994). Les structures du hasard en économie : efficience des
marchés, lois stables et processus fractales. Ph.D. thesis, IEP Paris.
[Wang et al.(2002)] Wang, Y., Gao, L., Zhao, M., Chen, J., Zhang, Z., et Yao, Y. (2002).
Time-frequency code for multicarrier DS-CDMA systems. In Proceeding of the IEEE
55th Vehicular Technology Conference, volume 3, pages 1224–1227.
[Wegman et al.(1989)] Wegman, E. J., Schwartz, S. G., et Thomas, J. (1989). Topics in
non-Gaussian Signal Processing. Academic Press, New York.
[White et Boashash(1988)] White, L. V. et Boashash, B. (1988). On estimating the
instantaneous frequency of a Gaussian random signal by use of the Wigner–Ville
distribution. IEEE Transactions on Acoustics, Speech, and Signal Processing, 36(3),
417–420.
[Wood et Barry(1994)] Wood, J. C. et Barry, D. T. (1994). Linear signal synthesis using
the Radon-Wigner transform. IEEE Transactions on Signal Processing, 42(8),
2105–2111.
[Xueshi Yang et Pesquet(2001)] Xueshi Yang, A. P. P. et Pesquet, J. C. (2001).
Estimating long-range dependence in impulsive traffic flows. In IEEE International
Conference on Acoustics, Speech, and Signal Processing (ICASSP ’01), pages 3413 –
3416.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires
210
BIBLIOGRAPHIE
[Zhang et Amin(2000)] Zhang, Y. et Amin, M. G. (2000). Blind separation of sources
based on their time-frequency signatures. In Proceedings. 2000 IEEE International
Conference on Acoustics, Speech, and Signal Processing, ICASSP ’00, volume 5,
Istanbul, Turkey.
[Zhang et Kassam(2004)] Zhang, Y. et Kassam, S. A. (2004). Robust rank-EASI
algorithm for blind source separation. IEE Proc. Commun., 151(1), 15–19.
[Zhang et al.(2001)] Zhang, Y., Ma, W., et Amin, M. G. (2001). Subspace analysis of
spatial time–frequency distribution matrices. IEEE Transactions on Signal
Processing, 49(4), 747–759.
[Zhao et al.(1990)] Zhao, Y., Atlas, L. E., et Marks, R. J. (1990). The use of
cone-shaped kernels for generalized time-frequency representations of nonstationary
signals. IEEE Trans. on Acoustics, Speech, and Signal Processing, 38(7), 1084–1091.
[Zhong et al.(2004a)] Zhong, M., tang, H., et Tang, Y. (2004a).
Expectation-Maximization approaches to independent component analysis.
Neurocomputing, 61, 503–512.
[Zhong et al.(2004b)] Zhong, M.-J., Tang, H.-W., Chen, H.-J., et Tang, Y.-Y. (2004b).
An EM algorithm for learning sparse and overcomplete representations.
Neurocomputation, 57, 469–476.
[Zhou et Giannakis(1994a)] Zhou, G. et Giannakis, G. (1994a). Self coupled harmonics :
stationary and cyclostationary approaches. In International Conference on Acoustics,
Speech, and Signal Processing, ICASSP’94, volume 4, pages IV/153–156, Adelaide,
SA, Australia. IEEE.
[Zhou et Giannakis(1995)] Zhou, G. et Giannakis, G. (1995). Harmonics in Gaussian
multiplicative and additive noise : Cramer-Rao bounds. IEEE Trans. on Signal Proc.,
43(5), 1217–1231.
[Zhou et Giannakis(1996)] Zhou, G. et Giannakis, G. (1996). Polyspectral analysis of
mixed processes and coupled harmonics. IEEE Transactions on Information Theory,
42(3), 943–958.
[Zhou et Giannakis(1993)] Zhou, G. et Giannakis, G. B. (1993). Comparison of
higher-order and cyclic approaches for estimating random amplitude modulated
harmonics. In IEEE Signal Processing Workshop on Higher-Order Statistics, pages
225–229, South Lake Tahoe, CA, USA.
[Zhou et Giannakis(1994b)] Zhou, G. et Giannakis, G. B. (1994b). On estimating
random amplitude-modulated harmonics using higher order spectra. IEEE Journal of
Oceanic Engineering, 19(4), 529–539.
[Zhou et al.(1996)] Zhou, G., Giannakis, G., et Swami, A. (1996). On polynomial phase
signals with time-varying amplitudes. IEEE Trans. on Signal Proc., 44(4), 848–861.
[Ziehe et Müller(1998)] Ziehe, A. et Müller, K.-R. (1998). TDSEP—an efficient
algorithm for blind separation using time structure. In Proc. Int. Conf. on Artificial
Neural Networks (ICANN’98), pages 675–680, Skövde, Sweden.
[Zolotarev(1966)] Zolotarev, V. (1966). On representation of stable laws by integrals. In
Selected Translations in Mathematical Statistics and Probability, volume 6, pages 84–8.
American Mathematical Society.
[Zolotarev(1986)] Zolotarev, V. M. (1986). One-dimentional stable distribution. In
Translations of Mathematical Monographs, volume 65. American Mathematical
Society.
[Zoubir et Brcich(2002)] Zoubir, A. et Brcich, R. (2002). Multiuser detection in
non-gaussian channels. Digital Signal Processing, 12, 262–273.
[Zoubir et Arnold(1996)] Zoubir, A. M. et Arnold, M. J. (1996). Testing Gaussianity
with the characteristic function : the i.i.d. case. Signal Processing, 53(2), 110–120.
M. Sahmoudi © Processus Alpha-Stables pour la Séparation et l’Estimation Robustes des
Signaux non-Gaussiens et/ou non-Stationnaires