# 線性判別分析

## 基本諗頭

LDA 係降維嘅一種常見做法。降維係指減低手上拃 datadei1 taa4）入面隨機變數嘅數量，簡單例子可以想像而家要分析 1,000 隻蝴蝶，手上嘅 data 描述

—令到啲 data 嘅維度下降咗，用日常用語講即係令啲 data 易睇咗。做科研或者教 AI 學習嗰陣，降維好多時都能夠令到啲 data 更易處理[1]。而 LDA 就係降維會用到嘅一種技術。屬於一種監督式嘅做法[2]

## 初步模型

LDA 假設咗兩樣嘢：啲變數係成常態分佈嘅，而且唔同組嘅協方差矩陣都一樣樣。

### 組間變異

${\displaystyle X=\{x_{1},x_{2},...,x_{N}\}}$

LDA 第一步係要計所謂嘅組間變異${\displaystyle S_{B}}$），反映啲組之間差異有幾大。組 ${\displaystyle i}$${\displaystyle S_{B}}$${\displaystyle S_{Bi}}$）就係

${\displaystyle S_{B}=(m_{i}-m)^{2}=(W^{T}\mu _{i}-W^{T}\mu )^{2}}$ [詳解 1]

### 組內變異

${\displaystyle S_{Wj}=\sum _{x\in \omega _{j},j=1,...,c}(W^{T}x_{i}-m_{j})^{2}}$ [詳解 2]

### 出結果

${\displaystyle \operatorname {argmax} {\frac {W^{T}S_{B}W}{W^{T}S_{W}W}}}$
${\displaystyle S_{W}W=\lambda (S_{B}W)}$

## 應用

0001 號客仔 產品 A 4 3
0002 號客仔 產品 A 5 3.5
0003 號客仔 產品 Ｂ 1 2

## 詳解

1. Tharwat, A., Gaber, T., Ibrahim, A., & Hassanien, A. E. (2017) 嗰篇文 2.2 嗰 part 有詳解呢條式，原版英文係："mi represents the projection of the mean of the i-th class and it is calculated as follows, mi = WT µi, where m is the projection (投射) of the total mean of all classes (所有類別嘅整體平均) and it is calculated as follows, m = WT µ, W represents the transformation matrix (變換矩陣) of LDA, µi (1 × M) represents the mean of the i-th class."
2. 有關呢啲數學符號嘅意思，可以睇睇加總集合

## 詞表

• 降維 / gong3 wai4 / dimension reduction
• 類別 / leoi6 bit6 / class
• 組間變異 / zou2 gaan1 bin3 ji6 / between-class variance
• 組內變異 / zou2 noi6 bin3 ji6 / within-class variance
• 費雪標準 / fai3 syut3 biu1 zeon2 / Fisher's criterion
• 特徵值 / dak6 zing1 zik6 / eigenvalue
• 聚類分析 / zeoi6 leoi6 fan1 sik1 / cluster analysis

## 攷

1. R. O. Duda, P. E. Hart, and D. G. Stork. Pattern classification. John Wiley & Sons, Second Edition, 2012.
2. Tharwat, A., Gaber, T., Ibrahim, A., & Hassanien, A. E. (2017). Linear discriminant analysis: A detailed tutorial (PDF). AI communications, 30(2), 169-190，第一句就提到 dimensionality reduction 嘅概念。
3. J. Ye, R. Janardan, and Q. Li. Two-dimensional linear discriminant analysis. In Proceedings of 17th Advances in Neural Information Processing Systems (NIPS), pages 1569-1576, 2004.
4. J. Lu, K. N. Plataniotis, and A. N. Venetsanopoulos. Face recognition using lda-based algorithms. IEEE Transactions on Neural Networks, 14(1):195-200, 2003.
5. Bilgili, B., & Ozkul, E. (2015). Brand awareness, brand personality, brand loyalty and consumer satisfaction relations in brand positioning strategies (A Torku brand sample). Journal of Global Strategic Management| Volume, 9(2), 10-20460.
6. Alkarkhi, A. F., Wasin, N. A. N. M., Alqaraghuli, A. A., Yusup, Y., Easa, A. M., & Huda, N. (2017). An investigation of food quality and oil stability indices of Muruku by cluster analysis and discriminant analysis. Int. J. Adv. Sci. Eng. Inf. Technol, 7(6), 2279-2285.
7. Fitzpatrick, A. M., Teague, W. G., Meyers, D. A., Peters, S. P., Li, X., Li, H., ... & National Institutes of Health. (2011). Heterogeneity of severe asthma in childhood: confirmation by cluster analysis of children in the National Institutes of Health/National Heart, Lung, and Blood Institute Severe Asthma Research Program. Journal of allergy and clinical immunology, 127(2), 382-389. "To determine the strongest predictors of cluster assignment, stepwise discriminant analysis of the cluster variables was performed with the Fisher method..."