Lecture 10: Robust outlier detection with L0-SVDD
Stéphane Canu
stephane.canu@litislab.eu
Sao Paulo 2014
February 28, 2014
Roadmap
1 Robust outlier detection with L0-SVDD
L0 SVDD
4 iterations of Adaptive L0 SVDD
Recall SVDD



min
R,c,ξ
R + C
n
i=1
ξi
with xi − c 2 ≤ R + ξi , i = 1, . . . , n
and ξi ≥ 0, i = 1, . . . , n
(1)
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 3 / 11
SVDD + outlier
C =1/16 C =1/8 C =1/4 C = 1/2 ( )
Figure: Example of SVDD solutions with different C values, m = 0 (red) and
m = 5 (magenta). The circled data points represent support vectors for both m.
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 4 / 11
The L0 norm
ξ 0 ≤ t



min
c∈IRp
,R∈IR,ξ∈IRn
R + C ξ 0
with xi − c 2 ≤ R+ξi
ξi ≥ 0 i = 1, n
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 5 / 11
L0 relaxations
p norm
exponenetial
piecewise linear
log



min
c∈IRp
,R∈IR,ξ∈IRn
R + C
n
i=1
log(γ + ξi )
with xi − c 2 ≤ R+ξi
ξi ≥ 0 i = 1, n .
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 6 / 11
DC programing
log(γ + t) = f (t) − g(t) with f (t) = t and g(t) = t − log(γ + t),
both functions f and g being convex. The DC framework consists in
minimizing iteratively (R plus a sum of) the following convex term:
f (ξ) − g (ξ)ξ = ξ − 1 −
1
γ + ξold
ξ =
ξ
γ + ξold
,
where ξold
i denotes the solution at the previous iteration.
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 7 / 11
The DC idea applied to our L0 SVDD approximation consists in building a
sequence of solutions of the following adaptive SVDD:



min
c∈IRp
,R∈IR,ξ∈IRn
R + C
n
i=1
wi ξi
with xi − c 2 ≤ R+ξi
ξi ≥ 0 i = 1, n
with wi =
1
γ + ξold
i
.
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 8 / 11
Stationary conditions of the KKT give: c = n
i=1 αi xi and n
i=1 αi = 1
where the αi are the Lagrange multipliers associated with the inequality
constraints xi − c 2 ≤ R+ξi . The dual of this problem is
min
α∈IRn
α XX α − α diag(XX )
with n
i=1 αi = 1 0 ≤ αi ≤ Cwi i = 1, n
(2)
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 9 / 11
Algorithm 1 L0 SVDD for the linear kernel
Data: X, y, C , γ
Result: R , c, ξ , α
wi = 1; i = 1, n
while not converged do
(α, λ) ← solve_QP(X, C, w) % solve problem (2)
c ← X α
R ← λ + c c
ξi ← max(0, xi − c 2 − R) i = 1, n
wi ← 1/(γ + ξi ) i = 1, n
end
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 10 / 11
Bibliography
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 11 / 11

Lecture10 outilier l0_svdd

  • 1.
    Lecture 10: Robustoutlier detection with L0-SVDD Stéphane Canu [email protected] Sao Paulo 2014 February 28, 2014
  • 2.
    Roadmap 1 Robust outlierdetection with L0-SVDD L0 SVDD 4 iterations of Adaptive L0 SVDD
  • 3.
    Recall SVDD    min R,c,ξ R +C n i=1 ξi with xi − c 2 ≤ R + ξi , i = 1, . . . , n and ξi ≥ 0, i = 1, . . . , n (1) Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 3 / 11
  • 4.
    SVDD + outlier C=1/16 C =1/8 C =1/4 C = 1/2 ( ) Figure: Example of SVDD solutions with different C values, m = 0 (red) and m = 5 (magenta). The circled data points represent support vectors for both m. Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 4 / 11
  • 5.
    The L0 norm ξ0 ≤ t    min c∈IRp ,R∈IR,ξ∈IRn R + C ξ 0 with xi − c 2 ≤ R+ξi ξi ≥ 0 i = 1, n Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 5 / 11
  • 6.
    L0 relaxations p norm exponenetial piecewiselinear log    min c∈IRp ,R∈IR,ξ∈IRn R + C n i=1 log(γ + ξi ) with xi − c 2 ≤ R+ξi ξi ≥ 0 i = 1, n . Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 6 / 11
  • 7.
    DC programing log(γ +t) = f (t) − g(t) with f (t) = t and g(t) = t − log(γ + t), both functions f and g being convex. The DC framework consists in minimizing iteratively (R plus a sum of) the following convex term: f (ξ) − g (ξ)ξ = ξ − 1 − 1 γ + ξold ξ = ξ γ + ξold , where ξold i denotes the solution at the previous iteration. Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 7 / 11
  • 8.
    The DC ideaapplied to our L0 SVDD approximation consists in building a sequence of solutions of the following adaptive SVDD:    min c∈IRp ,R∈IR,ξ∈IRn R + C n i=1 wi ξi with xi − c 2 ≤ R+ξi ξi ≥ 0 i = 1, n with wi = 1 γ + ξold i . Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 8 / 11
  • 9.
    Stationary conditions ofthe KKT give: c = n i=1 αi xi and n i=1 αi = 1 where the αi are the Lagrange multipliers associated with the inequality constraints xi − c 2 ≤ R+ξi . The dual of this problem is min α∈IRn α XX α − α diag(XX ) with n i=1 αi = 1 0 ≤ αi ≤ Cwi i = 1, n (2) Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 9 / 11
  • 10.
    Algorithm 1 L0SVDD for the linear kernel Data: X, y, C , γ Result: R , c, ξ , α wi = 1; i = 1, n while not converged do (α, λ) ← solve_QP(X, C, w) % solve problem (2) c ← X α R ← λ + c c ξi ← max(0, xi − c 2 − R) i = 1, n wi ← 1/(γ + ξi ) i = 1, n end Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 10 / 11
  • 11.
    Bibliography Stéphane Canu (INSARouen - LITIS) February 28, 2014 11 / 11