Aprendizaje Widrow-Hoff

Contenido

ADALINE

ADALINE (ADAptive LInear NEuron)
Una ADALINE de una capa
Adaline.png
observemos que la activación es la identidad. Ahora si se tiene dos enradas y una neurona, podemos dibujar el hiperplano separador
separaAdaline.png
Tendríamos un clasificador tipo popsitivo (contra perceptron tipo 1), tipos negativa (contra perceptron tipo 0), y una tercer clasificación tipo 0, esta última esta asociada al perceptron como tipo 1 (hardlims), entonces siguiendo esto, esta tercer clasificación la pordriamos asociar a tipo positivo en nuestra ADALINE y así sustituir una red perceptron por una ADALINE
Paper.RNN.19.png
Paper.RNN.20.png
Paper.RNN.21.png

Convergencia

ConverADALINE.png

Ejemplo: Solución analítica

ejempoAdaline1.png
p1 = [1 -1 -1]';
t1 = -1;
p2 = [1 1 -1]';
t2 = 1;
R = (1/2) * p1 * p1' + (1/2) * p2 * p2'
R = 3×3
1 0 -1 0 1 0 -1 0 1
h = (1/2) * t1 * p1 + (1/2) * t2 * p2
h = 3×1
0 1 0
c = (1/2) * (-1)^2 + (1/2) * (1)^2
c = 1
syms w1 w2 w3
x = [w1 w2 w3]';
F(w1,w2,w3) = c -2*x'*h + x'*R*x % Error cuadrado promedio
F(w1, w2, w3) = 
(w2-1)^2 + (w1-w3)^2 % una infinidad optimizan w2 = 1, w1 = w3
ans = 
E = gradient(F)==0
E(w1, w2, w3) = 
% Compute analytic solution of a symbolic equation
E
E(w1, w2, w3) = 
solution = solve(E,[w1,w2,w3]);
Warning: Unable to solve symbolically. Returning a numeric solution using vpasolve.
% Display symbolic solution returned by solve
displaySymSolution(solution);
solution = struct with fields:
w1: [1×1 sym] w2: [1×1 sym] w3: [1×1 sym]
where
xopt = inv(R)*h % no tenemos solución unica
Warning: Matrix is singular to working precision.
xopt = 3×1
NaN NaN NaN
eig(R)
ans = 3×1
0 1 2
tasa_estable_max = 1/max(eig(R))
tasa_estable_max = 0.5000

Ejemplo: Solución númerica

ejempoAdaline1.png
%adaline(W0,b0,alpha, p,t,ba)
W0 = [0 0 0];
p1 = [1 -1 -1]';
t1 = -1;
p2 = [1 1 -1]';
t2 = 1;
p = [p1 p2];
t = [t1 t2];
%adaline(W0,b0,alpha, p,t,ba)
[W,b] = adaline_epoca(W0,[0],0.2,p,t)
W = 3×3
0 -0.4000 0.1600 0 0.4000 0.9600 0 0.4000 -0.1600
b = 0
[W,b] = adaline_epoca(W(:,end)',[0],0.2,p,t)
W = 3×3
0.1600 0.0160 -0.0384 0.9600 1.1040 1.0496 -0.1600 -0.0160 0.0384
b = 0
[W,b] = adaline_epoca(W(:,end)',[0],0.2,p,t)
W = 3×3
-0.0384 0.0122 0.0028 1.0496 0.9990 0.9897 0.0384 -0.0122 -0.0028
b = 0
[W,b] = adaline_epoca(W(:,end)',[0],0.2,p,t)
W = 3×3
0.0028 -0.0036 0.0009 0.9897 0.9961 1.0005 -0.0028 0.0036 -0.0009
b = 0
[W,b] = adaline_epoca(W(:,end)',[0],0.2,p,t)
W = 3×3
0.0009 0.0004 -0.0003 1.0005 1.0010 1.0003 -0.0009 -0.0004 0.0003
b = 0
[W,b] = adaline_epoca(W(:,end)',[0],0.2,p,t)
W = 3×3
-0.0003 0.0001 0.0000 1.0003 0.9999 0.9999 0.0003 -0.0001 -0.0000
b = 0
[W,b] = adaline_epoca(W(:,end)',[0],0.2,p,t) % una neurona con un solo peso
W = 3×3
0.0000 -0.0000 0.0000 0.9999 1.0000 1.0000 -0.0000 0.0000 -0.0000
b = 0

Filtrado Adaptativo

Se define un nuevo bloque, que represetan el retraso de la señal de estrada
delaytaped.png
Un filtro adaptativo es
filtro.png
Es un sistemas LT! en tiempo discreto donde estamos ajustando los parametros del modelo de acuerdo a las entradas presentes para despues utlilizar este modelo para entradas con ruido.
CancelacionRuido.pngfiltro2in.png

Ejemplo: Filtro adaptable

clear vars
v = @(k) 1.2 * sin(2 * pi * k / 3);
m = @(k) 0.12 * sin(2 * pi * k / 3 + pi / 2);
k = 0:10;
figure
subplot(1,2,1)
stem(k,v(k))
subplot(1,2,2)
stem(k,m(k))
% Señales periodicas
Evto2 = mean(v(1:3).^2)
Evto2 = 0.7200
Evv1 = mean(v(1:3).*v(0:2))
Evv1 = -0.3600
R = [Evto2, Evv1; Evv1, Evto2]
R = 2×2
0.7200 -0.3600 -0.3600 0.7200
Emv = mean(m(1:3).*v(1:3)) % esto es cero
Emv = 1.1373e-17
Emv1 = mean(m(1:3).*v(0:2))
Emv1 = -0.0624
h = [0;Emv1]
h = 2×1
0 -0.0624
xopt = inv(R)*h
xopt = 2×1
-0.0577 -0.1155
[ved, D] = eig(R)
ved = 2×2
-0.7071 -0.7071 -0.7071 0.7071
D = 2×2
0.3600 0 0 1.0800
Esto2 = (1/0.4)*integral(@(s) s.^2,-0.2,0.2)
Esto2 = 0.0133
Em2 = mean(m(1:3).^2)
Em2 = 0.0072
c = Esto2 + Em2
c = 0.0205
syms w1 w2
x = [w1 w2]';
F(w1,w2) = vpa(c -2*x'*h + x'*R*x) % error cuadrático promedio
F(w1, w2) = 
figure
fmesh(F,[-2 2 -2 2])
figure
fcontour(F,[-2 2 -2 2])
hold on
plot(xopt(1),xopt(2),'or')
%adaline(W0,b0,alpha, p,t,ba)
W0 = [0 -2];
muestras = 90;
n = 0:muestras;
p = [v(n);v(n-1)]
p = 2×91
0 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 0 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 -0.0000 1.0392 -1.0392 0.0000 1.0392 -1.0392 -0.0000
%rng(123) % controlamos aletorios
s = unifrnd(-0.2,0.2,1,muestras+1);
t = s + m(n);
alpha = 0.1;
%adaline(W0,b0,alpha, p,t,ba)
[W,b] = adaline_epoca(W0,[0],alpha,p,t)
W = 2×92
0 0 -0.0234 -0.3411 -0.3411 -0.3085 -0.4845 -0.4845 -0.4026 -0.4326 -0.4326 -0.3531 -0.3534 -0.3534 -0.3123 -0.3243 -0.3243 -0.2873 -0.2798 -0.2798 -0.2513 -0.2865 -0.2865 -0.2179 -0.2123 -0.2123 -0.2115 -0.2474 -0.2474 -0.1800 -0.1622 -0.1622 -0.1793 -0.1738 -0.1738 -0.1769 -0.1432 -0.1432 -0.1585 -0.1639 -0.1639 -0.1247 -0.1403 -0.1403 -0.1613 -0.1134 -0.1134 -0.0988 -0.1158 -0.1158 -2.0000 -1.5672 -1.5672 -1.2495 -1.0446 -1.0446 -0.8685 -0.7179 -0.7179 -0.6879 -0.5583 -0.5583 -0.5580 -0.4699 -0.4699 -0.4579 -0.3908 -0.3908 -0.3984 -0.3470 -0.3470 -0.3118 -0.3096 -0.3096 -0.3152 -0.2791 -0.2791 -0.2432 -0.2471 -0.2471 -0.2650 -0.2406 -0.2406 -0.2460 -0.2022 -0.2022 -0.2359 -0.2035 -0.2035 -0.1981 -0.1778 -0.1778 -0.1622 -0.1636 -0.1636 -0.2114 -0.1757 -0.1757 -0.1587 -0.1417
b = 0
plot(W(1,:),W(2,:))
figure
plot(n,s)
hold on
plot(n,t)
figure
plot(n,s)
salidas = diag(W(:,1:end-1)'*[v(n);v(n-1)]);
size(salidas)
ans = 1×2
91 1
restaurada = t-salidas';
size(restaurada)
ans = 1×2
1 91
hold on
plot(n,restaurada)
mean((s(floor(muestras/2):end)-restaurada(floor(muestras/2):end)).^2)
ans = 0.0019

Problema P10.1

P10_1.png
syms k y(k)
a(k) =[2 -1 3]*[y(k);y(k-1);y(k-2)]
a(k) = 
a(-1)
ans = 
i) La salida es cero antes de
for i=0:5
[i,a(i)]
end
ans = 
ans = 
ans = 
ans = 
ans = 
ans = 
ii) solución
y_0 = 5;
y_1 = -4;
a_0 = 2*y_0
a_0 = 10
a_1 = -y_0 + 2*y_1
a_1 = -13
a_2 = 3*y_0-y_1
a_2 = 19
a_3 = 3*y_1
a_3 = -12
iii) Contribuye de 0 a 2

Problema P10.2

P10_2_1.png
P10_2_2.png
p1 = [1 1]';
p2 = [-1 -1]';
p3 = [2 2]';
P = [p1 p2 p3];
figure
plot(P(1,1:2),P(2,1:2), 'ok','MarkerSize',15, 'LineWidth',5)
hold on
plot(P(1,3),P(2,3),'ob','MarkerSize',15, 'LineWidth',5)
axis([-2 2.5 -2 2.5])
g = gca;
g.XAxisLocation = 'origin';
g.YAxisLocation = 'origin';
g.Box = 'off';
i) El problema es linealmente separable, por lo cual se puede ADALINE se puede implementar
ii) Encontramos la ecuación de la recta que pase por (hay infinidad de soluciones).
syms p11 p12
p12(p11) =((0-3)/(3-0))*(p11 - 0) + 3
p12(p11) = 
hold on
fplot(p12, [-0.5,3])
Asi , solo para recordar, el vector de pesos puede ser cualqueir multiplo del indicaddo si se elige el de dirección contrario la clasificación cambiara porsitivos por negativos y viceversa.
iii) el problema no es linealmente seprable
iv)el probema no es linealmente separable
p1 = [1 1]';
p2 = [1 -1]';
p3 = [1 0]';
P = [p1 p2 p3];
figure
plot(P(1,1:2),P(2,1:2), 'ok','MarkerSize',15, 'LineWidth',5)
hold on
plot(P(1,3),P(2,3),'ob','MarkerSize',15, 'LineWidth',5)
axis([-2 2.5 -2 2.5])
g = gca;
g.XAxisLocation = 'origin';
g.YAxisLocation = 'origin';
g.Box = 'off';

Problema P10.3

P10_3.png
p1 = [1 1]';
t1 = 1;
p2 = [1 -1]';
t2 = -1;
c = (1/2)*t1^2 + (1/2)*t2^2
c = 1
h = (1/2)*t1*p1 + (1/2)*t2*p2
h = 2×1
0 1
R = (1/2)*p1*p1' + (1/2)*p2*p2'
R = 2×2
1 0 0 1
syms w11 w12
x = [w11 w12]';
F(w11,w12)= c - 2 * x' * h + x' * R * x
F(w11, w12) = 
figure
fmesh(F,[-3 3 -2 4])
figure
fcontour(F,[-3 3 -2 4])
axis('square')
xopt = inv(R)*h
xopt = 2×1
0 1

Problema P10.4

P10_4.png
p1 = [1 1]';
t1 = 1;
p2 = [1 -1]';
t2 = -1;
figure
plot(p1(1,1),p1(2,1), 'ok','MarkerSize',15, 'LineWidth',5)
hold on
plot(p2(1,1),p2(2,1),'ob','MarkerSize',15, 'LineWidth',5)
axis([-2 2.5 -2 2.5])
g = gca;
g.XAxisLocation = 'origin';
g.YAxisLocation = 'origin';
g.Box = 'off';
W0 = [0 0];
p = [p1 p2];
t = [t1 t2];
alpha = 0.25;
%adaline(W0,b0,alpha, p,t,ba)
[W,b] = adaline_epoca(W0,[0],alpha,p,t)
W = 2×3
0 0.5000 0 0 0.5000 1.0000
b = 0
hold on
fimplicit(@(p1,p2) W(1,2)*p1 + W(2,2)*p2)
quiver(0,0,W(1,2),W(2,2),0, 'MaxHeadSize',0.5)
fimplicit(@(p1,p2) W(1,3)*p1 + W(2,3)*p2)
quiver(0,0,W(1,3),W(2,3),0, 'MaxHeadSize',0.5)

Problema P10.5

P10_5.png
val = eig(R)
val = 2×1
1 1
1/max(val)
ans = 1

Problema P10.6

P10_6_1.png
P10_6_2.png
% c = E(t^2) = E(y^2) = C(0)
c = 3
c = 3
% R = E(zz')
R = [3 -1;-1 3]
R = 2×2
3 -1 -1 3
% h = E(tz)
h = [-1 -1]'
h = 2×1
-1 -1
syms w11 w12
x = [w11 w12]';
F(w11,w12)= c - 2 * x' * h + x' * R * x
F(w11, w12) = 
figure
fmesh(F,[-3 3 -2 4])
figure
fcontour(F,[-4 4 -4 4])
axis('square')
xopt = inv(R) * h
xopt = 2×1
-0.5000 -0.5000
[V,D] = eig(2*R)
V = 2×2
-0.7071 -0.7071 -0.7071 0.7071
D = 2×2
4 0 0 8
hold on
quiver(-0.5,-0.5,V(1,1),V(1,2),0, 'MaxHeadSize',0.5)
quiver(-0.5,-0.5,V(2,1),V(2,2),0, 'MaxHeadSize',0.5)
ii) la máxima tasa
1 / max(eig(R))
ans = 0.2500
iii) Obtener los coefficientes del proceso con los datos del problema enlace
W0 = [0.75 0];
Mdl = arima('Constant',0,'AR',{-0.5 -0.5},'Variance',2);
rng(5)
sim = simulate(Mdl,100000);
t = sim(3:end)';
p1 =sim(2:end-1)';
p2 = sim(1:end-2)';
p = [p1;p2];
alpha = 0.00001;
%adaline(W0,b0,alpha, p,t,ba)
[W,b] = adaline_epoca(W0,[0],alpha,p,t);
figure
fcontour(F,[-1 1 -1 1])
axis('square')
hold on
plot(W(1,:),W(2,:))
plot(-0.5,-0.5,'o')

Problema P10.8

P10_8_1.png
P10_8_2.png
C1 = [1 1;1 2];
C2 = [2 2;-1 0];
C3 = [-1 -2;2 1];
C4 = [-1 -2;-1 -2];
figure
plot(C1(1,:),C1(2,:), 'bo', 'MarkerSize',20)
hold on
plot(C2(1,:),C2(2,:), 'gs','MarkerSize',20)
plot(C3(1,:),C3(2,:), 'ro', 'MarkerSize',30, 'LineWidth',5)
plot(C4(1,:),C4(2,:), 'ks','MarkerSize',30,'LineWidth',5)
axis([-3 3 -3 3])
g = gca;
g.XAxisLocation = 'origin';
g.YAxisLocation = 'origin';
g.Box = 'off';
legend("C1","C2","C3","C4")
legend('Location','eastoutside')
% como el problema citado, cambiamos 0 por -1 para lso tipo negativos
p1 = [1;1];
t1 = [-1;-1];
p2 = [1;2];
t2 = t1;
p3 = [2;-1];
t3 = [-1;1];
p4 = [2;0];
t4 = t3;
p5 = [-1;2];
t5 = [1;-1];
p6 = [-2;1];
t6 = t5;
p7 = [-1;-1];
t7 = [1;1];
p8 = [-2;-2];
t8 = t7;
W0 = [1 0;0 1];
b0 = [1;1];
p = [p1 p2 p3 p4 p5 p6 p7 p8];
t = [t1 t2 t3 t4 t5 t6 t7 t8];
alpha = 0.001; % Hagan propone 0.04, no me converge con eso
%adaline(W0,b0,alpha, p,t,ba)
[W,b] = adaline_epoca(W0,b0,alpha,p,t,1);
W(:,end-1:end)'
ans = 2×2
0.9371 -0.0116 0.0002 0.9446
W0 = W(:,end-1:end)';
b(end-1:end)'
ans = 2×1
0.9839 0.9802
b0 = b(end-1:end)';
epocas = 1000;
for i = 2:epocas
[W,b] = adaline_epoca(W0,b0,alpha,p,t,1);
W0 = W(:,end-1:end)';
b0 = b(end-1:end)';
end
Wf = W(:,end-1:end)'
Wf = 2×2
-0.5937 -0.0510 0.1680 -0.6659
bf = b(end-1:end)'
bf = 2×1
0.0133 0.1680
fimplicit(@(p1,p2) Wf(1,1)*p1 + Wf(1,2)*p2 + bf(1),[-3,3],'r')
fimplicit(@(p1,p2) Wf(2,1)*p1 + Wf(2,2)*p2 + bf(2),[-3,3],'k')
quiver(0,0.2,Wf(1,1),Wf(1,2),0, 'MaxHeadSize',0.5,"Color",'r','DisplayName',"_1w")
quiver(0,0.2,Wf(2,1),Wf(2,2),0, 'MaxHeadSize',0.5,"Color",'k','DisplayName',"_2w")
axis('square') % escala
grid on
PIENSA EN EL PLATAMEAMIENTO ANÁLITICO DE ESTE PROBLEMA

Problema P10.9

P_10_9_1.png
P10_9_2.png
p1 = [1 -1 -1 -1 1 1 1 1 1 -1 -1 -1 -1 -1 -1 -1]';
t1 = 60;
p2 = [-1 -1 -1 -1 1 -1 -1 -1 1 1 1 1 1 -1 -1 -1]';
t2 = 60;
p3 = [1 1 1 1 1 -1 1 1 1 -1 1 1 -1 -1 -1 -1]';
t3 = 0;
p4 = [-1 -1 -1 -1 1 1 1 1 1 -1 1 1 1 -1 1 1]';
t4 = 0;
p5 = [1 1 1 1 1 1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1]';
t5 = -60;
p6 = [-1 -1 -1 -1 1 1 1 1 1 1 -1 -1 1 -1 -1 -1]';
t6 = -60;
p = [p1 p2 p3 p4 p5 p6];
t = [t1 t2 t3 t4 t5 t6];
alpha = 0.03;
W0 =zeros(1,16);
b0 = 0;
[W,b] = adaline_epoca(W0,b0,alpha,p,t,1);
W
W = 16×7
0 3.6000 0.2160 -1.0670 0.9512 -4.0244 2.3037 0 -3.6000 -6.9840 -8.2670 -6.2488 -11.2244 -4.8963 0 -3.6000 -6.9840 -8.2670 -6.2488 -11.2244 -4.8963 0 -3.6000 -6.9840 -8.2670 -6.2488 -11.2244 -4.8963 0 3.6000 6.9840 5.7010 3.6827 -1.2928 -7.6209 0 3.6000 0.2160 1.4990 -0.5192 -5.4947 -11.8228 0 3.6000 0.2160 -1.0670 -3.0853 1.8903 -4.4378 0 3.6000 0.2160 -1.0670 -3.0853 1.8903 -4.4378 0 3.6000 6.9840 5.7010 3.6827 -1.2928 -7.6209 0 -3.6000 -0.2160 1.0670 3.0853 8.0608 1.7327
b
b = 1×7
0 3.6000 6.9840 5.7010 3.6827 -1.2928 -7.6209
e = 0;
for i = 1:6
a=W(:,end)'*p(:,i)+b(end);
e = e + (t(i)-a)^2;
end
e
e = 1.7426e+04
sum((t-(W(:,end)'*p+b(end))).^2)
ans = 1.7426e+04
epocas = 100;
W0 = W(:,end)';
b0 = b(end)';
e(1) = 2* 60 ^2;
e(2) = e;
for i = 3:epocas+1
[W,b] = adaline_epoca(W0,b0,alpha,p,t,1);
W0 = W(:,end)';
b0 = b(end)';
e(i) = sum((t-(W0*p+b0)).^2);
end
Wf = W(:,end)'
Wf = 1×16
23.1250 -25.0000 -25.0000 -25.0000 -5.6250 -21.2500 -7.5000 -7.5000 -5.6250 -13.7500 11.8750 11.8750 -23.1250 5.6250 -3.7500 -3.7500
bf = b(end)'
bf = -5.6250
e(end)
ans = 6.3109e-28
figure
plot(e)
xlabel("epocas")

App

Adaptative Noise Cancellation
nnd10nc
EEG Noise Cancellation
nnd10eeg
Linear Classification
nnd10lc

Referencias

El material se toma del libro de Martin Hagan et. al. enlace
function [W,b] = adaline_epoca(W0,b0,alpha, p,t,ba)
% ba = 1 activa la actualizción del bias
if nargin > 5
ba = ba;
else
ba = 0;
end
[fil,col] = size(p);
W = [W0'];
b = [b0'];
for i = 1:col
a = W0*p(:,i)+ b0;
e = t(:,i)-a;
Wn = W0 + 2 *alpha * e * p(:,i)';
if ba
bn = b0 + 2 *alpha * e;
b = [b bn'];
b0 = bn;
end
W = [W Wn'];
W0 = Wn;
end
end