基于注意力的模型 (Attention-based Model)

黎浩然/ 13 11 月, 2023/ 序列模型/SEQUENCEMODEL, 机器学习/MACHINELEARNING, 研究生/POSTGRADUATE/ 0 comments

Pay attention on partial of the input object each time
In RNN/LSTM, larger memory implies more parameters
Increasing memory size will not increasing parameters while in attention-based model
$z^0$ 可以理解为RNN中的 initialized memory vector
Match 可以是 $h^?$ 和 $z^i$ 的余弦相似度或者小型的神经网络，确保输出的 $\alpha_i^?$ 是标量

$$ \begin{equation} \alpha^i_j=match(h^i,z^j) \end{equation} $$

Leave a Comment 取消回复