(40 points) Backpropagation of matrix multiplication. The key operation in transformer is to calculate
the query-key multiplication matrix.
Forward input: Q E Rnxd, K E Rnxd Forward output: Z = QKT, where Z E Rnxn
Backward input:
aJ
a.J
OZin
:
oz
aJ
aJ
aJ
OZn1
0Zn2
aZnn
aJ
aJ
and
for i,j =
Find the backward outputs
n.
,
aQij
SKij
1.
2.
3.
4.
Not the question you are looking for? Ask here!
Enter question by text
Enter question by image
Unlock Smarter Learning with AskSia Super!
Join Super, our all-in-one AI solution that can greatly improve your learning efficiency.