I am learning the Transformer_Captioning.ipynb in assignment3. After I run the cell of testing MultiHeadAttention, I get some incorrect results:
self_attn_output error: 0.449382070034207
masked_self_attn_output error: 1.0
attn_output error: 1.0
I even copied your MultiHeadAttention code. But, I still get the same result:
self_attn_output error: 0.449382070034207
masked_self_attn_output error: 1.0
attn_output error: 1.0
I even downloaded your assignment3 code, and I still get the same output.
Is there anything else I missed?