Skip to content
This repository was archived by the owner on Oct 31, 2023. It is now read-only.
This repository was archived by the owner on Oct 31, 2023. It is now read-only.

transformer multihead attention scaling layer error  #108

Description

@skswldndi

Hi. I think there's an problem in transformer scaling layer.
When I run UNMT, got Exceptionerror in NMT/src/modules/multihead_attention.py line 97.

line 97 : q = self.scaling
line 30 : self.scaling = self.head_dim
*-0.5

I could not find the reason.
So I just change my code to

line 97 : q = q / math.sqrt(self.head_dim)

and it worked.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions