Can you show that how you evaluate the model performance with 'attention_mask' ?
according to this line:
https://github.com/kssteven418/LTP/blob/8ab31a623fb71c5f4f8208e878097f214484e848/src/transformers/models/ltp/modeling_ltp.py#L305C27-L305C27
the 'attention_mask' is never used outside the for loop.
So, I think you did not use the attention mask in the evaluation part, because you must need this mask for those labels.
Can you show me your evaluation process? (with some token pruned)
Can you show that how you evaluate the model performance with 'attention_mask' ?
according to this line:
https://github.com/kssteven418/LTP/blob/8ab31a623fb71c5f4f8208e878097f214484e848/src/transformers/models/ltp/modeling_ltp.py#L305C27-L305C27
the 'attention_mask' is never used outside the for loop.
So, I think you did not use the attention mask in the evaluation part, because you must need this mask for those labels.
Can you show me your evaluation process? (with some token pruned)