I read that you apply a bivariate gumbel sampling in your paper, and use the generalized form gumbel softmax.
Gumbel softmax takes logits (log probability) as input, while you directly use learned structure theta as input:
adj = gumbel_softmax(x, temperature=temp, hard=True) (in line 234, GTS/model/pytorch/model.py)
Why it worked? Thank you.
I read that you apply a bivariate gumbel sampling in your paper, and use the generalized form gumbel softmax.
Gumbel softmax takes logits (log probability) as input, while you directly use learned structure theta as input:
adj = gumbel_softmax(x, temperature=temp, hard=True)(in line 234, GTS/model/pytorch/model.py)Why it worked? Thank you.