Hi,
I found your work very interesting and helpful, but there seem to be two mistakes in lines 134-137 in SVGPVAE_model when you are computing the KL term of the lower bound $\mathcal{L}^l_H$ for the moving-ball experiment:
KL_term = 0.5*(K_mm_log_det - S_log_det - m +
tf.trace(tf.matmul(K_mm_inv, A_hat)) +
tf.reduce_sum(A_hat *
tf.linalg.matvec(K_mm_inv, A_hat)))
-
When you compute the Mahalanobis distance, is A_hat supposed to be mu_hat? Should we also add axis=-1 in tf.reduce_sum?
-
Since you use tf.reduce_sum without the axis argument in lines 131-132, K_mm_log_det and S_log_det are two scalars. However, K_mm's shape is [M, M] whereas A_hat's [35, M, M] (M is the number of inducing points, 35 is the number of videos.). Therefore, we might need to retain the batch shape [35] for S_log_det; otherwise, we will miss a factor '35' before K_mm_log_det.
Could you please let me know if I am wrong? Thanks.
Hi,
I found your work very interesting and helpful, but there seem to be two mistakes in lines 134-137 in$\mathcal{L}^l_H$ for the moving-ball experiment:
SVGPVAE_modelwhen you are computing the KL term of the lower boundWhen you compute the Mahalanobis distance, is
A_hatsupposed to bemu_hat? Should we also addaxis=-1intf.reduce_sum?Since you use
tf.reduce_sumwithout theaxisargument in lines 131-132,K_mm_log_detandS_log_detare two scalars. However,K_mm's shape is[M, M]whereasA_hat's[35, M, M](M is the number of inducing points, 35 is the number of videos.). Therefore, we might need to retain the batch shape[35]forS_log_det; otherwise, we will miss a factor '35' beforeK_mm_log_det.Could you please let me know if I am wrong? Thanks.