Skip to content

Error with nested calls to TaylorDiff #99

@landreman

Description

@landreman

Hi, I have the following nested AD problem that I'd like to get to work with TaylorDiff. Starting with a vector-valued function f(params, x) for scalar x, take a high-order derivative with respect to x and evaluate for a specific value of x. Then apply some reduction function to the result to obtain a scalar-valued function g(params). Finally I want to evaluate the gradient d g / d params. Example:

# Arbitrary function:
f(params, x) = [params[1] * x^3 + params[2], params[2] * sin(x - params[1]), sqrt(x + params[2])]

function g(params)
    closure(x) = f(params, x)
    some_x = 0.7
    d3f_dx3 = TaylorDiff.derivative(closure, some_x, Val(3))
    return sum(d3f_dx3)
end

some_params = [1.3, 2.1]

@show g(some_params)  # Fine, gives 6.095380076578732
TaylorDiff.derivative(g, some_params, [1.0, 0.0], Val(1))  # First element of the gradient

Results, using julia 1.11.5 and TaylorDiff v0.3.3:

ERROR: MethodError: *(::TaylorScalar{Float64, 1}, ::TaylorScalar{Float64, 3}) is ambiguous.

Candidates:
  *(a::TaylorScalar, b::Number)
    @ TaylorDiff ~/.julia/packages/TaylorDiff/qw5aY/src/primitive.jl:119
  *(a::Number, b::TaylorScalar)
    @ TaylorDiff ~/.julia/packages/TaylorDiff/qw5aY/src/primitive.jl:114

Possible fix, define
  *(::TaylorScalar, ::TaylorScalar)

Stacktrace:
 [1] f(params::Vector{TaylorScalar{Float64, 1}}, x::TaylorScalar{Float64, 3})
   @ Main ./REPL[4]:1
 [2] (::var"#closure#1"{Vector{TaylorScalar{Float64, 1}}})(x::TaylorScalar{Float64, 3})
   @ Main ./REPL[5]:2
 [3] derivatives
   @ ~/.julia/packages/TaylorDiff/qw5aY/src/derivative.jl:41 [inlined]
 [4] derivative
   @ ~/.julia/packages/TaylorDiff/qw5aY/src/derivative.jl:16 [inlined]
 [5] g(params_in::Vector{TaylorScalar{Float64, 1}})
   @ Main ./REPL[5]:4
 [6] derivatives
   @ ~/.julia/packages/TaylorDiff/qw5aY/src/derivative.jl:41 [inlined]
 [7] derivative(f::Function, x::Vector{Float64}, l::Vector{Float64}, p::Val{1})
   @ TaylorDiff ~/.julia/packages/TaylorDiff/qw5aY/src/derivative.jl:17
 [8] top-level scope
   @ REPL[8]:1

Any idea how this could be made to work?

While Zygote-over-TaylorDiff does work for this problem, @btime shows it is much faster to use ForwardDiff-over-ForwardDiff (probably due to the overhead of reverse mode), so I imagine TaylorDiff-over-TaylorDiff (or ForwardDiff-over-TaylorDiff) might be even faster due to the high-order inner derivative. Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions