Environment
- Qiskit version: 2.1.2
- Python version: 3.11.4
- Operating system: macOS-14.6.1-arm64-arm-64bit
What is happening?
I am transpiling a dynamic circuit that includes IfElseOp instructions. Each if_else is relatively simple: if the control is 0, nothing happens; if the control is 1, two swap gates are applied.
When the default pass manager, the transpiler produces an abnormally large and inefficient circuit. As a result, running the transpiled circuit on hardware takes an extremely long time.
The original dynamic circuit is:
The transpiled output is:
How can we reproduce the issue?
from qiskit import QuantumCircuit, ClassicalRegister, QuantumRegister
from qiskit.transpiler.preset_passmanagers import generate_preset_pass_manager
import qiskit.circuit.classical as qiskit_classical
from qiskit_ibm_runtime.fake_provider import FakeFez
backend = FakeFez()
def build_qpa_circuit(d: int = 2, N: int = 2, dynamic: bool = True) -> QuantumCircuit:
""" Build the QPA-style circuit for fixed d and N.
- dynamic=True: uses mid-circuit measurement + if_test; swaps happen only on the else branch.
- dynamic=False: no mid-circuit measurement/control flow; performs the 'else' swaps unconditionally.
"""
qr = QuantumRegister(3 * d + 1, "q")
if dynamic:
cr = ClassicalRegister(1, "control")
qc = QuantumCircuit(qr, cr)
else:
qc = QuantumCircuit(qr)
for _ in range(N):
qc.reset(0)
qc.h(0)
for k in range(d):
qc.cswap(0, k + 1, k + d + 1)
qc.h(0)
if dynamic:
qc.measure(0, qc.cregs[0][0])
parity_control = qiskit_classical.expr.lift(qc.cregs[0][0])
with qc.if_test(parity_control) as _else:
pass
with _else:
for k in range(d):
qc.swap(k + d + 1, k + 2 * d + 1)
else: # Static: always do the 'else' swaps (no mid-circuit measure/control)
for k in range(d):
qc.swap(k + d + 1, k + 2 * d + 1)
qc.measure_all()
return qc
qc_dynamic = build_qpa_circuit(d=2, N=2, dynamic=True)
pm = generate_preset_pass_manager(backend=backend, seed_transpiler=42)
qc_dynamic_tr = pm.run(qc_dynamic)
qc_dynamic_tr.draw('mpl', fold=-1)
What should happen?
The transpiled circuit should be significantly shorter. From the output, it seems that the pass manager handles the static portions of the circuit well, but the transpilation of the dynamic portion (highlighted in the red box) is extremely inefficient, especially given that it is only meant to perform two swap operations.
For comparison, when using the equivalent static circuit (with both else branches explicitly included), the transpiler produces a much shorter and simpler result:
qc_static = build_qpa_circuit(d=2, N=2, dynamic=False)
qc_static_tr = pm.run(qc_static)
qc_static_tr.draw('mpl', fold=-1)
Any suggestions?
We can also use the static circuit to better understand how to improve the transpilation of the dynamic case. By taking the initial layout from the static circuit and passing it as input when transpiling the dynamic circuit, the resulting layout and routing are much more efficient:
initial_layout_custom = qc_static_tr.layout.initial_virtual_layout(filter_ancillas=True)
pm_custom = generate_preset_pass_manager(backend=backend, seed_transpiler=42, initial_layout=initial_layout_custom)
qc_dynamic_tr_custom = pm_custom.run(qc_dynamic)
qc_dynamic_tr_custom.draw('mpl', fold=-1)
This suggests that the transpiler is not handling the if_else control flow in a desirable way. While the static portions of the circuit are routed well, the dynamic portions appear to be treated as a single block rather than decomposed into their components. This raises the broader question of how dynamic circuits with if_else should ideally be transpiled, since different branches can vary greatly in complexity, it is unclear how to best optimize across cases.
During this investigation, I also encountered another unexpected behavior. When computing the 2Q depth using depth(lambda x: x.operation.num_qubits == 2), the original dynamic circuit (before transpilation) reports a 2Q depth of 0, despite containing swap gates inside the dynamic section. It is unclear whether this is intended, but it highlights the difficulty in interpreting circuit depth for dynamic circuits.
To gain some insight, I converted the transpiled circuits into equivalent static circuits (expanding the else branch) and collected their depths:
Sabre (default) results:
2Q depth (everything): 143
2Q depth (only dynamic part): 114
Custom mapping results (mapping from static circuit transpilation):
2Q depth (everything): 71
2Q depth (only dynamic part): 18
These depth comparisons reinforce the visual evidence that the current routing of dynamic circuits is highly inefficient. Treating a dynamic circuit as its deepest static case yields significantly better results than the default transpilation. While this may not be the ideal approach, it highlights that the current method is likely buggy or ineffective.
Environment
What is happening?
I am transpiling a dynamic circuit that includes
IfElseOpinstructions. Eachif_elseis relatively simple: if the control is0, nothing happens; if the control is1, twoswapgates are applied.When the default pass manager, the transpiler produces an abnormally large and inefficient circuit. As a result, running the transpiled circuit on hardware takes an extremely long time.
The original dynamic circuit is:
The transpiled output is:
How can we reproduce the issue?
What should happen?
The transpiled circuit should be significantly shorter. From the output, it seems that the pass manager handles the static portions of the circuit well, but the transpilation of the dynamic portion (highlighted in the red box) is extremely inefficient, especially given that it is only meant to perform two
swapoperations.For comparison, when using the equivalent static circuit (with both
elsebranches explicitly included), the transpiler produces a much shorter and simpler result:Any suggestions?
We can also use the static circuit to better understand how to improve the transpilation of the dynamic case. By taking the initial layout from the static circuit and passing it as input when transpiling the dynamic circuit, the resulting layout and routing are much more efficient:
This suggests that the transpiler is not handling the
if_elsecontrol flow in a desirable way. While the static portions of the circuit are routed well, the dynamic portions appear to be treated as a single block rather than decomposed into their components. This raises the broader question of how dynamic circuits withif_elseshould ideally be transpiled, since different branches can vary greatly in complexity, it is unclear how to best optimize across cases.During this investigation, I also encountered another unexpected behavior. When computing the 2Q depth using
depth(lambda x: x.operation.num_qubits == 2), the original dynamic circuit (before transpilation) reports a 2Q depth of 0, despite containing swap gates inside the dynamic section. It is unclear whether this is intended, but it highlights the difficulty in interpreting circuit depth for dynamic circuits.To gain some insight, I converted the transpiled circuits into equivalent static circuits (expanding the else branch) and collected their depths:
These depth comparisons reinforce the visual evidence that the current routing of dynamic circuits is highly inefficient. Treating a dynamic circuit as its deepest static case yields significantly better results than the default transpilation. While this may not be the ideal approach, it highlights that the current method is likely buggy or ineffective.