Release 0.13.0¶
New features since last release
Catalyst now supports
qml.specs, meaning that users can use theqml.specsfunction to track the exact resources of programs compiled withqjit()! This new feature is currently only supported when usinglevel="device". (#2033) (#2055)This is made possible by leveraging resource-tracking capabilities using the
null.qubitdevice under the hood, which gathers circuit information via mock execution. This makes getting exact resources from large circuits extremely performant. For example, the circuit below has 100 qubits and its device-level resources can be calculated in around 1 minute!from functools import partial gateset = {qml.H, qml.S, qml.CNOT, qml.T, qml.RX, qml.RY, qml.RZ} @qml.qjit @partial(qml.transforms.decompose, gate_set=gateset) @qml.qnode(qml.device("null.qubit", wires=100)) def circuit(): qml.QFT(wires=range(100)) qml.Hadamard(wires=0) qml.CNOT(wires=[0, 1]) qml.OutAdder(x_wires=range(10), y_wires=range(10, 20), output_wires=range(20, 31)) return qml.expval(qml.Z(0) @ qml.Z(1)) circ_specs = qml.specs(circuit, level="device")()
>>> print(circ_specs['resources']) num_wires: 100 num_gates: 138134 depth: 90142 shots: Shots(total=None) gate_types: {'CNOT': 55313, 'RZ': 82698, 'Hadamard': 123} gate_sizes: {2: 55313, 1: 82821}
Note that there are certain limitations to
specssupport. For example,whileloops might not terminate when executing on thenull.qubitdevice due to the quantum execution being mocked out.The graph-based decomposition system, enabled with the global toggle
qml.decomposition.enable_graph(), is now supported with Catalyst with PennyLane program capture enabled (qml.capture.enable()). This providesqjit()compatibility to defining custom decomposition rules and access to the many decomposition rules for templates and operators in PennyLane that have been added over the past few release cycles. (#1820) (#2099) (#2091) (#2029) (#2001) (#2115)qml.decomposition.enable_graph() qml.capture.enable() @qml.register_resources({qml.H: 2, qml.CZ: 1}) def my_cnot1(wires): qml.H(wires=wires[1]) qml.CZ(wires=wires) qml.H(wires=wires[1]) @qml.qjit @partial( qml.transforms.decompose, gate_set={"H", "CZ", "GlobalPhase"}, alt_decomps={qml.CNOT: [my_cnot1]}, ) @qml.qnode(qml.device("lightning.qubit", wires=2)) def circuit(): qml.H(0) qml.CNOT(wires=[0, 1]) return qml.state()
>>> circuit() Array([0.70710678+0.j, 0. +0.j, 0. +0.j, 0.70710678+0.j], dtype=complex128)
Similar to PennyLane’s behaviour, this feature will fall back to the old system whenever the graph cannot find decomposition rules for all unsupported operators in the program, and a
UserWarningis raised.For more information, please consult the PennyLane decomposition module.
Catalyst now supports dynamic wire allocation with
qml.allocate()andqml.deallocate()when program capture is enabled, unlockingqjit-able applications like decompositions of gates that require temporary auxiliary wires and logical patterns in subroutines that benefit from having dynamic wire management. (#2002) (#2075)Two new functions,
qml.allocate()andqml.deallocate(), have been added to PennyLane to support dynamic wire allocation. With Catalyst, these features can be accessed onlightning.qubit,lightning.kokkos, andlightning.gpu.Dynamic wire allocation refers to the allocation of wires in the middle of a circuit, as opposed to the static allocation during device initialization. For example:
qml.capture.enable() @qjit @qml.qnode(qml.device("lightning.qubit", wires=2)) # 2 initial qubits def circuit(): qml.X(0) # |10> with qml.allocate(1) as q: # |10> and |0>, 1 dynamically allocated qubit qml.X(q[0]) # |10> and |1> qml.CNOT(wires=[q[0], 1]) # |11> and |1> return qml.probs(wires=[0, 1])
>>> print(circuit()) [0. 0. 0. 1.]
In the above program, 2 qubits are allocated during device initialization, and 1 additional qubit is allocated inside the circuit with
qml.allocate(1).For more information on what
qml.allocate()andqml.deallocate()do, please consult the PennyLane v0.43 release notes.There are some notable differences between the behaviour of these features with
qjitversus without. For details, please see the relevant sections in the Catalyst sharp bits page.A new quantum compilation pass called
reduce_t_depth()has been added, which reduces the depth and count of non-Clifford Pauli product rotations (PPRs) in circuits. This compilation pass works by commuting non-Clifford PPRs (those requiring aT-state to implement) in adjacent layers and merging compatible ones. More details can be found in Figure 6 of A Game of Surface Codes. (#1975) (#2048) (#2085)The impact of the
reduce_t_depth()pass can be measured usingppm_specs()to compare the circuit depth before and after applying the pass. Consider the following circuit:import pennylane as qml from catalyst import qjit, measure pips = [("pipe", ["enforce-runtime-invariants-pipeline"])] no_reduce_T = { "to_ppr": {}, "commute_ppr": {}, "merge_ppr_ppm": {}, } reduce_T = { "to_ppr": {}, "commute_ppr": {}, "merge_ppr_ppm": {}, "reduce_t_depth": {} } for pipeline in [reduce_T, no_reduce_T]: @qjit(pipelines=pips, target="mlir", circuit_transform_pipeline=pipeline) @qml.qnode(qml.device("null.qubit", wires=3)) def circuit(): n = 3 for i in range(n): qml.H(wires=i) qml.S(wires=i) qml.CNOT(wires=[i, (i + 1) % n]) qml.T(wires=i) qml.H(wires=i) qml.T(wires=i) return [measure(wires=i) for i in range(n)] print(ppm_specs(circuit))
{'circuit_0': {'depth_pi8_ppr': 3, 'depth_ppm': 1, 'logical_qubits': 3, 'max_weight_pi8': 3, 'num_of_ppm': 3, 'pi8_ppr': 6}} {'circuit_0': {'depth_pi8_ppr': 4, 'depth_ppm': 1, 'logical_qubits': 3, 'max_weight_pi8': 3, 'num_of_ppm': 3, 'pi8_ppr': 6}}
After performing the
to_ppr(),commute_ppr(), andmerge_ppr_ppm()passes, the circuit contains a depth of four of non-Clifford PPRs (depth_pi8_ppr). Subsequently applying thereduce_t_depth()pass will move PPRs around via commutation, resulting in a circuit with a smaller PPR depth of three.Catalyst now handles more types of hybrid workflows by supporting returning classical and MCM values with the dynamic one-shot MCM method. (#2004) (#2090)
For example, the code below will generate 10 values, with an equal probability of 42 and 43 appearing.
import pennylane as qml from catalyst import qjit, measure @qjit(autograph=True) @qml.qnode(qml.device("lightning.qubit", wires=1), mcm_method="one-shot", shots=10) def circuit(): qml.Hadamard(wires=0) m = measure(0) if m: return 42, m else: return 43, m
>>> print(circuit()) (Array([42, 43, 42, 42, 43, 42, 42, 43, 42, 42], dtype=int64), Array([ True, False, True, True, False, True, True, False, True, True], dtype=bool))
The default mid-circuit measurement method in catalyst has been changed from
"single-branch-statistics"to"one-shot"when mcms are present in the program, which provides a more sensible experience overall when using finite shots. [#2017] [#2019]The main differentiator is that
"one-shot"explores all branches of the decision tree when probabilistic elements are present in the program, such as mid-circuit measurements, device noise, or other sources of randomness. The cost is that simulation / device execution is repeatedshotsnumber of times.Catalyst now provides native support for
qml.SingleExcitation,qml.DoubleExcitation, andqml.PCPhaseon compatible devices (e.g., Lightning simulators). This enhancement avoids unnecessary gate decomposition, leading to reduced compilation time and improved overall performance. (#1980) (#1987)
Improvements 🛠
Adjoint differentiation is used by default when executing on lightning devices, which significantly reduces gradient computation time. (#1961)
The
ppm_specs()function now tracks the non-Clifford and Clifford PPR depth and the overall PPM depth. (#2014)For example:
from catalyst import qjit, measure from catalyst.passes import to_ppr, commute_ppr, reduce_t_depth, merge_ppr_ppm pips = [("pipe", ["enforce-runtime-invariants-pipeline"])] circuit_transforms = { "to_ppr": {}, "commute_ppr": {}, "merge_ppr_ppm": {}, } @qjit(pipelines=pips, target="mlir", circuit_transform_pipeline=circuit_transforms) @qml.qnode(qml.device("null.qubit", wires=3)) def circuit(): n = 3 for i in range(n): qml.H(wires=i) qml.S(wires=i) qml.CNOT(wires=[i, (i + 1) % n]) qml.T(wires=i) qml.H(wires=i) qml.T(wires=i) return [measure(wires=i) for i in range(n)]
>>> print(ppm_specs(circuit)) {'circuit_0': {'depth_pi8_ppr': 3, 'depth_ppm': 1, 'logical_qubits': 3, 'max_weight_pi8': 3, 'num_of_ppm': 3, 'pi8_ppr': 6}}
pennylane.QubitUnitaryis no longer favoured in the decomposition of controlled operators when the operator is not natively supported by the device, but the device supportspennylane.QubitUnitary. Instead, conversion topennylane.QubitUnitaryonly happens if the operator does not define another decomposition. The previous behaviour was the cause of performance issues when dealing with large controlled operators, as their matrix representation could be embedded as dense constant data into the program. The performance difference can span multiple orders of magnitude. (#2100)Conditional operators, such as
cond()orpennylane.cond(), now allow the target and branch functions to use arguments in their call signature. Previously, one had to supply all values via closure, but this is now done automatically under the hood. (#2096)Improvements have been made to the
catalyst.from_plxpr.from_plxprfeature set. (#1844) (#1850) (#1903) (#1896) (#1889) (#1973) (#1983) (#2041)It now supports:
qml.adjointandqml.ctrloperations and transforms,operator arithmetic observables and
qml.Hermitianobservables,qml.for_loop,qml.condandqml.while_loopoutside of QNodes,qml.condwithelifbranches,dynamic-value shots and dynamically-settable shots,
and the
qml.countsmeasurement process.
Parallelization is now considered in the IR. As part of that, Catalyst can represent parallel layers, compute depth, and optimize depth.
Two change were made as part of this overall improvement to the IR:
A new pass, accessible with
--partition-layersin the Catalyst CLI, has been added to group PPR and PPM operations intoqec.layeroperations based on qubit interactivity and commutativity, enabling circuit analysis and potential support for parallel execution. (#1951)The
qec.layerandqec.yieldoperations have been added to the QEC dialect to represent a group of QEC operations. The main use case is to analyze the depth of a circuit. Also, this is a preliminary step towards supporting parallel execution of QEC layers. (#1917)
Utility functions for modifying an existing compilation pipeline have been added to the
pipelinesmodule. (#1941)These functions provide a simple interface to insert passes and stages into a compilation pipeline. The available functions are
insert_pass_after,insert_pass_before,insert_stage_after, andinsert_stage_before. For example,>>> from catalyst.pipelines import insert_pass_after >>> pipeline = ["pass1", "pass2"] >>> insert_pass_after(pipeline, "new_pass", ref_pass="pass1") >>> pipeline ['pass1', 'new_pass', 'pass2']
A new pass called
detensorize-function-boundaryhas been added, which removes scalar tensors across function boundaries and enables thesymbol-dcepass to remove dead functions, reducing the number of instructions for compilation and thus improving performance. (#1904)The error message for unsupported mid-circuit measurements in measurement processes when using
mcm_method="single-branch-statistics"has been improved. (#2105)Catalyst’s native control flow functions (
for_loop(),while_loop()andcond()) now raise an error if used with PennyLane program capture (i.e.,qml.capture.enable()is present). (#1945)The Catalyst CLI now prints the Catalyst version when invoked with
catalyst --versionorquantum-opt --version. (#1922)A runtime error is now raised when the qubits provided to a quantum gate are not distinct (i.e. overlap). (#2006).
The Pauli product optimization pass that commutes Clifford rotations (\(\frac{\pi}{4}\)) past non-Clifford rotations (\(\frac{\pi}{8}\)) now also supports \(\frac{\pi}{2}\) angles. (#1966)
The default value for the
decompose_methodparameter in theppr_to_ppm()compilation pass is now"pauli-corrected", an improved decomposition of non-Clifford PPRs into two PPMs, instead of two PPMs, and a Clifford correction. This decomposition is based on Figure 13(a) in arXiv:2211.15465. (#2043) (#2047)In the Pauli-based compilation pipeline, identity operations (
qml.Identity) are now accepted in the input program converted to a corresponding PPR gate. Additionally, internal validation was improved across PPR/PPM passes. (#2058)Using the
keep_intermediate='pass'option now prints the whole module scope of a program to the intermediate files instead of just the pass scope. (#2051)
Breaking changes 💔
The
get_ppm_specsfunction has been renamed toppm_specs(). (#2031)The
shotsproperty has been removed fromOQDDevice. The number of shots for a QNode execution is now set directly on the QNode viaqml.qnode(..., shots=N), or via the decoratorqml.set_shots. (#1988)The JAX version used by Catalyst has been updated to 0.6.2. (#1897)
(Device implementers only) The
ReleaseAllQubitsdevice interface function has been replaced withReleaseQubits. (#1996)Instead of releasing all currently active qubits, the new interface function
ReleaseQubitsexplicitly takes in an array of qubit IDs to be released.For devices without dynamic allocation support it is expected that this function only succeed if the ID array contains the same values as those produced by the initial
AllocateQubitscall, otherwise the device is encouraged to raise an error.(Compiler integrators only) The version of LLVM and Enzyme used by Catalyst has been updated and the
mlir-hlodependency has been replaced withstablehlo. (#1916) (#1921)The LLVM version has been updated to commit f8cb798.
The stablehlo version has been updated to commit 69d6dae.
The Enzyme version has been updated to v0.0.186.
Deprecations 👋
Usage of the
Device.shotsproperty, along with settingdevice(..., shots=...), has been deprecated. Please set the shots at the QNode level withqml.qnode(..., shots=...)or using the decoratorqml.set_shots. (#1952)
Bug fixes 🐛
Fixed an issue with PennyLane program capture and static argnums on the QNode where the same lowering was being used no matter if the static arguments changed. The lowering to MLIR is no longer cached if there are static argnums. (#2053)
Fixed a bug where applying a quantum transform after a QNode could produce incorrect results or errors in certain cases. This resolves issues related to transforms operating on QNodes with classical outputs and improves compatibility with measurement transforms. (#2081)
Fixed a bug with incorrect type promotion on conditional branches, which was giving inconsistent output types from qjit’d QNodes. (#1977)
Snake case keyword arguments supplied to
apply_pass()are now correctly converted to the kebab case used for pass options in MLIR. (#1954).For example:
@qjit(target="mlir") @catalyst.passes.apply_pass("some-pass", "an-option", maxValue=1, multi_word_option=1) @qml.qnode(qml.device("null.qubit", wires=1)) def example(): return qml.state()
The pass application instruction will look like the following in MLIR:
%0 = transform.apply_registered_pass "some-pass" with options = {"an-option" = true, "maxValue" = 1 : i64, "multi-word-option" = 1 : i64}Fixed incorrect handling of partitioned shots in the decomposition pass of
measurements_from_samples. (#1981)Fixed a compiler error that occurred when
qml.prodwas used together with other operator transforms (e.g.,qml.adjoint) when Autograph was enabled. (#1910) (#2083)A bug in the
NullQubit::ReleaseQubit()method that prevented the deallocation of individual qubits on the"null.qubit"device has been fixed. (#1926)Stacked Python decorators for built-in Catalyst passes are now applied in the correct order when PennyLane program capture is enabled. (#2027)
Various issues in the OQC device plugin have been fixed:
Fixed a mistake in the gate sequence generated by the
ppr_to_ppmcompilation pass whendecompose_method="auto-corrected"is used. (#2043)static_argnumsis now correctly propagated when tracing the target functions of certain transformations and decorators, like the one used in the dynamic-one-shot mcm method. (#2056)Fixed a bug where deallocating the auxiliary qubit in
ppr_to_ppmwithdecompose_method="clifford-corrected"was deallocating the wrong auxiliary qubit. (#2039)
Internal changes ⚙️
The NullQubit device now provides the resource-tracking filename to allow for cleanup. (#1861)
The type of the
number_original_argattribute inCustomCallOphas been changed from a dense array to an integer. (#2022)QregManagerhas been renamed toQubitHandlerand has been extended to manage converting PLxPR wire indices into Catalyst JAXPR qubits. This is especially useful for lowering subroutines that take in qubits as arguments, like in decomposition rules. (#1820)The error message for using a quantum subroutine that was defined outside of a QNode scope has been improved. (#1932)
The usage of
qml.transforms.dynamic_one_shot.parse_native_mid_circuit_measurementsin Catalyst’sdynamic_one_shotimplementation was updated to use its new call signature. (#1953)When capture is enabled with
qml.capture.enable(),@qml.qjit(autograph=True)will use PennyLane’s autograph implementation instead of Catalyst’s. (#1960)The
extract_backend_infohelper function for theQJITDeviceno longer has a redundantcapabilitiesargument. (#1956)A warning is now raised when subroutines are used without PennyLane program capture enabled (
qml.capture.enable()). (#1930)Import paths for noise transforms have been updated from
pennylane.transformstopennylane.noise. (#1918) (#2020)Conversion patterns for the single-qubit
quantum.alloc_qbandquantum.dealloc_qboperations have been added for lowering to the LLVM dialect. These conversion patterns allow for execution of programs containing these operations. (#1920)The default compilation pipeline is now available as
catalyst.pipelines.default_pipeline(). The functioncatalyst.pipelines.get_stages()has also been removed, as it was not used and duplicated theCompileOptions.get_stages()method. (#1941)A new built-in compilation pipeline for experimental MBQC workloads called
catalyst.ftqc.mbqc_pipeline()has been added. (#1942)The output of this function can be used directly as input to the
pipelinesargument ofqjit(). For example:from catalyst.ftqc import mbqc_pipeline @qjit(pipelines=mbqc_pipeline()) @qml.qnode(dev) def workload(): ...
The
mbqc.graph_state_prepoperation has been added to the MBQC dialect. This operation prepares a graph state with arbitrary qubit connectivity, specified by an input adjacency-matrix operand, for use in MBQC workloads. (#1965)catalyst.accelerate,catalyst.debug.callback, andcatalyst.pure_callback,catalyst.debug.print, andcatalyst.debug.print_memrefnow work when PennyLane program capture is enabled withqml.capture.enable(). (#1902)The merge rotation pass in Catalyst (
merge_rotations()) now also considersqml.Rotandqml.CRot. (#1955)Catalyst now supports array-backed registers, meaning that
quantum.insertoperations can be configured to allow for the insertion of a qubit into an arbitrary position within a register. (#2000)This feature is disabled by default. To enable it, configure the pass pipeline to set the
use-array-backed-registersoption of theconvert-quantum-to-llvmpass totrue. For example:catalyst --tool=opt --pass-pipeline="builtin.module(convert-quantum-to-llvm{use-array-backed-registers=true})" <input file>The
NoMemoryEffecttrait has been removed from thequantum.allocoperation, which allowed for supporting the dynamic wire allocation feature. (#2044)Validation in the
ppm_specsfunction has been improved to prevent duplicate unnecessary duplication in the pipeline configuration. (#2049)A new compilation pass called
ppr_to_mbqc()has been added to lowerqec.pprandqec.ppminstructions into MBQC-style instructions. (#2057)This pass is part of a bottom-of-stack MBQC execution pathway, with a small separation between the PPR/PPM and MBQC layers to enable end-to-end compilation on a mocked backend.
import pennylane as qml from catalyst import qjit, measure from catalyst.passes import ppr_to_mbqc, to_ppr pipeline = [("pipe", ["enforce-runtime-invariants-pipeline"])] @qjit(target="mlir", pipelines=pipeline) @ppr_to_mbqc @to_ppr @qml.qnode(qml.device("lightning.qubit", wires=2)) def circuit(): qml.CNOT(wires=[0, 1]) qml.T(0) return measure(0) print(circuit.mlir_opt)
... %out_qubits = quantum.custom "Hadamard"() %2 : !quantum.bit %out_qubits_2:2 = quantum.custom "CNOT"() %out_qubits, %1 : !quantum.bit, !quantum.bit %out_qubits_3 = quantum.custom "RZ"(%cst_1) %out_qubits_2#1 : !quantum.bit %out_qubits_4:2 = quantum.custom "CNOT"() %out_qubits_2#0, %out_qubits_3 : !quantum.bit, !quantum.bit %out_qubits_5 = quantum.custom "Hadamard"() %out_qubits_4#0 : !quantum.bit %out_qubits_6 = quantum.custom "RZ"(%cst_0) %out_qubits_4#1 : !quantum.bit %out_qubits_7 = quantum.custom "Hadamard"() %out_qubits_5 : !quantum.bit %out_qubits_8 = quantum.custom "RZ"(%cst_0) %out_qubits_7 : !quantum.bit %out_qubits_9 = quantum.custom "Hadamard"() %out_qubits_8 : !quantum.bit %out_qubits_10 = quantum.custom "RZ"(%cst) %out_qubits_6 : !quantum.bit %mres, %out_qubit = quantum.measure %out_qubits_10 : i1, !quantum.bit ...
Note that in an MBQC gate set, the
RotXZXgate cannot yet be executed on available backends.A new jax primitive
qdealloc_qb_pis available for single qubit deallocations, which may be useful for the development of new features. (#2005)
Documentation 📝
Typos were fixed and supplemental information was added to the docstrings for
ppm_compilaion,to_ppr,commute_ppr,ppr_to_ppm,merge_ppr_ppm, andppm_specs. (#2050)The Catalyst Command Line Interface documentation incorrectly stated that the
catalystexecutable is available in thecatalyst/bin/directory relative to the environment’s installation directory when installed viapip. The documentation has been updated to point to the correct location, which is thebin/directory relative to the environment’s installation directory. (#2030)A handful of typos were fixed in the sharp bits page and transforms API. (#2046)
Links to demos were updated and corrected to point to relevant, up-to-date demos. (#2042)
Contributors ✍️
This release contains contributions from (in alphabetical order):
Ali Asadi, Joey Carter, Yushao Chen, Isaac De Vlugt, Sengthai Heng, David Ittah, Jeffrey Kam, Christina Lee, Joseph Lee, Andrija Paurevic, Justin Pickering, Ritu Thombre, Roberto Turrado, Paul Haochen Wang, Jake Zaia, Hongsheng Zheng.