{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Adjoint gradient computation with PennyLane and Amazon Braket"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Use Braket SDK Cost Tracking to estimate the cost to run this example\n",
    "from braket.tracking import Tracker\n",
    "t = Tracker().start()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This tutorial will show you how to compute gradients of free parameters in a quantum circuit using PennyLane and Amazon Braket. Check out the [example notebook](../../braket_features/Using_The_Adjoint_Gradient_Result_Type.ipynb) about the adjoint gradient method and [PennyLane's tutorial](https://pennylane.ai/qml/demos/tutorial_adjoint_diff.html) to learn the basics of the adjoint method. This tutorial builds on the [Quantum Chemistry with VQE example notebook](../3_Quantum_chemistry_with_VQE/3_Quantum_chemistry_with_VQE.ipynb) and the [PennyLane demo on adaptive circuits](https://pennylane.ai/qml/demos/tutorial_adaptive_circuits.html)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Brief introduction to the adjoint differentiation method"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Adjoint differentiation is a technique used to compute gradients of parametrized quantum circuits. It can be used when `shots=0` and is available on Amazon Braket's on-demand state vector simulator, SV1. The adjoint differentiation method allows you to compute the gradient of a circuit with `P` parameters in only 1+1 circuit executions (one forward and one backward pass, similar to backpropagation), as opposed to the parameter-shift or finite-difference methods, both of which require `2P` circuit executions for every gradient calculation. The adjoint method can lower the cost of running variational quantum workflows, especially for circuits with a large number of parameters."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-block alert-info\">\n",
    "<b>Note:</b> This notebook requires pennylane>=0.23.0, amazon-braket-sdk-python>=1.35.0, and amazon-braket-pennylane-plugin>=1.10.0\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Defining the problem"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Just as in the [PennyLane adaptive circuits demo](https://pennylane.ai/qml/demos/tutorial_adaptive_circuits.html), we will use PennyLane to construct a circuit which represents a VQE run on a molecular structure in order to calculate its groundstate energy. In the previous tutorial we looked at $\\mathrm{H}_2$, a relatively small and simple molecule of just two atoms and two electrons. In this tutorial, we will extend the methods of the previous tutorial to study a larger, more difficult to simulate molecule: $\\mathrm{CO}_2$, carbon dioxide. First, we import the necessary Python libraries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pennylane as qml\n",
    "from pennylane import qchem\n",
    "from pennylane import numpy as np\n",
    "import time"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Again, we will represent the molecule we want to simulate using the `xyz` format. An `xyz` file with the structure of $\\mathrm{CO}_2$ is provided in this folder as `co2.xyz`. We'll also set the number of active electrons to simulate. Active electrons are those which can transition between energy levels through an excitation: single, double, or more. Including more active electrons will increase the number of qubits we need to simulate this molecule. In this tutorial, we will set the number of active electrons to `8` - less than the 22 total electrons present in $\\mathrm{CO}_2$. In practice, it's usually possible to get accurate results using far fewer active electrons than total electrons in the molecule, as most electrons do not participate in bonding.\n",
    "\n",
    "We'll use the `pyscf` method to build the Hamiltonian, which uses the OpenFermion-PySCF plugin to build a *non-differentiable* molecular Hamiltonian. Since we are interested in optimizing the *excitations*, and not the Hamiltonian itself, this is a reasonable choice that will lower the total number of derivatives we will need to compute, because the Hamiltonian itself will not be parametrized. For more information about this plugin, see [the PennyLane documentation](https://docs.pennylane.ai/en/stable/code/qml_qchem.html?highlight=pyscf#openfermion-pyscf-backend)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Number of qubits: 16\n"
     ]
    }
   ],
   "source": [
    "n_electrons = 8\n",
    "symbols, coordinates = qchem.read_structure('qchem/co2.xyz')\n",
    "# suppress a HDF5 warning\n",
    "import warnings\n",
    "with warnings.catch_warnings():\n",
    "    warnings.simplefilter('ignore')\n",
    "    H, qubits = qchem.molecular_hamiltonian(symbols,\n",
    "                                            coordinates,\n",
    "                                            method=\"pyscf\",\n",
    "                                            active_electrons=n_electrons,\n",
    "                                            name=\"co2\")\n",
    "print(f\"Number of qubits: {qubits}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We now define our initial ansatz, a Hartree-Fock state with all the single and double excitations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Number of single excitations:  32\n",
      "Number of double excitations:  328\n"
     ]
    }
   ],
   "source": [
    "# Hartree-Fock state\n",
    "hf_state = qchem.hf_state(n_electrons, qubits)\n",
    "# generate single- and double-excitations\n",
    "singles, doubles = qchem.excitations(n_electrons, qubits)\n",
    "print(\"Number of single excitations: \", len(singles))\n",
    "print(\"Number of double excitations: \", len(doubles))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can immediately see that the number of excitations, especially doubles, is very large. However, many of these excitations will contribute little to the final energy we compute. We thus seek a way to \"filter\" the large number of unimportant excitations. This will not reduce the accuracy of our approach very much, but will allow us to arrive at a final energy must faster and with much less cost in compute time."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Adaptive VQE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In general, we do not know which excitations are important (contribute a lot to the final energy) and which are not (contribute little). [Grimsley et al.](https://www.nature.com/articles/s41467-019-10988-2) developed the ADAPT-VQE algorithm, which allows us to perform this desired filtering. The steps to be followed are:\n",
    "  1. Compute derivatives with respect to all `doubles` excitations\n",
    "  2. Filter out all `doubles` with derivatives below some cutoff\n",
    "  3. Optimize the remaining `doubles` excitations\n",
    "  4. Compute derivatives with respect to all `singles` excitatations, keeping the filtered-and-optimized `doubles` fixed\n",
    "  5. Filter out all `singles` with derivatives below some cutoff\n",
    "  6. Optimize all remaining `singles` and `doubles` excitations\n",
    "  7. Compute the final energy"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Running adaptive VQE with adjoint differentiation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll need to set up the device to use with PennyLane. Because the qubit count is 16, this workflow is a good candidate for SV1, the Amazon Braket on-demand state vector simulator. SV1 now supports two gradient computation methods in `shots=0` (exact) mode: adjoint differentiation, available by setting `diff_method='device'`, and parameter shift, available by setting `diff_method='parameter-shift'`. As shown in [the adjoint gradient example notebook](../../braket_features/Using_The_Adjoint_Gradient_Result_Type.ipynb), the adjoint differentiation method is an execution-frugal way to compute gradients. When using `parameter-shift`, each partial derivative in the gradient requires *two* circuit executions to compute, but with the adjoint method we can compute *all* partial derivatives (and thus the entire gradient) with one circuit execution and the \"back-stepping\" procedure, which is similar in runtime. The adjoint method can deliver a quadratic speedup in the number of parameters, making it a great choice when the number of parameterized gates is large, as it is for our problem."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "# set the device and differentiation method\n",
    "device_arn = \"arn:aws:braket:::device/quantum-simulator/amazon/sv1\"\n",
    "dev = qml.device(\"braket.aws.qubit\", device_arn=device_arn, wires=qubits, shots=0)\n",
    "diff_method = 'device'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 1: Compute derivatives of all `doubles` excitations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we define the circuit to implement step 1 of our algorithm above: computing derivatives of all `doubles` excitations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "@qml.qnode(dev, diff_method=diff_method)\n",
    "def circuit_1(params, excitations):\n",
    "    qml.BasisState(hf_state, wires=H.wires)\n",
    "    for i, excitation in enumerate(excitations):\n",
    "        if len(excitation) == 4:\n",
    "            qml.DoubleExcitation(params[i], wires=excitation)\n",
    "        else:\n",
    "            qml.SingleExcitation(params[i], wires=excitation)\n",
    "    return qml.expval(H)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We initialize all the parameters as `0.0` to start and compute the derivatives of all double excitations:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-block alert-warning\">\n",
    "<b>Caution</b> This cell may take about 30s to run on SV1.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Time to compute all double excitation derivatives with adjoint differentiation: 36.21157622337341\n"
     ]
    }
   ],
   "source": [
    "circuit_gradient = qml.grad(circuit_1, argnum=0)\n",
    "doubles_select = []\n",
    "params = [0.0] * len(doubles)\n",
    "\n",
    "adjoint_doubles_start = time.time()\n",
    "\n",
    "doubles_grads = circuit_gradient(params, excitations=doubles)\n",
    "\n",
    "adjoint_doubles_stop = time.time()\n",
    "adjoint_doubles_time = adjoint_doubles_stop - adjoint_doubles_start\n",
    "print(f\"Time to compute all double excitation derivatives with adjoint differentiation: {adjoint_doubles_time}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 2: Filter out all doubles with derivatives below some cutoff"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can filter out all double excitations which contribute little as measured by their derivative. The choice of cutoff is arbitrary and higher cutoffs will improve performance, by removing more parameters that would have to be optimized, at the possible cost of accuracy. In this case we'll pick `1e-5` as a cutoff."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Total number of doubles 328\n",
      "Total number of selected doubles 84\n"
     ]
    }
   ],
   "source": [
    "doubles_select = [doubles[i] for i in range(len(doubles)) if abs(doubles_grads[i]) > 1.0e-5]\n",
    "print(f\"Total number of doubles {len(doubles)}\")\n",
    "print(f\"Total number of selected doubles {len(doubles_select)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can see that the filtering procedure has dramatically lessened the number of double excitations, and thus the number of parameters we will need to optimize in later steps of the algorithm."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 3: Optimize the remaining `doubles` excitations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can optimize the parameters associated with the double excitations which do contribute significantly. We'll define fresh parameter values for this optimization procedure and run for 10 iterations with a step size of 0.4."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-block alert-warning\">\n",
    "<b>Caution</b> This cell may take about 5 minutes to run on SV1.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Iteration 0 of doubles optimization.\n",
      "Iteration 1 of doubles optimization.\n",
      "Iteration 2 of doubles optimization.\n",
      "Iteration 3 of doubles optimization.\n",
      "Iteration 4 of doubles optimization.\n",
      "Iteration 5 of doubles optimization.\n",
      "Iteration 6 of doubles optimization.\n",
      "Iteration 7 of doubles optimization.\n",
      "Iteration 8 of doubles optimization.\n",
      "Iteration 9 of doubles optimization.\n"
     ]
    }
   ],
   "source": [
    "stepsize=0.4\n",
    "\n",
    "opt = qml.GradientDescentOptimizer(stepsize=stepsize)\n",
    "iterations = 10\n",
    "\n",
    "params_doubles = np.zeros(len(doubles_select), requires_grad=True)\n",
    "\n",
    "for n in range(iterations):\n",
    "    print(f\"Iteration {n} of doubles optimization.\")\n",
    "    params_doubles = opt.step(circuit_1, params_doubles, excitations=doubles_select)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 4: Compute derivatives with respect to all `singles` excitations, keeping the filtered-and-optimized `doubles` fixed"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With the filtered doubles optimized, we are ready to compute the derivatives with respect to the single excitations given that the `doubles` are held fixed. We define a new circuit to use in the gradient computation:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "@qml.qnode(dev, diff_method=diff_method)\n",
    "def circuit_2(params, excitations, gates_select, params_select):\n",
    "    qml.BasisState(hf_state, wires=H.wires)\n",
    "\n",
    "    for i, gate in enumerate(gates_select):\n",
    "        if len(gate) == 4:\n",
    "            qml.DoubleExcitation(params_select[i], wires=gate)\n",
    "        elif len(gate) == 2:\n",
    "            qml.SingleExcitation(params_select[i], wires=gate)\n",
    "\n",
    "    for i, gate in enumerate(excitations):\n",
    "        if len(gate) == 4:\n",
    "            qml.DoubleExcitation(params[i], wires=gate)\n",
    "        elif len(gate) == 2:\n",
    "            qml.SingleExcitation(params[i], wires=gate)\n",
    "\n",
    "    return qml.expval(H)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here, `gates_select` and `params_select` refer to the `doubles` excitations we selected in previous steps. Again, we compute the derivatives of the cost function with respect to all single excitations using the adjoint differentiation method:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Time to compute all singles derivatives with adjoint differentiation: 25.915367364883423\n"
     ]
    }
   ],
   "source": [
    "circuit_gradient = qml.grad(circuit_2, argnum=0)\n",
    "params = [0.0] * len(singles)\n",
    "\n",
    "adjoint_singles_start = time.time()\n",
    "\n",
    "singles_grads = circuit_gradient(params, excitations=singles, gates_select=doubles_select, params_select=params_doubles)\n",
    "\n",
    "adjoint_singles_stop = time.time()\n",
    "adjoint_singles_time = adjoint_singles_stop - adjoint_singles_start\n",
    "print(f\"Time to compute all singles derivatives with adjoint differentiation: {adjoint_singles_time}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 5: Filter out all `singles` with derivatives below some cutoff"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With these `grads`, we can filter out all single excitations with derivatives lower than some cutoff. We'll use `1e-5` as the cutoff again, keeping in mind that it is arbitrary."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Total number of singles 32\n",
      "Total number of selected singles 4\n"
     ]
    }
   ],
   "source": [
    "singles_select = [singles[i] for i in range(len(singles)) if abs(singles_grads[i]) > 1.0e-5]\n",
    "print(f\"Total number of singles {len(singles)}\")\n",
    "print(f\"Total number of selected singles {len(singles_select)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Because the total number of single excitations is lower, the difference is not as dramatic as with the `doubles`. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 6: Optimize all remaining `singles` and `doubles` excitations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we have the full set of filtered excitations, we can optimize them all together."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-block alert-info\">\n",
    "<b>Note</b> Because we are now optimizing <b>both</b> the single and double excitation parameters, we will use <code>circuit_1</code> to compute the cost.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-block alert-warning\">\n",
    "<b>Caution</b> This cell may take about 5 minutes to run on SV1.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Iteration 0 of full optimization.\n",
      "Iteration 1 of full optimization.\n",
      "Iteration 2 of full optimization.\n",
      "Iteration 3 of full optimization.\n",
      "Iteration 4 of full optimization.\n",
      "Iteration 5 of full optimization.\n",
      "Iteration 6 of full optimization.\n",
      "Iteration 7 of full optimization.\n",
      "Iteration 8 of full optimization.\n",
      "Iteration 9 of full optimization.\n"
     ]
    }
   ],
   "source": [
    "params = np.zeros(len(doubles_select + singles_select), requires_grad=True)\n",
    "gates_select = doubles_select + singles_select\n",
    "\n",
    "best_energy = 0.0\n",
    "for n in range(iterations):\n",
    "    print(f\"Iteration {n} of full optimization.\")\n",
    "    params, energy = opt.step_and_cost(circuit_1, params, excitations=gates_select)\n",
    "    best_energy=energy.numpy()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 7: Determine the final energy"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Having completed the final round of optimization, we can print the final energy that the Adapt-VQE algorithm computed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Best energy: -184.90548216687063\n"
     ]
    }
   ],
   "source": [
    "print(f\"Best energy: {best_energy}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Performance comparison"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Above, we mentioned that the adjoint differentiation method (accessible through the `diff_method='device'` argument to a PennyLane `qnode`) should be used in `shots=0` mode because it offers quadratic speedup over the `parameter-shift` method. However, `parameter-shift` is easily parallelizable using batch execution, which is supported by the Amazon Braket PennyLane plugin (see [the docs](https://amazon-braket-pennylane-plugin-python.readthedocs.io/en/latest/devices/braket_remote.html#enabling-the-parallel-execution-of-multiple-circuits) for more), while the adjoint method must run in a single task. So, is adjoint still worthwhile to use if we are willing to run `parameter-shift` in batch mode?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-block alert-warning\">\n",
    "<b>Caution</b> This section may take over an hour to run and incur significant charges if you are running on an Amazon on-demand simulator. The code which submits tasks has been commented out to prevent you from incurring unintended bills. To run the code in this section, you must first uncomment it. Otherwise, you can follow along and view the results at the end.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For this comparison we will *not* compare the time to compute the full gradient for the circuit if there are more than 100 double excitations, as this can take a very significant amount of time. We in fact already know how long it took using the adjoint method, as we measured that above. Since each component of the gradient can be computed independently when using `parameter-shift` (this can also be done when using the adjoint method, but doing so would sacrifice the performance benefit), we can measure how long it takes to compute a fraction of the gradient and extrapolate."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-block alert-info\">\n",
    "<b>Note</b> The choice to compute at most 100 derivatives for the parameter shift comparison was made in the interests of minimizing runtime and billing. If you want a true like-to-like comparison, you can compute <b>all</b> derivatives of double excitations using <code>parameter-shift</code> rather than extrapolating.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "diff_method = 'parameter-shift'\n",
    "doubles_count = min(35, len(doubles))\n",
    "doubles_ps = doubles[:doubles_count]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First, we redefine `circuit_1` so that it uses the `parameter-shift` differentiation method:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "@qml.qnode(dev, diff_method=diff_method)\n",
    "def circuit_1_ps_serial(params, excitations):\n",
    "    qml.BasisState(hf_state, wires=H.wires)\n",
    "    for i, excitation in enumerate(excitations):\n",
    "        if len(excitation) == 4:\n",
    "            qml.DoubleExcitation(params[i], wires=excitation)\n",
    "        else:\n",
    "            qml.SingleExcitation(params[i], wires=excitation)\n",
    "    return qml.expval(H)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we compute the first `doubles_count` derivatives in `doubles` using `parameter-shift`, but *without* using batched execution on Amazon Braket:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Time to compute 35 double excitation derivatives using unbatched parameter shift: 3186.5143387317657\n"
     ]
    }
   ],
   "source": [
    "circuit_gradient = qml.grad(circuit_1_ps_serial, argnum=0)\n",
    "params = [0.0] * doubles_count\n",
    "\n",
    "doubles_ps_unbatched_start = time.time()\n",
    "\n",
    "unbatched_grads = circuit_gradient(params, excitations=doubles_ps)\n",
    "\n",
    "doubles_ps_unbatched_stop = time.time()\n",
    "doubles_ps_unbatched_time = doubles_ps_unbatched_stop - doubles_ps_unbatched_start\n",
    "print(f\"Time to compute {doubles_count} double excitation derivatives using unbatched parameter shift: {doubles_ps_unbatched_time}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can rerun the computation, now with batching turned on in the `qml.device`. Keep in mind that you must obey the per-region [concurrency limits](https://docs.aws.amazon.com/braket/latest/developerguide/braket-quotas.html) for Amazon on-demand simulators if you choose to set the [optional `max_parallel` argument](https://amazon-braket-pennylane-plugin-python.readthedocs.io/en/latest/devices/braket_remote.html#enabling-the-parallel-execution-of-multiple-circuits)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-block alert-info\">\n",
    "<b>Note</b> As discussed in the <a href=\"https://docs.aws.amazon.com/braket/latest/developerguide/braket-batching-tasks.html\">Amazon Braket documentation</a>, running a batch of tasks concurrently does <b>not</b> lower the total cost of the batch, only the wallclock time. Tasks will be executed concurrently on the on-demand simulator as opposed to one after another, as when running outside of batch mode, which lowers the time you must wait for the entire batch of tasks to complete.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "dev_parallel = qml.device(\n",
    "    \"braket.aws.qubit\",\n",
    "    device_arn=\"arn:aws:braket:::device/quantum-simulator/amazon/sv1\",\n",
    "    wires=qubits,\n",
    "    shots=0,\n",
    "    parallel=True,\n",
    ")\n",
    "@qml.qnode(dev_parallel, diff_method=diff_method)\n",
    "def circuit_1_ps_parallel(params, excitations): # must redefine due to new device\n",
    "    qml.BasisState(hf_state, wires=H.wires)\n",
    "    for i, excitation in enumerate(excitations):\n",
    "        if len(excitation) == 4:\n",
    "            qml.DoubleExcitation(params[i], wires=excitation)\n",
    "        else:\n",
    "            qml.SingleExcitation(params[i], wires=excitation)\n",
    "    return qml.expval(H)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can rerun and time the computation, now with batching enabled:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Time to compute 35 double excitation derivatives using batched parameter shift: 1802.3958163261414\n"
     ]
    }
   ],
   "source": [
    "params = [0.0] * doubles_count\n",
    "circuit_gradient = qml.grad(circuit_1_ps_parallel, argnum=0)\n",
    "doubles_ps_batched_start = time.time()\n",
    "\n",
    "batched_grads = circuit_gradient(params, excitations=doubles_ps)\n",
    "\n",
    "doubles_ps_batched_stop = time.time()\n",
    "doubles_ps_batched_time = doubles_ps_batched_stop - doubles_ps_batched_start\n",
    "print(f\"Time to compute {doubles_count} double excitation derivatives using batched parameter shift: {doubles_ps_batched_time}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Extrapolated time to compute all doubles derivatives with unbatched parameter shift: 29862.191517257692\n"
     ]
    }
   ],
   "source": [
    "extrapolated_unbatched_ps_time = doubles_ps_unbatched_time * (len(doubles)/doubles_count)\n",
    "print(f\"Extrapolated time to compute all doubles derivatives with unbatched parameter shift: {extrapolated_unbatched_ps_time}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Extrapolated time to compute all doubles derivatives with batched parameter shift: 16891.023650142124\n"
     ]
    }
   ],
   "source": [
    "extrapolated_batched_ps_time = doubles_ps_batched_time * (len(doubles)/doubles_count)\n",
    "print(f\"Extrapolated time to compute all doubles derivatives with batched parameter shift: {extrapolated_batched_ps_time}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As expected, the batched execution is substantially faster. How would it compare with the walltime for computing the derivatives with the adjoint method? We can do a rough calculation and comparison:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Time to compute all doubles derivatives:\n",
      "Ratio of (extrapolated) unbatched parameter shift time to adjoint time: 824.6587039749627\n"
     ]
    }
   ],
   "source": [
    "adjoint_vs_unbatched_ps = extrapolated_unbatched_ps_time / adjoint_doubles_time\n",
    "print(f\"Time to compute all doubles derivatives:\\nRatio of (extrapolated) unbatched parameter shift time to adjoint time: {adjoint_vs_unbatched_ps}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Time to compute all doubles derivatives:\n",
      "Ratio of (extrapolated) batched parameter shift time to adjoint time: 466.453698285564\n"
     ]
    }
   ],
   "source": [
    "adjoint_vs_batched_ps = extrapolated_batched_ps_time / adjoint_doubles_time\n",
    "print(f\"Time to compute all doubles derivatives:\\nRatio of (extrapolated) batched parameter shift time to adjoint time: {adjoint_vs_batched_ps}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The adjoint differentiation method provides a substantial wallclock improvement over both batched and unbatched `parameter-shift`, even for the relatively small qubit count of this problem. Because of the quadratic scaling, we can expect the adjoint differentiation method's performance to be even more favorable for larger systems. For VQE problems like this one, this is particularly true, as the number of double excitations scales to the fourth power in the number of electrons, which is itself related to the number of qubits."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Correctness check"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, just to reassure ourselves, we can check that the adjoint method and `parameter-shift` return identical results for the derivatives in `shots=0` mode."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[-0.044793067882551406, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.04479306788255156, 0.0, 0.0, 0.0, 0.0, -0.03548100882402628, 0.0, 0.0, -0.023288073123347845, 0.0, 0.0, 0.0, 0.0, 0.0, -0.012217382659441306, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]\n",
      "[-0.044793067882545376, -6.1928479400457334e-15, 2.6531566187835427e-15, -8.942675704972513e-16, 1.7746123057175047e-15, 5.901095647171372e-15, 1.1868639193156705e-15, -0.044793067882549185, 4.7342881085964644e-15, 8.820892689029992e-16, 1.4863490802086347e-15, 5.90638513669108e-15, -0.03548100882403037, -6.2279833078444984e-15, -2.6752696290439975e-15, -0.023288073123352977, 1.6921423417461336e-15, 2.670274324449899e-15, -2.815484849928097e-15, -2.0406466188365425e-17, -5.810940290979032e-15, -0.01221738265940523, 6.8565100157100086e-15, 5.280719959261674e-15, 6.1969905855345365e-15, -2.171061239200851e-15, 5.089076788104875e-16, -2.6180781927668725e-15, 5.3732805495666735e-15, 6.131155462712434e-16, 3.811084445308944e-15, -6.333324218826695e-15, -3.5324021440048863e-15, -4.675311186009487e-15, 5.167609105081913e-15]\n"
     ]
    }
   ],
   "source": [
    "adjoint_derivs = [d.numpy() for d in doubles_grads[:doubles_count]]\n",
    "unbatched_derivs = [d.numpy() for d in unbatched_grads[:doubles_count]]\n",
    "print(adjoint_derivs)\n",
    "print(unbatched_derivs)\n",
    "assert np.allclose(adjoint_derivs, unbatched_derivs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[-0.044793067882545376, -6.1928479400457334e-15, 2.6531566187835427e-15, -8.942675704972513e-16, 1.7746123057175047e-15, 5.901095647171372e-15, 1.1868639193156705e-15, -0.044793067882549185, 4.7342881085964644e-15, 8.820892689029992e-16, 1.4863490802086347e-15, 5.90638513669108e-15, -0.03548100882403037, -6.2279833078444984e-15, -2.6752696290439975e-15, -0.023288073123352977, 1.6921423417461336e-15, 2.670274324449899e-15, -2.815484849928097e-15, -2.0406466188365425e-17, -5.810940290979032e-15, -0.01221738265940523, 6.8565100157100086e-15, 5.280719959261674e-15, 6.1969905855345365e-15, -2.171061239200851e-15, 5.089076788104875e-16, -2.6180781927668725e-15, 5.3732805495666735e-15, 6.131155462712434e-16, 3.811084445308944e-15, -6.333324218826695e-15, -3.5324021440048863e-15, -4.675311186009487e-15, 5.167609105081913e-15]\n"
     ]
    }
   ],
   "source": [
    "batched_derivs = [d.numpy() for d in batched_grads[:doubles_count]]\n",
    "print(batched_derivs)\n",
    "assert np.allclose(adjoint_derivs, batched_derivs)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-block alert-info\">\n",
    "<b>What's next?</b> Try running this workflow on different molecules, for example ammonia or carbon monoxide. Try splitting the adjoint gradient computations over multiple batches (you can compute slices of the <code>doubles</code> independently and submit the computations to run concurrently) and see if that can improve wallclock performance.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Task Summary\n",
      "{'arn:aws:braket:::device/quantum-simulator/amazon/sv1': {'shots': 0, 'tasks': {'COMPLETED': 304}, 'execution_duration': datetime.timedelta(seconds=483, microseconds=156000), 'billed_execution_duration': datetime.timedelta(seconds=983, microseconds=345000)}}\n",
      "Note: Charges shown are estimates based on your Amazon Braket simulator and quantum processing unit (QPU) task usage. Estimated charges shown may differ from your actual charges. Estimated charges do not factor in any discounts or credits, and you may experience additional charges based on your use of other services such as Amazon Elastic Compute Cloud (Amazon EC2).\n",
      "Estimated cost to run this example: 1.229 USD\n"
     ]
    }
   ],
   "source": [
    "print(\"Task Summary\")\n",
    "print(t.quantum_tasks_statistics())\n",
    "print('Note: Charges shown are estimates based on your Amazon Braket simulator and quantum processing unit (QPU) task usage. Estimated charges shown may differ from your actual charges. Estimated charges do not factor in any discounts or credits, and you may experience additional charges based on your use of other services such as Amazon Elastic Compute Cloud (Amazon EC2).')\n",
    "print(f\"Estimated cost to run this example: {t.qpu_tasks_cost() + t.simulator_tasks_cost():.3f} USD\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_braket",
   "language": "python",
   "name": "conda_braket"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.15"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}