{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Lightweight python components" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the last notebook, we show you how to build a pipeline using a exisitng container image. Lightweight python components do not require you to build a new container image for every code change. They're intended to use for fast iteration in notebook environment.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Building a lightweight python component\n", "\n", "There are several requirements for the function:\n", "\n", "- The function should be stand-alone. It should not use any code declared outside of the function definition. Any imports should be added inside the main function. Any helper functions should also be defined inside the main function.\n", "- The function can only import packages that are available in the base image. If you need to import a package that's not available you can try to find a container image that already includes the required packages. (As a workaround you can use the module subprocess to run pip install for the required package. There is an example below in my_divmod function.)\n", "- If the function operates on numbers, the parameters need to have type hints. Supported types are [int, float, bool]. Everything else is passed as string.\n", "- To build a component with multiple output values, use the typing.NamedTuple type hint syntax: NamedTuple('MyFunctionOutputs', [('output_name_1', type), ('output_name_2', float)])\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Import kfp library" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import kfp\n", "from kfp import components" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a Python function to wrap your component\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Define a Python function\n", "def add(a: float, b: float) -> float:\n", " '''Calculates sum of two arguments'''\n", " return a + b" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Convert the fucntion to a pipeline operation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "add_op = components.func_to_container_op(add, base_image='python:3.6.8')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define your pipeline as a Python function\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import kfp.dsl as dsl\n", "@dsl.pipeline(\n", " name='Calculation pipeline',\n", " description='A toy pipeline that performs arithmetic calculations.'\n", ")\n", "def calculate_sum_lightweight(\n", " a='2',\n", " b='7',\n", " c='17',\n", " d='4'\n", "):\n", " #Passing pipeline parameter and a constant value as operation arguments\n", " add_task1 = add_op(a, b) #Returns a dsl.ContainerOp class instance. \n", " add_task2 = add_op(c, d) #Returns a dsl.ContainerOp class instance. \n", " \n", " #Passing a task output reference as operation arguments\n", " #For an operation with a single return value, the output reference can be accessed using `task.output` or `task.outputs['output_name']` syntax\n", " result_task = add_op(add_task1.output, add_task2.output)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compile the pipeline\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kfp.compiler.Compiler().compile(calculate_sum_lightweight,'calculate-sum-pipeline-lightweight.zip')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deploy the pipeline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Download `calculate-sum-pipeline-lightweight.zip` from your notebook server.\n", "![download-pipeline](./images/download_pipeline.jpg)\n", "\n", "Upload the generated `calculate-sum-pipeline-lightweight.zip` file through the Kubeflow Pipelines UI.\n", "\n", "Please follow [instructions](https://www.kubeflow.org/docs/gke/pipelines-tutorial/#run-the-pipeline) to run your pipeline " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" } }, "nbformat": 4, "nbformat_minor": 2 }