# Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. """ Getting Starting using TVMC Python: a high-level API for TVM ============================================================= **Author**: `Jocelyn Shiue `_ Hi! Here we explain the scripting tool designed for the complete TVM beginner. 🙂 Before we get started let's get an example model if you don't already have one. Follow the steps to download a resnet model via the terminal: .. code-block:: python mkdir myscripts cd myscripts wget https://github.com/onnx/models/raw/main/vision/classification/resnet/model/resnet50-v2-7.onnx mv resnet50-v2-7.onnx my_model.onnx touch tvmcpythonintro.py Let's start editing the python file in your favorite text editor. """ ################################################################################ # Step 0: Imports # ~~~~~~~~~~~~~~~ # # .. code-block:: python # # from tvm.driver import tvmc # # ################################################################################ # Step 1: Load a model # ~~~~~~~~~~~~~~~~~~~~ # # Let's import our model into tvmc. This step converts a machine learning model from # a supported framework into TVM's high level graph representation language called Relay. # This is to have a unified starting point for all models in tvm. The frameworks we currently # support are: Keras, ONNX, Tensorflow, TFLite, and PyTorch. # # .. code-block:: python # # model = tvmc.load('my_model.onnx') #Step 1: Load # # If you'd like to see the Relay, you can run: # ``model.summary()`` # # All frameworks support overwriting the input shapes with a shape_dict argument. # For most frameworks this is optional, but for Pytorch this is necessary as # TVM cannot automatically search for it. # # .. code-block:: python # # #model = tvmc.load(my_model, shape_dict={'input1' : [1, 2, 3, 4], 'input2' : [1, 2, 3, 4]}) #Step 1: Load + shape_dict # # A suggested way to see the model's input/shape_dict is via `netron `_. After opening the model, # click the first node to see the name(s) and shape(s) in the inputs section. ################################################################################ # Step 2: Compile # ~~~~~~~~~~~~~~~ # # Now that our model is in Relay, our next step is to compile it to a desired # hardware to run on. We refer to this hardware as a target. This compilation process # translates the model from Relay into a lower-level language that the # target machine can understand. # # In order to compile a model a tvm.target string is required. # To learn more about tvm.targets and their options look at the `documentation `_. # Some examples include: # # 1. cuda (Nvidia GPU) # 2. llvm (CPU) # 3. llvm -mcpu=cascadelake (Intel CPU) # # .. code-block:: python # # package = tvmc.compile(model, target="llvm") #Step 2: Compile # # # The compilation step returns a package. # ################################################################################ # Step 3: Run # ~~~~~~~~~~~ # # The compiled package can now be run on the hardware target. The device # input options are: CPU, Cuda, CL, Metal, and Vulkan. # # .. code-block:: python # # result = tvmc.run(package, device="cpu") #Step 3: Run # # And you can print the results: # ``print(results)`` # ################################################################################ # Step 1.5: Tune [Optional & Recommended] # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # # Run speed can further be improved by tuning. This optional step uses # machine learning to look at each operation within a model (a function) and # tries to find a faster way to run it. We do this through a cost model, and # benchmarking possible schedules. # # The target is the same as compile. # # .. code-block:: python # # tvmc.tune(model, target="llvm") #Step 1.5: Optional Tune # # The terminal output should look like: # # .. code-block:: python # # [Task 1/13] Current/Best: 82.00/ 106.29 GFLOPS | Progress: (48/769) | 18.56 s # [Task 1/13] Current/Best: 54.47/ 113.50 GFLOPS | Progress: (240/769) | 85.36 s # ..... # # There may be UserWarnings that can be ignored. # This should make the end result faster, but it can take hours to tune. # # See the section 'Saving the Tuning Results' below. Be sure to pass the tuning # results into compile if you want the results to apply. # # .. code-block:: python # # #tvmc.compile(model, target="llvm", tuning_records = "records.log") #Step 2: Compile ################################################################################ # Save and then start the process in the terminal: # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # # .. code-block:: python # # python my_tvmc_script.py # # Note: Your fans may become very active # ################################################################################ # Example results: # ~~~~~~~~~~~~~~~~ # # .. code-block:: python # # Time elapsed for training: 18.99 s # Execution time summary: # mean (ms) max (ms) min (ms) std (ms) # 25.24 26.12 24.89 0.38 # # # Output Names: # ['output_0'] # ################################################################################ # Additional TVMC Functionalities # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # ################################################################################ # Saving the model # ~~~~~~~~~~~~~~~~ # # To make things faster for later, after loading the model (Step 1) save the Relay version. # The model will then appear where you saved it for later in the coverted syntax. # # .. code-block:: python # # model = tvmc.load('my_model.onnx') #Step 1: Load # model.save(desired_model_path) # # ################################################################################ # Saving the package # ~~~~~~~~~~~~~~~~~~ # # After the model has been compiled (Step 2) the package also is also saveable. # # .. code-block:: python # # tvmc.compile(model, target="llvm", package_path="whatever") # # new_package = tvmc.TVMCPackage(package_path="whatever") # result = tvmc.run(new_package) #Step 3: Run # # ################################################################################ # Using Autoscheduler # ~~~~~~~~~~~~~~~~~~~ # # Use the next generation of tvm to enable potentially faster run speed results. # The search space of the schedules is automatically generated unlike # previously where they needed to be hand written. (Learn more: # `1 `_, # `2 `_) # # .. code-block:: python # # tvmc.tune(model, target="llvm", enable_autoscheduler = True) # # ################################################################################ # Saving the tuning results # ~~~~~~~~~~~~~~~~~~~~~~~~~ # # The tuning results can be saved in a file for later reuse. # # Method 1: # .. code-block:: python # # log_file = "hello.json" # # # Run tuning # tvmc.tune(model, target="llvm",tuning_records=log_file) # # ... # # # Later run tuning and reuse tuning results # tvmc.tune(model, target="llvm",tuning_records=log_file) # # Method 2: # .. code-block:: python # # # Run tuning # tuning_records = tvmc.tune(model, target="llvm") # # ... # # # Later run tuning and reuse tuning results # tvmc.tune(model, target="llvm",tuning_records=tuning_records) # ################################################################################ # Tuning a more complex model: # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # # If you notice T's printing that look like ``.........T.T..T..T..T.T.T.T.T.T.`` # increase the searching time frame: # # .. code-block:: python # # tvmc.tune(model,trials=10000,timeout=10,) # ################################################################################ # Compiling a model for a remote device: # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # # A remote procedural call (RPC) is useful when you would like to compile for hardware # that is not on your local machine. The tvmc methods support this. # To set up the RPC server take a look at the 'Set up RPC Server on Device' # section in this `document `_. # # Within the TVMC Script include the following and adjust accordingly: # # .. code-block:: python # # tvmc.tune( # model, # target=target, # Compilation target as string // Device to compile for # target_host=target_host, # Host processor # hostname=host_ip_address, #The IP address of an RPC tracker, used when benchmarking remotely. # port=port_number, # The port of the RPC tracker to connect to. Defaults to 9090. # rpc_key=your_key, # The RPC tracker key of the target device. Required when rpc_tracker is provided # ) #