{ "cells": [ { "attachments": { "JupyterPCIntegration.png": { "image/png": "" } }, "cell_type": "markdown", "metadata": {}, "source": [ "# Setting up and Using a Python Client Library for the Slurm REST API\n", "\n", "This notebook provides details how to set up a Python client library for the Slurm REST API. It currently tested with Slurm 20.11.7 and 20.11.8, which provides Slurm REST API v0.0.36. Some additional details and a high-level overview are provided in this related blog post.\n", "https://aws.amazon.com/blogs/hpc/using-the-slurm-rest-api-to-integrate-with-distributed-architectures-on-aws/\n", "\n", "First, we will create a Python module from the OpenAPI specification, then we will run representative functions from the created Python module, including viewing node information, submitting jobs, and viewing the job queue.\n", "\n", "This notebook is intended to follow the pcluster-athena++ notebook, which is also contained in this repository. In that notebook, the infrastructure in the diagram below is created automatically. The pcluster-athena++ notebook uses custom functions created in a separate helper module to interact with the REST API. This notebook is intended to provide lower-level functionality using only functions created directly from the OpenAPI specification.\n", "\n", "\n", "![JupyterPCIntegration.png](attachment:JupyterPCIntegration.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set-up\n", "\n", "We will use the OpenAPI Generator for Python to create the client module. The source can be found at: https://github.com/openapi-generators/openapi-python-client. Here, we will use pip to install it." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: openapi-python-client in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (0.10.7)\n", "Requirement already satisfied: pyyaml<6.0.0,>=5.3.1 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from openapi-python-client) (5.4.1)\n", "Requirement already satisfied: importlib_metadata<5,>2 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from openapi-python-client) (3.7.0)\n", "Requirement already satisfied: attrs<22.0.0,>=21.0.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from openapi-python-client) (21.2.0)\n", "Requirement already satisfied: httpx<0.21.0,>=0.15.4 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from openapi-python-client) (0.20.0)\n", "Requirement already satisfied: pydantic<2.0.0,>=1.6.1 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from openapi-python-client) (1.8.2)\n", "Requirement already satisfied: typer<0.4,>=0.3 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from openapi-python-client) (0.3.2)\n", "Requirement already satisfied: typing-extensions in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from openapi-python-client) (3.10.0.2)\n", "Requirement already satisfied: isort<6.0.0,>=5.0.5 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from openapi-python-client) (5.7.0)\n", "Requirement already satisfied: python-dateutil<3.0.0,>=2.8.1 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from openapi-python-client) (2.8.1)\n", "Requirement already satisfied: shellingham<2.0.0,>=1.3.2 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from openapi-python-client) (1.4.0)\n", "Requirement already satisfied: black in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from openapi-python-client) (20.8b1)\n", "Requirement already satisfied: jinja2<4.0.0,>=3.0.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from openapi-python-client) (3.0.2)\n", "Requirement already satisfied: autoflake<2.0,>=1.4 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from openapi-python-client) (1.4)\n", "Requirement already satisfied: pyflakes>=1.1.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from autoflake<2.0,>=1.4->openapi-python-client) (2.2.0)\n", "Requirement already satisfied: sniffio in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from httpx<0.21.0,>=0.15.4->openapi-python-client) (1.2.0)\n", "Requirement already satisfied: charset-normalizer in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from httpx<0.21.0,>=0.15.4->openapi-python-client) (2.0.6)\n", "Requirement already satisfied: httpcore<0.14.0,>=0.13.3 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from httpx<0.21.0,>=0.15.4->openapi-python-client) (0.13.7)\n", "Requirement already satisfied: async-generator in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from httpx<0.21.0,>=0.15.4->openapi-python-client) (1.10)\n", "Requirement already satisfied: certifi in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from httpx<0.21.0,>=0.15.4->openapi-python-client) (2021.5.30)\n", "Requirement already satisfied: rfc3986[idna2008]<2,>=1.3 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from httpx<0.21.0,>=0.15.4->openapi-python-client) (1.5.0)\n", "Requirement already satisfied: anyio==3.* in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from httpcore<0.14.0,>=0.13.3->httpx<0.21.0,>=0.15.4->openapi-python-client) (3.3.4)\n", "Requirement already satisfied: h11<0.13,>=0.11 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from httpcore<0.14.0,>=0.13.3->httpx<0.21.0,>=0.15.4->openapi-python-client) (0.12.0)\n", "Requirement already satisfied: idna>=2.8 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from anyio==3.*->httpcore<0.14.0,>=0.13.3->httpx<0.21.0,>=0.15.4->openapi-python-client) (3.1)\n", "Requirement already satisfied: dataclasses in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from anyio==3.*->httpcore<0.14.0,>=0.13.3->httpx<0.21.0,>=0.15.4->openapi-python-client) (0.8)\n", "Requirement already satisfied: contextvars>=2.1 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from sniffio->httpx<0.21.0,>=0.15.4->openapi-python-client) (2.4)\n", "Requirement already satisfied: immutables>=0.9 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from contextvars>=2.1->sniffio->httpx<0.21.0,>=0.15.4->openapi-python-client) (0.15)\n", "Requirement already satisfied: zipp>=0.5 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from importlib_metadata<5,>2->openapi-python-client) (3.4.0)\n", "Requirement already satisfied: MarkupSafe>=2.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from jinja2<4.0.0,>=3.0.0->openapi-python-client) (2.0.1)\n", "Requirement already satisfied: six>=1.5 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from python-dateutil<3.0.0,>=2.8.1->openapi-python-client) (1.15.0)\n", "Requirement already satisfied: click<7.2.0,>=7.1.1 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from typer<0.4,>=0.3->openapi-python-client) (7.1.2)\n", "Requirement already satisfied: appdirs in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from black->openapi-python-client) (1.4.4)\n", "Requirement already satisfied: mypy-extensions>=0.4.3 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from black->openapi-python-client) (0.4.3)\n", "Requirement already satisfied: typed-ast>=1.4.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from black->openapi-python-client) (1.4.2)\n", "Requirement already satisfied: regex>=2020.1.8 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from black->openapi-python-client) (2020.11.13)\n", "Requirement already satisfied: pathspec<1,>=0.6 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from black->openapi-python-client) (0.8.1)\n", "Requirement already satisfied: toml>=0.10.1 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from black->openapi-python-client) (0.10.2)\n", "\u001b[33mWARNING: You are using pip version 21.2.4; however, version 21.3.1 is available.\n", "You should consider upgrading via the '/home/ec2-user/anaconda3/envs/python3/bin/python -m pip install --upgrade pip' command.\u001b[0m\n" ] } ], "source": [ "!pip install openapi-python-client" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import boto3\n", "import random\n", "import requests\n", "import json\n", "import pprint\n", "from botocore.exceptions import ClientError" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Required inputs\n", "* The name of the parallel cluster that was previously set-up with the Slurm REST API using the pcluster-athena++ notebook\n", "* The region of that cluster" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# unique name of the pcluster\n", "pcluster_name = 'myPC5c'\n", "region=\"us-east-1\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Helper Functions\n", "Three additional helper cells are adapted from the pcluster-athena++ notebook to obtain the API Key, IP Address, and S3 bucket from the infrastructure previously created. This information needs to be collected in order to connect to the Slurm REST API.\n", "\n", "### Get the secret key from Secrets Manager" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Header information:\n", "{'X-SLURM-USER-NAME': 'slurm',\n", " 'X-SLURM-USER-TOKEN': 'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJleHAiOjE2MzcxODIyMDEsImlhdCI6MTYzNzE4MDQwMSwic3VuIjoicm9vdCJ9.o0giOhxzlhgCi-2kLYecBqoTVbGaRtGgpU74PXvIY2c'}\n" ] } ], "source": [ "slurm_secret_name = \"slurm_token_{}\".format(pcluster_name)\n", "\n", "session = boto3.session.Session()\n", "###\n", "# Retrieve the slurm_token from the SecretManager\n", "#\n", "def get_secret(session, slurm_secret_name, region):\n", "\n", " # Create a Secrets Manager client\n", " client = session.client(\n", " service_name='secretsmanager',\n", " region_name=region\n", " )\n", "\n", " # In this sample we only handle the specific exceptions for the 'GetSecretValue' API.\n", " # See https://docs.aws.amazon.com/secretsmanager/latest/apireference/API_GetSecretValue.html\n", " # We rethrow the exception by default.\n", "\n", " try:\n", " get_secret_value_response = client.get_secret_value(SecretId=slurm_secret_name)\n", " except ClientError as e:\n", " print(\"Error\", e)\n", " else:\n", " # Decrypts secret using the associated KMS CMK.\n", " # Depending on whether the secret is a string or binary, one of these fields will be populated.\n", " if 'SecretString' in get_secret_value_response:\n", " secret = get_secret_value_response['SecretString']\n", " return secret\n", " else:\n", " decoded_binary_secret = base64.b64decode(get_secret_value_response['SecretBinary'])\n", " return decoded_binary_secret\n", "\n", "###\n", "# Retrieve the token and inject into the header for JWT auth\n", "#\n", "def update_header_token(session, slurm_secret_name,region):\n", " # we use 'slurm' as the default user on head node for slurm commands\n", " token = get_secret(session, slurm_secret_name, region)\n", " post_headers = {'X-SLURM-USER-NAME':'slurm', 'X-SLURM-USER-TOKEN': token, 'Content-type': 'application/json', 'Accept': 'application/json'}\n", " get_headers = {'X-SLURM-USER-NAME':'slurm', 'X-SLURM-USER-TOKEN': token, 'Content-type': 'application/x-www-form-urlencoded', 'Accept': 'application/json'}\n", " return [post_headers, get_headers]\n", "\n", "junk, get_headers=update_header_token(session, slurm_secret_name, region)\n", "get_headers.pop('Accept')\n", "get_headers.pop('Content-type')\n", "\n", "print(\"Header information:\")\n", "ppr = pprint.PrettyPrinter(depth=2, indent=1)\n", "ppr.pprint(get_headers)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Obtain the REST API Endpoint from the CloudFormation Template" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Slurm REST endpoint is on 172.31.46.225\n" ] } ], "source": [ "cf_client = boto3.client('cloudformation')\n", "\n", "resp=cf_client.describe_stacks(StackName=pcluster_name)\n", "outputs=resp[\"Stacks\"][0][\"Outputs\"]\n", "\n", "slurm_host=''\n", "for o in outputs:\n", " if o['OutputKey'] == 'HeadNodePrivateIP':\n", " slurm_host = o['OutputValue']\n", " print(\"Slurm REST endpoint is on \", slurm_host)\n", " break;\n", "\n", "slurm_rest='http://'+slurm_host+':8082'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The S3 bucket name that was set up in pcluster-athena++" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "S3 bucket name is mypc5c-534950961494\n" ] } ], "source": [ "s3bucket=pcluster_name.lower() + \"-\" + boto3.client('sts').get_caller_identity().get('Account')\n", "print('S3 bucket name is ' + s3bucket)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## View and Save the OpenAPI Specification to a File" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'components': {'schemas': {...}, 'securitySchemes': {...}},\n", " 'info': {'contact': {...},\n", " 'description': 'API to access and control Slurm.',\n", " 'license': {...},\n", " 'termsOfService': 'https://github.com/SchedMD/slurm/blob/master/DISCLAIMER',\n", " 'title': 'Slurm Rest API',\n", " 'version': '0.0.36'},\n", " 'openapi': '3.0.2',\n", " 'paths': {'/openapi': {...},\n", " '/openapi.json': {...},\n", " '/openapi.yaml': {...},\n", " '/openapi/v3': {...},\n", " '/slurm/v0.0.35/diag': {...},\n", " '/slurm/v0.0.35/job/submit': {...},\n", " '/slurm/v0.0.35/job/{job_id}': {...},\n", " '/slurm/v0.0.35/jobs': {...},\n", " '/slurm/v0.0.35/node/{node_name}': {...},\n", " '/slurm/v0.0.35/nodes': {...},\n", " '/slurm/v0.0.35/partition/{partition_name}': {...},\n", " '/slurm/v0.0.35/partitions': {...},\n", " '/slurm/v0.0.35/ping': {...},\n", " '/slurm/v0.0.36/diag': {...},\n", " '/slurm/v0.0.36/job/submit': {...},\n", " '/slurm/v0.0.36/job/{job_id}': {...},\n", " '/slurm/v0.0.36/jobs': {...},\n", " '/slurm/v0.0.36/node/{node_name}': {...},\n", " '/slurm/v0.0.36/nodes': {...},\n", " '/slurm/v0.0.36/partition/{partition_name}': {...},\n", " '/slurm/v0.0.36/partitions': {...},\n", " '/slurm/v0.0.36/ping': {...},\n", " '/slurmdb/v0.0.36/account/{account_name}': {...},\n", " '/slurmdb/v0.0.36/accounts': {...},\n", " '/slurmdb/v0.0.36/association': {...},\n", " '/slurmdb/v0.0.36/associations': {...},\n", " '/slurmdb/v0.0.36/cluster/{cluster_name}': {...},\n", " '/slurmdb/v0.0.36/clusters': {...},\n", " '/slurmdb/v0.0.36/config': {...},\n", " '/slurmdb/v0.0.36/diag': {...},\n", " '/slurmdb/v0.0.36/job/{job_id}': {...},\n", " '/slurmdb/v0.0.36/jobs': {...},\n", " '/slurmdb/v0.0.36/qos': {...},\n", " '/slurmdb/v0.0.36/qos/{qos_name}': {...},\n", " '/slurmdb/v0.0.36/tres': {...},\n", " '/slurmdb/v0.0.36/user/{user_name}': {...},\n", " '/slurmdb/v0.0.36/users': {...},\n", " '/slurmdb/v0.0.36/wckey/{wckey}': {...},\n", " '/slurmdb/v0.0.36/wckeys': {...}},\n", " 'security': [{...}],\n", " 'servers': [{...}],\n", " 'tags': [{...}, {...}]}\n" ] } ], "source": [ "rest_api = requests.get(slurm_rest+\"/openapi/v3\", headers=get_headers)\n", "\n", "if rest_api.status_code != 200:\n", " # This means something went wrong.\n", " print(\"Error\" , rest_api.status_code)\n", "\n", "ppr = pprint.PrettyPrinter(depth=2, indent=1)\n", "ppr.pprint(rest_api.json())\n", "\n", "with open('slurmrestapi.json', 'w', encoding='utf-8') as f:\n", " json.dump(rest_api.json(), f, ensure_ascii=False, indent=4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Build client library module\n", "\n", "* Generate the client library module using OpenAPI generator\n", "* Apply a hotfix to the generated module\n", "* Install the generated module with pip\n", "\n", "Note, a README.md file will be generated by OpenAPI generator in the slurm-rest-api-client directory with details on client python module useage.\n", "\n", "The hotfix patch is included in this repository in the same directory as this Jupyter notebook, but is included here for completeness:\n", "```\n", "diff -urwB slurm-rest-api-client/slurm_rest_api_client/models/v0036_node_allocation.py slurm-rest-api-client-patched/slurm_rest_api_client/models/v0036_node_allocation.py\n", "--- slurm-rest-api-client/slurm_rest_api_client/models/v0036_node_allocation.py 2021-11-17 18:58:51.510567435 +0000\n", "+++ slurm-rest-api-client-patched/slurm_rest_api_client/models/v0036_node_allocation.py 2021-11-03 22:10:51.405977583 +0000\n", "@@ -50,6 +50,8 @@\n", "\n", " @classmethod\n", " def from_dict(cls: Type[T], src_dict: Dict[str, Any]) -> T:\n", "+ if isinstance(src_dict, str):\n", "+ src_dict={}\n", " d = src_dict.copy()\n", " memory = d.pop(\"memory\", UNSET)\n", " \n", "```" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Directory slurm-rest-api-client found, module already created\n", "patching file slurm-rest-api-client/slurm_rest_api_client/models/v0036_node_allocation.py\n", "Reversed (or previously applied) patch detected! Assume -R? [n] \n", "Apply anyway? [n] \n", "Skipping patch.\n", "1 out of 1 hunk ignored -- saving rejects to file slurm-rest-api-client/slurm_rest_api_client/models/v0036_node_allocation.py.rej\n", "Processing /home/ec2-user/SageMaker/aws-research-workshops/notebooks/parallelcluster/slurm-rest-api-client\n", " Installing build dependencies: started\n", " Installing build dependencies: finished with status 'done'\n", " Getting requirements to build wheel: started\n", " Getting requirements to build wheel: finished with status 'done'\n", " Preparing wheel metadata: started\n", " Preparing wheel metadata: finished with status 'done'\n", "Requirement already satisfied: python-dateutil<3.0.0,>=2.8.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from slurm-rest-api-client==0.0.36) (2.8.1)\n", "Requirement already satisfied: attrs<22.0.0,>=20.1.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from slurm-rest-api-client==0.0.36) (21.2.0)\n", "Requirement already satisfied: httpx<0.21.0,>=0.15.4 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from slurm-rest-api-client==0.0.36) (0.20.0)\n", "Requirement already satisfied: httpcore<0.14.0,>=0.13.3 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from httpx<0.21.0,>=0.15.4->slurm-rest-api-client==0.0.36) (0.13.7)\n", "Requirement already satisfied: sniffio in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from httpx<0.21.0,>=0.15.4->slurm-rest-api-client==0.0.36) (1.2.0)\n", "Requirement already satisfied: async-generator in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from httpx<0.21.0,>=0.15.4->slurm-rest-api-client==0.0.36) (1.10)\n", "Requirement already satisfied: certifi in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from httpx<0.21.0,>=0.15.4->slurm-rest-api-client==0.0.36) (2021.5.30)\n", "Requirement already satisfied: rfc3986[idna2008]<2,>=1.3 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from httpx<0.21.0,>=0.15.4->slurm-rest-api-client==0.0.36) (1.5.0)\n", "Requirement already satisfied: charset-normalizer in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from httpx<0.21.0,>=0.15.4->slurm-rest-api-client==0.0.36) (2.0.6)\n", "Requirement already satisfied: anyio==3.* in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from httpcore<0.14.0,>=0.13.3->httpx<0.21.0,>=0.15.4->slurm-rest-api-client==0.0.36) (3.3.4)\n", "Requirement already satisfied: h11<0.13,>=0.11 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from httpcore<0.14.0,>=0.13.3->httpx<0.21.0,>=0.15.4->slurm-rest-api-client==0.0.36) (0.12.0)\n", "Requirement already satisfied: idna>=2.8 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from anyio==3.*->httpcore<0.14.0,>=0.13.3->httpx<0.21.0,>=0.15.4->slurm-rest-api-client==0.0.36) (3.1)\n", "Requirement already satisfied: typing-extensions in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from anyio==3.*->httpcore<0.14.0,>=0.13.3->httpx<0.21.0,>=0.15.4->slurm-rest-api-client==0.0.36) (3.10.0.2)\n", "Requirement already satisfied: dataclasses in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from anyio==3.*->httpcore<0.14.0,>=0.13.3->httpx<0.21.0,>=0.15.4->slurm-rest-api-client==0.0.36) (0.8)\n", "Requirement already satisfied: contextvars>=2.1 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from sniffio->httpx<0.21.0,>=0.15.4->slurm-rest-api-client==0.0.36) (2.4)\n", "Requirement already satisfied: immutables>=0.9 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from contextvars>=2.1->sniffio->httpx<0.21.0,>=0.15.4->slurm-rest-api-client==0.0.36) (0.15)\n", "Requirement already satisfied: six>=1.5 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from python-dateutil<3.0.0,>=2.8.0->slurm-rest-api-client==0.0.36) (1.15.0)\n", "Building wheels for collected packages: slurm-rest-api-client\n", " Building wheel for slurm-rest-api-client (PEP 517): started\n", " Building wheel for slurm-rest-api-client (PEP 517): finished with status 'done'\n", " Created wheel for slurm-rest-api-client: filename=slurm_rest_api_client-0.0.36-py3-none-any.whl size=178319 sha256=57aa79de8ca67631bb0ec86706462657d22738efb8625773a7d0958e1a73d349\n", " Stored in directory: /home/ec2-user/.cache/pip/wheels/d1/9b/51/2e4d30fada72e87ac46cdb7b9686be85c236b2750065146c8a\n", "Successfully built slurm-rest-api-client\n", "Installing collected packages: slurm-rest-api-client\n", " Attempting uninstall: slurm-rest-api-client\n", " Found existing installation: slurm-rest-api-client 0.0.36\n", " Uninstalling slurm-rest-api-client-0.0.36:\n", " Successfully uninstalled slurm-rest-api-client-0.0.36\n", "Successfully installed slurm-rest-api-client-0.0.36\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " DEPRECATION: A future pip version will change local packages to be built in-place without first copying to a temporary directory. We recommend you use --use-feature=in-tree-build to test your packages with this new behavior before it becomes the default.\n", " pip 21.3 will remove support for this functionality. You can find discussion regarding this at https://github.com/pypa/pip/issues/7555.\n", "WARNING: You are using pip version 21.2.4; however, version 21.3.1 is available.\n", "You should consider upgrading via the '/home/ec2-user/anaconda3/envs/python3/bin/python -m pip install --upgrade pip' command.\n" ] } ], "source": [ "%%bash\n", "if [ ! -d slurm-rest-api-client ] \n", "then\n", " openapi-python-client generate --path slurmrestapi.json\n", "else\n", " echo \"Directory slurm-rest-api-client found, module already created\"\n", "fi\n", "\n", "if [ -d slurm-rest-api-client ] \n", "then\n", " patch -p0 < slurmrestapi-20.11.7.patch\n", " cd slurm-rest-api-client\n", " python -m pip install .\n", "else\n", " echo \"Directory slurm-rest-api-client not found, something must have gone wrong when building\"\n", "fi" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example useage of the Client API Created from the OpenAPI Specification" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Example 1: Querying Node Information\n", "\n", "First, we will show some standard queries to obtain information about the cluster nodes. The output of each client module function is an object containing data obtained from the API call. For the purpose of this demonstration, only selective portions of the output data are printed.\n", "\n", "In the cell below, you can:\n", "* Check if the REST API is responding with 200\n", "* Get the status of the Head Node with the ping API\n", "* Get the names and states of Compute Nodes using the get_nodes API" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Response is 200\n", "ip-172-31-46-225 responds UP\n", "big-dy-c5n2xlarge-1 state is idle\n", "big-dy-c5n2xlarge-2 state is idle\n", "big-dy-c5n2xlarge-3 state is idle\n", "big-dy-c5n2xlarge-4 state is idle\n", "big-dy-c5n2xlarge-5 state is idle\n", "big-dy-c5n2xlarge-6 state is idle\n", "big-dy-c5n2xlarge-7 state is idle\n", "big-dy-c5n2xlarge-8 state is idle\n", "big-dy-c5n2xlarge-9 state is idle\n", "big-dy-c5n2xlarge-10 state is idle\n", "small-dy-t2micro-1 state is idle\n", "small-dy-t2micro-2 state is idle\n", "small-dy-t2micro-3 state is idle\n", "small-dy-t2micro-4 state is idle\n", "small-dy-t2micro-5 state is idle\n", "small-dy-t2micro-6 state is idle\n", "small-dy-t2micro-7 state is idle\n", "small-dy-t2micro-8 state is idle\n", "small-dy-t2micro-9 state is idle\n", "small-dy-t2micro-10 state is idle\n" ] } ], "source": [ "import slurm_rest_api_client as slurm\n", "from slurm_rest_api_client.api.slurm import slurmctld_ping\n", "from slurm_rest_api_client.api.slurm import slurmctld_get_nodes\n", "\n", "client = slurm.Client(base_url=slurm_rest, headers=get_headers)\n", "\n", "response: slurm.types.Response[slurm.models.V0036Pings] = slurmctld_ping.sync_detailed(client=client)\n", "output: slurm.models.V0036Pings = slurmctld_ping.sync(client=client)\n", " \n", "if response.status_code is not 200:\n", " print(\"The REST API returned a non-200 exit code, try refreshing the JWT using the cell above\")\n", "else:\n", " print(\"Response is 200\")\n", "if output is not None:\n", " for ping in output.pings:\n", " print(ping.hostname + \" responds \" + str(ping.ping))\n", " \n", "#response: slurm.types.Response[V0036JobsResponse] = slurmctld_get_jobs.sync_detailed(client=client)\n", "output: slurm.models.V0036NodesResponse = slurmctld_get_nodes.sync(client=client)\n", "\n", "if output is not None:\n", " for node in output.nodes:\n", " print(node.hostname + \" state is \" + str(node.state))\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Example 2: Submit jobs\n", "\n", "Let's submit a series of Test jobs to demonstrate the submit job API. Each job will echo a phrase containing a random number generated in the Jupyter notebook in order to identify it uniquely, then sleep for a short time in order to be able to query it in Example 3.\n", "\n", "We will then use S3 to transfer the file from the cluster to the notebook host, and print the downloaded job output to the notebook cell.\n", "\n", "You can use Example 3 in order to check the status of the job." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Random integer submitted is 763516996\n", "test0 -- Job submitted: #113\n", "Random integer submitted is 724693664\n", "test1 -- Job submitted: #114\n", "Random integer submitted is 1933122879\n", "test2 -- Job submitted: #115\n", "Random integer submitted is 1182797700\n", "test3 -- Job submitted: #116\n", "Random integer submitted is 1377528152\n", "test4 -- Job submitted: #117\n" ] } ], "source": [ "from slurm_rest_api_client.api.slurm import slurmctld_submit_job\n", "\n", "njobs=5 # How many test jobs to submit\n", "tsleep=60 # Delay in seconds for each job to sit idle before completing\n", "\n", "for jobn in range(njobs):\n", " randi=random.randint(1,2147483647)\n", " print(\"Random integer submitted is \" + str(randi))\n", " job_name=\"test\" + str(jobn)\n", " job_dir=\"/shared/\"\n", " job_spec={ \n", " \"job\": { \n", " \"name\": job_name, \n", " \"ntasks\":2, \n", " \"nodes\": 2, \n", " \"partition\": \"small\",\n", " \"current_working_directory\": job_dir, \n", " \"standard_input\": \"/dev/null\", \n", " \"standard_output\": job_dir + job_name + \".out\", \n", " \"standard_error\": job_dir + job_name + \".err\", \n", " \"environment\": {\"PATH\": \"/bin:/usr/bin/:/usr/local/bin/\",\"LD_LIBRARY_PATH\": \"/lib/:/lib64/:/usr/local/lib\"} \n", " },\n", " \"script\": \"#!/bin/bash\\n\"\n", " \"echo I am from a Jupyter Notebook and Slurm Job $SLURM_JOB_ID\\n\" +\n", " \"sleep \" + str(tsleep) + \"\\n\" +\n", " \"echo My random integer is \" + str(randi) + \"\\n\" +\n", " \"aws s3 cp \" + job_name + \".out s3://\" + s3bucket + \"/\" + job_name + \".out\"\n", " }\n", "\n", " #response: slurm.types.Response[slurm.models.V0036JobSubmissionResponse] = slurmctld_submit_job.sync_detailed(client=client, json_body=job_spec)\n", " output: slurm.models.V0036JobSubmissionResponse = slurmctld_submit_job.sync(client=client, json_body=job_spec)\n", " if output is not None:\n", " print(job_name + \" -- Job submitted: #\" + str(output.job_id))\n", " else:\n", " print(\"Job not submitted\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### We will now use S3 to download the set of job outputs and print the result\n", "\n", "This will download the stout from the tests. If the file hasn't been created at job completion, it will not be found. You can check the job status in Example 3." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "test0.out -- I am from a Jupyter Notebook and Slurm Job 113\n", "\n", "test0.out -- My random integer is 763516996\n", "\n", "test1.out -- I am from a Jupyter Notebook and Slurm Job 114\n", "\n", "test1.out -- My random integer is 724693664\n", "\n", "test2.out -- I am from a Jupyter Notebook and Slurm Job 115\n", "\n", "test2.out -- My random integer is 1933122879\n", "\n", "test3.out -- I am from a Jupyter Notebook and Slurm Job 116\n", "\n", "test3.out -- My random integer is 1182797700\n", "\n", "test4.out -- I am from a Jupyter Notebook and Slurm Job 117\n", "\n", "test4.out -- My random integer is 1377528152\n", "\n" ] } ], "source": [ "s3 = boto3.client('s3')\n", "for jobn in range(njobs):\n", " outfilename=\"test\" + str(jobn) + \".out\"\n", " try: \n", " s3.download_file(s3bucket, outfilename, outfilename)\n", " with open(outfilename) as f:\n", " for line in f.readlines():\n", " print(outfilename + \" -- \" + str(line))\n", " except(ClientError) as e:\n", " print(outfilename + \" -- \" + str(e.response[\"Error\"][\"Message\"]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Example 3: Get jobs (like squeue)\n", "\n", "The get_jobs API can be used to get information on the jobs in the queue." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Job 108 is COMPLETED on small-dy-t2micro-1\n", "Job 109 is COMPLETED on small-dy-t2micro-3\n", "Job 110 is COMPLETED on small-dy-t2micro-5\n", "Job 111 is COMPLETED on small-dy-t2micro-7\n", "Job 112 is COMPLETED on small-dy-t2micro-9\n", "Job 113 is COMPLETED on small-dy-t2micro-1\n", "Job 114 is COMPLETED on small-dy-t2micro-3\n", "Job 115 is COMPLETED on small-dy-t2micro-5\n", "Job 116 is COMPLETED on small-dy-t2micro-7\n", "Job 117 is COMPLETED on small-dy-t2micro-9\n" ] } ], "source": [ "from slurm_rest_api_client.api.slurm import slurmctld_get_jobs\n", "\n", "output: slurm.models.V0036JobsResponse = slurmctld_get_jobs.sync(client=client)\n", " \n", "if output is not None:\n", " for job in output.jobs:\n", " print('Job ' + str(job.job_id) + \" is \" + job.job_state + \" on \" + job.batch_host)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.9" } }, "nbformat": 4, "nbformat_minor": 4 }