# EMR on EKS with MWAA This example deploys the following resources - Creates EMR Cluster and runs a spark job using a DAG ## Prerequisites: Ensure that you have installed the following tools on your machine. 1. [aws cli](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) 2. [Make](https://www.make.com/en) 3. [Amazon MWAA](https://aws.amazon.com/managed-workflows-for-apache-airflow/) 4. An S3 bucket for storing EMR data and scripts. _Note: If you do not have running MWAA environment, deploy it from the root of the project using terraform or CDK. ## Get started Clone the repository Navigate into one of the example directories and run `make` by passing MWAA environment related arguments ```sh cd blueprints/examples/EMR make deploy mwaa_bucket={MWAA_BUCKET} mwaa_execution_role_name=m{MWAA_EXEC_ROLE} mwaa_env_name={MWAA_ENV_NAME} emr_data_bucket={EMR_DATA_BUCKET} ``` ## clean up ```sh cd blueprints/examples/EMR make destroy mwaa_bucket={MWAA_BUCKET} mwaa_execution_role_name=m{MWAA_EXEC_ROLE} mwaa_env_name={MWAA_ENV_NAME} emr_data_bucket={EMR_DATA_BUCKET} ``` ## Login to MWAA Login to your Amazon MWAA environment. You should see a dag by the name 'emr_sample' Unpause the DAG and Run it from console ## What does the makefile do? 1. Copies the DAG and spark script to the S3 buckets 2. Attaches AmazonS3FullAccess and AmazonEMRFullAccessPolicy_v2 access permissions to MWAA execution role 3. Creates a variable in MWAA for the data bucket ## Clean up ```sh cd blueprints/examples/EMR make undeploy mwaa_bucket={MWAA_BUCKET} mwaa_execution_role_name={MWAA_EXEC_ROLE} mwaa_env_name={MWAA_ENV_NAME} emr_data_bucket={EMR_DATA_BUCKET} ```