## Overview This walkthrough provides a demonstration of App Mesh's metrics extension on ECS. The App Mesh metrics extesion enhances Envoy proxies running in your mesh, allowing them to emit *additional* metrics that are aware of the resources you define in App Mesh. Metrics will include new dimensions such as `TargetVirtualService`, allowing you to now aggregate and filter metrics by the virtual service your virtual nodes communicate with. *More details on the new metrics and dimensions below.* By using this walkthrough you will: 1. Deploy an application that leverages all the top-level App Mesh resources (virtual nodes, virtual gateways, virtual routers, etc). 1. Observe in CloudWatch the kinds of metrics provided by Envoy without the metrics extension. 1. Enable the metrics extension add more mesh-aware metrics. 1. See different ways to selectively filter metrics produced by Envoy using the CloudWatch agent or the AWS Distro for OpenTelemetry. 1. Use the metrics extension-provided metrics to easily generate CloudWatch dashboards. At any time, you can delete everything generated by this walkthrough by following the `Cleaning Up` section at the bottom of this page. ## Prerequsites * git (Or a way to download the files in this repository. Only the files in this directory are needed.) * curl (Or a way to make repeated HTTP requests from your terminal.) * The [AWS CLI](https://aws.amazon.com/cli/). * [Docker](https://docs.docker.com/get-docker/). You will use Docker to generate the application images used in this walkthrough. ## Initial Configuration 1. Clone this repository and naviate to the `walkthroughs/howto-metrics-extension-ecs` directory. 1. Export your account id: ```sh export AWS_ACCOUNT_ID= ``` 1. Export the AWS region you will deploy this application in: ```sh export AWS_DEFAULT_REGION= ``` 1. Export the Envoy proxy image you will use. For the latest image see https://docs.aws.amazon.com/app-mesh/latest/userguide/envoy.html ```sh export ENVOY_IMAGE= ``` *Note: The metrics extension requires the App Mesh Envoy image with a minimum version of `v1.19.1.1`* ## Application Overview In this walkthrough you will be deploying a simple ECS-based application called "The DJ App": ![Application Diagram](./images/djapp.png) The application is composed of a network load balancer, a virtual gateway, and four microservices: 1. `djapp` The frontend service that handles requests from the gateway. It is the only member of the `djapp.howto-metrics-extension.local` virtual service. 1. `metal-v1` A member of the `metal.howto-metrics-extension.local` virtual service. 1. `metal-v2` The other member of the `metal.howto-metrics-extension.local` virtual service. 1. `jazz` The sole member of the `jazz.howto-metrics-extension.local` virtual service. Traffic comes into the application via a Network Load Balancer that forwards traffic to the virtual gateway (named `gateway`). `gateway` acts as an ingress to our mesh and in-turn forwards traffic to `djapp.howto-metrics-extension.local`. The `djapp` requests artist lists randomly from the `metal.howto-metrics-extension.local` and `jazz.howto-metrics-extension.local` virtual services and returns them to its callers. * `jazz` will return jazz artists such as *Miles Davis* or *Astrud Gilberto*. * `metal-v1` will return metal artists such as *Judas Priest* or *Megadeth*. * `metal-v2` will return metal artists as well, but also includes the city where the artists come from. The two members of the `metal.howto-metrics-extension.local` virtual service simulate a canary deployment or A/B test where there are multiple versions of service deployed at the same time. > The `metal.howto-metrics-extension.local` virtual service distributes traffic 60/40 to `metal-v1` and `metal-v2` using a virtual router with a single route: ```yaml MetalRoute: Type: AWS::AppMesh::Route Properties: MeshName: !GetAtt Mesh.MeshName VirtualRouterName: !GetAtt MetalVirtualRouter.VirtualRouterName RouteName: metal Spec: HttpRoute: Match: Prefix: / Action: WeightedTargets: - VirtualNode: !GetAtt MetalV1VirtualNode.VirtualNodeName Weight: 60 - VirtualNode: !GetAtt MetalV2VirtualNode.VirtualNodeName Weight: 40 ``` Lastly, all metrics generated by the Envoy proxies will be sent to CloudWatch under the `howto-metrics-extension` namespace. ## Deploy the DJ App To deploy the DJ App, run the deploy script: ```sh ./deploy.sh deploy djapp-v1 ``` This will deploy the couldformation template `djapp-v1.yaml` located in the `deploy/` folder. The template contains the entirety of the DJ App outlined above. If everything was successful, the script should output a line you can run in your shell to save the public address of the Network Load Balancer. It will look like: ```sh export PUBLIC_ENDPOINT=... ``` Export the endpoint so you can easily make requests to the DJ App. ## Default Envoy Metrics To see the service metrics generated by Envoy you'll want to drive some traffic to the DJ App. This loop will make a request to the DJ App every second and print an artist name: ```sh while true; do curl $PUBLIC_ENDPOINT; sleep 1; echo; done ``` After sending traffic to DJ App for a few minutes, you can navigate to the CloudWatch console. ![Envoy Metrics in CloudWatch-- names](./images/djapp-v1-cloudwatch-metric-dimensions.png) Here you'll see **hundreds** of metrics being generated by Envoy broken down by a few different dimensions, including a few that give some insight as to which virtual node or virtual gateway generated them: * `appmesh.mesh` * `appmesh.virtual_node` * `appmesh.virtual_gateway` As an example, you can search for the metric `envoy_cluster_upstream_rq` in CloudWatch to see the number of requests made by Envoy to each of its upstream *clusters* (upstream clusters usually being other virtual nodes in your mesh). Alternatively, you can use the `metric.sh` script to get a snapshot of the metric for the last 15 minutes: ```sh ./metric.sh -m envoy_cluster_upstream_rq -x Sum -q "appmesh.mesh=howto-metrics-extension appmesh.virtual_node=djapp" ``` *Note: You can run `./metric.sh -h` for help on using the script.* ![Upstream request counts to metal-v2](./images/djapp-v1-cloudwatch-upstream-rq.png) In the above CloudWatch screenshot this metric has the dimension `envoy.cluster_name` with a value of `cds_egress_howto-metrics-extension_metal-v2_http_8080`. This metric is for the `djapp` virtual node, showing the number of requests from `djapp` to `metal-v2`. Each backend virtual node will result in a unique metric value being emitted, though none of these metrics indicate the virtual service the backend represents. App Mesh only includes the source and destination virtual node names in the metric names it generates. > To learn more about Envoy metrics and how they are named in App Mesh, see the Envoy metrics topic in the App Mesh User Guide: https://docs.aws.amazon.com/app-mesh/latest/userguide/envoy-metrics.html ## Enabling the Metrics Extension Above, you saw that Envoy generates a large number of metrics with high-cardinality. The metrics are also not straightforward to relate back to App Mesh resources, and in the case of virtual services, there is no mechanism to know for sure which virtual service a given virtual node or virtual gateway is communicating with. The goal of the metrics extension is to provide a small but still useful set of metrics that will give you insight into the behavior of your mesh in terms of the resources you define in App Mesh. At the time of writing, App Mesh will have made available the first version (version `1`) of the metrics extension. Over time, new metrics and features of will be made available. You can opt into these additional metrics by adding an environment variable to your ECS tasks: ```sh APPMESH_METRIC_EXTENSION_VERSION=1 ``` The existing DJ App tasks look like this: ```yaml DJAppTaskDef: Type: AWS::ECS::TaskDefinition Properties: ContainerDefinitions: - Name: envoy Environment: - Name: ENVOY_LOG_LEVEL Value: debug - Name: ENABLE_ENVOY_DOG_STATSD Value: 1 - Name: ENABLE_ENVOY_STATS_TAGS Value: 1 - Name: APPMESH_RESOURCE_ARN Value: !Ref DJAppVirtualNode # (...) ``` You'll next deploy an update to DJ App that replaces `ENABLE_ENVOY_STATS_TAGS` with `APPMESH_METRIC_EXTENSION_VERSION`. The new option makes `ENABLE_ENVOY_STATS_TAGS` redundant as the metrics extension will implicitly add App Mesh dimensions to Envoy's metrics: ```yaml DJAppTaskDef: Type: AWS::ECS::TaskDefinition Properties: ContainerDefinitions: - Name: envoy Environment: - Name: ENVOY_LOG_LEVEL Value: debug - Name: ENABLE_ENVOY_DOG_STATSD Value: 1 - Name: APPMESH_METRIC_EXTENSION_VERSION Value: 1 - Name: APPMESH_RESOURCE_ARN Value: !Ref DJAppVirtualNode # (...) ``` That is the only difference between `deploy/djapp-v1.yaml` and `deploy/djapp-v2.yaml`. Deploy it with the deploy script: ```sh # Here we can set SKIP_IMAGES=1 to skip re-deploying our application images since they won't change SKIP_IMAGES=1 ./deploy.sh deploy djapp-v2 ``` Once the update is complete, you'll want to send traffic to the DJ App again: ```sh while true; do curl $PUBLIC_ENDPOINT; sleep 1; echo; done ``` In CloudWatch you should now see new metrics starting to appear. The metrics provided by the metrics extension resemble those provided by Elastic Load Balancing: *Note: Metrics exported to CloudWatch will have a prefix of `envoy.appmesh.` (or `envoy_appmesh_` depending on the metrics format you are using to export to CloudWatch). Sections below will cover ways to configure this.* ### Metrics Related to Inbound/Ingress Traffic | Metric | Description | Dimensions | | --------------------- | ------------------------------------------------------------- | ------------------------------------------ | | ActiveConnectionCount | Number of active TCP connections. | Mesh, VirtualNode
Mesh, VirtualGateway | | NewConnectionCount | Number of TCP connections created. | Mesh, VirtualNode
Mesh, VirtualGateway | | ProcessedBytes | Total TCP bytes sent to and received from downstream clients. | Mesh, VirtualNode
Mesh, VirtualGateway | | RequestCount | The number of processed HTTP requests. | Mesh, VirtualNode
Mesh, VirtualGateway | | GrpcRequestCount | The number of processed gPRC requests. | Mesh, VirtualNode
Mesh, VirtualGateway | ### Metrics Related to Outbound/Egress Traffic | Metric | Description | Dimensions | | ------------------------- | --------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- | | TargetProcessedBytes | Total TCP bytes sent to and received from upstream targets. | Mesh, VirtualNode, TargetVirtualService, TargetVirtualNode
Mesh, VirtualGateway, TargetVirtualService, TargetVirtualNode | | HTTPCode_Target_2XX_Count | The number of HTTP requests to an upstream target that resulted in a 2xx HTTP response. | Mesh, VirtualNode, TargetVirtualService, TargetVirtualNode
Mesh, VirtualGateway, TargetVirtualService, TargetVirtualNode | | HTTPCode_Target_3XX_Count | The number of HTTP requests to an upstream target that resulted in a 3xx HTTP response. | Mesh, VirtualNode, TargetVirtualService, TargetVirtualNode
Mesh, VirtualGateway, TargetVirtualService, TargetVirtualNode | | HTTPCode_Target_4XX_Count | The number of HTTP requests to an upstream target that resulted in a 4xx HTTP response. | Mesh, VirtualNode, TargetVirtualService, TargetVirtualNode
Mesh, VirtualGateway, TargetVirtualService, TargetVirtualNode | | HTTPCode_Target_5XX_Count | The number of HTTP requests to an upstream target that resulted in a 5xx HTTP response. | Mesh, VirtualNode, TargetVirtualService, TargetVirtualNode
Mesh, VirtualGateway, TargetVirtualService, TargetVirtualNode | | RequestCountPerTarget | The number of requests sent to an upstream target. | Mesh, VirtualNode, TargetVirtualService, TargetVirtualNode
Mesh, VirtualGateway, TargetVirtualService, TargetVirtualNode | | TargetResponseTime | The time elapsed from when a request is made to an upstream target to when the full response is received. | Mesh, VirtualNode, TargetVirtualService, TargetVirtualNode
Mesh, VirtualGateway, TargetVirtualService, TargetVirtualNode | ![Upstream requests from djapp](./images/djapp-v2-cloudwatch-metrics.png) In the above screenshot, metrics for the `djapp` virtual node now include the metrics listed above. All of Envoy's built-in metrics will also now include the `Mesh`, `VirtualNode`, and `VirtualGateway` dimensions. You can use the `metric.sh` script included in this walkthough to get snapshot images of the metrics in this mesh. To see the number of requests handled by the gateway, run: ```sh ./metric.sh -t VirtualGateway -n gateway -m envoy_appmesh_RequestCount -x Sum -q "metric_type=counter" ``` To see the distribution of requests from the `djapp` to the different virtual nodes of `metal.howto-metrics-extension.local`, run: ```sh ./metric.sh -t VirtualNode -n djapp -m envoy_appmesh_RequestCountPerTarget -x Sum -q "TargetVirtualService=metal.howto-metrics-extension.local metric_type=counter" ``` Or to see the latency betwen `djapp` and the `jazz`, run: ```sh ./metric.sh -t VirtualNode -n djapp -m envoy_appmesh_TargetResponseTime -x Average -q "TargetVirtualNode=jazz metric_type=timing" ``` ## Filtering Metrics with the CloudWatch Agent In the previous section you observed that the App Mesh metrics extension provides a subset of useful metrics that help you see things such as traffic, latency, and error rates in the DJ App. However, these new metrics make some of Envoy's default metrics redundant. And many of Envoy's default metrics are likely not of use to the DJ App. Fortunately, there are ways to filter metrics locally in our services before sending them to CloudWatch. Here you will see a way of doing metric filtering using the CloudWatch agent. > Note: Envoy has built-in support for filtering the metrics it emits but this feature can lead to unwanted behavior, unfortunately including irrecoverable dead-lock preventing App Mesh from pushing new configuration to Envoy. You can track the status of this issue on Envoy's side here: https://github.com/envoyproxy/envoy/issues/8771 Or you can watch our tracking issue here: https://github.com/aws/aws-app-mesh-roadmap/issues/283 in the mean time, metric filtering is best done by a metrics daemon or sidecar. Most App Mesh users that use the CloudWatch agent leverage the [StatsD](https://github.com/statsd/statsd) (or DogStatsD) protocol to send metrics from Envoy to the agent and ultimately to CloudWatch. While this integration is simple, it provides no mechanism today to filter metrics before sending them to CloudWatch. The CloudWatch agent does however support scraping [Prometheus](https://prometheus.io/) metrics. Prometheus scrapers can be provided a *scrape configuration* that allows you to select the metrics you want to pull from Envoy and ultimately send to CloudWatch. Right now the task definitions for all of the services in the DJ App are configured to use StatsD and send their metrics to a CloudWatch agent sidecar: ```yaml DJAppTaskDef: Type: AWS::ECS::TaskDefinition Properties: ContainerDefinitions: - Name: envoy Environment: - Name: ENVOY_LOG_LEVEL Value: debug - Name: ENABLE_ENVOY_DOG_STATSD Value: 1 - Name: APPMESH_METRIC_EXTENSION_VERSION Value: 1 - Name: APPMESH_RESOURCE_ARN Value: !Ref DJAppVirtualNode # (...) - Name: cwagent Environment: - Name: CW_CONFIG_CONTENT Value: !Sub > { "metrics": { "namespace": "${ProjectName}", "metrics_collected": { "statsd": {} } } } # (...) ``` The option on the Envoy container `ENABLE_ENVOY_DOG_STATSD` tells Envoy to emit StatsD metrics. And the `metrics_collected: { statsd: {} }` setting on the CloudWatch agent in-turn configures the agent to consume metrics pushed from Envoy to it. The next update to the DJ App will instead deploy a single shared CloudWatch agent to scrape and filter Prometheus metrics from all of our services. This agent will automatically discover ECS tasks in our cluster that have special Docker labels that tell the agent the location of Envoy's Prometheus metrics endpoint. To do this, the agent will need its own IAM roles that give it read-only access to ECS tasks and permission to send the metrics it finds to CloudWatch: ```yaml CWAgentTaskIAMRole: Type: AWS::IAM::Role # (...) CWAgentTaskExecutionIAMRole: Type: AWS::IAM::Role # (...) ``` Then each DJ App service will be updated to remove the CloudWatch agent sidecar and instead add Docker labels indicating Envoy's Prometheus metrics endpoint: ```yaml DJAppTaskDef: Type: AWS::ECS::TaskDefinition Properties: ContainerDefinitions: - Name: envoy DockerLabels: ECS_PROMETHEUS_METRICS_PATH: /stats/prometheus ECS_PROMETHEUS_EXPORTER_PORT: 9901 Environment: - Name: ENVOY_LOG_LEVEL Value: debug - Name: APPMESH_METRIC_EXTENSION_VERSION Value: 1 - Name: APPMESH_RESOURCE_ARN Value: !Ref DJAppVirtualNode # (...) ``` Finally, a new ECS Service and Task for the CloudWatch agent is created: ```yaml CWAgentTaskDef: Type: AWS::ECS::TaskDefinition Properties: ContainerDefinitions: - Name: cwagent Image: public.ecr.aws/cloudwatch-agent/cloudwatch-agent:latest Essential: true Environment: - Name: PROMETHEUS_CONFIG_CONTENT Value: | global: scrape_interval: 1m scrape_timeout: 10s scrape_configs: - job_name: cwagent-ecs-file-sd-config sample_limit: 10000 file_sd_configs: - files: [ /tmp/cwagent_ecs_auto_sd.yaml ] metric_relabel_configs: - source_labels: [ __name__ ] regex: ^envoy_appmesh_.+$ action: keep - Name: CW_CONFIG_CONTENT Value: !Sub > { "logs": { "force_flush_interval": 5, "metrics_collected": { "prometheus": { "log_group_name": "${LogGroup}", "prometheus_config_path": "env:PROMETHEUS_CONFIG_CONTENT", "ecs_service_discovery": { "sd_frequency": "1m", "docker_label": {}, "sd_result_file": "/tmp/cwagent_ecs_auto_sd.yaml" }, "emf_processor": { "metric_namespace": "${ProjectName}", "metric_declaration_dedup": true, "metric_declaration": [ { "source_labels": [ "container_name" ], "label_matcher": "^envoy$", "dimensions": [ ["Mesh", "VirtualNode"], ["Mesh", "VirtualNode", "TargetVirtualNode"], ["Mesh", "VirtualNode", "TargetVirtualNode", "TargetVirtualService"], ["Mesh", "VirtualGateway"], ["Mesh", "VirtualGateway", "TargetVirtualNode"], ["Mesh", "VirtualGateway", "TargetVirtualNode", "TargetVirtualService"] ], "metric_selectors": [ "^.+$" ] } ] } } } } } # (...) ``` Under `PROMETHEUS_CONFIG_CONTENT` there will be a [Prometheus configuration file](https://prometheus.io/docs/prometheus/latest/configuration/configuration/). Prometheus' scrape configuration allows one to filter and manipulate metrics before sending them to CloudWatch in [Embedded Metrics Format](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Embedded_Metric_Format_Specification.html). In this example, all metrics that begin with `envoy_appmesh_` are collected from Envoy into the CloudWatch agent. > It is worth noting that Prometheus automatically replaced all dot (`.`) characters in Envoy metric names and dimensions with underscores (`_`). The CloudWatch agent only performs this transformation on metric names. The `metric_declaration` section in `CW_AGENT_CONFIG` is where you can select which metrics and which dimensions you want to send to CloudWatch in Embedded Metrics Format. Here, all possible dimension combinations are selected, and all metrics names are selected under `metrics_selectors` with a wildcard regular expression. You can remove dimensions or further restrict metric names in this section if you choose. > To learn more about configuring the CloudWatch agent to scrape Prometheus metrics see: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/ContainerInsights-Prometheus-Setup-configure-ECS.html Deploy the updated DJ App: ```sh SKIP_IMAGES=1 ./deploy.sh deploy djapp-v3 ``` When complete, generate traffic for a few minutes: ```sh while true; do curl $PUBLIC_ENDPOINT; sleep 1; echo; done ``` At this point, all of Envoy's default metrics are no longer being sent to CloudWatch. Additionally, only the metric dimensions we opted into above are being sent. You can verify that Envoy's metrics are no longer being sent to CloudWatch by using the `metric.sh` script again: ```sh ./metric.sh -t VirtualNode -n djapp -m envoy_cluster_upstream_cx_total -x Sum -q "envoy.cluster_name=cds_egress_howto-metrics-extension_metal-v1_http_8080 metric_type=counter" ``` This graph should show the metric `envoy_cluster_upstream_cx_total`. The number of TCP connections from `djapp` to the `metal-v1` virtual node. ## (Optional) Adding Selected Envoy Metrics The metrics extension metrics are not intended to replace all of the useful metrics provided by Envoy. Using scrape configration, you can always opt in to sending select Envoy metrics to CloudWatch. In this section you'll configure the CloudWatch agent to send the metric that counts the number of TCP connections between the `djapp` and its backend virtual nodes (`envoy_cluster_upstream_cx_total`): ```yaml CWAgentTaskDef: Type: AWS::ECS::TaskDefinition Properties: ContainerDefinitions: - Name: cwagent Image: public.ecr.aws/cloudwatch-agent/cloudwatch-agent:latest Essential: true Environment: - Name: PROMETHEUS_CONFIG_CONTENT Value: !Sub | global: scrape_interval: 1m scrape_timeout: 10s scrape_configs: - job_name: cwagent-ecs-file-sd-config sample_limit: 10000 file_sd_configs: - files: [ /tmp/cwagent_ecs_auto_sd.yaml ] metric_relabel_configs: - source_labels: [ __name__ ] regex: ^(envoy_appmesh_.+)|(envoy_cluster_upstream_cx_total)$ action: keep - source_labels: [ envoy_cluster_name ] target_label: TargetVirtualNode regex: ^cds_egress_${Mesh.MeshName}_(.+)_(.+)_[0-9]+$ replacement: ${!1} # (...) ``` In addition to keeping the `envoy_cluster_upstream_cx_total` metric, there is an additional configration that extracts the `TargetVirtualNode` dimension using Envoy's existing `envoy_cluster_name` dimension. This method of extracting target virtual node names from `envoy_cluster_name` works for most Envoy metrics. To deploy `djapp-v4` run: ```sh SKIP_IMAGES=1 ./deploy.sh deploy djapp-v4 ``` You can now observe that the Envoy metric has been restored and includes the `TargetVirtualNode` dimension: ```sh ./metric.sh -t VirtualNode -n djapp -m envoy_cluster_upstream_cx_total -x Sum -q "TargetVirtualNode=metal-v1" ``` ## (Optional) Filtering Metrics with the AWS Distro for OpenTelemetry As of the time of this writing, the CloudWatch agent does not support scraping histogram type metrics from a Prometheus client. Histogram metrics are used in Envoy to represent time durations. As a result, the metrics extension metric: `TargetResponseTime` or any of Envoy's built-in histogram/timing metrics will not get sent to CloudWatch. > See: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/ContainerInsights-Prometheus-metrics-conversion.html for current status of supported metrics types. In this section, you will replace the CloudWatch agent with the AWS OpenTelemetry Collector from the [AWS Distro for OpenTelemetry](https://aws.amazon.com/otel/). The OpenTelemetry collector supports many different metrics formats and can replace multilple existing metrics agents including the CloudWatch and AWS X-Ray agents. To migrate to the OpenTelemetry collector, you'll start from the version of the DJ App where we used the CloudWatch agent as a sidecar on each of our services. This was the version of the DJ App in template located at `deploy/djapp-v2.yaml`. ```yaml DJAppTaskDef: Type: AWS::ECS::TaskDefinition Properties: ContainerDefinitions: - Name: otel-collector Image: public.ecr.aws/aws-observability/aws-otel-collector:latest Essential: true Environment: - Name: AOT_CONFIG_CONTENT Value: !Sub | extensions: health_check: receivers: statsd: endpoint: 0.0.0.0:8125 aggregation_interval: 60s processors: batch/metrics: timeout: 60s filter: metrics: include: match_type: regexp metric_names: - ^envoy\.appmesh\..+$ exporters: awsemf: namespace: ${ProjectName} log_group_name: ${LogGroup} dimension_rollup_option: NoDimensionRollup service: pipelines: metrics: receivers: [statsd] processors: [batch/metrics, filter] exporters: [awsemf] extensions: [health_check] # (...) ``` While the OpenTelemetry collector also supports scraping Prometheus metrics from Envoy, you can continue to use StatsD metrics and filter using a `filter` processor. As in the above sections, the filter processor will configure the OpenTelemetry collector to send only metrics that start with `envoy.appmesh.` to CloudWatch. > To learn more about OpenTelemetry processors see: https://opentelemetry.io/docs/collector/configuration/#processors > To learn more about the CloudWatch EMF exporter see: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/awsemfexporter/README.md To deploy `djapp-v5` using the OpenTelemetry collector run: ```sh SKIP_IMAGES=1 ./deploy.sh deploy djapp-v5 ``` Using the same loop as above, generate some traffic so the new metrics will appear in CloudWatch: ```sh while true; do curl $PUBLIC_ENDPOINT; sleep 1; echo; done ``` You can view the `TargetResponseTime` metric for calls from `djapp` to `metal.howto-metrics-extension.local`: ```sh ./metric.sh -t VirtualNode -n djapp -m envoy.appmesh.TargetResponseTime -x p50 -q "TargetVirtualService=metal.howto-metrics-extension.local" ``` ## (Optional) Using the Metrics Extension to Simplify Creating CloudWatch Dashboards The addition of the `TargetVirtualService` dimension makes creating rolled-up dashboards for different target services in a mesh much simpler. If you ran the `metric.sh` script you likely would have seen the CloudWatch metric widgets it was generating. For example: ```json { "width": 1200, "height": 600, "period": 60, "start": "-PT15M", "end": "PT0H", "title": "djapp - envoy.appmesh.TargetResponseTime p50", "metrics": [ [ { "expression": "SEARCH('Namespace=\"howto-metrics-extension\" Mesh=\"howto-metrics-extension\" VirtualNode=\"djapp\" MetricName=\"envoy.appmesh.TargetResponseTime\" TargetVirtualService=metal.howto-metrics-extension.local', 'p50', 60)" } ] ] } ``` By combining `SEARCH` expressions with different dimensions, you could create a dashboard for all of the metrics extension metrics for every service in a mesh. In this final section you will deploy three CloudWatch dashboards: 1. `HowtoMetricsExtensionGateway` - The dashboard for the virtual gateway named `gateway`. 1. `HowtoMetricsExtensionDJApp` - The dashboard for the `djapp` virtual node. 1. `HowtoMetricsExtensionMetal` - The dashboard for the `metal-v1` and `metal-v2` vitual nodes. The dashboards will be auto-generated using a custom CloudFormation resource that will describe a mesh and compute a CloudWatch dashboard for a given virtual gateway or a list of virtual nodes. The list of virtual nodes would useful for creating a dashbord for all versioned deployments of a single service. The CloudFormation template for each dashboard is shared under `deploy/dashboard-v1.yaml`. The dashboard generator Lambda Function is located under `src/dashboard-v1` and should already be in your account as an ECR image if you have deployed any of the above `djapp` stacks above. Deploy the dashboards using separate calls to `dashboard.sh`. *Note: You can run `./dashboard.sh -h` for help on using the script.* The CloudFormation stack `howto-metrics-extension-GatewayDashboard`: ```sh ./dashboard.sh \ -c deploy \ -s howto-metrics-extension-GatewayDashboard \ -q howto-metrics-extension \ -d HowToMetricsExtensionGateway \ -i $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/howto-metrics-extension/dashboard-v1:latest \ -m howto-metrics-extension \ -g gateway ``` The CloudFormation stack `howto-metrics-extension-DJAppDashboard`: ``` ./dashboard.sh \ -c deploy \ -s howto-metrics-extension-DJAppDashboard \ -q howto-metrics-extension \ -d HowToMetricsExtensionDJApp \ -i $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/howto-metrics-extension/dashboard-v1:latest \ -m howto-metrics-extension \ -n djapp ``` The CloudFormation stack `howto-metrics-extension-MetalDashboard`: ``` ./dashboard.sh \ -c deploy \ -s howto-metrics-extension-MetalDashboard \ -q howto-metrics-extension \ -d HowToMetricsExtensionMetal \ -i $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/howto-metrics-extension/dashboard-v1:latest \ -m howto-metrics-extension \ -n metal-v1,metal-v2 ``` ![Generated dashbord in CloudWatch for gateway](./images/gateway-generated-dashboard.png) Each dashboard deployed above should demonstrate the metrics extension metrics under different conditions. For example, the `gateway` virtual node only routes traffic to a single virtual service. The `djapp` virtual node demonstrates multiple backends with the `TargetVirtualService` dimension roll-up. And the combined dashboard for `metal-v1` and `metal-v2` shows metrics available when your application has no backends. To clean up the three dashboards you created, use the `dashboard.sh` script again: ```sh ./dashboard.sh -c delete -s howto-metrics-extension-GatewayDashboard ``` ```sh ./dashboard.sh -c delete -s howto-metrics-extension-DJAppDashboard ``` ```sh ./dashboard.sh -c delete -s howto-metrics-extension-MetalDashboard ``` ## Cleaning Up If you used the `dashboard.sh` script to generate any additonal dashboards you can use the same script to delete them. All it does is delete the cloudformation stack associated with the dashboard. For example, if you created your own `HowtoMetricsExtensionJazz` dashboard in the section above run: ```sh ./dashboard.sh -c delete -s howto-metrics-extension-JazzDashboard ``` To remove everything else you created during the walkthrough, use the deploy script: ```sh ./deploy.sh delete ```