# Application Overview This section aims to provide a high-level view of what components make up a self-hosted ConstructHub instance, and troubleshooting guidance where appropriate. ## High-Level Architecture The frontend of ConstructHub instances is a single-page web-app (developed in the [cdklabs/construct-hub-webapp] repository), served from an S3 bucket by a CloudFront Web Distribution. The CloudFront Web Distribution also serves objects from a second S3 bucket that is used by the backend application to store indexed package data that is presented by the fronted. The backend of ConstructHub instances is an event-driven, serverless application that performes the following tasks: 1. A *package source* implementation notifes the ConstructHub instance of new packages by sending messages to an ingestion SQS queue. 1. The *ingestion* function is triggered by the ingestion SQS queue, and verifies the new package version is compliant, before starting up the backend workflow. 1. The *backend workflow* is a StepFunctions State Machine that orchestrates the necessary steps to fully index a package into a ConstructHub instance. 1. The *prune* function enforces the configured deny-list and ensures previously indexed packages that are now part of the deny-list are removed from storage within an hour. 1. The *discovery canary*, if configured, is a Lambda function the will periodically validate the hub is able to discover new packages within a predefined SLA. [cdklabs/construct-hub-webapp]: https://github.com/cdklabs/construct-hub-webapp ### 1. Package Sources ConstructHub provides two package source implementations: `NpmJs` and `CodeArtifact`. * The `NpmJs` source interfaces with the `npmjs.com` CouchDB replica (which is at `replicate.npmjs.com/registry`) by following it's `_changes` stream in search of relevant packages. When such a package is identified, a stager function is invoked, which stages the package tarball into an S3 bucket then notifies the ConstructHub ingestion SQS queue. The CouchDB follower is scheduled to run every `5 minutes`, and stores the current CouchDB sequence ID in a specific object in the S3 bucket used for staging package tarballs. > Back-filling is automatic for the `NpmJs` source. Upon initial deployment, > it will start scanning the CouchDB `_changes` stream. Should there be a need > to re-run a backfill of this source, the transaction marker object in S3 can > be deleted to roll back to that initial transaction. The marker object is > linked from the backend dashboard. - A **high-severity** alarm triggers if the NpmJs Follower is not running at the scheduled cadence, or if it encounters failures for more than `15 minutes`. - A **high-severity** alarm triggers if the NpmJs Stager dead-letter queue is not empty. - Troubleshooting the NpmJs Follower can be done by inspecting its log traces in CloudWatch Logs, or by looking at service maps in the X-Ray console. - The NpmJs Follower produces a set of metrics that are automatically inserted in the *Backend Dashboard*, including the following: + `NpmJsChangeAge` shows how far behind the public `npmjs.com` registry the current CouchDB sequence ID is. In steady state (once the initial backfill has completed), this metric should always be below `5 minutes`. + `PackageVersionAge` is the amount of time elapsed between the publication of a package version in the public `npmjs.com` registry, and when that was signalled to the ingestion SQS queue. In steady state, this metric should always be below `5 minutes`. + `UnprocessableEntity` is the count of events received from the CouchDB instance that could not be processed. This metric is not emitted if no event was found unprocessable. The CloudWatch Logs for the NpmJs Follower will contain additional information about those events. * The `CodeArtifact` source leverages EventBridge events emitted by any CodeArtifact repository when packages it contains are modified (created, updated, deleted). It considers only events pertaining to `npm` packages published the specific CodeArtifact Repository that it is configured with. A Lambda Function verifies the package version from the event is eligible for ConstructHub (i.e: it is a `jsii` package, using an allowed license, etc...) before staging it in an S3 bucket, then notifying the ingestion SQS Queue. > No backfill provision is currently implemented for the `CodeArtifact` > source. If a ConstructHub instance is started off from a pre-existing > CodeArtifact repository, the operator should manually inject all relevant > packages from said repository into the ingestion queue. > > :construction: A managed back-fill procedure will be provided in the future. - A **high-severity** alarm triggers if the CodeArtifact Forwarder function encounters failures. - Troubleshooting the CodeArtifact Forwarder can be done by inspecting its log traces in CloudWatch Logs, or by looking at service maps in the X-Ray console. * Third party package-sources can also be used. Please refer to these sources' documentation for monitoring & troubleshooting guidance. ### 2. Ingestion The *ingestion* process is implemented by a Lambda Function triggered directly from the ingestion SQS queue. It performs the following steps: 1. Download the tarball from the S3 location indicated in the ingestion payload 1. Validate the input payload using the `integrity` checksum 1. Validate that it is eligible for indexing: - It contains a `.jsii` assembly document that is valid - It is released under an allowed license - Essential `.jsii` assembly corresponds to the `package.json` document - The package name must be identical - The package version must be identical - The license must be identical - The package version is not listed in the configured deny list 1. Attempt to identify a `LICENSE` file bundled in the package 1. Uploads the tarball to the package data S3 bucket 1. Creates the `manifest.json` object in the package data S3 bucket, containing: - The contents of the `LICENSE` file (if one was found) - The publication timestamp for the package version 1. Uploads the `.jsii` assembly to the package data S3 bucket as `assembly.json` 1. Triggers the *Backend Workflow* for the package version A **high-severity** alarm triggers if the *ingestion* function encounters failures, or if the ingestion SQS queue has messages older than `10 minutes` approximately. If the *ingestion* function fails for a particular queue message more than `5` times, that message will be moved into a dead-letter queue. A **high-severity** alarm triggers when the dead-letter queue is not empty. Troubleshooting the *ingestion* function can be done by inspecting its log traces in CloudWatch Logs, or by looking at service maps in the X-Ray console. The function also produces several CloudWatch metrics that are visible in the *Backend Dashboard*, including: - `InvalidTarball` is the count of package versions that were rejected due to having an invalid tarball (missing the `package.json` file or `.jsii` assembly). - `InvalidAssembly` is the count of package versions that were rejected due to containing an invalid `.jsii` assembly (in most cases, these are old packages that were built using a pre-1.0 release of `jsii` that is no longer supported) - `MismatchedIdentityRejections` is the count of package versions that were rejected due to differences between data in the `.jsii` assembly and `package.json` files. - `IneligibleLicense` is the count of package versions that were rejected due to using a license that is not in the configured license allow-list. - `FoundLicenseFile` is the count of ingested package versions for which a `LICENSE` file could be identified. ### 3. Backend Workflow The *Backend Workflow* is a StepFunctions State Machine that performs the following tasks: 1. Execute the documentation rendering process for each supported language (TypeScript, Python, ...) 1. If any documentation could be rendered, adds the package version to the `catalog.json` object (which is a no-op if the package version is not the *latest* known release of it's major line) When any step of the State Machine fails, a message is sent to a dead-letter queue. That message includes information about the failure that happened (in case multiple failures happened, only one cause will be represented), and information about the State Machine execution (which can be used to review the full execution log in the AWS Console, or using the StepFunctions API). Executions that successfully sent a message to the dead-letter queue will show as "success". Conversely, "failed" executions may not have a corresponding message in the dead-letter queue and must be troubleshooted starting from the failed execution instead. A **high-severity** alarm trigegrs if the State Machine dead-letter queue is not empty, or if any execution fails. Troubleshooting can be done by reviewing State Machine execution events in the StepFunctions console (or using the StepFunctions API), reviewing the log traces of each step in CloudWAtch Logs, or by looking service maps in the X-Ray console. Messages from the dead-letter queue can be fed back to the State Machine by using the "Redrive DLQ" Lambda Function, that is linked from the *Backend Dashboard*. ### 4. Deny List Processes Each ConstructHub instance can be configured with an optional set of deny-list rules, to prevent packgaes from being indexed in that instance. If a package was already indexed at the time it is added to the deny-list, all indexed assets for it will be deleted by a *prune* Lambda Function. A **high-severity** alarm triggers if the *prune* function does not run at the configured cadence, or if it encounters failures. Troubleshooting can be done by inspecting the log traces it produces in CloudWatch Logs, or by looking at service maps in the X-Ray console. The *prune* function emits a `Rules` CloudWatch Metric that indicates how many deny-list rules it is currently enforcing. This could match the amount of rules that were configured on the ConstructHub instance. ### 5. Discovery Canary The [discovery canary](../README.md#discovery-canary) is an optional canary that can be configured as part of the hub's deployment NpmJs package source. Its job is to continuously validate that the hub is able to discover and process packages in a timely manner. The canary monitors the availability of new versions of a designated package in the ConstructHub instance (by default, this is `construct-hub-probe`), and emits metrics that help understand how much time elapses between the package publication to npmjs.com (`construct-hub-probe` gets a new version approximately every 3 hours), and when those packages are available to browse in ConstructHub. A **high-severity** alarm triggers if the canary function is either malfunctioning or detects discovery SLA breaches. Troubleshooting these alarms is described in the operator runbook. - [`ConstructHub/Sources/NpmJs/Canary/SLA-Breached`](./operator-runbook.md#constructhubsourcesnpmjscanarysla-breached) ### 6. Feed Generation Construct hub generates RSS/ATOM feed when the package catalog gets updated. The feed generator looks at the latest 100 packages and generates the feed. If the construct hub is configured to generate release notes, then the generated feed will contain the change log for the packages where it can be generated. #### 6. Release Notes Fetcher Construct hub can be configured to generate release notes for the packages that are added to the catalog. The release notes are generated from Github when the release information is available. The generation of release notes looks for the following places in Github - Get the release notes from individual release from Github - Get the list of all the releases from Github and then match the release number - Get the changelog.md file and match the release number to generate the release notes Github APIs are rate-limited and for the construct hub to generate the release notes, it has to be configured with Github [Personal Access token](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token). The release notes generation process uses a step function to ensure that the rate limits are respected and will back up the request when the API service limits are hit ## Monitoring & Alarming Each ConstructHub instance comes with a set of CloudWatch dashboards that can be used to monitor the current state of the instance. The name of the backend dashboard can be configured using the `backendDashboardName` property of the `ConstructHub` construct: ```ts import { App, Stack } from '@aws-cdk/core'; import { ConstructHub } from 'construct-hub'; // The usual... you might have used `cdk init app` instead! const app = new App(); const stack = new Stack(app, 'StackName', { /* ... */ }); // Now to business! new ConstructHub(stack, 'ConstructHub', { backendDashboardName: 'ConstructHub-Backend' }); ``` This dashboard provides an overview of the most important process of the ConstructHub instance, and can provide insight into the cause of many problem. In addition to this, several alarms are automatically created by the `ConstructHub` construct, that aim to inform operators about any problem. By default no actions are configured on these alarms, but the `alarmActions` property can be used to specify `IAlarmAction` instances to be bound to each alarm: ```ts import { SnsAction } from '@aws-cdk/aws-cloudwatch-actions'; import { Topic } from '@aws-cdk/aws-sns'; import { App, Stack } from '@aws-cdk/core'; import { ConstructHub } from 'construct-hub'; // The usual... you might have used `cdk init app` instead! const app = new App(); const stack = new Stack(app, 'StackName', { /* ... */ }); // Now to business! const emergencyTopic = new Topic(stack, 'Emergencies', { /* ... */ }); const informationTopic = new Topic(stack, 'Information', { /* ... */ }); new ConstructHub(stack, 'ConstructHub', { alarmActions: { // This action triggers when immediate attention is needed! highSeverityAction: new SnsAction(emergencyTopic), // This action triggers with less urgent alarms. normalSeverityAction: new SnsAction(informationTopic), }, }); ```