# Aggregation-query-proxy There are use-cases when you need to aggregate bounded result sets for a short time period from DynamoDB or Keyspaces, for example, an hourly or daily report, including all hourly or daily sales. However, Amazon DynamoDB and Keyspaces do not support the commonly seen SQL aggregation constructs such as COUNT, SUM, MIN, MAX, and GROUP BY, as aggregation queries with unbound number of partitions might take unpredictable time to execute. Because of this constraint, it is better to preprocess the operational data to do the aggregation, and storage of the processed data in Amazon DynamoDB/Keyspaces. This pattern provides a solution by placing a scalable aggregation proxy (sidecar) between your application and DynamoDB/Keyspaces. The aggregation-query-proxy (AQP) consists of a scalable proxy layer that sits between your application and Amazon Keyspaces/DynamoDB. It provides intermediate aggregation logic which allows existing application to execute aggregation queries against Amazon DynamoDB/Keyspaces. The AQP converts the provided aggregation query (SQL-92) to a plain request (CQL/DDBPartiQL). After the plain response (json) has been received the AQP uses IonEngine to aggregate the plain response into the final result set in json format.  Create your yaml based on the template: `cp conf/keyspaces-aggregation-query-proxy.yaml.template conf/keyspaces-aggregation-query-proxy.yaml` ### Configure and build the app with Amazon Keyspaces Set ```dataBaseName``` to ```KEYSPACES``` Set ```pathToKeyspacesConfigFile``` to ```/usr/app``` #### Configure DataStax conf file Prepare DataStax java driver conf file ### Build the project `mvn install` `build.sh <AWS_ACCOUNT> <REGION>` ### Start a docker container with the app ```docker run -it -p 8080:8080 simple-aggregation-query-app``` ### Limitations As a best practice we recommend executing bounded Amazon Keyspaces (CQL) or DynamoDB (PartiQL) requests against the Aggregation Query Proxy. In all cases, avoid unbounded aggregations queries (without WHERE clause). Unbounded aggregation queries might lead to unpredictable execution time, high JVM memory pressure on AQP nodes (OOM), or high Amazon DynamoDB/Keyspaces RCUs consumption. ## Configure and build the app with Amazon DynamoDB Set ```dataBaseName``` to ```DYNAMODB``` Set ```dynamoRegion``` to ```us-east-1``` ### Build the project `mvn install` `build.sh <AWS_ACCOUNT> <REGION>` ### Start a docker container with the app ```docker run -it -p 8080:8080 --env AWS_REGION="us-east-1" --env AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID --env AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY --env AWS_SESSION_TOKEN=$AWS_SESSION_TOKEN simple-aggregation-query-app``` ### Let's execute a simple aggregation query against Amazon Keyspaces `AUTH_BASIC=$(echo -n large-query-app:your_secret | base64)` `http --follow --timeout 3600 GET 'http://0.0.0.0:8080/query-aggregation/select count(book_title) as books, award, avg(rank) as avg_rang from keyspaces_sample.keyspaces_sample_table GROUP BY award' Authorization:'Basic '$(AUTH_BASIC)` `HTTP/1.1 200 OK` `Cache-Control: no-transform, max-age=60` `Content-Encoding: gzip` `Content-Length: 195` `Content-Type: application/json` `Date: Wed, 04 May 2022 01:59:16 GMT` `Vary: Accept-Encoding` ```json { "response": [{"resultSet":[{"books":3,"award":"Kwesi Manu Prize","avg_rang":2e0}, {"books":3,"award":"Richard Roe","avg_rang":2e0}, {"books":3,"award":"Wolf","avg_rang":2e0}]}], "stats": { "elapsedTimeToAggregateDataInMs": 361, "elapsedTimeToRetrieveDataInMs": 120, "payloadSizeBytes": 626 } } ``` ### Let's execute a simple aggregation query against Amazon DynamoDB `http --auth-type basic -a large-query-app:secretEXAMPLE --follow --timeout 3600 GET 'http://0.0.0.0:8080/query-aggregation/select zipcode,pk,sum(amount) as total from "your_table" where pk in (%27ACCOUNT%23ACCOUNT40%23CUSTOMER%23CUSTOMER33%27, %27ACCOUNT%23ACCOUNT1%23CUSTOMER%23CUSTOMER61%27) group by zipcode, pk'` `HTTP/1.1 200 OK` `Cache-Control: no-transform, max-age=60` `Content-Encoding: gzip` `Content-Length: 229` `Content-Type: application/json` `Date: Tue, 14 Jun 2022 14:57:15 GMT` `Vary: Accept-Encoding` ```json { "response": [ { "resultSet": [ { "pk": "ACCOUNT#ACCOUNT1#CUSTOMER#CUSTOMER61", "total": 133.9894140356888, "zipcode": 74545 }, { "pk": "ACCOUNT#ACCOUNT40#CUSTOMER#CUSTOMER33", "total": 4321.055836431855, "zipcode": 56624 } ] } ], "stats": { "elapsedTimeToAggregateDataInMs": 1210, "elapsedTimeToRetrieveDataInMs": 1346, "payloadSizeBytes": 194 } } ``` ### Local DynamoDB for JUnit test Set ```localDDB``` to ```true``` to run all JUnit tests against DynamoDB ## License This project is licensed under the MIT-0