AWSTemplateFormatVersion: '2010-09-09' Transform: 'AWS::Serverless-2016-10-31' Description: 'AWS CloudFormation template for the Exploring Images on Social Media using Amazon Rekognition' Metadata: AWS::CloudFormation::Interface: ParameterGroups: - Label: default: Application Settings Parameters: - pApplicationName - Label: default: Database Settings Parameters: - pDatabaseName - Label: default: Twitter Settings Parameters: - pTwitterTermList - pTwitterLanguages - pTwitterAuthConsumerKey - pTwitterAuthConsumerSecret - pTwitterAuthToken - pTwitterAuthTokenSecret - Label: default: VPC and Instance Settings Parameters: - pInstanceKeyName - pVpcCIDR - pPublicSubnet1CIDR - pPublicSubnet1IngressCIDR - Label: default: Lambda Settings Parameters: - pLambdaS3Bucket - pAnalyzeTweetsLambdaS3Key - pAddTriggerLambdaS3Key ParameterLabels: pInstanceKeyName: default: Instance Key pTwitterTermList: default: Twitter Term List pTwitterLanguages: default: Twitter Languages pTwitterAuthConsumerKey: default: Twitter Auth Consumer Key pTwitterAuthConsumerSecret: default: Twitter Auth Consumer Secret pTwitterAuthToken: default: Twitter Auth Token pTwitterAuthTokenSecret: default: Twitter Auth Token Secret pApplicationName: default: Application Name pVpcCIDR: default: VPC CIDR pPublicSubnet1CIDR: default: Public Subnet1 CIDR pPublicSubnet1IngressCIDR: default: Management IP Address pLambdaS3Bucket: default: Lambda S3 Bucket pAnalyzeTweetsLambdaS3Key: default: S3 Key of Analyze Tweets Lambda ZIP file pAddTriggerLambdaS3Key: default: S3 Key of Add Trigger Lambda ZIP file pDatabaseName: default: Database Name Mappings: # This is just the Amazon Linux AMI: AmazonLinuxAMI: us-east-1: # Virginia AMI: ami-a4c7edb2 us-east-2: # Ohio AMI: ami-8a7859ef us-west-1: # North California AMI: ami-327f5352 us-west-2: # Oregon AMI: ami-6df1e514 eu-west-1: # Ireland AMI: ami-d7b9a2b1 eu-west-2: # London AMI: ami-ed100689 eu-central-1: # Frankfurt AMI: ami-82be18ed sa-east-1: # Sao Paulo AMI: ami-87dab1eb ap-southeast-1: # Singapore AMI: ami-77af2014 ap-southeast-2: # Sydney AMI: ami-10918173 ap-northeast-1: # Tokyo AMI: ami-3bd3c45c ap-northeast-2: # Seoul AMI: ami-e21cc38c ca-central-1: # Canada AMI: ami-a7aa15c3 ap-south-1: # Mumbai AMI: ami-47205e28 Parameters: pInstanceKeyName: Type: AWS::EC2::KeyPair::KeyName Description: "The name of the private key to use for SSH access." pTwitterTermList: Description: List of terms for twitter to listen to Type: String Default: "'cloud', 'computing'" pTwitterLanguages: Description: List of languages to use for the twitter streaming reader Type: String Default: "'en'" pTwitterAuthConsumerKey: Description: Consumer key for access twitter Type: String pTwitterAuthConsumerSecret: Description: Consumer Secret for access twitter Type: String pTwitterAuthToken: Description: Access Token for calling twitter Type: String pTwitterAuthTokenSecret: Description: Access Token Secret for calling twitter Type: String pApplicationName: Description: Name of the application deploying for the Exploring Images on Social Media using Amazon Rekognition blogpost Type: String Default: SocialMediaImageAnalytics pVpcCIDR: Description: Please enter the IP range (CIDR notation) for this VPC Type: String Default: pPublicSubnet1CIDR: Description: Please enter the IP range (CIDR notation) for the public subnet in the first Availability Zone Type: String Default: pLambdaS3Bucket: Description: Please enter the name of the S3 Bucket containing the Lambda function ZIP file Type: String Default: aws-bigdata-blog pAnalyzeTweetsLambdaS3Key: Description: Please enter the S3 Key of the Analyze Tweets Lambda function ZIP file Type: String Default: artifacts/exploring-images-on-social-media/ pAddTriggerLambdaS3Key: Description: Please enter the S3 Key of the Add Trigger Lambda function ZIP file Type: String Default: artifacts/exploring-images-on-social-media/ pDatabaseName: Description: Name of the AWS Glue database containing tables for analyzing tweets Type: String Default: socialanalyticsblog pPublicSubnet1IngressCIDR: Description: Please enter your computer's IP address (CIDR notation) for SSH access to public subnet Type: String Default: x.x.x.x/32 Resources: rAuthConsumerSecretManagerSecret: Type: AWS::SecretsManager::Secret Properties: Name: AuthConsumerSecretManagerSecret SecretString: !Ref pTwitterAuthConsumerSecret rAuthAccessTokenSecretManagerSecret: Type: AWS::SecretsManager::Secret Properties: Name: AuthAccessTokenSecretManagerSecret SecretString: !Ref pTwitterAuthTokenSecret rAuthConsumerManagerSecret: Type: AWS::SecretsManager::Secret Properties: Name: AuthConsumerManagerSecret SecretString: !Ref pTwitterAuthConsumerKey rAuthAccessTokenManagerSecret: Type: AWS::SecretsManager::Secret Properties: Name: AuthAccessTokenManagerSecret SecretString: !Ref pTwitterAuthToken rTweetsEC2SecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: Security Group for EC2 Twitter Reader VpcId: !Ref rVPC Tags: - Key: Name Value: rTweetsEC2SecurityGroup - Key: ResourceGroup Value: CloudFormationResource - Key: Project Value: !Join ['-', [!Ref 'pApplicationName', !Ref 'AWS::Region']] SecurityGroupIngress: - IpProtocol: tcp FromPort: '22' ToPort: '22' CidrIp: !Ref pPublicSubnet1IngressCIDR rSocialMediaImageAnalyticsEC2Role: Type: AWS::IAM::Role Properties: AssumeRolePolicyDocument: Version: '2012-10-17' Statement: - Effect: Allow Principal: Service: Action: sts:AssumeRole Path: '/' Policies: - PolicyName: socialmedia-image-analytics-policy PolicyDocument: Version: '2012-10-17' Statement: - Effect: Allow Action: # - 'firehose:DeleteDeliveryStream' - 'firehose:PutRecord' - 'firehose:PutRecordBatch' - 'firehose:UpdateDestination' Resource: # - !Join ['', ['arn:aws:firehose:::deliverystream/', !Ref rIngestionFirehoseStream ]] - !GetAtt rIngestionFirehoseStream.Arn - !GetAtt rEntitiesFirehoseStream.Arn - !GetAtt rSentimentFirehoseStream.Arn - !GetAtt rRekognitionFirehoseStream.Arn - Effect: Allow Action: - 'secretsmanager:GetSecretValue' Resource: - !Ref rAuthConsumerSecretManagerSecret - !Ref rAuthAccessTokenSecretManagerSecret - !Ref rAuthConsumerManagerSecret - !Ref rAuthAccessTokenManagerSecret rVPC: Type: AWS::EC2::VPC Properties: CidrBlock: !Ref pVpcCIDR EnableDnsSupport: true EnableDnsHostnames: true Tags: - Key: Name Value: !Ref pApplicationName rInternetGateway: Type: AWS::EC2::InternetGateway Properties: Tags: - Key: Name Value: !Ref pApplicationName rInternetGatewayAttachment: Type: AWS::EC2::VPCGatewayAttachment Properties: InternetGatewayId: !Ref rInternetGateway VpcId: !Ref rVPC rPublicSubnet1: Type: AWS::EC2::Subnet Properties: VpcId: !Ref rVPC AvailabilityZone: !Select [ 0, !GetAZs ] CidrBlock: !Ref pPublicSubnet1CIDR MapPublicIpOnLaunch: true Tags: - Key: Name Value: !Sub ${pApplicationName} Public Subnet (AZ1) rPublicRouteTable: Type: AWS::EC2::RouteTable Properties: VpcId: !Ref rVPC Tags: - Key: Name Value: !Sub ${pApplicationName} Public Routes rDefaultPublicRoute: Type: AWS::EC2::Route DependsOn: rInternetGatewayAttachment Properties: RouteTableId: !Ref rPublicRouteTable DestinationCidrBlock: GatewayId: !Ref rInternetGateway rPublicSubnet1RouteTableAssociation: Type: AWS::EC2::SubnetRouteTableAssociation Properties: RouteTableId: !Ref rPublicRouteTable SubnetId: !Ref rPublicSubnet1 rTweetsBucket: Type: AWS::S3::Bucket rTwitterStreamingInstanceProfile: Type: AWS::IAM::InstanceProfile Properties: Path: '/' Roles: - !Ref rSocialMediaImageAnalyticsEC2Role rTwitterStreamingReaderServer: Type: AWS::EC2::Instance Properties: ImageId: !FindInMap [ AmazonLinuxAMI, !Ref 'AWS::Region', AMI] # Lookup the AMI in the region map InstanceType: t2.medium # Any size is fine KeyName: !Ref pInstanceKeyName # Use the keypair from the input parameters SecurityGroupIds: - !Ref rTweetsEC2SecurityGroup IamInstanceProfile: !Ref rTwitterStreamingInstanceProfile SubnetId: !Ref rPublicSubnet1 Tags: - Key: Name Value: !Join ['-', [!Ref 'pApplicationName', 'DeployGroup', !Ref 'AWS::Region']] - Key: Project Value: !Join ['-', [!Ref 'pApplicationName', !Ref 'AWS::Region']] UserData: Fn::Base64: Fn::Sub: - | #!/bin/bash -ex set -e sleep 60 yum clean all yum update -y yum -y install nodejs npm --enablerepo=epel npm config set registry npm install node-sass request@2.81.0 echo "var twitter_config = module.exports = { twitter: { consumer_key: '`aws secretsmanager get-secret-value --secret-id AuthConsumerManagerSecret --query SecretString --output text --region ${Region}`', consumer_secret: '`aws secretsmanager get-secret-value --secret-id AuthConsumerSecretManagerSecret --query SecretString --output text --region ${Region}`', access_token: '`aws secretsmanager get-secret-value --secret-id AuthAccessTokenManagerSecret --query SecretString --output text --region ${Region}`', access_token_secret: '`aws secretsmanager get-secret-value --secret-id AuthAccessTokenSecretManagerSecret --query SecretString --output text --region ${Region}`' }, topics: [${TwitterTerms}], languages: [${pTwitterLanguages}], kinesis_delivery: '${KinesisIngestionFirehose}' }" > /home/ec2-user/twitter_reader_config.js wget tar --warning=no-unknown-keyword -xf SocialAnalyticsReader.tar -C /home/ec2-user/ echo "Service started." - { TwitterTerms: !Ref pTwitterTermList, pTwitterLanguages: !Ref pTwitterLanguages, Region: !Ref 'AWS::Region', KinesisIngestionFirehose: !Ref rIngestionFirehoseStream } rIngestionFirehoseStream: Type: AWS::KinesisFirehose::DeliveryStream Properties: ExtendedS3DestinationConfiguration: BucketARN: !Join ['', ['arn:aws:s3:::', !Ref rTweetsBucket]] BufferingHints: IntervalInSeconds: 60 SizeInMBs: 5 Prefix: 'raw/' CompressionFormat: 'UNCOMPRESSED' RoleARN: !GetAtt rIngestionFireHoseRole.Arn rEntitiesFirehoseStream: Type: AWS::KinesisFirehose::DeliveryStream Properties: ExtendedS3DestinationConfiguration: BucketARN: !Join ['', ['arn:aws:s3:::', !Ref rTweetsBucket]] BufferingHints: IntervalInSeconds: 60 SizeInMBs: 5 Prefix: 'entities/' CompressionFormat: 'UNCOMPRESSED' RoleARN: !GetAtt rIngestionFireHoseRole.Arn rSentimentFirehoseStream: Type: AWS::KinesisFirehose::DeliveryStream Properties: ExtendedS3DestinationConfiguration: BucketARN: !Join ['', ['arn:aws:s3:::', !Ref rTweetsBucket]] BufferingHints: IntervalInSeconds: 60 SizeInMBs: 5 Prefix: 'sentiment/' CompressionFormat: 'UNCOMPRESSED' RoleARN: !GetAtt rIngestionFireHoseRole.Arn rRekognitionFirehoseStream: Type: AWS::KinesisFirehose::DeliveryStream Properties: ExtendedS3DestinationConfiguration: BucketARN: !Join ['', ['arn:aws:s3:::', !Ref rTweetsBucket]] BufferingHints: IntervalInSeconds: 60 SizeInMBs: 5 Prefix: 'media/' CompressionFormat: 'UNCOMPRESSED' RoleARN: !GetAtt rIngestionFireHoseRole.Arn rIngestionFireHoseRole: Type: AWS::IAM::Role Properties: AssumeRolePolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Principal: Service: Action: sts:AssumeRole Condition: StringEquals: sts:ExternalId: !Ref 'AWS::AccountId' rIngestionFirehosePolicy: Type: AWS::IAM::Policy Properties: PolicyName: TweetIngestionFirehosePolicy Roles: - !Ref rIngestionFireHoseRole PolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Action: - s3:AbortMultipartUpload - s3:GetBucketLocation - s3:GetObject - s3:ListBucket - s3:ListBucketMultipartUploads - s3:PutObject Resource: - !Join ['', ['arn:aws:s3:::', !Ref rTweetsBucket]] - !Join ['', ['arn:aws:s3:::', !Ref rTweetsBucket, '/*']] - Effect: Allow Action: - logs:CreateLogGroup - logs:CreateLogStream - logs:PutLogEvents - logs:DescribeLogStreams Resource: - arn:aws:logs:*:*:* rSocialMediaImageAnalyticsLambdaFunctionRole: Type: AWS::IAM::Role Properties: AssumeRolePolicyDocument: Version: '2012-10-17' Statement: - Effect: Allow Principal: Service: [] Action: ['sts:AssumeRole'] Path: / Policies: - PolicyName: SocialMediaImageAnalyticsLambdaFunctionExecutionPolicy PolicyDocument: Version: '2012-10-17' Statement: - Sid: CloudWatchAccess Effect: Allow Action: ['logs:CreateLogGroup', 'logs:CreateLogStream', 'logs:PutLogEvents'] Resource: arn:aws:logs:*:*:* - Sid: S3Access Effect: Allow Action: ['s3:GetObject', 's3:ListBucket', 's3:PutObject'] Resource: [!GetAtt [rTweetsBucket, Arn], !Join ['', [!GetAtt [rTweetsBucket, Arn], /*]]] - Sid: FirehoseAccess Effect: Allow Action: ['firehose:ListDeliveryStreams', 'firehose:PutRecord', 'firehose:PutRecordBatch'] Resource: [!GetAtt [rSentimentFirehoseStream, Arn], !GetAtt [rEntitiesFirehoseStream, Arn], !GetAtt [rRekognitionFirehoseStream, Arn]] - Sid: ComprehendAccess Effect: Allow Action: ['comprehend:DetectEntities', 'comprehend:DetectSentiment'] Resource: '*' - Sid: RekognitionAccess Effect: Allow Action: ['rekognition:DetectLabels','rekognition:DetectFaces','rekognition:RecognizeCelebrities','rekognition:DetectModerationLabels','rekognition:DetectText'] Resource: '*' - Sid: SQSAccess Effect: Allow Action: ['sqs:SendMessage'] Resource: [!GetAtt [rDeadLetterQueue, Arn]] rSocialMediaImageAnalyticsLambda: Type: 'AWS::Lambda::Function' Properties: Code: S3Bucket: !Ref pLambdaS3Bucket S3Key: !Ref pAnalyzeTweetsLambdaS3Key Handler: index.handler Runtime: python3.7 MemorySize: 256 Timeout: 900 ReservedConcurrentExecutions: 100 # Aligns with Rekognition TPS limit Role: Fn::GetAtt: - rSocialMediaImageAnalyticsLambdaFunctionRole - Arn Environment: Variables: SENTIMENT_STREAM: !Ref rSentimentFirehoseStream ENTITY_STREAM: !Ref rEntitiesFirehoseStream REKOGNITION_STREAM: !Ref rRekognitionFirehoseStream BUCKET: !Ref rTweetsBucket DEADLETTER_QUEUE: !Ref rDeadLetterQueue IMAGEKEY_PREFIX: tmp/images/ rSocialMediaGlueDB: Type: "AWS::Glue::Database" Properties: DatabaseInput: Name: !Ref pDatabaseName CatalogId: !Ref AWS::AccountId rTweetsTable: Type: "AWS::Glue::Table" Properties: TableInput: Name: tweets StorageDescriptor: Compressed: False InputFormat: org.apache.hadoop.mapred.TextInputFormat NumberOfBuckets: -1 OutputFormat: Location: !Join ['', ['s3://', !Ref 'rTweetsBucket', '/raw/']] SerdeInfo: SerializationLibrary: Columns: - Name: id Type: bigint - Name: coordinates Type: struct> - Name: retweeted Type: boolean - Name: source Type: string - Name: entities Type: struct>>,urls:array>>> - Name: extended_entities Type: struct,media_url:string,media_url_https:string,url:string,display_url:string,expanded_url:string,type:string,sizes:struct,large:struct,medium:struct,small:struct>>>,favorited:boolean,retweeted:boolean,possibly_sensitive:boolean,filter_level:string,lang:string> - Name: reply_count Type: bigint - Name: favorite_count Type: bigint - Name: geo Type: struct> - Name: id_str Type: string - Name: timestamp_ms Type: bigint - Name: truncated Type: boolean - Name: text Type: string - Name: retweet_count Type: bigint - Name: possibly_sensitive Type: boolean - Name: filter_level Type: string - Name: created_at Type: string - Name: place Type: struct>>>> - Name: favorited Type: boolean - Name: lang Type: string - Name: in_reply_to_screen_name Type: string - Name: is_quote_status Type: boolean - Name: in_reply_to_user_id_str Type: string - Name: user Type: struct - Name: quote_count Type: bigint Parameters: {'classification': 'json'} DatabaseName: !Ref rSocialMediaGlueDB CatalogId: !Ref AWS::AccountId rTweetSentimentTable: Type: "AWS::Glue::Table" Properties: TableInput: Name: tweet_sentiments StorageDescriptor: Compressed: False InputFormat: org.apache.hadoop.mapred.TextInputFormat NumberOfBuckets: -1 OutputFormat: Location: !Join ['', ['s3://', !Ref 'rTweetsBucket', '/sentiment/']] SerdeInfo: SerializationLibrary: Columns: - Name: tweetid Type: bigint - Name: text Type: string - Name: originaltext Type: string - Name: sentiment Type: string - Name: sentimentposscore Type: double - Name: sentimentnegscore Type: double - Name: sentimentneuscore Type: double - Name: sentimentmixedscore Type: double Parameters: {'classification': 'json'} DatabaseName: !Ref rSocialMediaGlueDB CatalogId: !Ref AWS::AccountId rTweetEntitiesTable: Type: "AWS::Glue::Table" Properties: TableInput: Name: tweet_entities StorageDescriptor: Compressed: False InputFormat: org.apache.hadoop.mapred.TextInputFormat NumberOfBuckets: -1 OutputFormat: Location: !Join ['', ['s3://', !Ref 'rTweetsBucket', '/entities/']] SerdeInfo: SerializationLibrary: Columns: - Name: tweetid Type: bigint - Name: entity Type: string - Name: type Type: string - Name: score Type: double Parameters: {'classification': 'json'} DatabaseName: !Ref rSocialMediaGlueDB CatalogId: !Ref AWS::AccountId rTweetMediaRekognitionTable: Type: "AWS::Glue::Table" Properties: TableInput: Name: media_rekognition StorageDescriptor: Compressed: False InputFormat: org.apache.hadoop.mapred.TextInputFormat NumberOfBuckets: -1 OutputFormat: Location: !Join ['', ['s3://', !Ref 'rTweetsBucket', '/media/']] SerdeInfo: SerializationLibrary: Columns: - Name: tweetid Type: bigint - Name: mediaid Type: string - Name: text Type: string - Name: media_url Type: string - Name: image_labels Type: struct,confidence:double>>,confidence:double,parents:array>,name:string>>,textdetections:array,polygon:array>>,confidence:double,detectedtext:string,type:string,id:int,parentid:int>>,facedetails:array,sunglasses:struct,gender:struct,landmarks:array>,pose:struct,emotions:array>,agerange:struct,eyesopen:struct,boundingbox:struct,smile:struct,mouthopen:struct,quality:struct,mustache:struct,beard:struct>>,celebrityrecognition:struct,confidence:double,landmarks:array>,pose:struct,quality:struct>>,celebrityfaces:array,confidence:double,landmarks:array>,pose:struct,quality:struct>,urls:array,name:string,id:string,matchconfidence:double>>,orientationcorrection:string>,moderationlabels:array>,labelmodelversion:string> Parameters: {'classification': 'json'} DatabaseName: !Ref rSocialMediaGlueDB CatalogId: !Ref AWS::AccountId rS3Notification: Type: Custom::Notification Properties: ServiceToken: !GetAtt rLambdaS3EventCreationCustomResource.Arn rAttachPolicyCustomResourceLambdaRole: Type: AWS::IAM::Role Properties: AssumeRolePolicyDocument: Version: '2012-10-17' Statement: - Effect: Allow Principal: Service: [] Action: ['sts:AssumeRole'] Path: / Policies: - PolicyName: SocialMediaAnalyticLambdaFunctionExecutionPolicy PolicyDocument: Version: '2012-10-17' Statement: - Sid: CloudWatchAccess Effect: Allow Action: ['logs:CreateLogGroup', 'logs:CreateLogStream', 'logs:PutLogEvents'] Resource: arn:aws:logs:*:*:* - Sid: S3Access Effect: Allow Action: ['s3:GetObject', 's3:PutObject', 's3:PutBucketNotification', 's3:ListBucket', 's3:DeleteObject'] Resource: - !Join ['', ['arn:aws:s3:::', !Ref 'rTweetsBucket']] - !Join ['', ['arn:aws:s3:::', !Ref 'rTweetsBucket', '/*']] - Sid: LambdaAddPermission Effect: Allow Action: ['lambda:AddPermission'] Resource: !GetAtt rSocialMediaImageAnalyticsLambda.Arn rLambdaS3EventCreationCustomResource: Type: AWS::Lambda::Function Properties: Handler: index.lambda_handler Role: !GetAtt rAttachPolicyCustomResourceLambdaRole.Arn Runtime: python3.7 Timeout: 300 Environment : Variables: lambda_arn: !GetAtt rSocialMediaImageAnalyticsLambda.Arn account_number: !Ref 'AWS::AccountId' s3_bucket: !Ref 'rTweetsBucket' Code: S3Bucket: !Ref pLambdaS3Bucket S3Key: !Ref pAddTriggerLambdaS3Key rDeadLetterQueue: Type: AWS::SQS::Queue Outputs: SSHCommand: Description: To establish an SSH tunnel with the twitter stream reader, type the following command. Value: !Join ['', [ssh -i ~/, !Ref 'pInstanceKeyName', .pem ec2-user@, !GetAtt [ rTwitterStreamingReaderServer, PublicDnsName]]] EC2InstanceConsoleURL: Description: URL to ec2 EC2InstanceConsoleURL Value: !Join ['', ['https://', !Ref "AWS::Region", '', !Ref "AWS::Region", '#Instances:search=', !Ref 'rTwitterStreamingReaderServer']] LambdaFunctionConsoleURL: Description: URL to the Lambda Function console Value: !Join ['', ['', !Ref "AWS::Region", '#/functions/', !Ref "rSocialMediaImageAnalyticsLambda", '?tab=graph']] S3ConsoleURL: Description: URL to the Lambda Function console Value: !Join ['', ['', !Ref 'rTweetsBucket', '/?region=', !Ref "AWS::Region", '&tab=overview']] TwitterRawLocation: Description: S3 Twitter Raw location. Value: !Join ['', ['s3://', !Ref 'rTweetsBucket', /raw/]] TwitterEntitiesLocation: Description: S3 Twitter Entities location. Value: !Join ['', ['s3://', !Ref 'rTweetsBucket', /entities/]] TwitterSentimentLocation: Description: S3 Twitter Sentiment location. Value: !Join ['', ['s3://', !Ref 'rTweetsBucket', /sentiment/]] TwitterMediaLabelsLocation: Description: S3 Twitter Media Labels location. Value: !Join ['', ['s3://', !Ref 'rTweetsBucket', /media/]]