export const meta = { title: `Connect to machine learning services`, description: `Add AI/ML capabilities such as text recognition, image labeling, text-to-speech, and translation to your GraphQL API.`, }; Amplify allows you to identify text on an image, identify labels on an image, translate text, and synthesize speech from text with the `@predictions` directive. > Note: The `@predictions` directive requires a S3 storage bucket configured via `amplify add storage`. ## Identify text on an image To configure text recognition on an image use the `identifyText` action in the `@predictions` directive. ```graphql type Query { recognizeTextFromImage: String @predictions(actions: [identifyText]) } ``` In your GraphQL query, can pass in a S3 `key` for the image. At the moment, this directive works only with objects located within the `public/` folder of your S3 bucket. The `public/` prefix is automatically added to the `key` input. For instance, in the example below, `public/myimage.jpg` will be used as the input. ```graphql query RecognizeTextFromImage($input: RecognizeTextFromImageInput!) { recognizeTextFromImage(input: { identifyText: { key: "myimage.jpg" } }) } ``` ## Identify labels on an image To configure label recognition on an image use the `identifyLabels` action in the `@predictions` directive. ```graphql type Query { recognizeLabelsFromImage: [String] @predictions(actions: [identifyLabels]) } ``` In your GraphQL query, you can pass in a S3 `key` for the image. At the moment, this directive works only with objects located within `public/` folder in your S3 bucket. The `public/` prefix is automatically added to the `key` input. For instance, in the example below, `public/myimage.jpg` will be used as the input. The query below will return a list of identified labels. Review [Detecting Labels](https://docs.aws.amazon.com/rekognition/latest/dg/labels.html) in the Amazon Rekognition documentation for the full list of supported labels. ```graphql query RecognizeLabelsFromImage($input: RecognizeLabelsFromImageInput!) { recognizeLabelsFromImage(input: { identifyLabels: { key: "myimage.jpg" } }) } ``` ## Translate text To configure text translation use the `identifyLabels` action in the `@predictions` directive. ```graphql type Query { translate: String @predictions(actions: [translateText]) } ``` The query below will return the translated string. Populate the `sourceLanguage` and `targetLanguage` parameters with one of the [Supported Language Codes](https://docs.aws.amazon.com/translate/latest/dg/what-is.html#what-is-languages). Pass in the text to translate via the `text` parameter. ```graphql query TranslateText($input: TranslateTextInput!) { translate(input: { translateText: { sourceLanguage: "en" targetLanguage: "de" text: "Translate me" } }) } ``` ## Synthesize speech from text To configure Text-to-Speech synthesis use the `convertTextToSpeech` action in the `@predictions` directive. ```graphql type Query { textToSpeech: String @predictions(actions: [convertTextToSpeech]) } ``` The query below will return a presigned URL with the synthesized speech. Populate the `voiceID` parameter with one of the [Supported Voice IDs](https://docs.aws.amazon.com/polly/latest/dg/voicelist.htm). Pass in the text to synthesize via the `text` parameter. ```graphql query ConvertTextToSpeech($input: ConvertTextToSpeechInput!) { textToSpeech(input: { convertTextToSpeech: { voiceID: "Nicole" text: "Hello from AWS Amplify!" } }) } ``` ## Combining Predictions actions You can also combine multiple Predictions actions together into a sequence. The following action sequences are supported: - `identifyText -> translateText -> convertTextToSpeech` - `identifyLabels -> translateText -> convertTextToSpeech` - `translateText -> convertTextToSpeech` In the example below, `speakTranslatedImageText` identifies text from an image, then translates it into another language, and finally converts the translated text to speech. ```graphql type Query { speakTranslatedImageText: String @predictions(actions: [ identifyText translateText convertTextToSpeech ]) } ``` An example of that query will look like: ```graphql query SpeakTranslatedImageText($input: SpeakTranslatedImageTextInput!) { speakTranslatedImageText(input: { identifyText: { key: "myimage.jpg" } translateText: { sourceLanguage: "en" targetLanguage: "es" } convertTextToSpeech: { voiceID: "Conchita" } }) } ``` A code example of this using the JS Library is shown below: ```js import React, { useState } from 'react'; import { Amplify, Storage, API, graphqlOperation } from 'aws-amplify'; import awsconfig from './aws-exports'; import { speakTranslatedImageText } from './graphql/queries'; /* Configure Exports */ Amplify.configure(awsconfig); function SpeakTranslatedImage() { const [src, setSrc] = useState(""); const [img, setImg] = useState(""); function putS3Image(event) { const file = event.target.files[0]; Storage.put(file.name, file) .then(async (result) => { setSrc(await speakTranslatedImageTextOP(result.key)) setImg(await Storage.get(result.key)); }) .catch(err => console.log(err)); } return (

Upload Image

{ putS3Image(event) }} />
{ img && } { src &&
}
); } async function speakTranslatedImageTextOP(key) { const inputObj = { translateText: { sourceLanguage: "en", targetLanguage: "es" }, identifyText: { key }, convertTextToSpeech: { voiceID: "Conchita" } }; const response = await API.graphql( graphqlOperation(speakTranslatedImageText, { input: inputObj })); return response.data.speakTranslatedImageText; } function App() { return (

Speak Translated Image

); } export default App; ``` ## How it works Definition of the `@predictions` directive: ```graphql directive @predictions(actions: [PredictionsActions!]!) on FIELD_DEFINITION enum PredictionsActions { identifyText # uses Amazon Rekognition to detect text identifyLabels # uses Amazon Rekognition to detect labels convertTextToSpeech # uses Amazon Polly in a lambda to output a presigned url to synthesized speech translateText # uses Amazon Translate to translate text from source to target language } ``` `@predictions` creates resources to communicate with Amazon Rekognition, Translate, and Polly. For each action the following is created: - IAM Policy for each service (e.g. Amazon Rekognition `detectText` Policy) - An AppSync VTL function - An AppSync DataSource Finally, a pipeline resolver is created for the query or field. The pipeline resolver is composed of AppSync functions which are defined by the action list provided in the directive.