openapi: 3.0.3 info: version: 1.0.0 title: AWS Docs API description: API for AWS Docs paths: /tags: post: operationId: createFormReviewWorkflowTag description: create a form review workflow tag requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/CreateFormReviewWorkflowTagInput' responses: '200': description: Returned on successful addition of a form review workflow tag content: application/json: schema: $ref: '#/components/schemas/FormReviewWorkflowTag' get: operationId: listFormReviewWorkflowTags description: List all form review workflow tags parameters: - $ref: '#/components/parameters/pageSize' - $ref: '#/components/parameters/nextToken' responses: '200': description: Returned on successful list of all form review workflow tags content: application/json: schema: $ref: '#/components/schemas/ListFormReviewWorkflowTagsResponse' /sources/document: post: operationId: submitSourceDocument description: Submit a document for processing requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/SubmitSourceDocumentInput' responses: '200': description: Returned on successful submission of a form content: application/json: schema: $ref: '#/components/schemas/DocumentMetadata' /documents/upload-url: get: operationId: getDocumentUploadUrl description: Get a presigned url for uploading a document parameters: - in: query name: fileName required: true schema: type: string - in: query name: contentType required: true schema: type: string responses: '200': description: Returned presigned url for uploading a document content: application/json: schema: $ref: '#/components/schemas/GetDocumentUploadUrlResponse' /documents: get: operationId: listDocuments description: List all documents parameters: - $ref: '#/components/parameters/pageSize' - $ref: '#/components/parameters/nextToken' responses: '200': description: Returns a list of documents content: application/json: schema: $ref: '#/components/schemas/ListDocumentsResponse' /documents/{documentId}: get: operationId: getDocument description: Get details about a document being ingested parameters: - in: path name: documentId required: true schema: type: string responses: '200': description: Returned on successful retrieval of document metadata content: application/json: schema: $ref: '#/components/schemas/DocumentMetadata' '404': description: Returned when a document is not found content: application/json: schema: $ref: '#/components/schemas/ApiError' /documents/{documentId}/forms: get: operationId: listDocumentForms description: Get details about the forms within a processed document parameters: - in: path name: documentId required: true schema: type: string - $ref: '#/components/parameters/pageSize' - $ref: '#/components/parameters/nextToken' responses: '200': description: Returned on successful retrieval of document forms content: application/json: schema: $ref: '#/components/schemas/ListFormsResponse' '404': description: Returned when a document is not found content: application/json: schema: $ref: '#/components/schemas/ApiError' /forms: get: operationId: listForms description: List all forms within documents parameters: - $ref: '#/components/parameters/pageSize' - $ref: '#/components/parameters/nextToken' responses: '200': description: Returns a list of forms content: application/json: schema: $ref: '#/components/schemas/ListFormsResponse' /documents/{documentId}/forms/{formId}: get: operationId: getDocumentForm description: Get details about a form within a processed document parameters: - in: path name: documentId required: true schema: type: string - in: path name: formId required: true schema: type: string responses: '200': description: Returned on successful retrieval of the form content: application/json: schema: $ref: '#/components/schemas/FormMetadata' '404': description: Returned when a document is not found content: application/json: schema: $ref: '#/components/schemas/ApiError' /documents/{documentId}/forms/{formId}/status: put: operationId: updateStatus description: start a new review parameters: - in: path name: documentId required: true schema: type: string - in: path name: formId required: true schema: type: string requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/UpdateStatusInput' responses: '200': description: The newly updated form metadata content: application/json: schema: $ref: '#/components/schemas/FormMetadata' /documents/{documentId}/forms/{formId}/review: put: operationId: updateFormReview description: Update the extracted data details object from a document form parameters: - in: path name: documentId required: true schema: type: string - in: path name: formId required: true schema: type: string requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/UpdateFormInput' responses: '200': description: Returned on successful update of the form content: application/json: schema: $ref: '#/components/schemas/FormMetadata' '404': description: Returned when a document is not found content: application/json: schema: $ref: '#/components/schemas/ApiError' /schemas: get: operationId: listFormSchemas description: List all schemas for forms parameters: - $ref: '#/components/parameters/pageSize' - $ref: '#/components/parameters/nextToken' responses: '200': description: List all registered form schemas content: application/json: schema: $ref: '#/components/schemas/ListFormSchemasResponse' post: operationId: createFormSchema description: Create a new form schema requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/FormSchemaInput' responses: '200': description: The newly created schema content: application/json: schema: $ref: '#/components/schemas/FormSchema' /schemas/{schemaId}: get: operationId: getFormSchema description: Retrieve a specific form schema parameters: - in: path name: schemaId required: true schema: type: string responses: '200': description: The newly created schema content: application/json: schema: $ref: '#/components/schemas/FormSchema' put: operationId: updateFormSchema description: Update an existing form schema parameters: - in: path name: schemaId required: true schema: type: string requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/FormSchema' responses: '200': description: The updated schema content: application/json: schema: $ref: '#/components/schemas/FormSchema' delete: operationId: deleteFormSchema description: Delete a form schema parameters: - in: path name: schemaId required: true schema: type: string responses: '200': description: The deleted schema content: application/json: schema: $ref: '#/components/schemas/FormSchema' /metrics: get: operationId: getMetrics description: Retrieve average aggregate metrics for disclosure data extraction for the given time period parameters: - in: query name: startTimestamp required: true schema: type: string - in: query name: endTimestamp required: true schema: type: string responses: '200': description: Aggregate metrics content: application/json: schema: $ref: '#/components/schemas/AggregateMetrics' components: parameters: pageSize: name: pageSize in: query description: The number of results to return in a page required: true schema: type: integer nextToken: name: nextToken in: query description: Passed to continue retrieving pages of results required: false schema: type: string schemas: ApiError: description: Returned when an error occurs type: object properties: message: type: string required: - message CreateFormReviewWorkflowTagInput: description: Describes the input for creating a new form review workflow tag type: object properties: tagText: type: string required: - tagText FormReviewWorkflowTag: description: Describes the object of a form review workflow tag type: object allOf: - $ref: '#/components/schemas/CreateFormReviewWorkflowTagInput' - $ref: '#/components/schemas/CreateUpdateDetails' properties: tagId: type: string required: - tagId ListFormReviewWorkflowTagsResponse: description: A list of form review workflow tags type: object allOf: - $ref: '#/components/schemas/PaginatedResponse' properties: tags: type: array items: $ref: '#/components/schemas/FormReviewWorkflowTag' required: - tags IngestionExecution: description: Describes the execution of the document ingestion pipeline type: object properties: executionId: type: string status: $ref: '#/components/schemas/ExecutionStatus' statusReason: type: string required: - executionId - status ExecutionStatus: type: string enum: [IN_PROGRESS, SUCCEEDED, FAILED] ExtractionExecution: description: Describes the execution of the form data extraction pipeline type: object properties: executionId: type: string status: $ref: '#/components/schemas/ExtractionExecutionStatus' statusReason: type: string required: - executionId - status ExtractionExecutionStatus: type: string enum: [ NOT_STARTED, IN_PROGRESS, READY_FOR_REVIEW, REVIEWING, REVIEWED, FAILED, ] GetDocumentUploadUrlResponse: description: Response to getting a document upload url type: object properties: documentId: type: string url: type: string location: $ref: '#/components/schemas/S3Location' required: - documentId - url - location SubmitSourceDocumentInput: description: Request to submit a document type: object properties: schemaId: type: string documentId: type: string name: type: string description: Name of the document location: $ref: '#/components/schemas/S3Location' required: - documentId - name - location - schemaId DocumentMetadata: description: Metadata about a document type: object allOf: - $ref: '#/components/schemas/CreateUpdateDetails' properties: documentId: type: string name: description: The name of the document type: string location: $ref: '#/components/schemas/S3Location' ingestionExecution: $ref: '#/components/schemas/IngestionExecution' numberOfPages: description: The number of pages in the document, discovered during classification type: integer numberOfClassifiedForms: description: The number of forms discovered within the document type: integer url: description: Presigned url for fetching the document (returned on get individual document only) type: string statusTransitionLog: description: A log of status transitions type: array items: $ref: '#/components/schemas/StatusTransition' required: - documentId - name - location - statusTransitionLog FormMetadata: description: Metadata about a form within a document type: object allOf: - $ref: '#/components/schemas/CreateUpdateDetails' properties: documentId: type: string documentName: type: string formId: type: string schemaId: type: string numberOfPages: description: The number of pages in the form type: integer startPageIndex: type: integer endPageIndex: type: integer location: $ref: '#/components/schemas/S3Location' extractionExecution: $ref: '#/components/schemas/ExtractionExecution' extractedData: description: Data extracted from the form - has any type, will be of the shape of the schema. This can be modified by reviewers who may correct data that has been inaccurately extracted originalExtractedData: description: The original data extracted from the form - has any type, will be of the shape of the schema. This is what was originally extracted by the system, prior to any human review. extractedDataMetadata: description: Metadata of extracted data values, of same shape as the data above, but leaf values contain confidence and bounding box metadata. extractionAccuracy: $ref: '#/components/schemas/ExtractionAccuracy' averageConfidence: description: The average confidence computed by textract for all fields in the form type: number schemaSnapshot: $ref: '#/components/schemas/FormJSONSchema' url: description: Presigned url for fetching the document (returned on get individual form only) type: string textractOutputLocation: $ref: '#/components/schemas/S3Location' statusTransitionLog: description: A log of status transitions type: array items: $ref: '#/components/schemas/StatusTransition' tags: type: array items: type: string notes: type: string required: - documentId - documentName - formId - schemaId - numberOfPages - startPageIndex - endPageIndex - location - extractionExecution - schemaSnapshot - statusTransitionLog ExtractionAccuracy: description: A collection of measures of the accuracy of data extracted for the form, computed once the form has been reviewed by a human type: object properties: fieldDistancePercentage: description: A percentage based on the Levenshtein Distance between the original extracted values and the human corrected values. Since it computes the minimum number of single-character edits (substitutions, insertions, deletions) required to transform the original to the reviewed, it acts as a measure much like 'how much manual work was required for the review?' See https://en.wikipedia.org/wiki/Levenshtein_distance type: number fieldCorrectnessPercentage: description: The percentage of fields that were not changed during review type: number required: - fieldDistancePercentage - fieldCorrectnessPercentage AggregateMetrics: description: Aggregated metrics for disclosure data extraction type: object allOf: - $ref: '#/components/schemas/AggregateDocumentMetrics' - $ref: '#/components/schemas/AggregateFormMetrics' properties: bySchemaId: type: object additionalProperties: $ref: '#/components/schemas/AggregateFormMetrics' required: - bySchemaId AggregateDocumentMetrics: description: Aggregated metrics for documents type: object properties: averageClassificationTimeMilliseconds: type: integer averageClassificationTimePerPageMilliseconds: type: integer totalProcessedDocumentCount: type: integer totalSuccessfulDocumentCount: type: integer totalFailedDocumentCount: type: integer AggregateFormMetrics: description: Aggregated metrics for forms type: object properties: averageExtractionAccuracyDistance: type: number averageExtractionAccuracyCorrectness: type: number averageConfidence: type: number averageExtractionTimeMilliseconds: type: integer averageExtractionTimePerPageMilliseconds: type: integer averageProcessingTimeMilliseconds: type: integer averageProcessingTimePerPageMilliseconds: type: integer averageWaitForReviewTimeMilliseconds: type: integer averageWaitForReviewTimePerPageMilliseconds: type: integer averageReviewTimeMilliseconds: type: integer averageReviewTimePerPageMilliseconds: type: integer averageEndToEndTimeMilliseconds: type: integer averageEndToEndTimePerPageMilliseconds: type: integer totalProcessedFormCount: type: integer totalSuccessfulFormCount: type: integer totalFailedFormCount: type: integer CreateUpdateDetails: description: Metadata about when an item was created/updated type: object properties: createdBy: type: string updatedBy: type: string createdTimestamp: type: string updatedTimestamp: type: string S3Location: description: A location in s3 type: object properties: bucket: type: string objectKey: type: string required: - bucket - objectKey PaginatedResponse: type: object properties: nextToken: type: string ListDocumentsResponse: description: A list of documents type: object allOf: - $ref: '#/components/schemas/PaginatedResponse' properties: documents: type: array items: $ref: '#/components/schemas/DocumentMetadata' required: - documents ListFormsResponse: description: A list of forms type: object allOf: - $ref: '#/components/schemas/PaginatedResponse' properties: forms: type: array items: $ref: '#/components/schemas/FormMetadata' required: - forms StatusTransition: description: Defines an item in a status transition log type: object properties: timestamp: description: The time at which the status transition occurred type: string status: description: The status that was transitioned to type: string actingUser: description: The user that triggered the change of status type: string required: - timestamp - status - actingUser ListFormSchemasResponse: description: A list of form schemas type: object allOf: - $ref: '#/components/schemas/PaginatedResponse' properties: schemas: type: array items: $ref: '#/components/schemas/FormSchema' required: - schemas FormSchema: description: A schema defining the structured data expected for a form type: object allOf: - $ref: '#/components/schemas/FormSchemaInput' - $ref: '#/components/schemas/CreateUpdateDetails' properties: schemaId: type: string required: - schemaId FormSchemaInput: description: A schema defining the structured data expected for a form (without an id) type: object properties: title: type: string description: The title of the form, as it appears in the form description: type: string description: A description of the form and schema schema: $ref: '#/components/schemas/FormJSONSchema' required: - title - schema FormFieldExtractionMetadata: type: object description: Metadata to assist with the extraction of this form field from a document properties: formKey: description: The literal text uses as the key for this field in a form, eg 'Name of Entity'. Capitalisation should be the same as appears in the form. type: string tablePosition: description: The 1-indexed table number in which this field appears. type: integer rowPosition: description: The 1-indexed row number within the table in which this field appears type: integer columnPosition: description: The 1-indexed column number within the table in which this field appears. type: integer textractQuery: description: When specified, try to extract the field using this textract query before falling back to other means type: string FormJSONSchema: type: object description: Schema for a json schema for a form, an extended definition of a standard JSON schema. See See https://github.com/OAI/OpenAPI-Specification/blob/main/schemas/v3.0/schema.yaml properties: order: description: The relative order of this property (for use in object types) type: integer extractionMetadata: $ref: '#/components/schemas/FormFieldExtractionMetadata' title: type: string multipleOf: type: number minimum: 0 exclusiveMinimum: true maximum: type: number exclusiveMaximum: type: boolean default: false minimum: type: number exclusiveMinimum: type: boolean default: false maxLength: type: integer minimum: 0 minLength: type: integer minimum: 0 default: 0 pattern: type: string format: regex maxItems: type: integer minimum: 0 minItems: type: integer minimum: 0 default: 0 uniqueItems: type: boolean default: false maxProperties: type: integer minimum: 0 minProperties: type: integer minimum: 0 default: 0 required: type: array items: type: string minItems: 1 enum: type: array items: {} minItems: 1 uniqueItems: false typeOf: type: string enum: - array - boolean - integer - number - object - string allOf: type: array items: $ref: '#/components/schemas/FormJSONSchema' oneOf: type: array items: $ref: '#/components/schemas/FormJSONSchema' anyOf: type: array items: $ref: '#/components/schemas/FormJSONSchema' items: $ref: '#/components/schemas/FormJSONSchema' properties: type: object additionalProperties: $ref: '#/components/schemas/FormJSONSchema' additionalProperties: type: boolean default: true description: type: string formatType: type: string default: {} nullable: type: boolean default: false readOnly: type: boolean default: false writeOnly: type: boolean default: false example: {} deprecated: type: boolean default: false UpdateFormInput: description: A schema defining the extracted data input type: object properties: extractedData: type: object description: an object representing the extracted data to be updated tags: type: array items: type: string description: an optional array of tagIds to support review tagging notes: type: string description: optional reviewer entered notes required: - extractedData UpdateStatusInput: description: An object that represents the updated status of the document form type: object properties: newStatus: type: string required: - newStatus