NAME neptune-export.sh export-pg - Export property graph from Neptune to CSV or JSON. SYNOPSIS neptune-export.sh export-pg [ --alb-endpoint ] [ --approx-edge-count ] [ --approx-node-count ] [ {-b | --batch-size} ] [ {-c | --config-file | --filter-config-file} ] [ --clone-cluster ] [ --clone-cluster-correlation-id ] [ --clone-cluster-instance-type ] [ --clone-cluster-replica-count ] [ {--cluster-id | --cluster | --clusterid} ] [ {-cn | --concurrency} ] [ {--config | --filter} ] {-d | --dir} [ --disable-ssl ] [ {-e | --endpoint} ... ] [ --edge-label-strategy ] [ {-el | --edge-label} ... ] [ --escape-csv-headers ] [ --escape-newline ] [ --exclude-type-definitions ] [ --export-id ] [ --format ] [ --gremlin-edge-filter ] [ --gremlin-filter ] [ --gremlin-node-filter ] [ --include-last-event-id ] [ --janus ] [ --lb-port ] [ --limit ] [ --log-level ] [ --max-content-length ] [ --merge-files ] [ --multi-value-separator ] [ {-nl | --node-label} ... ] [ --nlb-endpoint ] [ {-o | --output} ] [ {-p | --port} ] [ --partition-directories ] [ --per-label-directories ] [ --profile ... ] [ {-r | --range | --range-size} ] [ {--region | --stream-region} ] [ {-s | --scope} ] [ --serializer ] [ --skip ] [ --stream-large-record-strategy ] [ --stream-name ] [ --strict-cardinality ] [ {-t | --tag} ] [ --token-prefix ] [ --tokens-only ] [ --use-iam-auth ] [ --use-ssl ] OPTIONS --alb-endpoint Application load balancer endpoint (optional: use only if connecting to an IAM DB enabled Neptune cluster through an application load balancer (ALB) – see https://github.com/aws-samples/aws-dbs-refarch-graph/tree/master/src/connecting-using-a-load-balancer#connecting-to-amazon-neptune-from-clients-outside-the-neptune-vpc-using-aws-application-load-balancer). This option may occur a maximum of 1 times This option is part of the group 'load-balancer' from which only one option may be specified --approx-edge-count Approximate number of edges in the graph. This option may occur a maximum of 1 times --approx-node-count Approximate number of nodes in the graph. This option may occur a maximum of 1 times -b , --batch-size Batch size (optional, default 64). Reduce this number if your queries trigger CorruptedFrameExceptions. This option may occur a maximum of 1 times -c , --config-file , --filter-config-file Path to JSON schema config file (file path, or 'https' or 's3' URI). This option is part of the group 'configFile or config' from which only one option may be specified --clone-cluster Clone an Amazon Neptune cluster. This option may occur a maximum of 1 times --clone-cluster-correlation-id Correlation ID to be added to a correlation-id tag on the cloned cluster. This option may occur a maximum of 1 times --clone-cluster-instance-type Instance type for cloned cluster (by default neptune-export will use the same instance type as the source cluster). This options value is restricted to the following set of values: db.r4.large db.r4.xlarge db.r4.2xlarge db.r4.4xlarge db.r4.8xlarge db.r5.large db.r5.xlarge db.r5.2xlarge db.r5.4xlarge db.r5.8xlarge db.r5.12xlarge db.t3.medium r4.large r4.xlarge r4.2xlarge r4.4xlarge r4.8xlarge r5.large r5.xlarge r5.2xlarge r5.4xlarge r5.8xlarge r5.12xlarge t3.medium This option may occur a maximum of 1 times --clone-cluster-replica-count Number of read replicas to add to the cloned cluster (default, 0). This option may occur a maximum of 1 times This options value must fall in the following range: 0 <= value <= 15 --cluster-id , --cluster , --clusterid ID of an Amazon Neptune cluster. If you specify a cluster ID, neptune-export will use all of the instance endpoints in the cluster in addition to any endpoints you have specified using the endpoint options. This option may occur a maximum of 1 times This option is part of the group 'endpoint or clusterId' from which at least one option must be specified -cn , --concurrency Concurrency – the number of parallel queries used to run the export (optional, default 4). This option may occur a maximum of 1 times --config , --filter JSON schema for property graph. This option is part of the group 'configFile or config' from which only one option may be specified -d , --dir Root directory for output. This option may occur a maximum of 1 times This options value must be a path to a directory. The provided path must be readable and writable. --disable-ssl Disables connectivity over SSL. This option may occur a maximum of 1 times -e , --endpoint Neptune endpoint(s) – supply multiple instance endpoints if you want to load balance requests across a cluster. This option is part of the group 'endpoint or clusterId' from which at least one option must be specified --edge-label-strategy Export edges by their edge labels, or by a combination of their start vertex label, edge label, and end vertex label (optional, default 'edgeLabelsOnly'). This options value is restricted to the following set of values: edgeLabelsOnly edgeAndVertexLabels This option may occur a maximum of 1 times -el , --edge-label Labels of edges to be included in config (optional, default all labels). --escape-csv-headers Escape characters in CSV column headers (optional, default 'false'). This option may occur a maximum of 1 times --escape-newline Escape newline characters in CSV files (optional, default 'false'). This option may occur a maximum of 1 times --exclude-type-definitions Exclude type definitions from CSV column headers (optional, default 'false'). This option may occur a maximum of 1 times --export-id Export id This option may occur a maximum of 1 times --format Output format (optional, default 'csv'). This options value is restricted to the following set of values: json csv csvNoHeaders neptuneStreamsJson neptuneStreamsSimpleJson This option may occur a maximum of 1 times --gremlin-edge-filter Gremlin steps for filtering edges (overrides --gremlin-filter). This option may occur a maximum of 1 times --gremlin-filter Gremlin steps for filtering nodes and edges. This option may occur a maximum of 1 times --gremlin-node-filter Gremlin steps for filtering nodes (overrides --gremlin-filter). This option may occur a maximum of 1 times --include-last-event-id Get the last event ID from the Amazon Neptune stream, if enabled, and save it to a JSON file (optional, default 'false'). This option may occur a maximum of 1 times --janus Use JanusGraph serializer. This option may occur a maximum of 1 times --lb-port Load balancer port (optional, default 80). This option may occur a maximum of 1 times This options value represents a port and must fall in one of the following port ranges: 1-1023, 1024-49151 --limit Maximum number of items to export (optional). This option may occur a maximum of 1 times --log-level Log level (optional, default 'error'). This options value is restricted to the following set of values: trace debug info warn error This option may occur a maximum of 1 times --max-content-length Max content length (optional, default 50000000). This option may occur a maximum of 1 times --merge-files Merge files for each vertex or edge label (currently only supports CSV files for export-pg). This option may occur a maximum of 1 times --multi-value-separator Separator for multi-value properties in CSV output (optional, default ';'). This option may occur a maximum of 1 times -nl , --node-label Labels of nodes to be included in config (optional, default all labels). --nlb-endpoint Network load balancer endpoint (optional: use only if connecting to an IAM DB enabled Neptune cluster through a network load balancer (NLB) – see https://github.com/aws-samples/aws-dbs-refarch-graph/tree/master/src/connecting-using-a-load-balancer#connecting-to-amazon-neptune-from-clients-outside-the-neptune-vpc-using-aws-network-load-balancer). This option may occur a maximum of 1 times This option is part of the group 'load-balancer' from which only one option may be specified -o , --output Output target (optional, default 'file'). This options value is restricted to the following set of values: files stdout devnull stream This option may occur a maximum of 1 times -p , --port Neptune port (optional, default 8182). This option may occur a maximum of 1 times This options value represents a port and must fall in one of the following port ranges: 1-1023, 1024-49151 --partition-directories Partition directory path (e.g. 'year=2021/month=07/day=21'). This option may occur a maximum of 1 times --per-label-directories Create a subdirectory for each distinct vertex or edge label. This option may occur a maximum of 1 times --profile Name of an export profile. -r , --range , --range-size Number of items to fetch per request (optional). This option may occur a maximum of 1 times --region , --stream-region AWS Region in which your Amazon Kinesis Data Stream is located. This option may occur a maximum of 1 times -s , --scope Scope (optional, default 'all'). This options value is restricted to the following set of values: all nodes edges This option may occur a maximum of 1 times --serializer Message serializer – (optional, default 'GRAPHBINARY_V1D0'). This options value is restricted to the following set of values: GRAPHSON GRAPHSON_V1D0 GRAPHSON_V2D0 GRAPHSON_V3D0 GRAPHBINARY_V1D0 GRYO_V1D0 GRYO_V3D0 GRYO_LITE_V1D0 This option may occur a maximum of 1 times --skip Number of items to skip (optional). This option may occur a maximum of 1 times --stream-large-record-strategy Strategy for dealing with records to be sent to Amazon Kinesis that are larger than 1 MB. This options value is restricted to the following set of values: dropAll splitAndDrop splitAndShred This option may occur a maximum of 1 times --stream-name Name of an Amazon Kinesis Data Stream. This option may occur a maximum of 1 times --strict-cardinality Format all set and list cardinality properties as arrays in JSON, including properties with a single value (optional, default 'false'). This option may occur a maximum of 1 times -t , --tag Directory prefix (optional). This option may occur a maximum of 1 times --token-prefix Token prefix (optional, default '~'). This option may occur a maximum of 1 times --tokens-only Export tokens (~id, ~label, ~from, ~to) only (optional, default 'off'). This options value is restricted to the following set of values: off nodes edges both This option may occur a maximum of 1 times --use-iam-auth Use IAM database authentication to authenticate to Neptune (remember to set the SERVICE_REGION environment variable). This option may occur a maximum of 1 times --use-ssl Enables connectivity over SSL. This option is deprecated: neptune-export will always connect via SSL unless you use --disable-ssl to explicitly disable connectivity over SSL. This option may occur a maximum of 1 times EXAMPLES bin/neptune-export.sh export-pg -e neptunedbcluster-xxxxxxxxxxxx.cluster-yyyyyyyyyyyy.us-east-1.neptune.amazonaws.com -d /home/ec2-user/output Export all data to the /home/ec2-user/output directory bin/neptune-export.sh export-pg -e neptunedbcluster-xxxxxxxxxxxx.cluster-yyyyyyyyyyyy.us-east-1.neptune.amazonaws.com -d /home/ec2-user/output --format json Export all data to the /home/ec2-user/output directory as JSON bin/neptune-export.sh export-pg -e neptunedbcluster-xxxxxxxxxxxx.cluster-yyyyyyyyyyyy.us-east-1.neptune.amazonaws.com -d /home/ec2-user/output -s nodes Export only nodes to the /home/ec2-user/output directory bin/neptune-export.sh export-pg -e neptunedbcluster-xxxxxxxxxxxx.cluster-yyyyyyyyyyyy.us-east-1.neptune.amazonaws.com -d /home/ec2-user/output -nl User -el FOLLOWS Export only User nodes and FOLLOWS relationships bin/neptune-export.sh export-pg -e neptunedbcluster-xxxxxxxxxxxx.cluster-yyyyyyyyyyyy.us-east-1.neptune.amazonaws.com -d /home/ec2-user/output -cn 2 Parallel export using 2 threads bin/neptune-export.sh export-pg -e neptunedbcluster-xxxxxxxxxxxx.cluster-yyyyyyyyyyyy.us-east-1.neptune.amazonaws.com -d /home/ec2-user/output -cn 2 -r 1000 Parallel export using 2 threads, with each thread processing batches of 1000 nodes or edges