Copyright Amazon.com Inc. or its affiliates. All Rights Reserved.
SPDX-License-Identifier: Apache-2.0 OR ISC
#P256 Armv8 Assembly Functions
This application was used to develop the assembly functions committed to
[awsls:p256-armv8](https://github.com/aws/aws-lc/tree/p256-armv8)
The goal is bringing the P-256 performance on ARMv8 at par with x86_64.
The ARMv8 assembly code is taken from OpenSSL 1.1.1 (at [this commit](openssl/openssl@46a9ee8)).
The goal is achieved by reusing the AWS-LC P-256 implementation in `p256-x86_64.c` (nistz256) with ARMv8.
This is possible because the assembly functions required to support that code have their analogous functions in
the imported OpenSSL.
(Namely, the file [openss/crypto/ec/asm/ecp_nistz256-armv8.pl](https://github.com/openssl/openssl/blob/46a9ee8c796c8b5f8d95290676119b4f3d72be91/crypto/ec/asm/ecp_nistz256-armv8.pl)
was imported with slight modification in the first 2 commits.)
However, there are 3 x86_64 assembly functions that do not have corresponding functions in that file.
Those functions are `ecp_nistz256_select_w5` and `ecp_nistz256_select_w7` and `beeu_mod_inverse_vartime`.
###`ecp_nistz256_select_w5`
This function performs a constant-time table lookup by reading all entries, one at a time,
and using a mask based on the index to keep only the desired entry in the result destination.
* There are 16 entries in the table
* Each entry consists of 3 256-bit values which are the projective coordinates of a point
on the P-256 curve.
* The index is in the range [1,16].
###`ecp_nistz256_select_w7`
This function is almost identical to `ecp_nistz256_select_w5`. The differences are:
* There are 64 entries in the table
* Each entry consists of 2 256-bit values which are the affine coordinates of a point
on the P-256 curve.
* The index is in the range [1,64].
### `beeu_mod_inverse_vartime`
This function is an implementation of the Binary Extended GCD (Euclidean) Algorithm,
for a reference, see A. Menezes, P. vanOorschot, and S. Vanstone's Handbook of Applied Cryptography,
Chapter 14, Algorithm 14.61 and Note 14.64
http://cacr.uwaterloo.ca/hac/about/chap14.pdf
The python model for this function is in `beeu.py`.
It is used to compute the modular inverse of a value |a| modulo n, where n is odd.
## Tests
The tests for the three described functions are performed in `main.c`
## Additional code
The code in `beeu_scratch.c` is not directly used in the implementation; it was used
to compile into ARMv8 assembly and experiment with the generated instructions for `beeu.S`.
The Makefile, as inspired by [ARM SVE examples](https://developer.arm.com/documentation/dai0548/a/),
when building, creates the assembly files from the C files, so it is easy to see the assembly
instructions equivalent to the functions in this file.