aboutsummaryrefslogtreecommitdiffstats
path: root/crypto/sha
diff options
context:
space:
mode:
authorAndy Polyakov <appro@openssl.org>2015-03-03 22:05:25 +0100
committerAndy Polyakov <appro@openssl.org>2015-04-02 09:47:56 +0200
commit94376cccb4ed5b376220bffe0739140ea9dad8c8 (patch)
tree162487337d9149d1c34e585f5bfb5583f0561f64 /crypto/sha
parent7b644df899d0c818488686affc0bfe2dfdd0d0c2 (diff)
downloadopenssl-94376cccb4ed5b376220bffe0739140ea9dad8c8.tar.gz
aes/asm/aesv8-armx.pl: optimize for Cortex-A5x.
ARM has optimized Cortex-A5x pipeline to favour pairs of complementary AES instructions. While modified code improves performance of post-r0p0 Cortex-A53 performance by >40% (for CBC decrypt and CTR), it hurts original r0p0. We favour later revisions, because one can't prevent future from coming. Improvement on post-r0p0 Cortex-A57 exceeds 50%, while new code is not slower on r0p0, or Apple A7 for that matter. [Update even SHA results for latest Cortex-A53.] Reviewed-by: Richard Levitte <levitte@openssl.org>
Diffstat (limited to 'crypto/sha')
-rw-r--r--crypto/sha/asm/sha1-armv8.pl2
-rw-r--r--crypto/sha/asm/sha512-armv8.pl4
2 files changed, 3 insertions, 3 deletions
diff --git a/crypto/sha/asm/sha1-armv8.pl b/crypto/sha/asm/sha1-armv8.pl
index 6be8624342..bdf48b8ec4 100644
--- a/crypto/sha/asm/sha1-armv8.pl
+++ b/crypto/sha/asm/sha1-armv8.pl
@@ -14,7 +14,7 @@
#
# hardware-assisted software(*)
# Apple A7 2.31 4.13 (+14%)
-# Cortex-A53 2.19 8.73 (+108%)
+# Cortex-A53 2.24 8.03 (+97%)
# Cortex-A57 2.35 7.88 (+74%)
#
# (*) Software results are presented mostly for reference purposes.
diff --git a/crypto/sha/asm/sha512-armv8.pl b/crypto/sha/asm/sha512-armv8.pl
index 45eb719fe5..717473b86e 100644
--- a/crypto/sha/asm/sha512-armv8.pl
+++ b/crypto/sha/asm/sha512-armv8.pl
@@ -14,7 +14,7 @@
#
# SHA256-hw SHA256(*) SHA512
# Apple A7 1.97 10.5 (+33%) 6.73 (-1%(**))
-# Cortex-A53 2.38 15.6 (+110%) 10.1 (+190%(***))
+# Cortex-A53 2.38 15.5 (+115%) 10.0 (+150%(***))
# Cortex-A57 2.31 11.6 (+86%) 7.51 (+260%(***))
#
# (*) Software SHA256 results are of lesser relevance, presented
@@ -25,7 +25,7 @@
# (***) Super-impressive coefficients over gcc-generated code are
# indication of some compiler "pathology", most notably code
# generated with -mgeneral-regs-only is significanty faster
-# and lags behind assembly only by 50-90%.
+# and the gap is only 40-90%.
$flavour=shift;
$output=shift;