aboutsummaryrefslogtreecommitdiffstats
path: root/crypto/sha
Commit message (Collapse)AuthorAgeFilesLines
* ARMv8 assembly pack: add Qualcomm Kryo results.Andy Polyakov2017-11-133-0/+3
| | | | | | [skip ci] Reviewed-by: Tim Hudson <tjh@openssl.org>
* Many spelling fixes/typo's corrected.Josh Soref2017-11-1112-22/+22
| | | | | | | | | Around 138 distinct errors found and fixed; thanks! Reviewed-by: Kurt Roeckx <kurt@roeckx.be> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3459)
* s390x assembly pack: extend s390x capability vector.Patrick Steuer2017-10-303-4/+13
| | | | | | | | | | | | | | | Extend the s390x capability vector to store the longer facility list available from z13 onwards. The bits indicating the vector extensions are set to zero, if the kernel does not enable the vector facility. Also add capability bits returned by the crypto instructions' query functions. Signed-off-by: Patrick Steuer <patrick.steuer@de.ibm.com> Reviewed-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4542)
* Remove parentheses of return.KaoruToda2017-10-183-5/+5
| | | | | | | | | Since return is inconsistent, I removed unnecessary parentheses and unified them. Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Matt Caswell <matt@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4541)
* s390x assembly pack: remove capability double-checking.Patrick Steuer2017-10-172-6/+0
| | | | | | | | | | | | | | | | | An instruction's QUERY function is executed at initialization, iff the required MSA level is installed. Therefore, it is sufficient to check the bits returned by the QUERY functions. The MSA level does not have to be checked at every function call. crypto/aes/asm/aes-s390x.pl: The AES key schedule must be computed if the required KM or KMC function codes are not available. Formally, the availability of a KMC function code does not imply the availability of the corresponding KM function code. Signed-off-by: Patrick Steuer <patrick.steuer@de.ibm.com> Reviewed-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4501)
* Remove email addresses from source code.Rich Salz2017-10-1315-22/+20
| | | | | | | | | | Names were not removed. Some comments were updated. Replace Andy's address with openssl.org Reviewed-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Paul Dale <paul.dale@oracle.com> (Merged from https://github.com/openssl/openssl/pull/4516)
* sha/asm/keccak1600-armv8.pl: fix return value buglet and ...Andy Polyakov2017-09-091-147/+11
| | | | | | | | | | | | | | | | | ... script data load. On related note an attempt was made to merge rotations with logical operations. I mean as we know, ARM ISA has merged rotate-n-logical instructions which can be used here. And they were used to improve keccak1600-armv4 performance. But not here. Even though this approach resulted in improvement on Cortex-A53 proportional to reduction of amount of instructions, ~8%, it didn't exactly worked out on non-Cortex cores. Presumably because they break merged instructions to separate μ-ops, which results in higher *operations* count. X-Gene and Denver went ~20% slower and Apple A7 - 40%. The optimization was therefore dismissed. Reviewed-by: Rich Salz <rsalz@openssl.org>
* MSC_VER <= 1200 isn't supported; remove dead codeRich Salz2017-08-271-3/+0
| | | | | | | VisualStudio 6 and earlier aren't supported. Reviewed-by: Andy Polyakov <appro@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4263)
* sha/asm/keccak1600-armv4.pl: optimize for Thumb-2.Andy Polyakov2017-08-161-144/+242
| | | | | | | Reduce per-round instruction count in Thumb-2 case by 16%. This is achieved by folding ldr/str pairs to their double-word counterparts. Reviewed-by: Rich Salz <rsalz@openssl.org>
* sha/asm/keccak1600-avx512.pl: fix buglet in SHA3_squeeze tail.Andy Polyakov2017-08-121-1/+1
| | | | Reviewed-by: Rich Salz <rsalz@openssl.org>
* sha/asm/keccak1600-armv4.pl: improve non-NEON performance by ~10%.Andy Polyakov2017-08-021-352/+388
| | | | | | | | | | | | | This is achieved mostly by ~10% reduction of amount of instructions per round thanks to a) switch to KECCAK_2X variant; b) merge of almost 1/2 rotations with logical instructions. Performance is improved on all observed processors except on Cortex-A15. This is because it's capable of exploiting more parallelism and can execute original code for same amount of time. Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> (Merged from https://github.com/openssl/openssl/pull/4057)
* sha/keccak1600.c: choose more sensible default parameters.Andy Polyakov2017-08-011-11/+21
| | | | | | | "More" refers to the fact that we make active BIT_INTERLEAVE choice in some specific cases. Update commentary correspondingly. Reviewed-by: Rich Salz <rsalz@openssl.org>
* Fix typo in sha1-thumb.plXiaoyin Liu2017-07-301-1/+1
| | | | | | Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4056)
* sha/keccak1600.c: build and make it work with strict warnings.Andy Polyakov2017-07-252-1/+6
| | | | | | Reviewed-by: Paul Dale <paul.dale@oracle.com> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3943)
* sha/asm/keccak1600-avx512.pl: improve performance by 17%.Andy Polyakov2017-07-241-176/+278
| | | | | | | | | | | Improvement is result of combination of data layout ideas from Keccak Code Package and initial version of this module. Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Rich Salz <rsalz@openssl.org>
* sha/asm/keccak1600-avx512.pl: absorb bug-fix and minor optimization.Andy Polyakov2017-07-211-19/+17
| | | | | | | Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Rich Salz <rsalz@openssl.org>
* x86_64 assembly pack: "optimize" for Knights Landing, add AVX-512 results.Andy Polyakov2017-07-212-0/+2
| | | | | | | | | | | | | | | | | | "Optimize" is in quotes because it's rather a "salvage operation" for now. Idea is to identify processor capability flags that drive Knights Landing to suboptimial code paths and mask them. Two flags were identified, XSAVE and ADCX/ADOX. Former affects choice of AES-NI code path specific for Silvermont (Knights Landing is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are effectively mishandled at decode time. In both cases we are looking at ~2x improvement. AVX-512 results cover even Skylake-X :-) Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Rich Salz <rsalz@openssl.org>
* sha/asm/keccak1600-avx2.pl: optimized remodelled version.Andy Polyakov2017-07-151-97/+99
| | | | | | | | | New register usage pattern allows to achieve sligtly better performance. Not as much as I hoped for. Performance is believed to be limited by irreconcilable write-back conflicts, rather than lack of computational resources or data dependencies. Reviewed-by: Rich Salz <rsalz@openssl.org>
* sha/asm/keccak1600-avx2.pl: remodel register usage.Andy Polyakov2017-07-151-109/+105
| | | | | | | | | This gives much more freedom to rearrange instructions. This is unoptimized version, provided for reference. Basically you need to compare it to initial 29724d0e15b4934abdf2d7ab71957b05d1a28256 to figure out the key difference. Reviewed-by: Rich Salz <rsalz@openssl.org>
* Optimize sha/asm/keccak1600-avx2.pl.Andy Polyakov2017-07-101-84/+87
| | | | Reviewed-by: Rich Salz <rsalz@openssl.org>
* Add sha/asm/keccak1600-avx2.pl.Andy Polyakov2017-07-101-0/+479
| | | | Reviewed-by: Rich Salz <rsalz@openssl.org>
* Add sha/asm/keccak1600-avx512.pl.Andy Polyakov2017-07-071-0/+449
| | | | | | Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> (Merged from https://github.com/openssl/openssl/pull/3861)
* sha/keccak1600.c: internalize KeccakF1600 and simplify SHA3_absorb.Andy Polyakov2017-07-031-35/+17
| | | | Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
* sha/asm/keccak1600-x86_64.pl: close gap with Keccak Code Package.Andy Polyakov2017-07-031-32/+31
| | | | | | [Also typo and readability fixes. Ryzen result is added.] Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
* sha/asm/keccak1600-s390x.pl: typo and readability, minor size optimization.Andy Polyakov2017-07-031-15/+8
| | | | Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
* x86_64 assembly pack: fill some blanks in Ryzen results.Andy Polyakov2017-07-032-2/+2
| | | | Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
* Add sha/asm/keccak1600-s390x.pl.Andy Polyakov2017-06-291-0/+568
| | | | Reviewed-by: Richard Levitte <levitte@openssl.org>
* sha/asm/keccak1600-x86_64.pl: add CFI directives.Andy Polyakov2017-06-291-0/+40
| | | | Reviewed-by: Richard Levitte <levitte@openssl.org>
* sha/asm/keccak1600-x86_64.pl: optimize by re-ordering instructions.Andy Polyakov2017-06-291-83/+95
| | | | Reviewed-by: Richard Levitte <levitte@openssl.org>
* sha/asm/keccak1600-x86_64.pl: remove redundant moves.Andy Polyakov2017-06-291-28/+50
| | | | Reviewed-by: Richard Levitte <levitte@openssl.org>
* Add sha/asm/keccak1600-x86_64.pl.Andy Polyakov2017-06-291-0/+535
| | | | Reviewed-by: Richard Levitte <levitte@openssl.org>
* sha/asm/keccak1600-mmx.pl: optimize for Atom and add comparison data.Andy Polyakov2017-06-241-115/+126
| | | | | | | | | | | Curiously enough out-of-order Silvermont benefited most from optimization, 33%. [Originally mentioned "anomaly" turned to be misreported frequency scaling problem. Correct results were collected under older kernel.] Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> (Merged from https://github.com/openssl/openssl/pull/3739)
* Add sha/asm/keccak1600-mmx.pl, x86 MMX module.Andy Polyakov2017-06-241-0/+429
| | | | | | Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> (Merged from https://github.com/openssl/openssl/pull/3739)
* sha/asm/sha512p8-ppc.pl: add POWER8 performance data.Andy Polyakov2017-06-211-0/+9
| | | | | | | | [skip ci] Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3705)
* Add Keccak-1600 modules for PPC64 and POWER8.Andy Polyakov2017-06-212-0/+1607
| | | | | | | | [skip ci] Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3705)
* Add sha/asm/keccak1600-c64x.plAndy Polyakov2017-06-211-0/+882
| | | | | | | [skip ci] Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> (Merged from https://github.com/openssl/openssl/pull/3708)
* Add sha/asm/keccak1600-armv8.pl.Andy Polyakov2017-06-151-0/+653
| | | | Reviewed-by: Rich Salz <rsalz@openssl.org>
* sha/asm/keccak1600-armv4.pl: switch to more efficient bit interleaving ↵Andy Polyakov2017-06-081-119/+260
| | | | | | algorithm. Reviewed-by: Rich Salz <rsalz@openssl.org>
* sha/keccak1600.c: switch to more efficient bit interleaving algorithm.Andy Polyakov2017-06-081-43/+95
| | | | | | [Also bypass sizeof(void *) == 8 check on some platforms.] Reviewed-by: Rich Salz <rsalz@openssl.org>
* sha/asm/keccak1600-armv4.pl: add NEON code path.Andy Polyakov2017-06-061-20/+530
| | | | Reviewed-by: Rich Salz <rsalz@openssl.org>
* sha/asm/keccak1600-armv4.pl: add SHA3_absorb and SHA3_squeeze.Andy Polyakov2017-06-061-50/+319
| | | | Reviewed-by: Rich Salz <rsalz@openssl.org>
* sha/asm/keccak1600-armv4.pl: optimization based on profiler feedback.Andy Polyakov2017-06-061-80/+80
| | | | Reviewed-by: Rich Salz <rsalz@openssl.org>
* Add sha/asm/keccak1600-armv4.pl.Andy Polyakov2017-06-061-0/+532
| | | | Reviewed-by: Rich Salz <rsalz@openssl.org>
* sha/keccak1600.c: add #ifdef KECCAK1600_ASM.Andy Polyakov2017-06-051-0/+7
| | | | Reviewed-by: Rich Salz <rsalz@openssl.org>
* sha/keccak1600.c: reduce temporary storage utilization even futher.Andy Polyakov2017-06-051-47/+46
| | | | Reviewed-by: Rich Salz <rsalz@openssl.org>
* sha/keccak1600.c: add another 1x variant.Andy Polyakov2017-06-051-0/+144
| | | | Reviewed-by: Rich Salz <rsalz@openssl.org>
* sha/keccak1600.c: add ARM-specific "reference" tweaks.Andy Polyakov2017-06-051-21/+41
| | | | Reviewed-by: Rich Salz <rsalz@openssl.org>
* sha/keccak1600.c: implement lane complementing transformAndy Polyakov2017-05-301-0/+58
| | | | | | | | ...as discussed in section 2.2 of "Keccak implementation overview". [skip ci] Reviewed-by: Rich Salz <rsalz@openssl.org>
* sha/keccak1600.c: implement bit interleaving optimization.Andy Polyakov2017-05-301-78/+103
| | | | | | | This targets 32-bit processors and is discussed in section 2.1 of "Keccak implementation overview". Reviewed-by: Rich Salz <rsalz@openssl.org>
* Remove filename argument to x86 asm_init.David Benjamin2017-05-113-3/+3
| | | | | | | | | | | | | | | The assembler already knows the actual path to the generated file and, in other perlasm architectures, is left to manage debug symbols itself. Notably, in OpenSSL 1.1.x's new build system, which allows a separate build directory, converting .pl to .s as the scripts currently do result in the wrong paths. This also avoids inconsistencies from some of the files using $0 and some passing in the filename. Reviewed-by: Richard Levitte <levitte@openssl.org> Reviewed-by: Andy Polyakov <appro@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3431)