Skip to content

Commit 398a41e

Browse files
jarghLucas Gabriel Vuotto
and
Lucas Gabriel Vuotto
authored
Enable x86 indirect branch tracking in code and proofs (#203)
* x86: add script for adding endbr{32,64} instructions * Implement CET for amd64 * Sync AT&T syntax files * Add ENDBR64 to x86 model This is intended to support the insertion of ENDBR64 into the x86 code to support indirect branch tracking (IBT), part of CET (control-flow enforcement technology). The actual semantics assumed is simply a NOP; on machines without CET enabled, including older machines from before the feature existed, this is indeed how it behaves. We do not attempt to model the restrictions that are imposed when CET is enabled, and such restrictions arguably belong on the indirect branches themselves, of which there are none in s2n-bignum itself. For more discussion see https://stackoverflow.com/questions/56905811/what-does-the-endbr64-instruction-actually-do * Enable IBT by default in the code This makes the code compile with IBT unless explicitly flagged otherwise in the compiler options, hence inserting one initial ENDBR64 instruction at the main entry point for each function. The copy of the code given in each "define_assert_from_elf" is also updated correspondingly. So far, the proofs are not updated, though the * Re-derive non-IBT version of code and restore proofs This uses the function define_trimmed that trims away the first 4 bytes of the code (expected to be ENDBR64) and makes a definition matching the non-IBT original, hence restoring the proofs. * Use symbolic lengths for most x86 toplevel correctness theorems This switches most xxx_SUBROUTINE_CORRECT statements to use explicit "LENGTH code" hypotheses in place of explicit numbers. The main unchanged ones are those with data in the text section as well. * Add IBT variants of x86 toplevel correctness statements Every xxx_SUBROUTINE_CORRECT result, both standard and Windows ABI, now has a variant xxx_IBT_SUBROUTINE_CORRECT in the file, with a proof that mostly works automatically via the new ADD_IBT_RULE, corresponding to the case where the code is prefixed by ENDBR64. This rule fails to handle the cases with data in the text section; this occurs in these proofs that will need fixing next: edwards25519_scalarmuldouble_alt.ml edwards25519_scalarmuldouble.ml edwards25519_scalarmulbase_alt.ml edwards25519_scalarmulbase.ml curve25519_x25519base_alt.ml curve25519_x25519base.ml * Fix remaining IBT proof breaks This is done by making ADD_IBT_RULE a little more robust so that it handles the case of appended code and data tables, and also modifies explicit code length numbers as well as symbolic lengths. * Invert naming conventions for x86 Windows and IBT versions This switches around the naming conventions in correctness theorems and machine code definitions: * Instead of the IBT forms of correctness theorems having their names flagged with _IBT, that is now the default and the others are labelled explicitly as _NOIBT * The WINDOWS_ modifier is used as a suffix not a prefix in correctness statements, and likewise for windows_ in the machine code definitions. * The full machine code with ENDBR64 has a name ending _mc and the trimmed form without ENDBR64 has a name ending _tmc (t for trimmed). * Use platform CET header where possible The explicit ENDBR64 definition is now only used if the platform is not already enabling CET. This improvement was suggested by lgv5 in the discussion at #173 * Expand ENDBR64 explanatory comment, fix whitespace Where "fix whitespace" means both of the following: (1) remove all trailing spaces (2) expand all tabs to spaces, except in Makefiles and tools. * Install libpcre2-dev in the CI codebuild --------- Co-authored-by: Lucas Gabriel Vuotto <[email protected]>
1 parent 7a293e0 commit 398a41e

File tree

979 files changed

+45646
-27100
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

979 files changed

+45646
-27100
lines changed

arm/fastmul/bignum_emontredc_8n.S

+182-182
Large diffs are not rendered by default.

arm/fastmul/bignum_ksqr_16_32.S

+371-371
Large diffs are not rendered by default.

arm/fastmul/bignum_ksqr_32_64.S

+371-371
Large diffs are not rendered by default.

arm/fastmul/bignum_sqr_8_16.S

+364-364
Large diffs are not rendered by default.

arm/fastmul/unopt/bignum_emontredc_8n_cdiff_base.S

+42-42
Original file line numberDiff line numberDiff line change
@@ -19,10 +19,10 @@
1919
// ----------------------------------------------------------------------------
2020
#include "_internal_s2n_bignum.h"
2121

22-
S2N_BN_SYM_VISIBILITY_DIRECTIVE(bignum_emontredc_8n_cdiff_base)
23-
S2N_BN_SYM_PRIVACY_DIRECTIVE(bignum_emontredc_8n_cdiff_base)
24-
.text
25-
.balign 4
22+
S2N_BN_SYM_VISIBILITY_DIRECTIVE(bignum_emontredc_8n_cdiff_base)
23+
S2N_BN_SYM_PRIVACY_DIRECTIVE(bignum_emontredc_8n_cdiff_base)
24+
.text
25+
.balign 4
2626

2727
// Silly SLOTHY limitation: It needs the loop counter to have the name 'count'
2828
count .req x27 // inner loop counter
@@ -247,10 +247,10 @@ bignum_emontredc_8n_cdiff_base_outerloop:
247247
vmul_2x_64_64_128 v21, x4, v0, v1
248248
mov x14, v0.d[0]
249249
mov x15, v0.d[1]
250-
mul x12, x4, x8
250+
mul x12, x4, x8
251251
adds x17, x17, x12
252252
umulh x12, x4, x8
253-
mul x13, x4, x9
253+
mul x13, x4, x9
254254
adcs x19, x19, x13
255255
umulh x13, x4, x9
256256
adcs x20, x20, x14
@@ -270,12 +270,12 @@ bignum_emontredc_8n_cdiff_base_outerloop:
270270
vmul_2x_64_64_128 v21, x5, v0, v1
271271
mov x14, v0.d[0]
272272
mov x15, v0.d[1]
273-
mul x12, x5, x8
274-
adds x19, x19, x12
275-
umulh x12, x5, x8
276-
mul x13, x5, x9
277-
adcs x20, x20, x13
278-
umulh x13, x5, x9
273+
mul x12, x5, x8
274+
adds x19, x19, x12
275+
umulh x12, x5, x8
276+
mul x13, x5, x9
277+
adcs x20, x20, x13
278+
umulh x13, x5, x9
279279
adcs x21, x21, x14
280280
adcs x22, x22, x15
281281
mov x14, v1.d[0]
@@ -289,49 +289,49 @@ bignum_emontredc_8n_cdiff_base_outerloop:
289289

290290
// Montgomery step 2
291291

292-
mul x6, x20, x3
292+
mul x6, x20, x3
293293
// NEON: Calculate x6 * (x10, x11) that does two 64x64->128-bit multiplications.
294294
vmul_2x_64_64_128 v21, x6, v21, v1
295295
mov x14, v21.d[0]
296296
mov x15, v21.d[1]
297-
mul x12, x6, x8
298-
adds x20, x20, x12
299-
umulh x12, x6, x8
300-
mul x13, x6, x9
301-
adcs x21, x21, x13
302-
umulh x13, x6, x9
297+
mul x12, x6, x8
298+
adds x20, x20, x12
299+
umulh x12, x6, x8
300+
mul x13, x6, x9
301+
adcs x21, x21, x13
302+
umulh x13, x6, x9
303303
adcs x22, x22, x14
304304
adcs x23, x23, x15
305305
mov x14, v1.d[0]
306306
mov x15, v1.d[1]
307-
adc x24, xzr, xzr
308-
adds x21, x21, x12
309-
mul x7, x21, x3
310-
adcs x22, x22, x13
311-
adcs x23, x23, x14
312-
adc x24, x24, x15
307+
adc x24, xzr, xzr
308+
adds x21, x21, x12
309+
mul x7, x21, x3
310+
adcs x22, x22, x13
311+
adcs x23, x23, x14
312+
adc x24, x24, x15
313313

314314
stph x6, x7, x1, #16, t1
315315

316316
// Montgomery step 3
317317

318-
mul x12, x7, x8
319-
mul x13, x7, x9
320-
mul x14, x7, x10
321-
mul x15, x7, x11
322-
adds x21, x21, x12
323-
umulh x12, x7, x8
324-
adcs x22, x22, x13
325-
umulh x13, x7, x9
326-
adcs x23, x23, x14
327-
umulh x14, x7, x10
328-
adcs x24, x24, x15
329-
umulh x15, x7, x11
330-
adc x25, xzr, xzr
331-
adds x12, x22, x12
332-
adcs x13, x23, x13
333-
adcs x14, x24, x14
334-
adc x15, x25, x15
318+
mul x12, x7, x8
319+
mul x13, x7, x9
320+
mul x14, x7, x10
321+
mul x15, x7, x11
322+
adds x21, x21, x12
323+
umulh x12, x7, x8
324+
adcs x22, x22, x13
325+
umulh x13, x7, x9
326+
adcs x23, x23, x14
327+
umulh x14, x7, x10
328+
adcs x24, x24, x15
329+
umulh x15, x7, x11
330+
adc x25, xzr, xzr
331+
adds x12, x22, x12
332+
adcs x13, x23, x13
333+
adcs x14, x24, x14
334+
adc x15, x25, x15
335335

336336
lsr count, x0, #5
337337

0 commit comments

Comments
 (0)