I have been investigating adding default compiler and linker options to CPython for improved safety at runtime as well as uncover potential vulnerabilities by digging through warnings these compiler options would generate. OpenSSF has been developing some guidance for hardening compiler options and some time ago an issue was written suggesting CPython should consider adopting some of the suggestions of the OpenSSF and there is where I enter the story.

At first my goals are to

Get pyperformance baseline for existing CPython mainline
See how options suggested in the OpenSSF guidance affect benchmarks
Get an understanding of new warnings
Give a recommendation for a set of compiler options

For my local benchmarking machine I decided to install Fedora on a PC I built a few years ago. I chose Fedora because the Dockerfile that comes with the CPython repo specified a Fedora:40 image. The PC I will be using for benchmarking has an AMD Ryzen 7 3700X with 32GB of memory. I will use this machine as a way to gauge options to send on to run in a lab at Microsoft set up to run CPython benchmarks. More information on that here and a special thanks to Michael Droettboom for running these benchmarks!

At first I would like to figure out compiler options we can get for “free”. Warning flags that don’t generate any warnings and other compiler hardening options that have minimal impact on pyperformance benchmarks. These flags will be benchmarked on the Microsoft test machines and then if they are close to baseline benchmarks they will be included in a PR.

Benchmarks

The OpenSSF guidance document provided a TLDR; set of compiler options for C/C++ code that the working group had deemed worthy enough that they should probably be enabled for a majority of projects. The first tests should be with these options:

-O2 -Wall -Wformat -Wformat=2 -Wconversion -Wimplicit-fallthrough \
-Werror=format-security \
-U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=3 \
-D_GLIBCXX_ASSERTIONS \
-fstrict-flex-arrays=3 \
-fstack-clash-protection -fstack-protector-strong \
-Wl,-z,nodlopen -Wl,-z,noexecstack \
-Wl,-z,relro -Wl,-z,now

Compiler and linker flags can be passed to the CPython configuration script. We know that the linker options are going to impact performance, so we will take benchmarks for each of the options we think are going to implement, observe the compile time warnings, and assess the benchmarks to get an initial PR of non-warning and non-performance impacting

Note: Benchmark numbering may not make much sense without more context. The intent is that these runs can be referenced in my backlog of benchmarks sometime in the future

Run Name	Configuration Command	Option
Run 45	./configure CFLAGS="-O2 -fdiagnostics-format=json" LDFLAGS=""
Run 46	./configure CFLAGS="-Wall -fdiagnostics-format=json" LDFLAGS=""	Just a warning, true zero performance impact
Run 47	./configure CFLAGS="-Wformat=2 -fdiagnostics-format=json" LDFLAGS=""	Just a warning, true zero performance impact
Run 48	./configure CFLAGS="-Wconversion -fdiagnostics-format=json" LDFLAGS=""	Just a warning, true zero performance impact
Run 49	./configure CFLAGS="-Wimplicit-fallthrough -fdiagnostics-format=json" LDFLAGS=""	Just a warning, true zero performance impact
Run 50	./configure CFLAGS="-Werror=format-security -fdiagnostics-format=json" LDFLAGS=""	Just a warning, true zero performance impact
Run 51	./configure CFLAGS="-fstack-protector-strong -fdiagnostics-format=json" LDFLAGS=""
Run 52	./configure CFLAGS="-fcf-protection=full -fdiagnostics-format=json" LDFLAGS=""
Run 53	./configure CFLAGS="-Wtrampolines -fdiagnostics-format=json" LDFLAGS=""	Just a warning, true zero performance impact
Run 54	./configure CFLAGS="-fno-delete-null-pointer-checks -fdiagnostics-format=json" LDFLAGS=""
Run 55	./configure CFLAGS="-fno-strict-overflow -fdiagnostics-format=json" LDFLAGS=""
Run 56	./configure CFLAGS="-fstrict-flex-arrays=1 -fdiagnostics-format=json" LDFLAGS=""
Run 57	./configure CFLAGS="-fno-strict-aliasing -fdiagnostics-format=json" LDFLAGS=""
Run 58	./configure CFLAGS="-Wbidi-chars=any -fdiagnostics-format=json" LDFLAGS=""	Just a warning, true zero performance impact
Run 59	./configure	None (another baseline run for fun)
Run 60	./configure	None (yet another baseline run)

Above run data is run_set_20240618_011115

Results

By default CPython is compiled with -O3 optimization, so -O2 is not included, although it was tested here.

We can see that warnings are generated when some options are enabled:

Run Name	New Warnings?
Run 45	No
Run 46	Yes
Run 47	Yes (-Wformat-nonliterl)
Run 48	Yes (-Wsign-conversion, -Wconversion, -Wfloat-conversion)
Run 49	No
Run 50	Yes
Run 51	No
Run 52	No
Run 53	No
Run 54	Yes (-Wformat-overflow replaces a -Wstringop-overflow)
Run 55	No
Run 56	No
Run 57	No
Run 58	No
Run 59	N/A
Run 60	N/A

And an analysis of the benchmarks of the options that didn’t generate warnings (even just warning flags):

Run 49 vs. Run 59

-Wimplicit-fallthrough

Benchmark Tag	Geometric Mean
apps	1.00x slower
asyncio	1.01x slower
math	1.00x faster
regex	1.00x slower
serialize	1.00x slower
startup	1.00x slower
template	1.01x slower
overall	1.00x slower

Just a warning flag. Including in next stage.

Run 51 vs. Run 59

-fstack-protector-strong

Benchmark Tag	Geometric Mean
apps	1.01x slower
asyncio	1.01x slower
math	1.01x faster
regex	1.01x faster
serialize	1.02x slower
startup	1.01x slower
template	1.04x slower
overall	1.01x slower

A little suspicious but will include it in the set to be tested in the Microsoft lab.

Run 53 vs. Run 59

-Wtrampolines

Benchmark Tag	Geometric Mean
apps	1.00x slower
asyncio	1.01x slower
math	1.00x faster
regex	1.00x slower
serialize	1.00x slower
startup	1.00x slower
template	1.01x slower
overall	1.00x slower

Looks good to move to next step.

Run 55 vs. Run 59

-fno-strict-overflow

Benchmark Tag	Geometric Mean
apps	1.01x slower
asyncio	1.01x slower
math	1.00x faster
regex	1.00x faster
serialize	1.01x slower
startup	1.00x slower
template	1.01x slower
overall	1.01x slower

Looks good to include in next step.

Benchmark in Microsoft Lab

I modified the CPython autoconf configure.ac to check and enable the options above:

# Enable flags that warn and protect for potential security vulnerabilities.
# These flags should be enabled by default for all builds.
AX_CHECK_COMPILE_FLAG([-Wimplicit-fallthrough], [BASECFLAGS="$BASECFLAGS -Wimplicit-fallthrough"], [AC_MSG_WARN([-Wimplicit-fallthrough not supported])])
AX_CHECK_COMPILE_FLAG([-fstack-protector-strong], [BASECFLAGS="$BASECFLAGS -fstack-protector-strong"], [AC_MSG_WARN([-fstack-protector-strong not supported])])
AX_CHECK_COMPILE_FLAG([-fno-strict-overflow], [BASECFLAGS="$BASECFLAGS -fno-strict-overflow"], [AC_MSG_WARN([-fno-strict-overflow not supported])])
case $CC in
  *gcc*)
    # Add GCC-specific compiler flags          
    AX_CHECK_COMPILE_FLAG([-Wtrampolines], [BASECFLAGS="$BASECFLAGS -Wtrampolines"], [AC_MSG_WARN([-Wtrampolines not supported])])
esac

The benchmarks showed that the difference between my build and the baseline were below the threshold that we could reasonably measure.

Base vs. New Options

fork: nohlson
ref: enable_no_impact_def
machine: linux-x86_64
commit hash: 98d9ea0
commit date: 2024-06-20
overall geometric mean: 1.00x slower
HPT reliability: 70.47%
HPT 99th percentile: 1.00x faster
Memory change: 1.00x

The full set of benchmarks can be found here

Conclusion

I will create an initial PR with the following options since they do not negatively impact performance and do not generate new warnings at compile time:

-Wimplicit-fallthrough -fstack-protector-strong -fno-strict-overflow -Wtrampolines

PR: gh-112301: Enable compiler flags with low performance impact and no warnings #120975

UPDATE (6/25/24)

It was noted in the comments of the PR that -fno-strict-overflow is already enabled if it is available for the compiler. I removed my check and enabling of that flag since it is redundant.

CC BY-SA 4.0 Nate Ohlson. Last modified: June 25, 2024.

CPython Compiler Hardening Intro

Getting to First PR