Optimization in C++ Compilers: 2 – Compile Flags
Many compilable languages have compile-time problems. When we examine it in C++, compile-time causes huge time losses for developers as the project grows. There are even companies working to reduce compile-time only. Especially due to the size of the products in the game industry, this process takes a lot of time. Companies like Increbuild are developing technologies to make these processes faster. In this article, I will try to talk about optimization flags on GCC.
For the optimization process, compilers contain flags within themselves. With these flags, it is possible to perform optimization at various levels. Optimization processes on GCC are indicated with the “-O” flag and take different parameters with it according to the optimization level. Optimization operations that start with -O actually enable operations that start with -f and can be specifically controlled, to be done in groups.
It allows you to act as if all optimization levels are turned off. It is generally preferred to provide faster compile-time in codes where debugging will be done intensively. Since there is no optimization in the code, the fastest optimization level is -O0. However, it is not expected to provide the best level of code size and processing speed within the code. Therefore, it is not recommended to be used in outputs that will become products.
It provides the most basic optimization while activating the following flags, without using too much compile-time and memory. Most of the time, it can be preferred in terms of both code size and working speed.
Flags that are automatically activated with -O1:
-fauto-inc-dec -fbranch-count-reg -fcombine-stack-adjustments -fcompare-elim -fcprop-registers -fdce -fdefer-pop -fdelayed-branch -fdse -fforward-propagate -fguess-branch-probability -fif-conversion -fif-conversion2 -finline-functions-called-once -fipa-modref -fipa-profile -fipa-pure-const -fipa-reference -fipa-reference-addressable -fmerge-constants -fmove-loop-invariants -fmove-loop-stores -fomit-frame-pointer -freorder-blocks -fshrink-wrap -fshrink-wrap-separate -fsplit-wide-types -fssa-backprop -fssa-phiopt -ftree-bit-ccp -ftree-ccp -ftree-ch -ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts -ftree-dse -ftree-forwprop -ftree-fre -ftree-phiprop -ftree-pta -ftree-scev-cprop -ftree-sink -ftree-slsr -ftree-sra -ftree-ter -funit-at-a-time
In order to increase the performance and code security of the code, it opens all the flags specified in -O1 as well as the following flags. If a special optimization process will not be performed, it is the recommended and most adequate optimization level for the Release versions of the products.
Flags that are automatically activated with -O2:
-falign-functions -falign-jumps -falign-labels -falign-loops -fcaller-saves -fcode-hoisting -fcrossjumping -fcse-follow-jumps -fcse-skip-blocks -fdelete-null-pointer-checks -fdevirtualize -fdevirtualize-speculatively -fexpensive-optimizations -ffinite-loops -fgcse -fgcse-lm -fhoist-adjacent-loads -finline-functions -finline-small-functions -findirect-inlining -fipa-bit-cp -fipa-cp -fipa-icf -fipa-ra -fipa-sra -fipa-vrp -fisolate-erroneous-paths-dereference -flra-remat -foptimize-sibling-calls -foptimize-strlen -fpartial-inlining -fpeephole2 -freorder-blocks-algorithm=stc -freorder-blocks-and-partition -freorder-functions -frerun-cse-after-loop -fschedule-insns -fschedule-insns2 -fsched-interblock -fsched-spec -fstore-merging -fstrict-aliasing -fthread-jumps -ftree-builtin-call-dce -ftree-loop-vectorize -ftree-pre -ftree-slp-vectorize -ftree-switch-conversion -ftree-tail-merge -ftree-vrp -fvect-cost-model=very-cheap
It is the highest possible optimization level. However, this does not mean that it offers the most effective solution. It can increase compile time and RAM usage. As resource usage is increased, it may even cause the system to slow down in some cases. If the 3th party libraries used are not built with -O3, it may cause problems during compilation.
Flags that are automatically activated with -O3:
-fgcse-after-reload -fipa-cp-clone -floop-interchange -floop-unroll-and-jam -fpeel-loops -fpredictive-commoning -fsplit-loops -fsplit-paths -ftree-loop-distribution -ftree-partial-pre -funswitch-loops -fvect-cost-model=dynamic -fversion-loops-for-strides
Used for code size optimization. It aims to use less memory by removing the following flags from the flags opened in O2 optimization. Low memories can be preferred when working on devices.
Flags removed from -O2 along with -Os:
-falign-functions -falign-jumps -falign-labels -falign-loops -fprefetch-loop-arrays -freorder-blocks-algorithm=stc
In general usage, it can be thought of as debugging optimization. It provides a healthier debugging process by making minimal optimization while performing the debugging process in -O0. It aims a safer debug process by removing the following flags from the flags opened in -O1 optimization.
-fbranch-count-reg -fdelayed-branch -fdse -fif-conversion -fif-conversion2 -finline-functions-called-once -fmove-loop-invariants -fmove-loop-stores -fssa-phiopt -ftree-bit-ccp -ftree-dse -ftree-pta -ftree-sra
As an optimization, it performs a process that cares about size rather than speed. While doing this, it tries to use instructions that it thinks can use less size. It is similar to -Os in general structure. While doing these operations, most of the flags in the -O2 optimization are also active.
After turning off the -fsemantic-interposition flag, it activates all flags in -O3, allowing the use of programs that do not comply with the standard. (Not applicable to all programs.)
While doing this, it also activates the following flags for C++:
Optimization processes that you can perform on the compiler in its most basic lines contain the above information. These operations are optimization operations that affect compile-time. I am planning to prepare an article on the subject of Link Time Optimization (LTO), which may affect Linking-time in future articles.