Podcast overview: Internals of GCC
an interview with Morgan Deters
In this post we will take a look of what I have understood in this podcast recorded in the Software Engineering radio website.
A couple of semesters ago I had my Advanced Programming course which focuses on some of the low level appliances that are provided by the language C. In the first few lectures, I was taught that libraries needed to be linked manually in console. Also I learnt that I could write a couple of instructions to the compiler so that it comes up with a couple of optimizations at run time.
I'm still unsure about the functionality of all of the layers that my C program has to go through in order to be transformed into machine code. In this podcast I have got a nice insight of how a tool such as GCC manages to do such optimizations.
From what I understood, first of all it seems that GCC transforms code into a standardized tree of instructions that is architecture independent. So every language supported by GCC converges into a same set of instructions, depending of the architecture that is being used. A "front end" compiler needs to be built for each of the languages, since high level expressions are treated differently. Additionally, this language-specific front-end needs to be attached with the so called "middle-end" to the architecture specific developments at the back-end. Through the podcast, there is a question about the optimization support for multi-core architectures in the GCC collection. My friend and classmate Rodrigo Garcia made some speculation about the lack of support from GCC to for this kind of architectures in his very own post. Personally I believe that there will be support at some point, however the amount of variables to be considered to build a safe compiler scales up the difficulty of building such optimizations as resource sharing, as far as I know, is one of the fundamentals of multi-core processing. It is also quite risky and difficult to implement.
One thing I have found to be interesting is the assignation of registers at the assembly level. It seems that this needs to be done by GCC since a processor actually assigns few of them for the execution of a program. I believe that this process might play a role in making multi-processor optimization as limited as it is.