CodeNewbie Community 🌱

Cover image for Java Quickie. Compilers
Tristan
Tristan

Posted on

Java Quickie. Compilers

Introduction

  • This series is going to be dedicated to the basic understanding of Java. When ever I find myself asking, "How does this work ?". I will create a blog post and put it here. This series will not be in order so feel free to read what ever post you find most relevant.
  • This post has come to as a result of wondering what a compiler is, what is does and why is it so important.

A brief over view of compilers

  • Compilers are fundamental to modern computing. They act as translators, transforming human readable code (source code) into computer readable code (machine code). Compilers are what allow us to ignore all the machine dependant details of a machine language. It is thanks to compilers that we can run code on different machines.

History of compilers

  • The term compiler was coined in the early 1950's by Grace Murray Hopper. At the time compilation was called automatic programming and there was a lot of skepticism about it ever becoming successful. If you are unfamiliar with Grace Hopper and her accomplishments, just know that she was a very hardcore computer scientist and most, if not all of modern programming is based on her discoveries.

  • Among the first real compilers were the Fortran compilers. These compilers presented the user with a largely machine independent source language. The compiler also performed some optimization to produce efficient machine code. These Fortran compilers really paved the way for the flood of modern languages and compilers that we enjoy today.

  • In the early days of compilers, things were sort of built as they were needed, which meant compilers were often complex and costly. However, in today's modern compiler world, compiler construction is well documented and understood. Do not get the wrong idea, compiler construction is still a very complex feat of engineering.

What compilers do

  • As stated earlier, compilers act as a translator of the programming language(the source) to machine language(the target). This oversimplification might make you believe that all compilers do the same thing, this is not true. Compilers can be distinguished in two ways.

1) The machine code generated
2) The format of the machine code they generate

  • Machine code is just the code that holds all the ones and zeros that the computer understands.

Machine code generated by compilers

  • Compilers may generate 3 types of code by which they can be differentiated.

1) Pure machine code
2) Augmented machine code
3) Virtual machine code

Pure machine code

  • Compilers generate this code for a particular machine without assuming the existence of any operating system. This kind of machine code is called pure code because it includes nothing but instructions for the machine. This approach is rare and is usually only used in compilers which are intended for implementing operating systems or embedded applications. This form of machine code can execute on bare hardware without dependence on any other software.

Augmented machine code.

  • Compilers generate this kind of code for machines that are augmented with operating system. Running a program generated by such a compiler requires that a particular operating system be present on the target machine. Most Fortran compilers used this kind of software. This was basically the world of compilers before Java

Virtual machine code

  • The kind of code generated is composed entirely of virtual instructions. This approach is particularly attractive as a technique for producing code that can run easily on a variety of computers. This level of portability is achieved by writing an interpreter for the virtual machine(VM) on any target machine. Code generated by the VM compiler can then be run on any machine that has an interpreter for the VM. This is basically how the Java Virtual Machine(JVM) works. Java applications produce instructions for any computer that the JVM interpreter is available.

Machine Code Formats

  • Another way that compilers differ from one another is in the format of machine code they generate. Formats may be categorized as follows:

1) Assembly source format
2) Relocatable binary
3) Absolute binary

Assembly Language format

  • Before we dive any deeper I just want to make a quick distinction between machine code and Assembly language .Assembly language is a low-level programming language that requires a piece of software called an assembler to convert it into machine code. Machine code is the language that consists of all the binary that the machine understands.
  • Generating assembly code is useful for cross-compilation, where the compiler executes on one computer but generates code that can execute on another machine.

Relocatable Binary

  • Most production quality compilers do not generate assembly language. Instead, most generate target code in relocatable binary because it is more efficient and allows the compiler more control over the translation process.

Absolute binary

  • Some compilers generate an absolute binary format that can be directly executed when the compiler is finished. This process is usually the fastest of all the Machine Code Formats. However, the program must be compiled for each recreation, making the compilation cost high. Compilers that use this formatting technique are best suited for prototyping compilers, not production quality.

  • Not a very in depth guide but hopefully it gives us both a better understanding of a compiler.

Conclusion

  • Thank you for taking the time out of your day to read this blog post of mine. If you have any questions or concerns please comment below or reach out to me on Twitter.

Top comments (0)