We can use the C language as our programming language, or source code. This is human understandable. The computer can only understand 1s and 0s, or machine code. We need a thing that will take our source code that we write, and turn it into machine code that the computer can understand. This is called a compiler. The most popular is VSCode. This is what we will use. In the top right box is where we write code. Bottom right is the terminal, where you run commands to compile code and write code. Left hand side is a file explorer, far left is an activity bar or a menu. This is all called a GUI or graphical user interface. The terminal is a CLI or command line interface, because we type commands into it.
The first thing we will do is in the terminal window type:
code hello.c
This is a command for VSCode to create a code file named “hello.c”. It recognises the file extension as .c and will automatically know you will be writing code in C. The “code” command has various flags and arguments you can use for functionality. Malan writes out the following code:
#include <stdio.h>
int main(void)
{
printf("hello, world\n");
}
This is a very simple program in C. Mathematically, this is similar to f(x), f is a function, and (x) is the input or argument to the function. The next thing to do is to compile the code. In the terminal window we type:
make hello
We don’t include the “.c” because hello.c is the file it is going to look for, not the file it is going to create. This compiles the source code into machine code, and gives us a file called “hello”. Looking at this file in the text editor, it is gobbledegook to humans, but it is what the computer understands. To run the program we type into the terminal window:
./hello
The dot slash essentially means right here. The dot refers to the current working directory and the slash is the path separator used to navigate around directories. So by typing “./hello” we are commanding the computer to execute the “hello” file in the current working directory. When we hit enter on this command, the hello program runs and “hello, world” appears in the terminal window. This “hello, world” is a side affect, not a return value. Return values will be discussed below.
Let’s break down the C code.
#include <stdio.h>
is telling the program to include the Standard Input Output header file (.h is a header file). It contains the functions, definitions and declarations in order to be able to input data and output data (usually to the terminal). By putting this here you are telling the compiler to go grab those instructions and let me use them here. Such functions include printf (which we used), puts, putchar, scanf, fopen and others. If we didn’t put this here, when we try to call up the function printf, the compiler would throw up an error because it doesn’t know what printf is. These are more generally called libraries. They contain code other people have written that we can use. It makes our lives easier as now we don’t have to keep defining functions every time we want to use them.
printf("hello, world\n");
Printf is the name of the function defined in stdio.h, then we use parenthesis. These enclose the argument to be passed to the printf function. The double quote marks denote an actual string. It tells the compiler that everything inside the double quote is a literal string and nothing in here is a function. If we use single quote marks then that would tell the compiler that everything contained is a single letter or character. \n is a new line, otherwise the terminal prompt would happen straight after “hello, world” on the same line and it looks a bit dirty, so we include a new line. It is part of something called ‘escape sequence’. These are special sequences of symbols that do something unusual. \n is a new line, \r is a carriage return, \” prints an actual double quotation mark. Think about what would happen if we wanted to print out an actual double quotation mark and put just that in the code; the compiler would interpret that as the end of the string and everything after that would be treated as a function (that doesn’t exist). There are all sorts of escape sequences that solve all sorts of these problems. Then we put the semicolon to end that function.
Now let’s look at the int main void section. The compiler will look for ‘main’, which it interprets as the beginning of the code. ‘int’ is telling the compiler to return an integer (a whole number); this is seen as a status report. By convention, returning zero means everything went ok. If 1 is returned then that would denote some kind of error. Lastly, (void); anything in the parentheses is an argument, but since we don’t require an argument in this simple program, we void it out. It’s telling the compiler we don’t need any other information.
Lastly; the curly brackets. These are what encloses the code. It signifies the start and end of the code in the current block.
If we need to get more information on functions we can check the manuals, or head to manual.cs50.io. It contains all of the functions used in cs50 and how to use them.
CS50 have created their own header file named cs50.h. In the real world when we use the standard version of VSCode, we can’t just call up this header file because it is stored on the CS50’s server. It contains a bunch of functions that we will use over the course. Some of these functions get input from the user. To do this in C without this function is quite difficult, so these functions are to make things easier for the time being. One of these functions is called get_string. This prompts the user for an input and stores it in a variable called for use later. The way we use this function is:
string answer = get_string("What's your name? ");
Here we are defining a variable and calling it “answer” and we are saying that the data to be stored in that variable is the output of the get_string prompt. The equals sign is the thing here that the compiler sees that signifies we are declaring a variable. By convention variable names should be lower case and not contain any spaces. We are also declaring the type of variable, which we have set as a string. So to rewrite the hello world code with this new thing:
#include <stdio.h>
#include <cs50.h>
int main(void)
{
string answer = get_string("What's your name? ");
printf("Hello, %s\n", answer);
}
First off see that we have included the cs50.h header file. Next difference is that we are prompting the user to enter a string, stored as “answer”. Now in the next line, the first difference is %s. This is used as a placeholder. The s character denotes that in there will go a string. Then we end that string with the double quote, there’s a comma space and then the name of our variable. The comma space here is separating arguments to the printf function, it is not interpreted as an English comma. We can pass as many arguments as we need. They must be stated in the order that they appear in the proceeding string. When we execute this code, the string “What’s your name?” is printed in the terminal, this implies that inside the get_string function there is a printf function somewhere in there. The user types their name, then the program spits out “Hello, name!”
NEXT SECTION
Linux is an operating system that is commonly used in computing. It is great at being a server. Usually we interact with it using command line interface. There are a bunch of common commands. I have already learnt these previously and they can be looked up online, so I won’t go into more detail. This section shows how you can navigate around the file structure and do common tasks a normal person would do using the GUI.
CONDITIONALS
We use the ‘if’ expression to denote a conditional. It is not a function, it is a feature of C. The parenthesis are important here and not part of the statement. We use ‘else’ to specify something that happens if the ‘if’ statement is not true. The expression in the brackets after the if is a boolean expression. It evaluates to either true or false.
if (x < y)
{
printf("x is less than y\n");
}
else
{
printf("x is not less than y\n");
}
We can also use ‘else if’ if we have more than 2 statements that could be true:
if (x < y)
{
printf("x is less than y\n");
}
else if (x > y)
{
printf("y is less than x\n");
}
else if (x == y) Here we are using double equals on purpose as the single equals is the assignment operator, as discussed above.
{
printf("x is equal to y\n");
}
With the last statement ‘else if (x == y)’, this is the only other logical statement, and can be seen as a waste of time and resources, so it can be replaced with:
else
{
printf("x is equal to y\n");
}
DATA TYPES
Some more operators:
<= greater than or equal to
>= less than or equal to
!= not equal to
Some more data types:
string – a bunch of characters
bool – true or false
char – a single character
double – like a floating point number but 64 bits
float – floating point value; a number that has a decimal (32 bits; 4 billion numbers)
int – whole numbers, including negatives (32 bit, plus 2 billion to negative 2 billion)
long – like int but with 64 bits
Integer overflow happens if we need to represent a number bigger than 32 bits can handle. In that case, we need to use long, but even then if the number is bigger than 64 bits, we will again get integer overflow. This has caused some interesting behaviour out in the real world. If we use an integer to attempt to represent a decimal, the decimal will be truncated to just the integer value. Float is what we need to use to represent decimals. However, if we use a float to represent 1/3, eventually the decimal will not be 0.3rec, the computer will use an approximation. To get more decimal places, we can use a double.
In cs50.h we can use the functions like get_long etc. Above we used the %s as a placeholder for the string that was stored as a variable, well we can use %i as a placeholder for an integer, %c for char, %f for float, %li for long.
COUNTERS
Last week in Scratch we made a counter. In C we can do the same thing:
int counter = 0;
We are defining an integer variable called ‘counter’ and assigning a value of 0.
We can increment the counter:
counter = counter + 1
This is not stating a maths expression, but rather take the value of ‘counter’ add 1 to it and store it in ‘counter’. However, we use this kind of expression so much that it has a shorthand syntax:
counter += 1;
This does the exact same thing. This is so common that this even has a more succinct syntax:
counter++;
This all works for subtracting too; replace + with -. Don’t forget we still need to define counter.
EFFICIENCY
David goes on to talk about being precise and succinct. He uses the example of why do we use ‘else if’ instead of ‘if’; because if we only use ‘if’ then the computer is going to ask all of the questions even if we already know the answer. It’s wasting resources. Another instance where we could be more succinct is when we are asking for a char input as the answer to a “do you agree?” question. We would usually type either ‘y’ or ‘Y’. We could do this with an else if, but then we are repeating ourselves. Instead we should use logical operators. Pipe pipe or || is the logical operator for ‘or’. So we would code
if (c == 'y' || c == 'Y')
Notice here that there are single quote marks around the y, this is because we are using a char. If it was a string, then it would be double quote marks.
&& is another logic operator, meaning ‘and’.
LOOPS
The first loop we will look at is a while loop:
{
int i = 3;
while (i > 0)
{
printf("Meow\n");
i--;
}
}
This defines a variable i, sets it to 3, and we have a boolean expression to satisfy, while i is greater than zero, print Meow. When i becomes equal to zero, the program stops. This will print Meow 3 times. The while loop will check if the condition in the brackets is correct first, then it will execute the functions in the curly braces. The next loop is a for loop:
{
for (int i = 0; i < 3; i++)
{
printf("Meow\n");
}
}
This does the same thing, but it is more succinct. The differences between the 2 is the intent; we use a for loop when we know how many times you want it to run, a while loop we use when we want something to loop until a condition is met. Notice in the for loop, it has 3 sections, delimited by semicolon. The three sections are the start (do this once at the start), condition (check this before every loop), update (do this after every loop).
Next we can create a program that asks the user how many times the program should meow:
int main(void)
{
int n = get_int("What's n? ");
for (int i = 0; i < n; i++)
{
printf("Meow\n");
}
}
It is also possible to do it this way:
int main(void)
{
int n = get_int("How many meows do you want?\n");
for (; n > 0; n--)
{
printf("Meow\n");
}
}
But, in the second example, when the loop is looping, the value of n changes. What happens if you need to refer back to the original value the user input? In the first example, that value is preserved in n, and we create a second variable i to do the looping. In most programming languages, by convention, we count up from zero, not down to zero. The second example counts down from the user input value to zero. Scope can be important; in the first example, the variable i is inside the loop block. After the loop finishes, any variables declared inside the loop will be wiped from memory. This is cleaner and in big programs can save on resources.
The get_int function in cs50.h has a bunch of error checking built in, so if a string or char is entered it will simply reprompt until an int is input. However, it is possible to enter in -14 which won’t make any sense in our situation above. So we can add in our own error checking. To do this we will use a while loop:
int main(void)
{
int n;
while (true)
{
n = get_int("What's n? ");
if (n < 0)
{
continue;
}
else
{
break;
}
}
for (int i = 0; i < n; i++)
{
printf("Meow\n");
}
}
With this example, we first create a variable n, but don’t give it a value. If we created this inside the while loop then it would be out of scope for the for loop and we wouldn’t be able to use it. Then we create the while loop, assign a value to n, setup an if conditional that if n is less than 0, it loops (continue) otherwise move on (break). This can be tightened up by the following:
{
int n;
while (true)
{
n = get_int("What's n? ");
if (n >= 0)
{
break;
}
}
for (int i = 0; i < n; i++)
{
printf("Meow\n");
}
}
Another way of doing this is a do while loop:
{
int n;
do
{
n = get_int("What's n? ");
}
while (n < 1);
for (int i = 0; i < n; i++)
{
printf("Meow\n");
}
}
This is telling the computer to do what is in the curly braces while the condition is true. This is good for if you want the function to be performed at least once, and then check the condition. Unlike the while loop which will perform the check first, and only perform the function if it’s true.
We can abstract away functions that we create, and define them at the beginning of the code, and then they will be in scope for the whole code to use, while making the code more succinct at the same time. For example:
void meow(int times);
int main(void)
{
int n = get_int("What's n? ");
meow(n);
}
void meow(int times)
{
for (int i = 0; i < times; i++)
{
printf("Meow\n");
}
}
In this example, we have our function that we have made ‘meow’ abstracted away below main. We do have to reference it, or create a prototype above main to tell the compiler that it exists, but the meat of the function goes below main. To declare this function we are using ‘void meow(int times)’ which looks a lot like ‘int main(void)’. The first word is the return value. Since nothing we’ve seen so far returns any value, we void it. The second word is the name of the function. The content in the brackets is the argument we are passing to the function. With our function we are passing an integer ‘times’ which is the number of times we want meow to happen, but with int main(void) we are not passing any inputs to it. More on this next week. We have some user input in main, we have stored it as a variable n, but n only exists within the scope of main. This value is passed out of main as the variable ‘times’, the labelling doesn’t matter. We could use n again.
Now if we use what we have learned about the loops, conditionals, abstracting functions and being succinct we can write the code as follows:
#include <cs50.h>
#include <stdio.h>
int getPosiInt(void);
void meow(int times);
int main(void)
{
int n = getPosiInt();
meow(n);
}
int getPosiInt(void)
{
int numMeow;
do
{
numMeow = get_int("How many Meows do you want? ");
}
while (numMeow < 0);
return numMeow;
}
void meow(int times)
{
for (int i = 0; i < times; i++)
{
printf("Meow.\n");
}
}
Notice now we have declared a function int getPosiInt(void). This means that the function will return an integer (a return value, not a side affect). In the code we see that it returns numMeow. At the top we see 2 prototypes, named descriptively. In main we have a brief todo list. Then we define our functions that we can use over and over if we choose. The code is well organized and demonstrates good separation of concerns. The use of helper functions for input validation and output is a good design choice. The function and variable names are descriptive and make the code easy to understand. Overall, the design is clear and effective. It also conforms to the cs50 style guide; everything is indented and formatted correctly. This follows good correctness, design and style. It could however use more comments to tell people exactly what is going on. To comment we use // followed by the comment.
We have been declaring variables that we can change, but what happens if we don’t want them to change, we want them to be a constant? We use const:
const int n = 9;
Now if we write code that will try to change that int, the compiler won’t let us. So this code will not be allowed:
#include <stdio.h>
int main(void)
{
const int n = 9;
n++;
printf("%i\n", n);
}