What happens when you type `ls -l *.c` in the shell

By Katherine Soto & Renato Leon

ls -la *.c simulation in shell

Hello! Today we will talk about how Shell works. That’s why, we will run the example of (“ls -la *.c) in our shell program: “simple_shell”, which is a project that we developed the last 15 days with C language with the goal of replicate the behaviour of shell.

First at all, you can see the logical program with the flowchart image.

Flowchart of “simple_shell” program

The program “simple_shell” simulates “shell program” using an infinite loop “while(1)” and the only form to go out is with “exit” or “EOF”(ctrl + D).

Our principal function consists in

  • check if it is interactive or not to print the prompt
  • read the line (getline)
  • Handle format (‘#’, ‘;’, ‘||’, ‘&&’, ‘$’)
  • Convert in an array **commands
  • Check if the command exists
  • Create a child process and run it depending on the logical also(‘||’, ‘&&’)
  • Handle type of errors

Now, let’s go to the example: “ls -l *.c”

  1. We write “ls -l *.c” in the prompt
  2. Main.c — Entry Point
main: Entry point of the program

Inside of main, the program checks

  • Import the headers from shell.h: if shell.h not exist(#ifndef), (#define) & (#enddefine)
header of the program simple_shell

First, we check if it is an Interactive or not interactive mode with isatty! To determinate to write or not write the prompt

“isatty” is a function that returns 1 if fd is an open file descriptor referring to
a terminal; otherwise 0 is returned

It is important to say that it works inside of while(1) to print the prompt. That’s why the prompt always will print in the interactive mode:

  • Print prompt For this example, it is worked with a prompt “$ “ but if we automatized this work, we can use the PS1 to print the prompt and the start of the line of error.
  • Print buffer
  • Execute and print the result
  • And repeat!

How PS1 works?

  • PS1 which contains the value of the default prompt. That ‘s why, we can use it to modify the prompt automatically. Also it is helpful when we want to write the start of an error message
  • Example for the PROMPT:
  • Example for the error, with PS1, we can write the start of error message (path):

Secondly, we will read “ls -l *.c” from the buffer

The program reads the buffer “ls -l *.c” with the function getline, also it give us the buff_len

function of getline, also the program have an own _getline that simulates getline

Third, Handle format:

After the read, the program handles the format (“#”, “\n”, “;” , “&&” , “||”) to convert the input of a “*buff line” to “**command”. For example, in this function, the program checks if there is an “#” command and if there exists, delete the comment from the buffer

  • Handle_comment: Check if there is an “#” command and delete it from the buffer
handle comment
  • After remove the comments, we handle the “\n” through the _strok function
handle “\n”
  • After it, we handle the format (“;”, “&&” ,“||”, “ “) through different levels using the function “parse_user_input”
  • Parse_user_input: The shell takes the user’s input and tokenizes that input, then that tokenized input is what we can call commands.
  • Now, WE can handle the “;”, “&&” and “||” through different levels using the function “parse_user_input” because the logical operator OR (||) prints only until finding a positive execution (correct command)
  • Finally, we can handle “ “ and “exit” in the function “execute_commands”

Once we have the commands, the shell evaluates the first one of them if it corresponds to an existing one, first the shell evaluates if that command is a builtin.

If that is the case, then executes the code corresponding to that builtin, gives the result back to the user and displays the prompt again, if the command is not a builtin, the shell evaluates if the command is an alias (an alias is a token used to represent some value, formed as: name=value, where name is the alias name and value is the value it represents), if that’s the case, it replaces it with the alias value.

If is not then the shell searches on the current directory and then if it doesn’t exist it searches in the directory list holded by the PATH environment variable. The way the shell does this is by getting the PATH from the environment variables and splitting it’s value into folder directories, then for each directory, the shell appends it to the command and tests if it exists or not, if it does, then replaces the command with it’s full path formed by appending the directory path to the beginning of the command.

Once the shell have the needed commands, it prepares to execute them by using an “exec” syscall, in our case “execve”, this syscall, takes 3 arguments, the first one is a string representing the command we want to execute, the second one, is the list of arguments with which the command will be executed, and the third one represents the list of environment variables with which the process will be executed, the list of environment variables can be accessed by declaring it as a third argument of the main function as follows:

int main(int argc, char **argv, char **env)

{}

Or through a macro “__environ”

All clear until this point, now be have to face an issue, when we use an “exec” syscall, to process that invokes it its replaced by a new process created by that call, so if we call the exec, our program will terminate immediately after that call, fortunately we can bypass this issue by using an other syscall, known as “fork”, this system call allows us to create a child process from which we could call “exec” and after that call, still continue executing the program because the call to “exec” was made from the child process and, therefore, did not affect our main process in which our program runs.

Something to keep in mind is that once the child process is created, its executed at the time as it’s parent, to avoid this and to make our parent process wait until the child has terminated, we use another syscall, “wait”, which will help us accomplish exactly this, make the parent wait until it’s child has terminated, and thus we can continue the main process.

And the solution will work as this:

  1. Create a child process by calling fork syscall
  2. Make parent wait until it’s child has terminated by calling “wait” syscall
  3. In the child process, call execve and pass the commands and env vars
  4. Continue parent process execution

Handling *.c:

Part of the beauty of functions and shell scripts is the ability to make a single function or script behave differently by passing parameters to it.

source: https://pubs.opengroup.org/onlinepubs/009604599/utilities/xcu_chap02.html#tag_02_02_03

In this part we can handle doing:

  1. reverse command
  2. strncmp with the 2 first char (before, 2 last: “.c”)
  3. if strncmp(“command_reverse”, “c.”, 2) == 0, the line of *.c will list , otherwise, it will erase it .

And that’s it! Now, you know how to works a shell! Finally, if also you want to do this project. We can encourage you to manage your time (you can use the kanban & sprint methodology with trello) and have good communication with peers because one function is useful for more, and also you can reuse the code like strtoken, parse, builtins, readline, etc.

To check more about the code. GitHub repository: https://github.com/kateincoding/simple_shell

Software engineer in progress