DevOps and SRE interview: What you need to learn in Linux?

When we talk about DevOps and system admins, Linux is the first thing that comes to mind. But Linux itself is so vast that it’s really confusing what all you should be aware of if you are planning to go for an interview. In this article, we will try to define what are the most frequently used things and which can be asked in Linux so that you can spend some time learning those things.

DevOps and SRE interview: What you need to learn in Linux?

Process debugging

You should be able to debug issues related to the process, how much resources it is consuming, and how you can start-stop, or kill the process. Most of the time it is done using systemd. You can also read about how you can see the number of threads a process is spawning. How can you see what system calls your process is calling using strace. A few of the commands that you can read are ps, top, vmstats, lsof. Learning about the proc file system will also help you a lot in this.

You should be able to see the logs of the process and how you can search in them. In new systems, these can be achieved using journald and having knowledge of this is highly recommended.

Memory Debugging

You should be able to tell what about of memory is used and in which section like cache. Keeping track of memory is very important as in high latency systems if these are not managed properly they can cause latency of system failures due to OOM. Few of the commands used in this can be free, top, vmstat, iostat. Again learning about proc file system will help you a lot here. One thing that you can focus on here is page faults. If there is a huge number of page faults it can be an issue.

Disk debugging.

Here you should be able to get the stats of the disk, manage it, and should be able to point out the exact locations where the disk is used. Few commands that are very useful here are df, du, lvm commands. Then there comes the performance of the disk, to monitor that you can use iostat and lsof.

Network Debugging.

In this section, you should be able to debug network issues like why there is no connection between two systems, DNS not getting resolved, and more such issues. Commands that will help you here are telnet, traceroute, ping, ss, netstat, dig etc.

These are a few basic things that you should be able to accomplish. Keep in mind everything in Linux is a file. If you know where the data is residing in files. You can easily look at the files to get the data you required. Most of the tools mentioned above only parse those files and give you a more easy to read view.

That’s it for this article if you like the article, please share and subscribe.

Also, you can find a compilation of what you have to for DevOps and SRE read in this book.

Gaurav Yadav

Gaurav is cloud infrastructure engineer and a full stack web developer and blogger. Sportsperson by heart and loves football. Scale is something he loves to work for and always keen to learn new tech. Experienced with CI/CD, distributed cloud infrastructure, build systems and lot of SRE Stuff.

  • Neeraj Prem Verma

    Good to see you included LVM in your article. Else the newbies who start with Cloud Platforms take disk-sizing for granted.

    Considering Security as an important aspect, A DevOps/SRE Person should have good clarity in User Management as well.
    I also feel that some hands-on knowledge around Servers and their protocols are also very helpful.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.