Linux HowTO

What is a shell

A shell is a text-based interactive program which allows you to communicate with any system. The command prompt (cmd.exe) on a windows system is an example of a shell. There are a variety of different shells you can find on unix/linux systems. One of the most popular ones is bash, which we'll be using for this training.

The first thing you'd want to do on a system is to move around through files and directories. It's like learning how to drive a car...

Useful commands

The next section will list some of the most useful commands available inside a shell. A full man(ual) page can be found by executing man followed by the command. e.g.: man cp The manual page contains a full description of all possible options of a certain command or tool. If you don't know how to use the man pages, please use man.

ls

ls will show you the content of the directory you're currently located:

[root@linuxhowto.in ~]# ls

anaconda-ks.cfg  Documents 
  install.log         Music     Public     Videos

Desktop          Downloads  install.log.syslog  Pictures 
  Templatesls /

Another way to get the content of a directory is to supply the directory as an argument:

[root@linuxhowto.in etc]# ls
  /etc/sysconfig/

atd         ip6tables-config network-scripts selinux
auditd      ip6tables.old     nfs              smartmontools
authconfig iptables          nspluginwrapper snmpd

In this view, one cannot see any difference between directories and files. Nor can you see any file permissions of a certain file. This can be viewed as follows:

Another useful option is ls -lh. This will print the file sizes in human readable format:

[root@linuxhowto.in sysconfig]# ls -l
  /etc/sysctl.conf

-rw-r--r--. 1 root root 1148 Oct  7 
  2011 /etc/sysctl.conf

[root@linuxhowto.in sysconfig]# ls
  -lh /etc/sysctl.conf

-rw-r--r--. 1 root root 1.2K Oct  7 
  2011 /etc/sysctl.conf

pwd

The pwd command will print your current working directory (Print Working Directory):

[root@linuxhowto.in sysconfig]# pwd
/etc/sysconfig

This is very useful for scripts.

cd

The cd command is used to navigate through the filesystem structure of your system. It allows you to move your shell to another directory. The following command will move our shell to /etc/sysconfig/:

[root@linuxhowto.in sysconfig]# cd
  /etc/sysconfig/

[root@linuxhowto.in sysconfig]#

If you execute cd without any arguments, it will move you to the home directory of the user you logged in with. In our case, we're logged in with the root user. As our home directory is /root, this is where we will go:

[root@linuxhowto.in sysconfig]# cd
[root@linuxhowto.in ~]#

When executing cd -, it will return to the directory you were previously using. An example to illustrate:

[root@linuxhowto ~]# cd
  /etc/sysconfig/

[root@linuxhowto sysconfig]# cd
[root@linuxhowto ~]# cd -
/etc/sysconfig
[root@linuxhowto sysconfig]#

mv

This tool lets you move files and directories. It's also used to rename files or directories. Some examples to illustrate:

[root@linuxhowto visitor]# ls -l
total 0

-rw-r--r--. 1 root root 0 Aug 15
  10:44 abc

-rw-r--r--. 1 root root 0 Aug 15
  10:44 testing

[root@linuxhowto visitor]# mv abc
  newname

[root@linuxhowto visitor]# ls -l
total 0

-rw-r--r--. 1 root root 0 Aug 15
  10:44 newname

-rw-r--r--. 1 root root 0 Aug 15
  10:44 testing

cp

The cp utility copies files and directories:

[root@linuxhowto visitor]# ls -l
total 0

-rw-r--r--. 1 root root 0 Aug 15
  10:44 newname

-rw-r--r--. 1 root root 0 Aug 15
  10:44 testing

[root@linuxhowto visitor]# cp testing
  newfile

[root@linuxhowto visitor]# ls -l
total 0

-rw-r--r--. 1 root root 0 Aug 15
  10:45 newfile

-rw-r--r--. 1 root root 0 Aug 15
  10:44 newname

-rw-r--r--. 1 root root 0 Aug 15
  10:44 testing

If you're copying a files or directories, the file permissions and owners of the new set of files will be different of the original ones. If you want to preserve file permissions and copy multiple files and folders recursively use:

cp -a /path/to/original_dir
  /path/to/destination_dir

The -a argument is a synonym for -drp:

-d will preserve links

-r makes the copy action recursive. This means it will also copy subdirectories

-p makes sure file permissions and owners are the same as the original ones

rm

This command will delete files:

[root@linuxhowto visitor]# ls -l
total 0

-rw-r--r--. 1 root root 0 Aug 15
  10:45 newfile

-rw-r--r--. 1 root root 0 Aug 15
  10:44 newname

-rw-r--r--. 1 root root 0 Aug 15
  10:44 testing

[root@linuxhowto visitor]# rm testing

rm: remove regular empty file
  `testing'? y

[root@linuxhowto visitor]# ls -l
total 0

-rw-r--r--. 1 root root 0 Aug 15
  10:45 newfile

-rw-r--r--. 1 root root 0 Aug 15
  10:44 newname

If you want to delete files and folders recursively, use the -rf options:

[root@linuxhowto visitor]# ls -l
total 4
-rw-r--r--. 1 root root 0 Aug 15 10:45 newfile
-rw-r--r--. 1 root root 0 Aug 15 10:44 newname

drwxr-xr-x. 2 root root 4096 Aug 15
  10:48 test

[root@linuxhowto visitor]# rm -rf
  test/

[root@linuxhowto visitor]# ls -l
total 0

-rw-r--r--. 1 root root 0 Aug 15
  10:45 newfile

-rw-r--r--. 1 root root 0 Aug 15
  10:44 newname

touch

The touch utility will update a file's access time. If the file does not exist, it will create a new, empty file:

[root@linuxhowto visitor]# ls -l
total 0

-rw-r--r--. 1 root root 0 Aug 15
  10:45 newfile

-rw-r--r--. 1 root root 0 Aug 15
  10:44 newname

[root@linuxhowto visitor]# touch
  pankaj

[root@linuxhowto visitor]# ls -l
total 0

-rw-r--r--. 1 root root 0 Aug 15
  10:45 newfile

-rw-r--r--. 1 root root 0 Aug 15
  10:44 newname

-rw-r--r--. 1 root root 0 Aug 15
  10:48 pankaj

cat

cat will print the content of a file to the standard out of your shell. Make sure that you use the cat command only on small files as the total content of the file will be dumped inside your shell or you'll be scrolling like hell. If you do want to check a bigger file, use vi or less.

[root@linuxhowto visitor]# echo
  linuxhowto.in > new.txt

[root@linuxhowto visitor]# cat
  new.txt

linuxhowto.in

less

less can be used in 2 ways:

Show the contents of a file so you're able to scroll and search through the file without opening a real editor

Pipe the output of a command which produces a lot of output to scroll and search the results

Some examples:

[root@linuxhowto log]# less
  /var/log/messages

[root@linuxhowto log]# grep ERROR
  /var/log/messages | less

head

head (executed without any parameters) will show you the first 10 rules of whatever you feed it. If you add the -n option, you can specify the amount of rules you want to see:

[root@linuxhowto log]# head -n 2
  /var/log/messages

Aug 13 13:00:49 localhost kernel:
  imklog 4.6.2, log source = /proc/kmsg started.

Aug 13 13:00:49 localhost rsyslogd:
  [origin software="rsyslogd" swVersion="4.6.2"
  x-pid="1162" x-info="http://www.rsyslog.com"] (re)start

tail

tail does the same as head but will show you the last lines of a file of stdin:

[root@linuxhowto log]# tail -n 2
  /var/log/messages

Aug 15 09:14:24 server kernel: sdb:
  sdb1 < >

Aug 15 09:14:46 server kernel: sdb:
  sdb1 < sdb5 >

tail can also be used to follow content added to a file, e.g.: a log file. When using the -f option, it will print new content added to a file and it will keep on running until you press ctrl+c to exit.

[root@linuxhowto log]# tail -f
  /var/log/messages

Aug 15 09:13:13 server
  pulseaudio[1873]: alsa-util.c: Disabling timer-based scheduling because
  running inside a VM.

Aug 15 09:13:13 server
  rtkit-daemon[1875]: Sucessfully made thread 1879 of process 1873
  (/usr/bin/pulseaudio) owned by '42' RT at priority 5.

Aug 15 09:13:13 server
  pulseaudio[1873]: alsa-util.c: Disabling timer-based scheduling because
  running inside a VM.

Aug 15 09:13:13 server
  pulseaudio[1873]: alsa-sink.c: Most likely this is a bug in the ALSA driver
  'snd_intel8x0'. Please report this issue to the ALSA developers.

Aug 15 09:13:13 server
  pulseaudio[1873]: alsa-sink.c: We were woken up with POLLOUT set -- however a
  subsequent snd_pcm_avail() returned 0 or another value < min_avail.

<snip>
ctrl + c

uptime

The uptime command will print a lot of useful information:

[root@linuxhowto log]# uptime

 10:53:59 up 
  1:41,  3 users,  load average: 0.01, 0.01, 0.00

10:53:59: The current time
up 1:41: The machine has an uptime of 1 hour and 41 minutes
3 user: 3 user is currently logged in (including me)
load average: 0.01, 0.01, 0.00 This is a list of the load averages the last minute, last 5 minutes, last 15 minutes
Check the part on system load in this guide to get some more information on what the numbers mean.

date

The date command will print the current time and date configured on the system:

[root@linuxhowto log]# date
Wed Aug 15 10:56:08 IST 2012

grep

grep searches files (or standard input if no files are specified) for lines containing a match to the given pattern. By default, grep prints the matching lines. Some go through some examples to illustrate.
Let's suppose you want to search for all errors in the volumedriver's log file:

[root@linuxhowto log]# grep kernel
  /var/log/dmesg

kernel direct mapping tables up to
  17ff0000 @ 7000-f000

Booting paravirtualized kernel on
  bare hardware

Memory: 364992k/393152k available
  (4313k kernel code, 27544k reserved, 2412k data, 504k init, 0k highmem)

virtual kernel memory layout:

Freeing unused kernel memory: 504k
  freed

Write protecting the kernel text:
  4316k

Write protecting the kernel read-only
  data: 1852k

You need to search a whole directory structure looking for all occurrences of a certain pattern:

`grep -Rni pattern /path/to/directories/you/want/to/search`
Option	Description
R	Recursive: search all underlying directories
n	Print line numbers inside the file
i	Case insensitive: can be useful if you're not sure about what you're searching for

Grep is a very powerful tool. Instead of just using words as search patterns, you can also use regular expressions. If you're interested in the advanced usage of grep and regexes.

wc

This command can be used to count (wc == word count) the number of words or lines in a file or output:

[root@linuxhowto log]# cat
  /var/log/messages | wc -l

2567

This means that the /var/log/messages contains 2567 lines.

watch

The watch command can be useful when following up on things that change. Let's say there is a raid array rebuilding. The thing you would do to get the current raid status is cat /proc/mdstat. But you would have to execute it every time you want to know how it's doing. watch can do this for you:

[root@linuxhowto log]# watch cat
  /proc/mdstat

This will print the content of /proc/mdstat every 2 seconds. With the -n option, you can specify the refresh rate.

free

free will print the current memory usage of the host:

[root@linuxhowto log]# free

             total       used       free     shared   
  buffers     cached

Mem:        380116    
  234588     145528          0      22312     111152

-/+ buffers/cache: 101124 278992

Swap:      1048568          0   
  1048568

For a full explanation of the numbers and what they mean, please refer to the part on memory management

dmesg

dmesg allows you to print or control the kernel ring buffer. Usually, you'll just run dmesg without any options to print all messages in the buffer:

[root@linuxhowto log]# dmesg

SELinux: initialized (dev rpc_pipefs,
  type rpc_pipefs), uses genfs_contexts

eth0: no IPv6 routers present

SELinux: initialized (dev autofs,
  type autofs), uses genfs_contexts

SELinux: initialized (dev autofs,
  type autofs), uses genfs_contexts

<snip>

ssh

The ssh client allows you to login on other machines which have an ssh daemon running.

root@ linuxhowto:~# ssh -l root 192.168.8.75
root@192.168.8.75's password:

Linux cpunode2 2.6.35-29-alinuxhowto.in
  #51

SMP Mon May 9 23:11:39 CEST 2011 x86_64 GNU/Linux
Ubuntu 10.04.2 LTS
Last login: Fri Jul 29 10:54:32 2011 from 192.168.8.56
root@server2:~#

The -l option specifies the user (login) you want to use on the remote system. A popular variant on this command is:

ssh root@192.168.8.75

Useful tools

top

The top program provides a dynamic real-time view of a running system. It can display system summary information as well as a list of tasks currently being managed by the Linux kernel. It will show you a real time list of all running processes. When executing top you'll get output like this:

top - 11:08:21 up  1:56, 
  3 users,  load average: 0.00,
  0.00, 0.00

Tasks: 114 total, 1 running, 113 sleeping, 0 stopped, 0 zombie

Cpu(s):  0.3%us, 
  0.5%sy,  0.0%ni, 98.8%id,  0.4%wa, 
  0.0%hi,  0.1%si,  0.0%st

Mem: 380116k total, 235084k used, 145032k free, 22360k buffers

Swap: 
  1048568k total,        0k
  used,  1048568k free,   111468k cached

 
  PID USER      PR  NI 
  VIRT  RES  SHR S %CPU %MEM    TIME+ 
  COMMAND

 2360 root      20  
  0  2696 1000  772 R 
  1.9  0.3   0:00.03 top

   
  1 root      20   0 
  2864 1400 1180 S  0.0  0.4  
  0:01.20 init

<snip>

The header will show you some more general system information. Please refer to the system load and memory management section for more information about these numbers.
The list of processes have the following headers:

Header	Description
PID	Process id of the process
USER	The process is running as this user
PR	Priority
NI	Nice value
VIRT	Amount of virtual memory used
RES	Amount of residential memory used
SHR	Amount of shared memory used
S	Process state
%CPU	Percentage of CPU usage
%MEM	Percentage used of total memory
TIME+	Amount of CPU time used
COMMAND	The process running

Useful list of shortcuts to be used when running top:

Shortcut	Description
Shift+M	Change the sort to memory usage
Shift+P	Change the sort to CPU usage
d	Change the refresh rate (delay). This is set to 3 seconds by default
k	Kill a process from within top
1	Split out the CPU usage on a per core basis instead of aggregating over all cores

vmstat

vmstat reports information about processes, memory, paging, block IO, traps, disks and cpu activity. When executed like this, it will print 3 lines each with a 1 second interval.

[root@linuxhowto log]# vmstat 1 3

procs -----------memory----------
  ---swap-- -----io---- --system-- -----cpu-----

 r 
  b   swpd   free  
  buff  cache   si  
  so    bi    bo  
  in   cs us sy id wa st

 0 
  0      0 145040  22376 111468    0   
  0    20    10  
  26   19  0  1
  99  0 
  0

 0 
  0      0 145032  22376 111492    0   
  0     0     0  
  26   12  0  0
  100  0 
  0

 0 
  0      0 145032  22376 111492    0   
  0     0     0  
  19    9  0  0
  100  0 
  0

The output headers are the following:

Header	Description
procs - r	The number of running processes
procs - b	The number of blocked processes
memory - swpd	The amount of memory in swap
memory - free	The amount of unused memory
memory - buff	The amount of memory used for buffers
memory - cache	The amount of memory used for caching
swap - si	The amount of bytes which have been moved to swap
swap so	The amount of bytes that moved from swap to physical memory
io - bi	The amount of bytes read from disk or network
io - bo	The amount of bytes written to disk or network
system - in	The amount of interrupts processed by the system
system - cs	The amount of context switches done by the CPU
cpu - us	Percentage of CPU power used by user
cpu - sy	Percentage of CPU power used by system
cpu - id	Percentage of CPU power unused
cpu - wa	Percentage of CPU power used by IOwait

Please refer to the system load and memory management section for more information about these numbers.

screen

Screen is a utility that can create a virtual terminal which can be attached and detached as needed. It is mostly used to start a shell that can stay running even when nobody is logged in on the system. This is why you should run all long running processes inside a screen session. If you should run long running processes from a regular shell and the network connection breaks, the long running process will be killed as soon as the ssh daemon cleans up dead connections. When using screen, these processes will keep on running. When logged back in, you can just attach the previous session and you're off.
Starting a screen:

screen -h 10000 test

The -h argument sets the scroll back buffer to 10000 lines. When you're inside a screen, press ctrl+a and ctrl+d to detach from the current session.
Getting a list of available screen sessions:

root@linuxhowto.in:~# screen -ls
There are screens on:

    6839.pts-0. linuxhowto.in
  (08/07/11

14:05:59)
  (Detached)

    6827.pts-0. linuxhowto.in
  (08/07/11

14:05:55)
  (Detached)

2 Sockets in /var/run/screen/S-root.

Attaching to a screen session:

screen -dr 6839.pts-0. linuxhowto.in

Attaching to a screen session when somebody is already attached to it, so you can watch it together:

screen -x 6839.pts-0. linuxhowto.in

gzip

gzip is a tool which compresses files (like zip, rar, 7zip, ...)

root@linuxhowto.in:~# ls -lh
  debpkg.lst

-rw-r--r-- 1 root root 5.5K Jun 30 15:53 debpkg.lst

root@linuxhowto.in:~# gzip debpkg.lst

root@linuxhowto.in:~# ls -lh
  debpkg.lst.gz

-rw-r--r-- 1 root root 2.5K Jun 30 15:53 debpkg.lst.gz

root@linuxhowto.in:~# gunzip
  debpkg.lst.gz

root@linuxhowto.in:~# ls -lh debpkg.lst

-rw-r--r-- 1 root root 5.5K Jun 30 15:53 debpkg.lst

tar

tar is an archiving tool. It originates from the age where people still used tapes to store data. It will take multiple files and create a single file stream. This stream is saved into a file.
Create a tar archive:

tar zcvpf /tmp/archive.tar.gz *

Extract an archive:

tar zxvpf /tmp/archive.tar.gz

Explanation of the arguments:

Argument	Description
z	Gzip the archive afterwards
c	Create an archive
x	Extract an archive
v	Verbose: print every file added or extracted from the archive
p	Preserve file permissions and owners
f	File - points to your tar archive

parted

Parted is an interactive tool which allows you to create and alter partition tables on a hard disk or volume.

root@linuxhowto.in:~# parted /dev/sda
GNU Parted 2.2
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.

(parted)
  print

Model: ATA WDC WD5001ABYS-0 (scsi)
Disk /dev/sda: 500GB

Sector size (logical/physical):
  512B/512B

Partition Table: gpt

Number  Start  
  End     Size    File
  system     Name  Flags

1

17.4kB  2098kB 
  2080kB                 
  grub  bios_grub

2

2098kB  53.7GB  53.7GB 
  ext4           
  root  raid

3

53.7GB  62.3GB  8589MB 
  linux-swap(v1)  swap  raid

4

62.3GB  500GB  
  438GB  
  ext4            dss1

(parted)
  help

  align-check TYPE
  N                       
  check partition N for

TYPE(min|opt)
  alignment

  check
  NUMBER                            
  do

a simple check on the
  file system

  cp [FROM-DEVICE]
  FROM-NUMBER TO-NUMBER   copy file system to another partition

  help
  [COMMAND]                          
  print general help, or help on COMMAND

  mklabel,mktable
  LABEL-TYPE              
  create a new

disklabel
  (partition table)

  mkfs NUMBER
  FS-TYPE                     
  make a FS-TYPE file system on partition NUMBER

  mkpart PART-TYPE
  [FS-TYPE] START END     make a partition

  mkpartfs PART-TYPE FS-TYPE
  START END     make a partition with a file system

  move NUMBER START
  END                   
  move partition NUMBER

  name NUMBER
  NAME                        
  name partition NUMBER as NAME

  print
  [devices|free|list,all|NUMBER]     display the partition
  table, available devices, free space, all found partitions, or a particular
  partition

  quit                                    
  exit program

  rescue START
  END                        
  rescue a lost partition near START and END

  resize NUMBER START END                 
  resize partition NUMBER and its file system

  rm
  NUMBER                               
  delete partition NUMBER

  select
  DEVICE                           
  choose the device to edit

  set NUMBER FLAG
  STATE                   
  change the FLAG on partition NUMBER

  toggle [NUMBER
  [FLAG]]                  
  toggle the state of FLAG on partition NUMBER

  unit
  UNIT                               
  set the default

unit to
  UNIT

  version                                 
  display the version number and copyright information of GNU Parted

(parted)

Using an editor

As you might know, our favorite editor is vi or vim (vi improved). On DCOS, which is Ubuntu based, when using vi you're actually using vi improved anyway. The real vi is only used on more Spartan systems.

To open a file with vi, use this command:

vi /path/to/file

Let's assume you have some kind of python stacktrace and it points you to a certain rule number inside a file (in the example line number 345):

vi /path/to/file +345

@ TODO: insert some useful vi commands

Date and time

It's important for any live system to have the date correctly set, for all linuxhowto.ins. Even if you're only doing a simple task as debugging a system which is running on multiple linuxhowto.ins, nothing can be more confusing then comparing log files which don't have the timestamps in sync.

A linux system will save the current time in GMT in the hardware clock of the system. When the system boots, it reads the current time from the hardware clock and keeps it in memory. When shutting down it will save the time back to the hardware clock which will keep track of time while the machine is off. The time saved in the hardware clock is always saved in GMT (in contradiction a windows system which will always save the local time to the hardware clock).

Date

The date command, issued without arguments, prints the local time and date.

[root@linuxhowto log]# date
Wed Aug 15 11:12:25 IST 2012

The date command can also be used to configure a new time on the system. The example will set the date to August 16th 2012 at 8:57PM:

date MMDDhhmm[[CC]YY]

eg: date 081620572012

When writing a script, you might need the current date in a custom format. The following example shows you how:

root@linuxhowto.in:~# date '+%Y%m%d'
20120816

NTP

NTP stands for Network Time Protocol. It's a way where systems can request a remote linuxhowto.in for the correct date and time. As explained before, it's very important to have all your systems synchronized. NTP was designed to keep track of time on your system and to have it in sync at a resolution of microseconds.

An NTP daemon will request the correct time from a configured timelinuxhowto.in and will adjust the local clock to match the time sent by the linuxhowto.in. It will also keep the drift of the linuxhowto.in. Every system has some time deviations. The NTP daemon running on your system will keep track of the clock skew and will, after a period of time, be able to calculate the correct drift of your system vs the correct time. The calculated drift will be applied to the system to reduce clock skew to a minimum.

To start the NTP daemon:

[root@linuxhowto log]#
  /etc/init.d/ntpd start

Starting ntpd:                                            
  [  OK  ] [ OK ]

To stop the NTP daemon:

[root@linuxhowto log]#
  /etc/init.d/ntpd stop

Shutting down ntpd:                                       
  [  OK  ]

The configuration files for an NTP daemon can be found here:

root@linuxhowto.in:~# cat
  /etc/ntp.conf

driftfile /var/lib/ntp/ntp.drift

server 127.127.1.0
server asia.pool.ntp.org

Ntpdate

There is 1 downside to all regular ntp daemons: if the clock is too far off, the regular ntp daemon cannot and will not adjust the clock to match the correct time. This is why we have ntpdate. Ntpdate will request the correct time from a time server and will apply it to the system, no matter how far the clock is off.

[root@linuxhowto log]# ntpdate -u
  asia.pool.ntp.org

15 Aug 11:21:43 ntpdate[2411]: adjust
  time server 59.124.196.83 offset -0.047474 sec

Hard disks

Disks

All devices are mapped to an entry in /dev/. A disk is usually represented by a device node on a linux system. In our case, internal hard drives (scsi, sata and sas drives) are represented as sd devices. Which basically means that the 1st drive in you system will be /dev/sda/, the 2nd drive will be /dev/sdb, ... Hard drives are detected during boot up. They get the drive letter assigned in the order they're detected by the kernel.

Partitions

A disk can be divided into several logical blocks. These blocks are called partitions. The layout of a disk is described in the disk's partition table. The 2 most commonly used partition tables are GPT and MSDOS partition tables. Inside SSO, we use GPT partition tables.

To view and alter partition tables, we use the tool parted. Please refer to the section on parted on how to use it.

Software RAID arrays

Like any good operating system, Linux also supports software raid. One of the big advantages of software RAID is that is't not hardware dependent and you don't need extra hardware. If you'd run a RAID array in hardware, you need a hardware RAID controller. These are more expensive then a regular SATA controller. If you created a RAID array on a specific brand of hardware RAID controller and your controller should ever fail, you'll almost need an identical type or at least the same brand to get your old disks back up and running. With software RAID this is not the case. You can swap hardware as much as you like. How stuff works...

On a linux system, the most common approach is to use partitions for raid arrays. We'll start with a regular system:

root@linuxhowto.in:~# cat
  /proc/mdstat

Personalities : [linear] [multipath]
  [raid0] [raid1] [raid6] [raid5] [raid4]
  [raid10]

unused devices: <none>

We have no RAID devices yet. Let's create one. The used partitions must be identical in size:

`mdadm --create -l 1` `-n 2` `/dev/md0 /dev/sda2 /dev/sdb2`
Argument	Description
--create	Since we'll be creating a RAID array
-l 1	The RAID level, 1 meaning a mirror raid
-n 2	The number of devices used in the array, in our case: 2
/dev/md0	The name of our new raid array
/dev/sdxx	A list of all members

This will create and assemble the RAID array:

root@ linuxhowto.in:~# cat
  /proc/mdstat

Personalities : [linear] [multipath]
  [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]

md0 : active raid1 sdb2[1] sda2[0]
52426688 blocks [2/2] [UU]

unused devices: <none>
root@ linuxhowto.in:~#

Let's pretend a disk failed. Your raid status should now look like this:

root@ linuxhowto.in:~# cat
  /proc/mdstat

Personalities : [linear] [multipath] [raid0]
  [raid1] [raid6] [raid5] [raid4] [raid10]

md0 : active raid1 sdb2[F] sda2[0]
52426688 blocks [1/2] [U_]

unused devices: <none>
root@ linuxhowto.in:~#

Let's assume we have a spare disk in place. So we'll need to remove the failed drive from the RAID array and insert a new partition (which has the same size as the other disks):

root@ linuxhowto.in:~# mdadm /dev/md0
  -r /dev/sdb2

root@ linuxhowto.in:~# cat
  /proc/mdstat

Personalities : [linear] [multipath]
  [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]

md0 : active raid1 sda2[0]
52426688 blocks [1/2] [U_]

unused devices: <none>

root@ linuxhowto.in:~# mdadm /dev/md0
  -a /dev/sdc2    # if

/dev/sdc is the new disk

root@ linuxhowto.in:~# cat
  /proc/mdstat

Personalities : [linear] [multipath]
  [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]

md0 : active raid1 sdc2[1] sda2[0]
52426688 blocks [1/2] [UU]

unused devices: <none>

Deconfiguring a RAID array involves several steps. We first need to stop the RAID array. After that, all members should have the metadata cleared as this is stored on the disks itself. This is what is called the superblock:

root@ linuxhowto.in:~# cat
  /proc/mdstat

Personalities : [linear] [multipath]
  [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]

md0 : active raid1 sdc2[1] sda2[0]
52426688 blocks [1/2] [UU]

unused devices: <none>

root@ linuxhowto.in:~# mdadm --stop
  /dev/md0

root@linuxhowto.in:~# cat
  /proc/mdstat

Personalities : [linear] [multipath]
  [raid0] [raid1] [raid6] [raid5] [raid4]
  [raid10]

unused devices: <none>

root@ linuxhowto.in:~# mdadm
  --zero-superblock /dev/sda2

root@ linuxhowto.in:~# mdadm
  --zero-superblock /dev/sdc2

If you do not clear the superblocks, the next time the machine boots, it will start the RAID array again as the superblocks get detected automatically by the kernel.

Filesystems

A filesystem is the middle layer between a block device, like a hard drive or a RAID array, and the possibility to create files and directories. A filesystem is actually some kind of indexing table which describes which blocks you need to read from the block device to assemble directories and files.

Ext4

ext4 is the newer version of ext2 and ext3. Some statistics:

Max file size	16TB
Max files in directory	4 billion
Max volume size	1 Exabyte (= 1.000 petabyte = 1.000.000 terabyte)
Became stable	21 October 2008

File permissions

The -l option of ls will show you a detailed overview on which permissions are applied to which file.

root@ linuxhowto.in:/opt/qbase3# ls -l
  apps

drwxr-xr-x 17 root root 4096 Jul 28 13:47 apps

drwxr-xr-x: the filetype and the permissions on the file (see below for explanation)
17:
root: user owning the file
root: group owning the file
4096: size of the file in bytes
Jul 28 13:47: last access time
apps: name of the file (in our case - the directory)
The very first character of a line will tell you what type of file is listed:

Character	File type
d	directory
-	regular file
l	symlink
c	character device
b	block device node

The next 9 characters are the user permissions (owner), group permissions (group owner) and world permissions (everybody else).

Character	Meaning	Description for files	Description for directories
r	readable	You can get the file's content	You can get a directory listing
w	writeable	You can edit the file	You can create new files
x	executable	You can execute it	Applications can browse

The full explanation of our example above:

It's a directory

The directory's name is apps

The directory is owned by the user root and the group root

The size is 4096 bytes. As this is the entry in the filesystem for a directory, it's size on disk is always 4096 bytes.

Directories usually have the x (executable) flag set for all users

The directory is readable and writable for the user root

The directory is readable for the group root

The directory is readable for world - which means everybody can read it

File permissions can be changed with a number of tools

Symlinks

Symlinks or symbolic links in full are a way to link files and directories to each other. An example will clarify it all. Let's assume you have several users, each with their own directory system and you have a set of files and document where you want all users to work on. Instead of trying to teach each user where he can locate the files on the filesystem, somewhere 34 directories deep inside, you can just symlink the same set of files to each user's directory. The files are still present on the original location, but the user has some kind of shortcut to the files which is completely transparent for him. Symlinks can be made on a file or directory basis.
There are 2 types of symlinks: softlinks and hardlinks.

Softlinks

The most used type of symlink is a softlink. This has multiple advantages:

Easily manageable

Viewable by user

A softlink creates a file which contains a link to the original fileset. As softlinks are viewable by regular users (but still transparent), they are easy to detect and to change:

root@linuxhowto.in:~# echo test >
  testfile1

root@linuxhowto.in:~# ln -s testfile1
  testfile2

root@linuxhowto.in:~# ls -l testfile*
-rw-r--r-- 1 root root 6 Aug 9 21:51 testfile1
lrwxrwxrwx 1 root root 9 Aug 9 21:51 testfile2 -> testfile1
root@linuxhowto.in:~# cat testfile2
pankaj

root@linuxhowto.in:~# unlink
  testfile2

root@linuxhowto.in:~# ls -l testfile*
-rw-r--r-- 1 root root 6 Aug 9 21:51 testfile1

The example shows that a symlink was created from testfile1 to testfile2. testfile2 can be accesses in the same way as testfile1.

Hardlinks

Hardlinks work in the same way as symlinks: they duplicate files and directory structures transparently for the user and applications. The major difference is that a hardlink operates on filesystem level. Instead of creating a file which is a link to the original files, a hardlink is created inside the metadata on filesystem level. Because of the fact that they're created on filesystem level, there is no way to tell using ls. You'll have to take a look at the inode on filesystem level. Having a closer look at this will take us too far off. For now you just need to know it exists.

Mountpoints

Instead of assigning a drive letter to a hard disk or a partition, a linux system uses mountpoints. This allows for a larger flexibility. In order to fully understand the concept, we'll start with the standard layout of a linux system.

Default layout of a linux system

Directory	Usage
/bin	All binaries a regular user should be able to use
/boot	Bootloader, kernel and initrd images
/dev	Contains all device nodes describing the system's hardware
/etc	System wide configuration files
/home	Home directories of all non-root users
/lib	Contains all system libraries
/mnt	Should be used when mounting disks and remote resources
/opt	Home for 3rd party software – no real agreement on this
/proc	Has special filesystem which lets you read and set kernel parameters
/sbin	All system binaries should be here, not needed by normal users
/sys	The new version of /proc. Does basically the same but with a different layout
/usr	Place for 3rd party tools and programs
/var	Place for temporary files, log files, www-root, mail directories, ...

Hard disks and partitions

To mount a partition on a certain mount point, you of course need 2 things: a partition and the mount point. The partition must have a filesystem. You can not mount a raw block device. The mount point is just a directory, preferrebly empty. This is the command for mounting a disk:

mkdir /mnt/testmount
mount /dev/sdf2 /mnt/testmount

For local disks, the mount utility will autodetect the filesystem used on the partition and select the correct helper too. If you want to force this, you can use mount.<filesystem type> tool. Eg:

mount.ext4 /dev/sdf2 /mnt/testmount

NFS

NFS (Network File System) is a bit like windows shares. You can locally mount a remote resource and use the directories and files of the remote resource as if the files were stored on your own computer. The network installation system to add extra nodes to an SSO environment used NFS to PXE boot the machines and launch the installer. If you should login on a node which is being installed, you will see that the hostname is set to 'nfsroot'. This means that the root filesystem is not located on local disk, but is mounted over the network and actually exists on the management node.
To mount a remote resource, you can use the following command (assuming the remote side is setup properly for NFS sharing):

mount -t nfs 192.168.8.75:/data/nfsshare
  /mnt/testmount

CIFS

CIFS is the linux implementation of windows shares. When using cifs, you can mount any remote resource coming from a linux box running Samba or any windows linuxhowto.in running the regular windows shares. SSO is using regular windows shares too for the systemnas. Customers can connect to the systemnas from any of their local machines. This is how you would connect from a linux node to the same systemnas:

mount -t cifs -o
  username=admin,password=admin //192.168.8.56/systemnas /mnt/testmount

The fstab file

The fstab file is located at /etc/fstab. It contains a list of all filesystems and mountpoints configured on the system. In most cases it is used to automount volumes during startup. This is an example of the content of an fstab file:

[root@linuxhowto visitor]# cat
  /etc/fstab

UUID=9e92d823-890b-454e-bcdf-da052db1edeb
  /                       ext4    defaults        1 1

UUID=79fe8f6a-d6c4-4fbc-a423-4482fd06ca64
  /boot                   ext4    defaults        1 2

UUID=823f3ef1-450a-40a9-a407-6ce7ffe4cd33
  swap                    swap    defaults        0 0

tmpfs /dev/shm tmpfs defaults 0 0
devpts /dev/pts devpts gid=5,mode=620 0 0

sysfs                   /sys                    sysfs   defaults        0 0

proc                    /proc                   proc    defaults        0 0

UUID="ac822e9d-3109-4504-850a-5103e6ae7757"     /mnt/lvdata     ext4   
  defauls 0 0

If a partition or disk has an entry in /etc/fstab, it is enough to only give the mountpoint or disk to the mount utility to mount the device as all other information is located in the fstab file. A little example:

mount /boot
mount /dev/sda2
mount /dev/sda2 /boot
mount -t ext3 /dev/sda2 /boot
mount.ext3 /dev/sda2 /boot

Above commands all do the same thing, whereas the first 2 versions need an entry in /etc/fstab and the other don't as we're specifying all options at the commandline.
If your /etc/fstab file has the proper content and you want to mount all configured volumes, there is no need to issue a mount command for all of them. The following command will mount them all at once:

mount -a

CPU usage

CPU usage can be split up in 4 big categories:

User

System or kernel

IOwait

Idle

The user CPU load is the amount of time the CPU is performing tasks running as a regular user. It does not matter if it is a regular user or the root user. These processes are usually daemons or programs, eg: the SSH daemon, a movie player, music player, syslog daemon, your graphical environment (Gnome, KDE, ...), a web browser, a PDF reader, Skype, MSN, a shell, ...

The kernel or system CPU load is the amount of time the CPU is performing kernel related tasks. This is more related to making sure your computer does what it is supposed to do, eg: make sure the disks are ok, fetching network packets from the NIC or sending network packets, writing parts of the physical memory to swap, run context switches on the CPU, schedule programs to make sure every program gets enough CPU time, ...

IOwait is the amount of time the CPU has to wait for peripheral devices to accept or fetch data for the CPU to continue what it was doing. Usually this is disk or network bound (there aren't that many ways to get things in and out of your system).

The last part is idle. This is the part where the CPU is doing nothing. If we do the math: 100% = user + kernel + iowait + idle

You can read all of this by running the top utility. The output example below is what top might print in it's header:

top - 13:32:34 up 37 min, 2

users, 
  load average: 0.00, 0.02, 0.05

Tasks: 150 total, 1 running, 148 sleeping, 0 stopped, 1 zombie

Cpu(s):  2.0%us, 
  1.7%sy,  0.0%ni, 96.0%id,  0.0%wa,  0.0%hi, 
  0.3%si,  0.0%st

Mem:   4055732k
  total,   995056k used,  3060676k free,   
  91240k buffers

Swap:       
  0k total,        0k
  used,        0k free,   387112k
  cached

If you take a closer look at the 3rd line, you can see us (user), sy (system or kernel time), id (idle) and wa (io wait). You will also see hi (hardware interrupts) and si (software interrupts). These tasks are also things that should be taken into account, but will be reflected in the other numbers as well if they would get important. Interrupt handling would take us too far off course.

Memory usage

Memory usage will, in most cases, tell you something about the scaling of your system. If you're running low on memory, then your system is probably underscaled on memory. If you have lots of free memory (unused), you could run with some less memory.

Physical memory

The physical memory is the total amount of memory installed as memory modules in your system, memory you can touch.

Swap

The swap memory is some kind of an overflow protection for when your system would run out of physical memory. The swap is usually configured on hard drives, which are a lot slower then the physical memory installed in your system. Once certain parts are written to swap, they are not transfered from swap to the physical memory until that part of the memory is needed. If it is not needed for the next 100 years, it will stay in swap for the next 100 years.

Network interfaces

ip

The ip command is the swiss pocket knife on networking of your linux system. The list possibilities of this tool is too long, that's why I will only give you the most used ones.
Getting the current configuration of network interfaces:

[root@linuxhowto visitor]# ip a

1: lo: <LOOPBACK,UP,LOWER_UP>
  mtu 16436 qdisc noqueue state UNKNOWN

   
  link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

   
  inet 127.0.0.1/8 scope host lo

   
  inet6 ::1/128 scope host

      
  valid_lft forever preferred_lft forever

2: eth0:
  <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP
  qlen 1000

   
  link/ether 08:00:27:d1:63:41 brd ff:ff:ff:ff:ff:ff

   
  inet 10.10.8.198/23 brd 10.10.9.255 scope global eth0

   
  inet6 fe80::a00:27ff:fed1:6341/64 scope link

     
   valid_lft forever preferred_lft
  forever

ethtool

ethtool is a tool that will allow you to request the current state of a network card and allows you to set or get certain parameters on ethernet level. Some examples will clarify.
This will give you an overview of the current status of the network card:

[root@linuxhowto visitor]# ethtool
  eth0

Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full

                                100baseT/Half
  100baseT/Full

     
                            1000baseT/Full

Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full

                                100baseT/Half
  100baseT/Full

                               
  1000baseT/Full

        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Speed: 1000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        MDI-X: Unknown
        Supports Wake-on: umbg
        Wake-on: d
        Current message level: 0x00000007 (7)
        Link detected: yes

The -k option will show you the offload parameters for the NIC added as argument:

[root@linuxhowto visitor]# ethtool -k
  eth0

Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: off
large-receive-offload: off

Or as some of you have heard, sometimes there can be an issue with pause frames on ethernet level with the current NIC driver (forcedeth):

[root@linuxhowto visitor]# ethtool -a
  eth0

Pause parameters for eth0:
Autonegotiate: on
RX: on
TX: off

Kernel

The core building block of the OS is the kernel. In fact, most people use the word linux in a wrong way. When using the word linux, mostly people refer to a distribution like Debian, Ubuntu, Gentoo, Slackware, Red Hat, Mandriva, ... Linux is just the kernel. Nothing more, nothing less.

Now, what is the kernel?

The kernel is the main management environment of your system. It will schedule processes on the CPU to make sure everybody gets his share, it will interact with peripheral devices (internally and externally), it will do memory management for you. Instead of using a monolothical block of code to do that, a microkernel was created, containing just the bare minimum, which has the ability to load modules. As drivers are needed to interact with a certain type of hardware, a module is create for each brand or type of hardware. When that hardware is available in the system, the module is loaded and the kernel creates the ability to speak to that particular piece of hardware. When the hardware is not available, the module is simply not loaded. So it's not using any system resources by just being there and doing nothing.

Ofcourse you can change all of this behavior because you are able to compile a custom kernel for your system. Just head over to www.kernel.org, download the version you like (stable or unstable), compile it and run it. You have total freedom of which drivers to compile as modules or compile them directly into the kernel. If you're a coder, you can change it to behave in a different way, the way you want it to behave.

Wednesday, August 15, 2012

Basic Commands in Linux / Linux for Beginners

What is a shell

Useful commands

ls

pwd

cd

mv

cp

rm

touch

cat

less

head

tail

uptime

date

grep

wc

watch

free

dmesg

ssh

Useful tools

top

vmstat

screen

gzip

tar

parted

Using an editor

Date and time

Date

NTP

Ntpdate

Hard disks

Disks

Partitions

Software RAID arrays

Filesystems

Ext4

File permissions

Symlinks

Softlinks

Hardlinks

Mountpoints

Default layout of a linux system

Hard disks and partitions

NFS

CIFS

The fstab file

CPU usage

Memory usage

Physical memory

Swap

Network interfaces

ip

ethtool

Kernel

1 comment:

About ME (admin@linuxhowto.in)

Website Global Rank