Display Specific Columns From a File in Linux

1. Overview

As Linux users, we frequently perform various operations on files. For example, one of the common operations is to display specific columns from a file.

In this tutorial, we’ll discuss the various ways to achieve this.

2. Display Single Column

Let’s create a file to use as an example. The input.txt file contains the output of the ls command in the long listing format:

$ cat input.txt 
-rw-r--r-- 1 jarvis jarvis 200M Apr 27 22:04 file1.dat
-rw-r--r-- 1 jarvis jarvis 400M Apr 27 22:04 file2.dat
-rw-r--r-- 1 jarvis jarvis 500M Apr 27 22:04 file3.dat
-rw-r--r-- 1 jarvis jarvis 600M Apr 27 22:04 file4.dat
-rw-r--r-- 1 jarvis jarvis 700M Apr 27 22:04 file5.dat

We can use the awk command to display specific columns. Let’s print the 5^th column from the file:

$ awk '{print $5}' input.txt 
200M
400M
500M
600M
700M

Let’s see the option we used in the awk command:

print: it’s awk’s built-in function which prints text to the standard output stream
$5: it represents files size from the 5^th column

Note that awk uses $N to represent the N^th column. For example, $2 represents the 2^nd column.

We can also use the cut command to display specific columns. Let’s print the same column using the cut command:

$ cut -d' ' -f5 input.txt 
200M
400M
500M
600M
700M

Let’s take a look at the option we used in the cut command:

-d: it represents the field delimiter. Its default value is a tab character
–f5: it represents the file size from the 5^th column

3. Display Multiple Columns

We can use awk to display multiple columns as well. Let’s print the file name and its size:

$ awk '{print $9 " " $5}' input.txt 
file1.dat 200M
file2.dat 400M
file3.dat 500M
file4.dat 600M
file5.dat 700M

Let’s see the options we used in the awk command:

$9: it represents file name from the 9^th column

It’s also possible to use the cut command to display multiple columns. For example, we can specify multiple columns using a comma separate list as follows:

$ cut -d' ' -f9,5 input.txt 
200M file1.dat
400M file2.dat
500M file3.dat
600M file4.dat
700M file5.dat

Note that it’s not possible to rearrange column order with the cut command. Selected input is written in the same order that it is read.

4. Display Range of Columns

Sometimes it’s convenient to use a loop when columns to be displayed are in large numbers. Let’s print all columns within the range of 3 to 8:

$ awk '{ for (i = 3; i <= 8; ++i) printf $i" "; print ""}' input.txt 
jarvis jarvis 200M Apr 27 22:04 
jarvis jarvis 400M Apr 27 22:04 
jarvis jarvis 500M Apr 27 22:04 
jarvis jarvis 600M Apr 27 22:04 
jarvis jarvis 700M Apr 27 22:04

Let’s take a look at the options we used in the awk command:

for: it’s a looping construct of the awk
printf: it’s awk’s built-in function which prints formatted text to the standard output stream

We can use the cut command to achieve the same result. It allows us to specify a range of columns using hyphen character as follows:

$ cut -d' ' -f3-8 input.txt 
jarvis jarvis 200M Apr 27 22:04
jarvis jarvis 400M Apr 27 22:04
jarvis jarvis 500M Apr 27 22:04
jarvis jarvis 600M Apr 27 22:04
jarvis jarvis 700M Apr 27 22:04

Additionally, we can display all columns from a specified column to the last column, which can be useful if we don’t know the exact number of columns. To demonstrate, let’s print all columns from the 5th column to the last column:

$ awk '{for (i=5; i<=NF; i++) printf $i " "; print ""}' input.txt
200M Apr 27 22:04 file1.dat 
400M Apr 27 22:04 file2.dat 
500M Apr 27 22:04 file3.dat 
600M Apr 27 22:04 file4.dat 
700M Apr 27 22:04 file5.dat

Above, for (i=5; i<=NF; i++) represents a loop that iterates from the 5^th column up to NF, printing each field. NF is a variable built into awk that represents the number of fields or columns in a line.

5. Changing the awk Field Separator

By default, awk uses a space character as a column separator. However, we can modify it according to our requirements. First, let’s modify our original input file by replacing spaces with commas. Now modified file looks like this:

$ cat input.txt 
-rw-r--r--,1,jarvis,jarvis,200M,Apr 27 22:04,file1.dat
-rw-r--r--,1,jarvis,jarvis,400M,Apr 27 22:04,file2.dat
-rw-r--r--,1,jarvis,jarvis,500M,Apr 27 22:04,file3.dat
-rw-r--r--,1,jarvis,jarvis,600M,Apr 27 22:04,file4.dat
-rw-r--r--,1,jarvis,jarvis,700M,Apr 27 22:04,file5.dat

Let’s print the file name, its size, and timestamp using a comma as a column separator:

$ awk -F"," '{print $7 " " $5 " " $6}' input.txt 
file1.dat 200M Apr 27 22:04
file2.dat 400M Apr 27 22:04
file3.dat 500M Apr 27 22:04
file4.dat 600M Apr 27 22:04
file5.dat 700M Apr 27 22:04

Let’s see the option we used in the awk command:

-F: it represents the field separator

Note that in the 6^th column, spaces are not replaced with the commas intentionally to demonstrate the use of field separator.

6. Conclusion

In this tutorial, we discussed various examples to display specific columns from a file. The commands showed in this tutorial can be used in day-to-day life while working with the Linux system.

Learn Java Collections

Learn Spring

Learn Maven

View All Courses

Administration

Scripting

Networking

Files

Processes

Full Archive

About Baeldung