Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: December 5, 2023
For a Linux administrator, manipulating and managing files is a common task. In this tutorial, we’ll discuss how to delete a certain column of a file. To demonstrate, we’ll make use of the awk and cut commands.
Before we begin deleting columns in a file, we first need to understand its structure. For instance, we need to know whether the file columns are separated by a space, comma, or tab delimiter. Knowing this helps to correctly remove a column.
To illustrate, we’ll use two files containing similar information but different delimiters. So, the first file is a space-delimited file named cars.txt. We’ll view it with the cat command:
$ cat cars.txt
Make Model Year Color
Toyota Camry 2022 Blue
Honda Accord 2021 Silver
Ford Mustang 2023 Red
Meanwhile, the second file is a comma-delimited file named cars.csv:
$ cat cars.csv
Make,Model,Year,Color
Toyota,Camry,2022,Blue
Honda,Accord,2021,Red
Ford,Mustung,2023,Silver
In the upcoming sections, we’ll cover how to delete a column from both files.
awk is a command line tool that we can use for manipulating and analyzing files. In particular, it processes input files based on the rules we provide. This makes it flexible and customizable:
$ awk 'pattern { action }' input_file
This example represents its general syntax. awk allows parameters that help in customizing the output:
First, let’s delete the third column in the cars.txt file:
$ awk '{OFS=" "; $3=""; gsub(/[[:space:]]+/, " "); print $0}' cars.txt > updated_cars.txt
Let’s break down this command:
Above, we delete the third column in the cars.txt file. Then, we save the modified information in the updated_cars.txt file to prevent overwriting the original file.
Next, let’s check the updated_cars.txt file:
$ cat updated_cars.txt
Make Model Color
Toyota Camry Blue
Honda Accord Silver
Ford Mustang Red
The output above shows that we’ve successfully deleted the third column.
Here, let’s delete the first column in the cars.csv file:
$ awk -F',' '{OFS=","; $1=""; sub("^,",""); gsub(",,",","); sub(",$",""); print $0}' cars.csv > updated_cars.csv
Further, let’s break down the command that helps us achieve this:
In summary, awk reads the content of the cars.csv file, defines the Output Field Separator as a comma, deletes the first column, and handles any issue regarding commas at the start, middle, and end of each line.
The cut command is crucial for extracting specific columns or fields from each line of a file. For this reason, it’s useful when working with files organized into columns or fields separated by spaces, tabs, or commas:
$ cut OPTION... [FILE]...
Considering the general syntax above, OPTION determines the behavior of the cut command whereas [FILE] represents the file from which data is extracted.
To demonstrate, let’s delete the second column in the cars.txt file:
$ cut -d' ' --complement -f2 cars.txt > updated_cars.txt
Afterward, we can check whether the updated_cars.txt file contains the desired content.
Equally important, we can use the cut command to delete a column from our file:
$ cut -d',' --complement -f3 cars.csv > updated_cars.csv
Now, let’s explain the above syntax:
At this point, the third column is absent in the new updated_cars.csv file.
In this article, we discussed how to delete a certain column from a file using the Linux command line. To achieve this, we utilized both the awk and cut commands.