Linux provides various utilities for processing file contents and output from commands. A very useful one among these is the cut command.
In this tutorial, we’ll see how we can use the cut command to slice files and command output.
The cut command is a command-line utility for cutting sections from each line of a file. It writes the result to the standard output.
It’s worth noting that it does not modify the file, but only works on a copy of the content.
Although typically the input to a cut command is a file, we can pipe the output of other commands and use it as input.
3. Slicing by Bytes
First, let’s see how we can slice the data in a file by byte.
Let’s suppose we have a file of employee records, employee_data.txt:
Name Age Department John Smith 36 HR John Wayne 48 Finance Edward King 40 Finance Stephen Fry 50 IT
The individual fields above are separated by the tab character.
To slice by bytes, we’ll use the -b or –bytes option:
$ cut -b 2 employee_data.txt
This will print the second byte from each line in the file:
a o o d t
Here, we’re not restricted to slicing by a single byte. Consequently, we can select multiple bytes from each line.
For example, we can slice by the 3rd, 5th, and 8th bytes simultaneously using the “,” separator:
$ cut -b 3,5,8 employee_data.txt m e h i h y wrK eh
We can also specify a range, using the “-“ separator:
$ cut -b 2-5 employee_data.txt ame ohn ohn dwar teph
It’s worth noting that we can omit the starting position or the ending position while specifying the range. So, “-5” will select all bytes from the first position to the 5th position. And, “5-“ will select all bytes from the 5th position to the end of the line.
As mentioned above, apart from files, we can also pipe output from other Linux commands as input to the cut command:
$ echo slicing example | cut -b 3-7 icing
4. Slicing by Characters
For slicing by character, we’ll use the -c or –characters option.
It’s similar to slicing by byte, except that it uses the character position rather than the byte position.
So, if a character uses multiple bytes, the output will include the whole character instead of a byte from the character.
Let’s look at an example:
$ echo spéciale | cut -c 3 é $ echo spéciale | cut -b 3 ? $ echo spéciale | cut -b 3,4 é
Note that ? is printed by the second command above as the first byte of the two-byte character is not printable.
It’s worth noting that tabs and backspaces are treated as a character.
5. Slicing by Fields
Now, let’s see how we can slice file data by field.
Let’s say we want to list only the names of all the employees from the file. We can do this by slicing the file data by the first field in the file using the -f or –fields option:
$ cut -f 1 employee_data.txt
Here, we’ve used the -f option of the cut command and sliced the input using 1 as the field number:
Name John Smith John Wayne Edward King Stephen Fry
Above, we’re assuming that the fields in the file are separated using the tab delimiter. But, we can override this behavior by using the -d or –delimiter option to specify a different delimiter:
$ cut -d " " -f 2 employee_data.txt
Here, we’ve used the -d option to specify space as the delimiter. Also, we’re slicing the data using field number 2.
Now, let’s look at the output:
Smith 36 HR Wayne 48 Finance King 40 Finance Fry 50 IT
It’s worth noting that the output includes part of the earlier first field and all the rest of the fields. This is because tab is now treated like any other character, and there are no spaces in any of the other fields. Similarly, the first line is blank because it does not contain any spaces.
As with the other options, we can select multiple fields using the “,” separator:
$ cut -f 1,3 employee_data.txt Name Department John Smith HR John Wayne Finance Edward King Finance Stephen Fry IT
And, we can select a range of fields using the “-“ separator:
$ cut -f 2- employee_data.txt Age Department 36 HR 48 Finance 40 Finance 50 IT
The above command will output all fields from the second field onwards.
By default, the cut command prints all lines from the input, even if the delimiter is not present. But, we can alter this behavior using -s or –only-delimited. Using this option, we can tell the cut command not to print the lines that don’t have the delimiter.
6. Other Options
Now, let’s look at other options that can be used with the above slicing methods.
When we use “,” to specify multiple bytes/characters/fields, the cut command concatenates the output without using a delimiter. But, we can add a custom delimiter using the –output-delimiter option:
$ echo slicing example | cut -c 2-5,9,11-13 [email protected]
This will add the delimiter character ‘@’ between each part of the output:
Another interesting option is –complement. This will print everything except the content at the specified position.
Let’s look at an example:
$ echo slicing example | cut -c 5-10 --complement slicample
As we can see, the output includes all characters except the ones between positions 5 and 10.
In this article, we saw examples of using the cut command. This command can be a useful tool for extracting data from files, or outputs of other commands.