Linux Command cut and 5 Options

The Linux command cut is particularly useful when you want to extract only specific columns or fields from a file. It is highly efficient when working with text data, especially when the data is divided by a delimiter. In this post, we will explore the basic usage of the cut command and its various options.

What is the Linux Command cut?

The cut command is used to extract specific columns or fields from a text file. When the contents of the file are divided by a consistent delimiter, this command allows you to extract only the parts you need. For instance, it’s very handy when working with CSV files and extracting specific columns.

Basic Syntax

The basic syntax of the cut command is as follows:

cut [options] [filename]
ShellScript

With options, you can specify the column or field, and then provide the filename. Options are mandatory; without them, there will be no output, so make sure to include them.

Main Options

-f (Select Field)

The -f option is used to select specific fields from a text file that is divided by a delimiter. This option is usually paired with the -d option, which sets the delimiter. By default, cut uses the tab character as a delimiter, but you can also specify commas, semicolons, spaces, or other characters as delimiters.

cut -f 2 -d ',' example.csv
ShellScript

This command extracts the second field from the example.csv file, which is separated by commas. Below is the content of example.csv.

Name,Age
Freud,23
Rachel,37
Mary,59
Adler,93
example.csv

And here is the result after running the cut command:

Figure 1. Linux Command cut with -f and -d options: Output of the second column from a CSV file
Figure 1. Linux Command cut with -f and -d options: Output of the second column from a CSV file

-d (Set Delimiter)

The -d option specifies the character used to separate fields. By default, the cut command recognizes the tab character as the delimiter. However, when working with CSV files or other formats with different delimiters, you must manually set the correct delimiter using the -d option.

cut -f 1 -d ':' /etc/passwd
ShellScript

This command extracts the first field from the /etc/passwd file, using the colon : as the delimiter. Since the /etc/passwd file contains user information separated by colons, this command is especially useful. Below is the result showing the first column from the file:

Figure 2. Linux Command cut with the -d option: Output of the first column from /etc/passwd
Figure 2. Linux Command cut with the -d option: Output of the first column from /etc/passwd

-c (Extract by Character)

The -c option is used to extract data by characters rather than fields. For example, if you want to extract the first three characters from each line, you can use the following command:

cut -c 1-3 example.txt
ShellScript

This command will output the first three characters from each line in the example.txt file. This option is particularly helpful when the structure of the file follows a consistent pattern. Below is an example run of the command on the /etc/passwd file:

Figure 3. Output of the first 1-3 characters from each line using the Linux command cut -c option
Figure 3. Output of the first 1-3 characters from each line using the Linux command cut -c option

–complement (Exclude Selected Fields)

The --complement option is used when you want to exclude a specific field or character and display everything else. For example, to exclude the second field and display the rest, you can run the following command:

cut -f 2 --complement -d ',' example.csv
ShellScript

This command excludes the second field from example.csv and displays the remaining fields. It is very useful when you want to exclude certain data while retaining the rest.

Figure 4. Output excluding a specified field using the Linux command cut --complement option
Figure 4. Output excluding a specified field using the Linux command cut –complement option

–output-delimiter (Set Output Delimiter)

And if you want to specify multiple fields, you can separate them with commas (,) as shown in the figure below. Alternatively, you can specify a range using a hyphen, like -f 1-5. You can see that the delimiter, which is the input delimiter ‘,’, is used in the output.

Figure 5. Selecting multiple fields using the -f option with the Linux command cut
Figure 5. Selecting multiple fields using the -f option with the Linux command cut

When extracting multiple fields, you can set the output delimiter using the --output-delimiter option. By default, cut uses the input delimiter for the output as well.

cut -f 1,2 -d ',' --output-delimiter='|' example.csv
ShellScript

This command extracts the first and second fields from example.csv and separates them with a pipe (|). This option is especially helpful when you want to change the output format.

Figure 6. Output using the Linux command cut --output-delimiter option to change the output delimiter
Figure 6. Output using the Linux command cut –output-delimiter option to change the output delimiter

Precautions

  1. Set the Delimiter Properly: By default, cut uses the tab character as a delimiter, so when dealing with CSV or other files with different delimiters, you must use the -d option to specify the correct delimiter.
  2. Handling Empty Fields: Even if fields are empty, the cut command recognizes them and includes them in the output. You might want to combine it with other commands to handle empty fields more effectively.
  3. File Format Awareness: Since cut works based on fixed delimiters, it may not produce accurate results if the file contains inconsistent delimiters. In such cases, it may be better to use more advanced text processing commands like awk or sed.

Summary

The cut command is an incredibly useful tool for managing text files in Linux. It is especially effective for extracting fields or columns from data divided by delimiters. By mastering options like -f, -d, and -c, you can efficiently extract only the necessary information from complex text files. However, be cautious with the file format and delimiter settings to avoid unintended results. Once you are familiar with the cut command, your data processing tasks in Linux will become much easier.

References

Leave a Comment