The Linux command cut is particularly useful when you want to extract only specific columns or fields from a file. It is highly efficient when working with text data, especially when the data is divided by a delimiter. In this post, we will explore the basic usage of the cut command and its various options.
Table of Contents
What is the Linux Command cut?
The cut command is used to extract specific columns or fields from a text file. When the contents of the file are divided by a consistent delimiter, this command allows you to extract only the parts you need. For instance, it’s very handy when working with CSV files and extracting specific columns.
Basic Syntax
The basic syntax of the cut command is as follows:
cut [options] [filename]
ShellScriptWith options, you can specify the column or field, and then provide the filename. Options are mandatory; without them, there will be no output, so make sure to include them.
Main Options
-f (Select Field)
The -f
option is used to select specific fields from a text file that is divided by a delimiter. This option is usually paired with the -d
option, which sets the delimiter. By default, cut uses the tab character as a delimiter, but you can also specify commas, semicolons, spaces, or other characters as delimiters.
cut -f 2 -d ',' example.csv
ShellScriptThis command extracts the second field from the example.csv
file, which is separated by commas. Below is the content of example.csv
.
Name,Age
Freud,23
Rachel,37
Mary,59
Adler,93
example.csvAnd here is the result after running the cut command:
-d (Set Delimiter)
The -d
option specifies the character used to separate fields. By default, the cut command recognizes the tab character as the delimiter. However, when working with CSV files or other formats with different delimiters, you must manually set the correct delimiter using the -d
option.
cut -f 1 -d ':' /etc/passwd
ShellScriptThis command extracts the first field from the /etc/passwd
file, using the colon :
as the delimiter. Since the /etc/passwd
file contains user information separated by colons, this command is especially useful. Below is the result showing the first column from the file:
-c (Extract by Character)
The -c
option is used to extract data by characters rather than fields. For example, if you want to extract the first three characters from each line, you can use the following command:
cut -c 1-3 example.txt
ShellScriptThis command will output the first three characters from each line in the example.txt
file. This option is particularly helpful when the structure of the file follows a consistent pattern. Below is an example run of the command on the /etc/passwd
file:
–complement (Exclude Selected Fields)
The --complement
option is used when you want to exclude a specific field or character and display everything else. For example, to exclude the second field and display the rest, you can run the following command:
cut -f 2 --complement -d ',' example.csv
ShellScriptThis command excludes the second field from example.csv
and displays the remaining fields. It is very useful when you want to exclude certain data while retaining the rest.
–output-delimiter (Set Output Delimiter)
And if you want to specify multiple fields, you can separate them with commas (,) as shown in the figure below. Alternatively, you can specify a range using a hyphen, like -f 1-5
. You can see that the delimiter, which is the input delimiter ‘,’, is used in the output.
When extracting multiple fields, you can set the output delimiter using the --output-delimiter
option. By default, cut uses the input delimiter for the output as well.
cut -f 1,2 -d ',' --output-delimiter='|' example.csv
ShellScriptThis command extracts the first and second fields from example.csv
and separates them with a pipe (|
). This option is especially helpful when you want to change the output format.
Precautions
- Set the Delimiter Properly: By default, cut uses the tab character as a delimiter, so when dealing with CSV or other files with different delimiters, you must use the
-d
option to specify the correct delimiter. - Handling Empty Fields: Even if fields are empty, the cut command recognizes them and includes them in the output. You might want to combine it with other commands to handle empty fields more effectively.
- File Format Awareness: Since cut works based on fixed delimiters, it may not produce accurate results if the file contains inconsistent delimiters. In such cases, it may be better to use more advanced text processing commands like
awk
orsed
.
Summary
The cut command is an incredibly useful tool for managing text files in Linux. It is especially effective for extracting fields or columns from data divided by delimiters. By mastering options like -f
, -d
, and -c
, you can efficiently extract only the necessary information from complex text files. However, be cautious with the file format and delimiter settings to avoid unintended results. Once you are familiar with the cut command, your data processing tasks in Linux will become much easier.