tsv-filter
tsv-filter is a command line tool used for filtering and manipulating tab-separated values (TSV) files. It is designed to efficiently process large TSV datasets by employing a stream-based approach. The tool takes input from the standard input or a file, performs filtering tasks based on specified criteria, and provides the results on the standard output.
With tsv-filter, you can filter TSV files based on specific column values or conditions. It supports simple operators such as equal to (=), not equal to (!=), greater than (>), less than (<), etc., to define filtering conditions. Additionally, it allows combining multiple filters using logical operators like AND (&&) and OR (||), making it flexible for complex filtering requirements.
The tool is highly customizable, allowing you to select specific columns from the input file and define custom output headers. It also supports filtering based on regular expressions, enabling powerful pattern matching operations.
tsv-filter provides a range of post-processing options, including sorting, grouping, and aggregating data. It supports numerical and textual sorting, as well as grouping data by one or more columns. Aggregations like sum, average, minimum, maximum, and count can be used to derive statistical information from the input data.
The tool has comprehensive documentation and a user-friendly interface, making it easy to learn and use even for those with limited command line experience. It is written in Python and can be installed using package managers like pip, ensuring compatibility with different operating systems.
Overall, tsv-filter is a versatile command line tool that simplifies and accelerates data manipulation and analysis tasks for TSV files, making it a valuable asset for data scientists, analysts, and those working with large datasets.
List of commands for tsv-filter:
-
tsv-filter:tldr:00730 tsv-filter: Count matching lines, interpreting first line as a [H]eader.$ tsv-filter --count -H --eq ${field_name}:${number} ${path-to-tsv_file}try on your machineexplain this command
-
tsv-filter:tldr:0e76d tsv-filter: Print the lines where a specific column is [eq]ual/[n]on [e]qual/[l]ess [t]han/[l]ess than or [e]qual/[g]reater [t]han/[g]reater than or [e]qual to a given number.$ tsv-filter --${select} ${column_number}:${number} ${path-to-tsv_file}try on your machineexplain this command
-
tsv-filter:tldr:22317 tsv-filter: Filter for non-empty fields.$ tsv-filter --not-empty ${column_number} ${path-to-tsv_file}try on your machineexplain this command
-
tsv-filter:tldr:3a1c7 tsv-filter: Print the lines where a specific column is numerically equal to a given number.$ tsv-filter -H --eq ${field_name}:${number} ${path-to-tsv_file}try on your machineexplain this command
-
tsv-filter:tldr:40387 tsv-filter: Print the lines that satisfy two conditions.$ tsv-filter --eq ${column_number1}:${number} --str-eq ${column_number2}:${string} ${path-to-tsv_file}try on your machineexplain this command
-
tsv-filter:tldr:465b7 tsv-filter: Print the lines that match at least one condition.$ tsv-filter --or --eq ${column_number1}:${number} --str-eq ${column_number2}:${string} ${path-to-tsv_file}try on your machineexplain this command
-
tsv-filter:tldr:6d9eb tsv-filter: Print the lines where a specific column is empty.$ tsv-filter --invert --not-empty ${column_number} ${path-to-tsv_file}try on your machineexplain this command
-
tsv-filter:tldr:b144f tsv-filter: Print the lines where a specific column is [eq]ual/[n]ot [e]qual/part of/not part of a given string.$ tsv-filter --str-${select} ${column_number}:${string} ${path-to-tsv_file}try on your machineexplain this command