parquet-tools:tldr:b883a

parquet-tools: Concatenate several Parquet files into the target one.

parquet-tools

$ parquet-tools merge ${path-to-parquet1} ${path-to-parquet2} ${path-to-target_parquet}

try on your machine

The command parquet-tools merge ${path-to-parquet1} ${path-to-parquet2} ${path-to-target_parquet} merges two Parquet files specified by ${path-to-parquet1} and ${path-to-parquet2}, and saves the merged result to the file specified by ${path-to-target_parquet}.

Parquet is a columnar storage file format that is commonly used in big data processing frameworks like Apache Spark and Apache Hadoop. The parquet-tools command-line utility provides various operations to work with Parquet files, including merging multiple Parquet files into a single file.

In the given command, you run the merge operation of parquet-tools. This operation takes three arguments:

${path-to-parquet1}: The path to the first Parquet file you want to merge.
${path-to-parquet2}: The path to the second Parquet file you want to merge.
${path-to-target_parquet}: The path to the resulting merged Parquet file.

When you execute this command, parquet-tools will read the two input Parquet files, merge them together, and store the merged content in the target Parquet file specified by ${path-to-target_parquet}.

This explanation was created by an AI. In most cases those are correct. But please always be careful and never run a command you are not sure if it is safe.

back to the parquet-tools tool