Forrest logo
tool overview
On this page you find all important commands for the CLI tool tabula. If the command you are looking for is missing please ask our AI.

tabula

Tabula is a powerful command line tool used for extracting data from PDF files. It is designed to work with Tabula’s web app, but also has standalone functionality. This tool enables users to convert PDF tables into structured data, such as CSV or Excel files. It is written in Java and requires the Java Runtime Environment (JRE) to be installed. Tabula uses a technique called table detection to identify tables within PDF documents automatically. It has a command line interface that allows users to specify parameters for the extraction process. The extracted data can be saved to a file or outputted directly to the console. Tabula supports various output formats, including CSV, TSV, Excel, and JSON. This tool also offers options for customizing table recognition and column extraction. It is widely used for data scraping, data analysis, and data visualization tasks.

List of commands for tabula:

  • tabula:tldr:17766 tabula: Extract tables from page 1 of a PDF, guessing which portion of the page to examine.
    $ tabula --guess --pages ${1} ${file-pdf}
    try on your machine
    explain this command
  • tabula:tldr:2ce14 tabula: Extract all tables from a PDF to a JSON file.
    $ tabula --format JSON -o ${file-json} ${file-pdf}
    try on your machine
    explain this command
  • tabula:tldr:44c94 tabula: Extract all tables from a PDF, using blank space to determine cell boundaries.
    $ tabula --no-spreadsheet ${file-pdf}
    try on your machine
    explain this command
  • tabula:tldr:6be54 tabula: Extract all tables from a PDF to a CSV file.
    $ tabula -o ${file-csv} ${file-pdf}
    try on your machine
    explain this command
  • tabula:tldr:b393e tabula: Extract all tables from a PDF, using ruling lines to determine cell boundaries.
    $ tabula --spreadsheet ${file-pdf}
    try on your machine
    explain this command
  • tabula:tldr:d54e6 tabula: Extract tables from pages 1, 2, 3, and 6 of a PDF.
    $ tabula --pages ${1-3,6} ${file-pdf}
    try on your machine
    explain this command
tool overview