Active 0.1.1 Go CLI Windows Linux MacOS

Preprocess

Fast cross-platform CLI tool for data analysis preprocessing

Stars
Forks
Open issues
Last push

Features

  • Fast preprocessing of tabular data files (CSV, TSV…)
  • Group, aggregate and summarize operations
  • Diff — compute differences between two versions of a dataset
  • Skim — instant overview of a file's structure and column types
  • Scale and normalize numeric columns
  • One-command install on Linux, macOS and Windows
  • Distributed via GoReleaser for optimized native binaries

Preprocess is a command-line tool written in Go, designed to speed up repetitive data preprocessing tasks. It targets analysts and developers who work with tabular files (CSV, TSV) and want a fast, scriptable alternative to graphical tools.

Installation

Linux / macOS

curl -LsSf https://preprocess-cli.netlify.app/install.sh | sh

Windows

powershell -ExecutionPolicy ByPass -c "irm https://preprocess-cli.netlify.app/install.ps1 | iex"

Or download the binary directly from the GitHub releases page.

Usage Examples

# Quick file overview
preprocess skim data.csv

# Descriptive statistics
preprocess statistics data.csv

# Group by column
preprocess group data.csv --by country

# Diff between two files
preprocess diff before.csv after.csv

Changelog

Fix 0.1.1

Update internal version number

Feature 0.1.0

First public release — skim, group, statistics, scale and diff commands