From fbb3ed809308288947a6757673db9874b8397ba1 Mon Sep 17 00:00:00 2001 From: Hayden Schiff Date: Thu, 21 Jan 2016 18:02:42 -0500 Subject: [PATCH 01/17] added csvclean --- pages/common/csvclean.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) create mode 100644 pages/common/csvclean.md diff --git a/pages/common/csvclean.md b/pages/common/csvclean.md new file mode 100644 index 000000000..612748e3b --- /dev/null +++ b/pages/common/csvclean.md @@ -0,0 +1,12 @@ +# csvclean + +> Finds and cleans common syntax errors in CSV files. +> Included in csvkit. + +- Clean a CSV file: + +`csvclean {{bad.csv}}` + +- List locations of syntax errors in a CSV file: + +`csvclean -n {{bad.csv}}` From b3710bc9a1557b57883fe4e67b611d769bee694a Mon Sep 17 00:00:00 2001 From: Hayden Schiff Date: Thu, 21 Jan 2016 18:02:55 -0500 Subject: [PATCH 02/17] added csvcut --- pages/common/csvcut.md | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) create mode 100644 pages/common/csvcut.md diff --git a/pages/common/csvcut.md b/pages/common/csvcut.md new file mode 100644 index 000000000..f74ecf5ad --- /dev/null +++ b/pages/common/csvcut.md @@ -0,0 +1,20 @@ +# csvcut + +> Filter and truncate CSV files. Like unix "cut" command, but for tabular data. +> Included in csvkit. + +- Print indices and names of all columns: + +`csvcut -n {{data.csv}}` + +- Extract the first and third columns: + +`csvcut -c {{1,3}} {{data.csv}}` + +- Extract all columns EXCEPT the fourth one: + +`csvcut -C {{4}} {{data.csv}}` + +- Extract the columns named "id" and "first name" (in that order): + +`csvcut -c {{id,"first name"}} {{data.csv}}` From 7f5bc2fec7beee2453ff1bdfc1811d9936292c8d Mon Sep 17 00:00:00 2001 From: Hayden Schiff Date: Thu, 21 Jan 2016 18:03:08 -0500 Subject: [PATCH 03/17] added csvformat --- pages/common/csvformat.md | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) create mode 100644 pages/common/csvformat.md diff --git a/pages/common/csvformat.md b/pages/common/csvformat.md new file mode 100644 index 000000000..e05e29628 --- /dev/null +++ b/pages/common/csvformat.md @@ -0,0 +1,24 @@ +# csvformat + +> Convert a CSV file to a custom output format. +> Included in csvkit. + +- Convert to a tab-delimited file: + +`csvformat -T {{data.csv}}` + +- Convert delimiters to a custom character: + +`csvformat -D "{{custom_character}}" {{data.csv}}` + +- Convert line endings to carriage return + line feed: + +`csvformat -M "{{\r\n}}" {{data.csv}}` + +- Convert to minimal use of quote characters: + +`csvformat -U 0 {{data.csv}}` + +- Convert to maximum use of quote characters: + +`csvformat -U 1 {{data.csv}}` From 588482fc786707a73408dea06e285329b89296b5 Mon Sep 17 00:00:00 2001 From: Hayden Schiff Date: Thu, 21 Jan 2016 18:03:29 -0500 Subject: [PATCH 04/17] added csvgrep --- pages/common/csvgrep.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) create mode 100644 pages/common/csvgrep.md diff --git a/pages/common/csvgrep.md b/pages/common/csvgrep.md new file mode 100644 index 000000000..a5723729a --- /dev/null +++ b/pages/common/csvgrep.md @@ -0,0 +1,16 @@ +# csvgrep + +> Filter CSV rows with string and pattern matching. +> Included in csvkit. + +- Find rows that have a certain string in column 1: + +`csvgrep -c {{1}} -m {{string_to_match}} {{data.csv}}` + +- Find rows in which columns 3 or 4 matches a certain regex pattern: + +`csvgrep -c {{3,4}} -r {{regex_pattern}} {{data.csv}}` + +- Find rows in which the "name" column does NOT include the string "John Doe": + +`csvgrep -i -c {{name}} -m {{"John Doe"}} {{data.csv}}` From 0db5d3a67869f2397aff9db66a3507fb876cf785 Mon Sep 17 00:00:00 2001 From: Hayden Schiff Date: Thu, 21 Jan 2016 18:03:45 -0500 Subject: [PATCH 05/17] added csvlook.md --- pages/common/csvlook.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) create mode 100644 pages/common/csvlook.md diff --git a/pages/common/csvlook.md b/pages/common/csvlook.md new file mode 100644 index 000000000..9e8726832 --- /dev/null +++ b/pages/common/csvlook.md @@ -0,0 +1,16 @@ +# csvlook + +> Render a CSV file in the console as a fixed-width table. +> Included in csvkit. + +- View a CSV file: + +`csvlook {{data.csv}}` + +- View a CSV file with _less_ for easy scrolling: + +`csvlook {{data.csv}} | less` + +- View columns 2 and 3 of a CSV file: + +`csvcut -c {{2,3}} | csvlook` From 0acdc5208d65a6733f8e46ae2f4bd6daf0554914 Mon Sep 17 00:00:00 2001 From: Hayden Schiff Date: Thu, 21 Jan 2016 18:04:00 -0500 Subject: [PATCH 06/17] added csvpy --- pages/common/csvpy.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) create mode 100644 pages/common/csvpy.md diff --git a/pages/common/csvpy.md b/pages/common/csvpy.md new file mode 100644 index 000000000..5d4b06ee6 --- /dev/null +++ b/pages/common/csvpy.md @@ -0,0 +1,12 @@ +# csvpy + +> Loads a CSV file into a Python shell. +> Included in csvkit. + +- Load a CSV file into a _CSVKitReader_ object: + +`csvpy {{data.csv}}` + +- Load a CSV file into a _CSVKitDictReader_ object: + +`csvpy --dict {{data.csv}}` From 4746a7342c194b683aaaad1ba0adc043ad99719c Mon Sep 17 00:00:00 2001 From: Hayden Schiff Date: Thu, 21 Jan 2016 18:04:19 -0500 Subject: [PATCH 07/17] added csvsort --- pages/common/csvsort.md | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) create mode 100644 pages/common/csvsort.md diff --git a/pages/common/csvsort.md b/pages/common/csvsort.md new file mode 100644 index 000000000..81cb649c5 --- /dev/null +++ b/pages/common/csvsort.md @@ -0,0 +1,20 @@ +# csvsort + +> Sorts CSV files. +> Included in csvkit. + +- Sort a CSV file by column 9: + +`csvsort -c {{9}} {{data.csv}}` + +- Sort a CSV file by the "name" column in descending order: + +`csvsort -r -c {{name}} {{data.csv}}` + +- Sort a CSV file by columns 2, then by column 4: + +`csvsort -c {{2,4}} {{data.csv}}` + +- Sort a CSV file without inferring data types: + +`csvsort --no-inference -c {{columns}} {{data.csv}}` From ff695a1601c3dfc31242525f91b82e1d38d7b153 Mon Sep 17 00:00:00 2001 From: Hayden Schiff Date: Thu, 21 Jan 2016 18:04:35 -0500 Subject: [PATCH 08/17] added csvstat --- pages/common/csvstat.md | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) create mode 100644 pages/common/csvstat.md diff --git a/pages/common/csvstat.md b/pages/common/csvstat.md new file mode 100644 index 000000000..a118eb424 --- /dev/null +++ b/pages/common/csvstat.md @@ -0,0 +1,28 @@ +# csvstat + +> Print descriptive statistics for all columns in a CSV file. +> Included in csvkit. + +- Show all stats for all columns: + +`csvstat {{data.csv}}` + +- Show all stats for columns 2 and 4: + +`csvstat -c {{2,4}} {{data.csv}}` + +- Show sums for all columns: + +`csvstat --sum {{data.csv}}` + +- Show averages for all columns: + +`csvstat --mean {{data.csv}}` + +- Show the max value length for column 3: + +`csvstat -c {{1}} --len {{data.csv}}` + +- Show the number of unique values in the "name" column: + +`csvstat -c {{name}} --unique {{data.csv}}` From bb12fba8485e0d249022503e22e32ca1722c6707 Mon Sep 17 00:00:00 2001 From: Hayden Schiff Date: Thu, 21 Jan 2016 18:04:54 -0500 Subject: [PATCH 09/17] added in2csv --- pages/common/in2csv.md | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) create mode 100644 pages/common/in2csv.md diff --git a/pages/common/in2csv.md b/pages/common/in2csv.md new file mode 100644 index 000000000..e206a75b0 --- /dev/null +++ b/pages/common/in2csv.md @@ -0,0 +1,20 @@ +# in2csv + +> Converts various tabular data formats into CSV. +> Included in csvkit. + +- Convert an XLS file to CSV: + +`in2csv {{data.xls}}` + +- Convert a DBF file to a CSV file: + +`in2csv {{data.dbf}} > {{data.csv}}` + +- Convert a specific sheet from an XLSX file to CSV: + +`in2csv --sheet={{sheet_name}} {{data.xlsx}}` + +- Fetch csvkit's open issues from GitHub's JSON API, and convert them to CSV: + +`curl https://api.github.com/repos/onyxfish/csvkit/issues?state=open | in2csv -f json > issues.csv` From 5c3c6f9e081bae21a9510ee46925c0175ff0104b Mon Sep 17 00:00:00 2001 From: Hayden Schiff Date: Thu, 21 Jan 2016 18:16:56 -0500 Subject: [PATCH 10/17] typo fix --- pages/common/csvsort.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pages/common/csvsort.md b/pages/common/csvsort.md index 81cb649c5..c4150ddce 100644 --- a/pages/common/csvsort.md +++ b/pages/common/csvsort.md @@ -11,7 +11,7 @@ `csvsort -r -c {{name}} {{data.csv}}` -- Sort a CSV file by columns 2, then by column 4: +- Sort a CSV file by column 2, then by column 4: `csvsort -c {{2,4}} {{data.csv}}` From 6937b082d69f41f527217f8c714bc81e60a29625 Mon Sep 17 00:00:00 2001 From: Hayden Schiff Date: Thu, 21 Jan 2016 18:18:05 -0500 Subject: [PATCH 11/17] fixed typo, removed an example for brevity --- pages/common/csvstat.md | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/pages/common/csvstat.md b/pages/common/csvstat.md index a118eb424..36fcbb62c 100644 --- a/pages/common/csvstat.md +++ b/pages/common/csvstat.md @@ -15,13 +15,9 @@ `csvstat --sum {{data.csv}}` -- Show averages for all columns: - -`csvstat --mean {{data.csv}}` - - Show the max value length for column 3: -`csvstat -c {{1}} --len {{data.csv}}` +`csvstat -c {{3}} --len {{data.csv}}` - Show the number of unique values in the "name" column: From 50142e923affcbd1fbbb39d9f32b3ada5083c39d Mon Sep 17 00:00:00 2001 From: Hayden Schiff Date: Thu, 21 Jan 2016 21:55:19 -0500 Subject: [PATCH 12/17] csvlook: removed all but simplest example --- pages/common/csvlook.md | 8 -------- 1 file changed, 8 deletions(-) diff --git a/pages/common/csvlook.md b/pages/common/csvlook.md index 9e8726832..3abdb0e23 100644 --- a/pages/common/csvlook.md +++ b/pages/common/csvlook.md @@ -6,11 +6,3 @@ - View a CSV file: `csvlook {{data.csv}}` - -- View a CSV file with _less_ for easy scrolling: - -`csvlook {{data.csv}} | less` - -- View columns 2 and 3 of a CSV file: - -`csvcut -c {{2,3}} | csvlook` From 4ddf7488752cf53a8026afbaf3f66b6e35ab4000 Mon Sep 17 00:00:00 2001 From: Hayden Schiff Date: Thu, 21 Jan 2016 21:59:40 -0500 Subject: [PATCH 13/17] csvgrep: grammatical fix --- pages/common/csvgrep.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pages/common/csvgrep.md b/pages/common/csvgrep.md index a5723729a..bc7458ebc 100644 --- a/pages/common/csvgrep.md +++ b/pages/common/csvgrep.md @@ -7,7 +7,7 @@ `csvgrep -c {{1}} -m {{string_to_match}} {{data.csv}}` -- Find rows in which columns 3 or 4 matches a certain regex pattern: +- Find rows in which columns 3 or 4 match a certain regex pattern: `csvgrep -c {{3,4}} -r {{regex_pattern}} {{data.csv}}` From 5676e5892d86ce67acec5a1712e6012440dd4b6c Mon Sep 17 00:00:00 2001 From: Hayden Schiff Date: Sun, 24 Jan 2016 02:10:38 -0500 Subject: [PATCH 14/17] in2csv: better example for piping / `-f` flag --- pages/common/in2csv.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pages/common/in2csv.md b/pages/common/in2csv.md index e206a75b0..6342fe081 100644 --- a/pages/common/in2csv.md +++ b/pages/common/in2csv.md @@ -15,6 +15,6 @@ `in2csv --sheet={{sheet_name}} {{data.xlsx}}` -- Fetch csvkit's open issues from GitHub's JSON API, and convert them to CSV: +- Pipe a JSON file to in2csv: -`curl https://api.github.com/repos/onyxfish/csvkit/issues?state=open | in2csv -f json > issues.csv` +`cat {{data.json}} | in2csv -f json > {{data.csv}}` From 9e24ad78a289722a6b9e4e682d9b911e58f28cc9 Mon Sep 17 00:00:00 2001 From: Hayden Schiff Date: Thu, 28 Jan 2016 15:07:06 -0500 Subject: [PATCH 15/17] csvpy: changed italics to code formatting --- pages/common/csvpy.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pages/common/csvpy.md b/pages/common/csvpy.md index 5d4b06ee6..c49bd1536 100644 --- a/pages/common/csvpy.md +++ b/pages/common/csvpy.md @@ -3,10 +3,10 @@ > Loads a CSV file into a Python shell. > Included in csvkit. -- Load a CSV file into a _CSVKitReader_ object: +- Load a CSV file into a `CSVKitReader` object: `csvpy {{data.csv}}` -- Load a CSV file into a _CSVKitDictReader_ object: +- Load a CSV file into a `CSVKitDictReader` object: `csvpy --dict {{data.csv}}` From 749ca0897dd640c043d93317cf5e696c6bed6297 Mon Sep 17 00:00:00 2001 From: Hayden Schiff Date: Thu, 28 Jan 2016 15:08:47 -0500 Subject: [PATCH 16/17] csvformat: better wording and clarifications --- pages/common/csvformat.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/pages/common/csvformat.md b/pages/common/csvformat.md index e05e29628..16276fbb1 100644 --- a/pages/common/csvformat.md +++ b/pages/common/csvformat.md @@ -3,7 +3,7 @@ > Convert a CSV file to a custom output format. > Included in csvkit. -- Convert to a tab-delimited file: +- Convert to a tab-delimited file (TSV): `csvformat -T {{data.csv}}` @@ -11,14 +11,14 @@ `csvformat -D "{{custom_character}}" {{data.csv}}` -- Convert line endings to carriage return + line feed: +- Convert line endings to carriage return (^M) + line feed: `csvformat -M "{{\r\n}}" {{data.csv}}` -- Convert to minimal use of quote characters: +- Minimize use of quote characters: `csvformat -U 0 {{data.csv}}` -- Convert to maximum use of quote characters: +- Maximize use of quote characters: `csvformat -U 1 {{data.csv}}` From ac13221b4179f8006d621acc9dee5f94ce60d364 Mon Sep 17 00:00:00 2001 From: Hayden Schiff Date: Thu, 28 Jan 2016 15:11:53 -0500 Subject: [PATCH 17/17] csvcut: formatting cleanup --- pages/common/csvcut.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pages/common/csvcut.md b/pages/common/csvcut.md index f74ecf5ad..03ad76f09 100644 --- a/pages/common/csvcut.md +++ b/pages/common/csvcut.md @@ -1,6 +1,6 @@ # csvcut -> Filter and truncate CSV files. Like unix "cut" command, but for tabular data. +> Filter and truncate CSV files. Like Unix's `cut` command, but for tabular data. > Included in csvkit. - Print indices and names of all columns: @@ -11,7 +11,7 @@ `csvcut -c {{1,3}} {{data.csv}}` -- Extract all columns EXCEPT the fourth one: +- Extract all columns **except** the fourth one: `csvcut -C {{4}} {{data.csv}}`