PG Casts

Generate a CSV from Table Data

Episode #33 hosted by Mary Lee

Hey everyone, today we're going to look at how to generate a CSV from data in Postgres.

Postgres Version 11

Transcript

We can generate a CSV from table data in Postgres using the copy command.

The copy command expects a table as an argument, followed by the absolute path for the outputted csv file.

In this example, we're using a dummy database with a table called users.

We can see from inspecting the users table that we have a few columns.

\d users

If we select from the table we can see that we have a few rows of data.

select * from users;

Let's dump this data into a CSV using the copy command.

copy users
to '/Users/marylee/Documents/users.csv';

Once the copy command has finished running, we can cat the newly generated file.

You'll notice that our CSV file doesn't have headers. To include headers in the generated csv file, you can pass "with csv header" to the copy command after specifying the output file path.

Let's try this out.

copy users
to '/Users/marylee/Documents/users.csv'
with csv header;

This time when we cat our output file, we can see that the headers have been added to the top.

If we only want specific columns from our table, we can pass them as a comma separated list after our table name. In this case, let's just select name and email from our users.

copy users (name, age)
to '/Users/marylee/Documents/users.csv'
with csv header;

Now if we cat our CSV again, we should see that we only have the name and email from our users table.

The copy command also supports queries on table being dumped. To do this, we wrap our query in parentheses and pass it as the first argument to the copy command. In this example, let's only dump users over the age of 40.

copy (select * from users where age > 40)
to '/Users/marylee/Documents/users.csv'
with csv header;

With the copy command finished, we can cat our CSV file again. This time, you can see that we only have the two users that satisfied our query conditions.

One thing to note when we're using a query to get the data from our table, is that we can't pass the column name args after the query. Instead, we update the query so we're only returning the columns we want.

Let's take our last example. To only dump the name and email of users over the age of 40, we alter our select statement to select only the columns we want.

copy (select name, email from users where age > 40)
to '/Users/marylee/Documents/users.csv'
with csv header;

If we cat the users CSV one more time, we can see that only name and email of our users over the age of 40 were included.

Thanks for watching!

Setup

create table users (
  name varchar not null,
  email varchar not null unique,
  age integer
);

insert into users
values ('Sam', 'sam@example.com', 35),
('Sue', 'sue@example.com', 65),
('Carl', 'carl@example.com', 45);