Postgres and Clojure Using clojure.java.jdbc 1/28/17


Introduction

In this post I'll be going over how to use the clojure.java.jdbc library to work with PostgreSQL. We'll touch upon all the basic CRUD operations, as well as how to implement Foreign Key constraints, define update and deletion policies, and build Many-To-Many relationships.

Being fairly low-level, clojure.java.jdbc presents a very thing abstraction layer and we'll often we passing in parameterized (and sometimes raw) SQL into our forms. There is a corresponding video series on YouTube if you prefer watching over reading.

  1. Postgres & Clojure Using clojure.java.jdbc - Part 1 of 3
  2. Postgres & Clojure Using clojure.java.jdbc - Part 2 of 3
  3. Postgres & Clojure Using clojure.java.jdbc - Part 3 of 3

As usual, let me know if you have any questions or problems.

Pre-Requisites

  1. PostgreSQL
  2. Leiningen

Quick Setup

If you already have an empty database and don't really care about setting up the Leiningen app, you can just clone the project repository and load the demo worksheet into Gorilla.

  1. Clone the project repo
  2. cd into the newly cloned project and run lein gorilla
  3. Load the clojure-java-jdbc-lib.cljw worksheet from the ws directory into your Gorilla REPL session

How to Follow Along

If you'd prefer to work in the normal clojure REPL, or want to build up your own Gorilla worksheet from scratch, the rest of this post will walk you through the session. Good luck!

Database Setup

First thing's first, we'll need a database. I'll also be creating a new user since I like to make my local dev environment as close to what I'll be using in production.

$ psql
psql (9.6.1)
Type "help" for help.

postgres=# CREATE USER public_land_user WITH PASSWORD 'keeppubliclandinpublichands';
CREATE ROLE
postgres=# CREATE DATABASE public_land OWNER public_land_user;
CREATE DATABASE
postgres=# \c public_land
You are now connected to database "public_land" as user "pstratton".
postgres=# \dt
No relations found.
postgres=# \q
$

Leiningen Project Setup

Let's create a new project using the built-in app template:

$ lein new app public-land-finder

Next we'll need to add our project dependencies. If we navigate to the clojure.java.jdbc repo we can see the Lein dependency information listed, so lets add that to the :dependencies section of our project.clj file first:

...
:dependencies [[org.clojure/clojure "1.8.0"]
               [org.clojure/java.jdbc "0.7.0-alpha1"]]
...

We're also going to need to pull down the JDBC driver for whichever database we'll be using, in our case, PostgreSQL. Let's follow the Maven Central link and sort out which version we need. It looks like 9.4.1212 is the latest, so let's add that:

...
:dependencies [[org.clojure/clojure "1.8.0"]
               [org.clojure/java.jdbc "0.7.0-alpha1"]
               [org.postgresql/postgresql "9.4.1212"]]
...

Finally, let's add the lein-gorilla plugin so we can do all of our work in the gorilla REPL.

:dependencies [[org.clojure/clojure "1.8.0"]
               [org.clojure/java.jdbc "0.7.0-alpha1"]
               [org.postgresql/postgresql "9.4.1212"]]
:plugins [[lein-gorilla "0.4.0"]]
...

REPL Time!

Fire up a Gorilla REPL session and we can get started:

$ lein gorilla
Gorilla-REPL: 0.4.0
Started nREPL server on port 59902
Running at http://127.0.0.1:59905/worksheet.html .
Ctrl+C to exit.

Then open the HTML link to where the server is running, in my case http://127.0.0.1:59905/worksheet.html.

Let's set a new namespace and do our imports. We'll need the clojure.java.jdbc library, of course. We'll also want to alias clojure.string for use later.

(ns public-land-finder
  (:require [clojure.string :as str]
            [clojure.java.jdbc :as j]))
; nil

Creating Tables

Now that we've imported clojure.java.jdbc library, we can create a new db-spec for our public_land database:

(def db {:dbtype "postgresql"
         :dbname "public_land"
         :host "localhost"
         :user "public_land_user"
         :password "keeppubliclandinpublichands"})
; #'public-land-finder/db

Our nascent project datastore will only have two tables: state and state_forest. Here's the ERD:

ERD v01

Next we'll use the create-table-ddl form to auto-generate the SQL to create our State table:

(def state-sql (j/create-table-ddl :state [[:state_id :serial "PRIMARY KEY"]
                                           [:state "VARCHAR(32)"]
                                           [:abrv "VARCHAR(2)"]]))
; #'public-land-finder/state-sql

Let's take a look at the the generated ddl:

(println state-sql)
; CREATE TABLE state (state_id serial PRIMARY KEY, state VARCHAR(32), abrv VARCHAR(2))

Everything looks good, so we'll go ahead and execute! it:

(j/execute! db [state-sql])
; [0]

Then add our first record to the new table:

(j/insert! db :state {:state "Alabama" :abrv "AL"})
; ({:state_id 1, :state "Alabama", :abrv "AL"})

Running Queries

Let's run a simple query to retrieve our first record:

(j/query db ["SELECT * FROM state WHERE state = ?" "Alabama"])
; ({:state_id 1, :state "Alabama", :abrv "AL"})

Since Postgres returns the full record after a successful insert!, our output is exactly the same. Let's add some more state records.

Adding Multiple Records

There are a few ways to add multiple records in a single query. The insert-multi! form takes a vector of hash-maps and returns the row data that was inserted:

(j/insert-multi! db :state [{:state "Alaska" :abrv "AK"}
                            {:state "Arizona" :abrv "AZ"}
                            {:state "Arkansas" :abrv "AR"}])
; ({:state_id 2, :state "Alaska", :abrv "AK"} 
;  {:state_id 3, :state "Arizona", :abrv "AZ"}
;  {:state_id 4, :state "Arkansas", :abrv "AR"})

The db-do-prepared form accepts a parameterized SQL string and then a series of parameter groups. Instead of returning row data, it returns a seq of update counts:

(j/db-do-prepared db ["INSERT INTO state (state, abrv) VALUES (?, ?)"
                      ["California" "CA"]
                      ["Colorado" "CO"]
                      ["Connecticut" "CN"]] {:multi? true})
; (1 1 1)

Updating Records

Looks like we've made a mistake and used "CN" rather than "CT" for Connecticut's abbreviation, so let's fix that:

(j/update! db :state {:abrv "CT"} ["abrv = ?" "CN"])
; (1)

then run a sanity check to make sure everything is cool:

(j/query db ["SELECT abrv FROM state WHERE state = ?" "Connecticut"])
; ({:abrv "CT"})

Query Result Set Processing

We can specify a function to be run against either the entire query result set and/or each individual row. Let's look at result-set-fn first. In this example we're going to use the count function that is part of the clojure.core library.

(j/query db ["SELECT state_id FROM state"] {:result-set-fn count})
; 7

Normally a query would return a result set of hash-maps. If we wanted a sequence of just the state names, we could pass the :state key in as the row-fn:

(j/query db ["SELECT state FROM state"] {:row-fn :state})
; ("Alabama" "Alaska" "Arizona" "Arkansas" "California" "Colorado" "Connecticut")

We can use row and result set functions in the same query. Let's write a utility function to retrieve a state_id by its abrv value or full name:

(defn id-for-state 
  [s]
  (if (= 2 (count s))
    (j/query db ["SELECT state_id FROM state where abrv = ?" s] 
             {:row-fn :state_id :result-set-fn first})
    (j/query db ["SELECT state_id FROM state where state = ?" s]
             {:row-fn :state_id :result-set-fn first})))
; #'public-land-finder/id-for-state
(id-for-state "California")
; 5
(id-for-state "AK")
; 2

Foreign Key Constraints

Now that we have our states table started, let's add a state_forest table. We'll want each state_forest to include a Foreign Key constraint to the state in which it's located. We do this by using REFERENCES. Since we want to use the state table's primary key, we only have to specify the name of the table.

We'll create the DDL and execute! it in the same form this time:

(j/execute! db [(j/create-table-ddl :state_forest [[:state_id :int "REFERENCES state"]
                                                   [:state_forest_id :serial "PRIMARY KEY"]
                                                   [:state_forest "VARCHAR(256)"]
                                                   [:acres :int]])])
[0]

Let's create a little utility function to facilitate loading state_forest records from a simple [state_forest acres] data structure. Here's Alabama's data set:

(def al-sfs [["Choccolocco" 4536]
             ["Hauss" 319]
             ["Geneva" 7120]
             ["Little River" 2100]
             ["Macon" 190]
             ["Weogufka" 240]])
; #'public-land-finder/al-sfs

Our utility will map over the data structure and build a vector of hashmaps that we can pass into insert-multi!:

(defn load-state-forests! 
  [sf-vec s]
  (let [state-id (id-for-state s)]
    (j/insert-multi! db :state_forest (map #(hash-map :state_id state-id
                                                      :state_forest (first %1)
                                                      :acres (second %1)) sf-vec))))
; #'public-land-finder/load-state-forests!

And now we can load Alabama's state forest data:

(load-state-forests! al-sfs "AL")
; ({:state_id 1, :state_forest_id 1, :state_forest "Choccolocco", :acres 4536}
;  {:state_id 1, :state_forest_id 2, :state_forest "Hauss", :acres 319}
;  {:state_id 1, :state_forest_id 3, :state_forest "Geneva", :acres 7120} 
;  {:state_id 1, :state_forest_id 4, :state_forest "Little River", :acres 2100}
;  {:state_id 1, :state_forest_id 5, :state_forest "Macon", :acres 190} 
;  {:state_id 1, :state_forest_id 6, :state_forest "Weogufka", :acres 240})

Deleting Records

Let's go over how to delete records. First, we'll blow away a state_forest:

(j/delete! db :state_forest ["state_forest = ?" "Macon"])
; (1)

(j/query db ["SELECT * FROM state_forest WHERE state_forest = ?" "Macon"])
; ()

Nice. Now let's see what happens if we delete a state:

(j/delete! db :state ["abrv = ?" "AL"])
; An exception was caused by: org.postgresql.util.PSQLException 
;  (ERROR: update or delete on table "state" violates foreign key constraint 
;  "state_forest_state_id_fkey" on table "state_forest" 
;  Detail: Key (state_id)=(1) is still referenced from table "state_forest".)
; ...

Looks like we didn't establish a deletion policy for our state_forest Foreign Key constraint! Typically we'd want all the child records to be deleted if the parent is deleted. Our ERD should look something like this:

ERD v02

Let's completely blow away the state_forest table and rebuild it properly. First we'll use the drop-table-ddl form to drop the table:

(j/execute! db (j/drop-table-ddl :state_forest))
; [0]

With that done, let's rebuild it with the proper UPDATE and DELETE policies in place (the UPDATE policy handles an event where the Foreign Key value on the parent object changes). We want both policies to CASCADE, which means automatically UPDATE or DELETE the child to mirror the parent:

(j/execute! db [(j/create-table-ddl :state_forest [
[:state_id :int "REFERENCES state ON UPDATE CASCADE ON DELETE CASCADE"]
                                                   [:state_forest_id :serial "PRIMARY KEY"]
                                                   [:state_forest "VARCHAR(256)"]
                                                   [:acres :int]])])
; [0]

Next we repopulate our state_forest table:

(load-state-forests! al-sfs "AL")
; ({:state_id 1, :state_forest_id 1, :state_forest "Choccolocco", :acres 4536}
;  {:state_id 1, :state_forest_id 2, :state_forest "Hauss", :acres 319}
;  {:state_id 1, :state_forest_id 3, :state_forest "Geneva", :acres 7120} 
;  {:state_id 1, :state_forest_id 4, :state_forest "Little River", :acres 2100}
;  {:state_id 1, :state_forest_id 5, :state_forest "Macon", :acres 190} 
;  {:state_id 1, :state_forest_id 6, :state_forest "Weogufka", :acres 240})

(j/query db ["SELECT state_forest_id FROM state_forest"] {:result-set-fn count})
; 6

With our deletion policy set to CASCADE, if we delete Alabama all of its state forests should go along with it:

(j/delete! db :state ["abrv = ?" "AL"])
; (1)

(j/query db ["SELECT state_forest_id FROM state_forest"] {:result-set-fn count})
; 0

Modifying Constraints

Dropping the table and rebuilding is fine during development, but typically you wouldn't want to run something so destructive on a production table. Let's add Alabama back into the state table and repopulate our state_forests:

(j/insert! db :state {:state "Alabama" :abrv "AL"})
; ({:state_id 8, :state "Alabama", :abrv "AL"})

(load-state-forests! al-sfs "AL")
; ({:state_id 1, :state_forest_id 1, :state_forest "Choccolocco", :acres 4536}
;  {:state_id 1, :state_forest_id 2, :state_forest "Hauss", :acres 319}
;  {:state_id 1, :state_forest_id 3, :state_forest "Geneva", :acres 7120} 
;  {:state_id 1, :state_forest_id 4, :state_forest "Little River", :acres 2100}
;  {:state_id 1, :state_forest_id 5, :state_forest "Macon", :acres 190} 
;  {:state_id 1, :state_forest_id 6, :state_forest "Weogufka", :acres 240})

Next we need to figure out what the auto-generated name for our Foreign Key constraint is since we didn't explicitly set it in our DDL statement:

(j/query db ["SELECT constraint_name FROM information_schema.table_constraints WHERE table_name = ?" "state_forest"])
; ({:constraint_name "state_forest_pkey"} 
;  {:constraint_name "state_forest_state_id_fkey"} 
;  {:constraint_name "2200_16898_2_not_null"})

Postgres won't let us modify a Foreign Key constraint directly. We'll need to drop it, then reimplement it:

(j/execute! db ["ALTER TABLE state_forest DROP CONSTRAINT state_forest_state_id_fkey"])
; [0]

We'll run a sanity check to make sure we did in fact drop it:

(j/query db ["SELECT constraint_name FROM information_schema.table_constraints WHERE table_name = ?" "state_forest"])
; ({:constraint_name "state_forest_pkey"}
;  {:constraint_name "2200_16935_2_not_null"})

Let's add the Foreign Key constraint back in, this time we'll set the deletion policy to RESTRICT, which means it'll throw an error if we try and delete a state that has state_forest relations:

(j/execute! db ["ALTER TABLE state_forest ADD CONSTRAINT state_forest_state_id_fkey_new FOREIGN KEY (state_id) REFERENCES state ON DELETE RESTRICT"])
; [0]

Now we're back at our original (undesirable) state where the attempt to delete a state raises an exception:

(j/delete! db :state ["abrv = ?" "AL"])
; An exception was caused by: org.postgresql.util.PSQLException 
;  (ERROR: update or delete on table "state" violates foreign key constraint 
;  "state_forest_state_id_fkey_new" on table "state_forest" 
;  Detail: Key (state_id)=(8) is still referenced from table "state_forest".)
; ...

Now that you know how to "modify" Foreign Key constraints, let's set it back to CASCADE before moving on:

(j/execute! db ["ALTER TABLE state_forest DROP CONSTRAINT state_forest_state_id_fkey_new"])
; [0]

(j/execute! db ["ALTER TABLE state_forest ADD CONSTRAINT state_forest_state_id_fkey FOREIGN KEY (state_id) REFERENCES state ON UPDATE CASCADE ON DELETE CASCADE"])
; [0]

Many-To-Many Relationships

The final thing I'd like to go over is how to set up a Many-To-Many (or N:M) relationship. Relational databases represent a N:M relationship using a special table that consists of the Foreign Key relationships and any associated data.

We want to have an activity table that has records for all the different activities that might be allowed in a state_forest. Our revised ERD looks like this:

ERD v02

Let's go ahead and create the new table and add a few records.

(j/execute! db [(j/create-table-ddl :activity [[:activity_id :serial "PRIMARY KEY"]
                                               [:activity "VARCHAR(64)"]])])
; [0]

(j/insert-multi! db :activity [{:activity "hunting"}
                               {:activity "fishing"}
                               {:activity "trail riding"}
                               {:activity "hiking"}
                               {:activity "primitive camping"}])
; ({:activity_id 1, :activity "hunting"}
;  {:activity_id 2, :activity "fishing"}
;  {:activity_id 3, :activity "trail riding"} 
;  {:activity_id 4, :activity "hiking"} 
;  {:activity_id 5, :activity "primitive camping"})

A state_forest can allow many different activities, and many state forests can allow the same activity. As far as UPDATE and DELETE policies, if we delete a state_forest or an activity we want all the N:M relationships involving the deleted record to go as well. Our relationship table DDL looks like this:

(j/execute! db [(j/create-table-ddl :state_forest_activity [[:state_forest_id :int "REFERENCES state_forest ON UPDATE CASCADE ON DELETE CASCADE"]
                                                            [:activity_id :int "REFERENCES activity ON UPDATE CASCADE ON DELETE CASCADE"]
                                                            ["CONSTRAINT state_forest_activity_pkey PRIMARY KEY (state_forest_id, activity_id)"]])])
; [0]

While we could certainly build up our relationship table manually by querying the primary keys of state_forest and activity then inserting the new records into our relationship table, it'll be a common enough operation that a few helper functions are more than justified.

First we'll need one to get the state_forest_id. Retrieving an id is so common I'd probably create a more general function that would work on any record when it's time to create the actual application, but for now we'll just create another table specific form:

(defn id-for-state-forest
  [name]
  (j/query db ["SELECT state_forest_id FROM state_forest where state_forest = ?" (str/capitalize name)]
           {:row-fn :state_forest_id :result-set-fn first}))

; #'public-land-finder/id-for-state-forest

(id-for-state-forest "Geneva")
; 9

Now we'll create another table specific form for activity_id:

(defn id-for-activity 
  [name]
  (j/query db ["SELECT activity_id FROM activity where activity = ?" (str/lower-case name)]
           {:row-fn :activity_id :result-set-fn first}))
; #'public-land-finder/id-for-activity

(id-for-activity "fishing")
; 2

Let's take another look at what our full activity query result set looks like:

(j/query db ["SELECT * FROM activity"])
; ({:activity_id 1, :activity "hunting"}
;  {:activity_id 2, :activity "fishing"}
;  {:activity_id 3, :activity "trail riding"} 
;  {:activity_id 4, :activity "hiking"} 
;  {:activity_id 5, :activity "primitive camping"})

Adding N:M Relationships (conveniently)

Ideally, I'd like to be able to specify the name of a state_forest then pass in a vector of activities that are allowed. For this to work, we'll need activity row maps to use activity as the key and activity_id as the value. It'll make sense in a second, so just bear with me. Here's the remapping function:

(defn activity-remapper 
  [m] 
  (hash-map (keyword (:activity m)) (:activity_id m)))
; #'public-land-finder/activity-remapper

(activity-remapper {:activity_id 1, :activity "hunting"})
; {:hunting 1}

Finally we'll create our new form for building state_forest activity relationships:

(defn load-state-forest-activities!
  [sf-name activities]
  (let [sf-id (id-for-state-forest sf-name)
        activity-map (apply merge (j/query db ["SELECT * FROM activity"]{:row-fn activity-remapper}))]
    (j/insert-multi! db :state_forest_activity 
                     (map #(hash-map :state_forest_id sf-id
                                     :activity_id ((keyword %) activity-map)) activities))))
; #'public-land-finder/load-state-forest-activities!

Now we can (more) easily add new relationships:

(load-state-forest-activities! "Geneva" ["hunting" "fishing" "trail riding" "hiking" "primitive camping"])
; ({:state_forest_id 9, :activity_id 1} 
;  {:state_forest_id 9, :activity_id 2} 
;  {:state_forest_id 9, :activity_id 3} 
;  {:state_forest_id 9, :activity_id 4} 
;  {:state_forest_id 9, :activity_id 5})

Querying N:M Relationships (conveniently)

Nice. Now we just need a convenient way to get our data back out. I'd like a function that accepts a state_forest and returns a sequence of activities. Since our relationship table stores the id values, let's create a helper function that will perform the activity lookup for an id:

(defn name-for-activity-id 
  [id] 
  (j/query db ["SELECT activity FROM activity WHERE activity_id = ?" id] {:row-fn :activity}))
; #'public-land-finder/name-for-activity-id

(name-for-activity-id 2)
; ("fishing")

The activity query itself will need to use the IN sql condition. Passing a parameter list into the normal query like we've done elsewhere doesn't work since the underlying JDBC library can't infer the datatype. Since the list of activity ids is generated interally we don't really have to worry about a malicious user injecting raw SQL. That being the case, I'll just use a string format function to build the query. If SQL injection is a concern, we'd have to go through hassle of extending the underlying protocol to convery clojure data types into a format that the underlying JDBC library understands. But for this example, format is fine:

(defn query-activities-for-sf 
  [sf-name]
  (let [act-ids (j/query db ["SELECT activity_id FROM state_forest_activity WHERE state_forest_id = ?" 
                             (id-for-state-forest sf-name)]{:row-fn :activity_id})]
    (j/query db [(format "SELECT activity FROM activity WHERE activity_id IN (%s)" 
                         (str/join "," act-ids))]{:row-fn :activity})))
; #'public-land-finder/query-activities-for-sf

(query-activities-for-sf "Geneva")
; ("hunting" "fishing" "trail riding" "hiking" "primitive camping")

Conclusion

As you can see, clojure.java.jdbc is pretty low level. We're still writing a fair amount of raw SQL to get our work done. There are a few other Clojure libraries that offer SQL DSLs at a higher level of abstraction. While it might make sense to go with one of the other libraries, they are all built on top of clojure.java.jdbc so it's worth understanding how they work under the hood. However, if you are comfortable working with SQL and understand the risks of SQL injection attacks, clojure.java.jdbc might be exactly what you want.