Collection Health Holds.sql (31.8 KB)
This “Collection Health Holds Query” has been in development for over a year. I first mentioned it in this post and presented a snippet of it at IUG 2024.
Using the filled hold and checkout transactions in circ_trans, output an analysis of checkouts by branch for the [previous] reporting month:
A. Reporting month (or first month of the reporting period if adjusted)
B. Display order
C. Branch/Collection/Pickup Location Name
D. Count of holds checked out
E. Count of browsed items checked out
F. Ratio of D:E
G. Count of holds filled from the local collection
H. Count of holds transited in from another branch
I. Ratio of G:H ± an estimated margin of error calculated as J÷E
J. Count of browsed checkouts having a non-local branch location code prefix
K. Count of inbound INN-Reach items
L. Count of outbound INN-Reach items
M. Ratio of K:L
Most useful if your circ_trans table is configured to retain at least 32 days of history; ours is configured to retain 1,110 days or 36½ months. You could run it weekly, but the smaller sample size will be more sensitive to spikes in circ activity, closed holidays, unscheduled branch closures, etc. The ratios are expressed as percentages of a total, but you can ditch the % symbols if they are clutter to your eyes.
Customization will of course be required for your site, primarily to map location codes and stat groups to branches. Note that this mapping is required in three places in the query batch (there are 15-16 queries in here, plus two optional diagnostic queries).
In addition to producing insightful statistics, the code presents useful examples of:
- CTEs (Common Table Expressions)
- Date functions and date arithmetic using INTERVAL
- Regular Expression string matching
- LEFT OUTER JOIN used to find non-matching rows
- CROSS JOINs
- UNIONs
- Creation of Temporary Tables
- Building a Primary Key on a temp table
- Building an Index on a temp table
- INSERTing data into temp tables
- UPDATE statements on temp tables
- Use of the LATERAL keyword for what amounts to set-based looping
- The GENERATE_SERIES() function
When run for the default one-month reporting period, this query batch completes for me in about 15 seconds. It naturally takes a wee bit longer if you’re running a YTD report, full year, or even all of circ_trans history, but no queries have timed out. YMMV. If you want to try building a multi-column index to improve performance, use a btree
index instead of the hash
index I use:
CREATE INDEX descriptive-name_idx
ON tmpFilledHold USING btree (column1, column2, ...)
;
…as PostgreSQL hash indicies only work for single columns. Enjoy!