Postgres supports a wide array of aggregate and window functions which can be used to perform analytics on data or data visualization in general.Aggregate functions act as window functions only when an OVER clause follows the call; otherwise they act as regular aggregates.

There are different categories of aggregate functions in postgres. The most common ones being

'N' FOR NORMAL AGGREGATES, LIKE MAX, MIN, ETC.
'O' FOR THE ORDERED-SET AGGREGATES
'H' FOR THE HYPOTHETICAL-SET AGGREGATES, WHICH ARE A SUBCLASS OF ORDERED-SET AGGREGATES

The hypothetical aggregates do not drop input rows containing nulls. Null values sort according to the rule specified in the ORDER BY clause.

psql=>
SELECT AGGFNOID, AGGKIND
FROM PG_AGGREGATE
WHERE AGGKIND IN ('O', 'H');

AGGFNOID | AGGKIND
——————————————+---------
PG_CATALOG.PERCENTILE_DISC | O
PG_CATALOG.PERCENTILE_CONT | O
PG_CATALOG.PERCENTILE_CONT | O
PG_CATALOG.PERCENTILE_DISC | O
PG_CATALOG.PERCENTILE_CONT | O
PG_CATALOG.PERCENTILE_CONT | O
MODE                       | O
PG_CATALOG.RANK            | H
PG_CATALOG.PERCENT_RANK    | H
PG_CATALOG.CUME_DIST       | H
PG_CATALOG.DENSE_RANK      | H

In this blog I am going to take an example data set and walk through some of the common window functions and where they can be useful.

Lets go ahead and create following table

CREATE TABLE orderranking
(
  orderid serial NOT NULL,
  customerid integer,
  ordertotal numeric(15,2)
)
DISTRIBUTED BY (orderid);

And then insert data into the table

INSERT INTO OrderRanking (CustomerID, OrderTotal)
SELECT 1, 1000
UNION ALL
SELECT 1, 1000
UNION ALL
SELECT 1, 500
UNION ALL
SELECT 1, 650
UNION ALL
SELECT 1, 3000
UNION ALL
SELECT 2, 1000
UNION ALL
SELECT 2, 2000
UNION ALL
SELECT 2, 500
UNION ALL
SELECT 2, 500
UNION ALL
SELECT 3, 500

select * from orderranking order by orderid;
 orderid | customerid | ordertotal 
---------+------------+------------
       1 |          1 |     500.00
       2 |          1 |     650.00
       3 |          1 |    1000.00
       4 |          1 |    3000.00
       5 |          2 |     500.00
       6 |          2 |    1000.00
       7 |          2 |    2000.00
       8 |          3 |     500.00
       9 |          1 |    1000.00
      10 |          2 |     500.00
(10 rows)

The primary window functions have two major operations.

  1. Order by
  2. Order by Partition

ROW_NUMBER()
Assigns a unique number to each row to which it is applied (either each row in the partition or each row returned by the query), in the ordered sequence of rows specified in the order_by_clause, beginning with 1

select *,
ROW_NUMBER() OVER (ORDER BY OrderTotal DESC) AS row_num
from OrderRanking order by OrderTotal desc;

 orderid | customerid | ordertotal | row_num                                                                                                                              
---------+------------+------------+---------
       4 |          1 |    3000.00 |       1
       7 |          2 |    2000.00 |       2
       6 |          2 |    1000.00 |       3
       3 |          1 |    1000.00 |       4
       9 |          1 |    1000.00 |       5
       2 |          1 |     650.00 |       6
       1 |          1 |     500.00 |       7
      10 |          2 |     500.00 |       8
       8 |          3 |     500.00 |       9
       5 |          2 |     500.00 |      10
(10 rows)

select *,
ROW_NUMBER() OVER (ORDER BY OrderTotal DESC) AS row_num,
ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY OrderTotal DESC) AS row_num_partition
from OrderRanking order by customerid, ordertotal desc;

 orderid | customerid | ordertotal | row_num | row_num_partition 
---------+------------+------------+---------+-------------------
       4 |          1 |    3000.00 |       1 |                 1
       3 |          1 |    1000.00 |       4 |                 2
       9 |          1 |    1000.00 |       5 |                 3
       2 |          1 |     650.00 |       6 |                 4
       1 |          1 |     500.00 |       7 |                 5
       7 |          2 |    2000.00 |       2 |                 1
       6 |          2 |    1000.00 |       3 |                 2
       5 |          2 |     500.00 |      10 |                 4
      10 |          2 |     500.00 |       9 |                 3
       8 |          3 |     500.00 |       8 |                 1
(10 rows)

RANK()
Computes the rank of each row returned from a query with respect to the other rows returned by the query, based on the values of the expression in the order_by_clause. Ties are assigned the same rank, with the next ranking(s) skipped

select *,
RANK() OVER (ORDER BY OrderTotal DESC) AS rank
from OrderRanking order by OrderTotal desc;

 orderid | customerid | ordertotal | rank                                                                                                                                 
---------+------------+------------+------
       4 |          1 |    3000.00 |    1
       7 |          2 |    2000.00 |    2
       6 |          2 |    1000.00 |    3
       9 |          1 |    1000.00 |    3
       3 |          1 |    1000.00 |    3
       2 |          1 |     650.00 |    6
      10 |          2 |     500.00 |    7
       8 |          3 |     500.00 |    7
       5 |          2 |     500.00 |    7
       1 |          1 |     500.00 |    7
(10 rows)

select *,
RANK() OVER (ORDER BY OrderTotal DESC) AS rank,
RANK() OVER (PARTITION BY CustomerID ORDER BY OrderTotal DESC) AS rank_partition
from OrderRanking order by customerid, ordertotal desc;

 orderid | customerid | ordertotal | rank | rank_partition                                                                                                                
---------+------------+------------+------+----------------
       4 |          1 |    3000.00 |    1 |              1
       3 |          1 |    1000.00 |    3 |              2
       9 |          1 |    1000.00 |    3 |              2
       2 |          1 |     650.00 |    6 |              4
       1 |          1 |     500.00 |    7 |              5
       7 |          2 |    2000.00 |    2 |              1
       6 |          2 |    1000.00 |    3 |              2
      10 |          2 |     500.00 |    7 |              3
       5 |          2 |     500.00 |    7 |              3
       8 |          3 |     500.00 |    7 |              1
(10 rows)

DENSE_RANK()
Rank of the current row without gaps. This function counts peer groups

select *,
DENSE_RANK() OVER (ORDER BY OrderTotal DESC) AS dense_rank
from OrderRanking order by OrderTotal desc;

 orderid | customerid | ordertotal | dense_rank                                                                                                                           
---------+------------+------------+------------
       4 |          1 |    3000.00 |          1
       7 |          2 |    2000.00 |          2
       6 |          2 |    1000.00 |          3
       3 |          1 |    1000.00 |          3
       9 |          1 |    1000.00 |          3
       2 |          1 |     650.00 |          4
       1 |          1 |     500.00 |          5
      10 |          2 |     500.00 |          5
       8 |          3 |     500.00 |          5
       5 |          2 |     500.00 |          5
(10 rows)

select *,
DENSE_RANK() OVER (ORDER BY OrderTotal DESC) AS dense_rank,
DENSE_RANK() OVER (PARTITION BY CustomerID ORDER BY OrderTotal DESC) AS dense_rank_partition
from OrderRanking order by customerid, ordertotal desc;

 orderid | customerid | ordertotal | dense_rank | dense_rank_partition 
---------+------------+------------+------------+----------------------
       4 |          1 |    3000.00 |          1 |                    1
       9 |          1 |    1000.00 |          3 |                    2
       3 |          1 |    1000.00 |          3 |                    2
       2 |          1 |     650.00 |          4 |                    3
       1 |          1 |     500.00 |          5 |                    4
       7 |          2 |    2000.00 |          2 |                    1
       6 |          2 |    1000.00 |          3 |                    2
      10 |          2 |     500.00 |          5 |                    3
       5 |          2 |     500.00 |          5 |                    3
       8 |          3 |     500.00 |          5 |                    1
(10 rows)

We could also use partiiton by clause for Aggregate functions such as SUM, MIN, MAX.

select *,
SUM(OrderTotal) OVER (PARTITION BY CustomerID) AS sum
from OrderRanking order by customerid, ordertotal desc;

 orderid | customerid | ordertotal |   sum   
---------+------------+------------+---------
       4 |          1 |    3000.00 | 6150.00
       3 |          1 |    1000.00 | 6150.00
       9 |          1 |    1000.00 | 6150.00
       2 |          1 |     650.00 | 6150.00
       1 |          1 |     500.00 | 6150.00
       7 |          2 |    2000.00 | 4000.00
       6 |          2 |    1000.00 | 4000.00
       5 |          2 |     500.00 | 4000.00
      10 |          2 |     500.00 | 4000.00
       8 |          3 |     500.00 |  500.00
(10 rows)

PERCENT_RANK()
Relative rank of the current row: (rank - 1) / (total rows - 1)

select *,
RANK() OVER (ORDER BY OrderTotal DESC) AS rank,
PERCENT_RANK() OVER (ORDER BY OrderTotal DESC) AS percent_rank
from OrderRanking order by ordertotal desc;

 orderid | customerid | ordertotal | rank |   percent_rank                                                                                                                
---------+------------+------------+------+-------------------
       4 |          1 |    3000.00 |    1 |                 0
       7 |          2 |    2000.00 |    2 | 0.111111111111111
       6 |          2 |    1000.00 |    3 | 0.222222222222222
       9 |          1 |    1000.00 |    3 | 0.222222222222222
       3 |          1 |    1000.00 |    3 | 0.222222222222222
       2 |          1 |     650.00 |    6 | 0.555555555555556
      10 |          2 |     500.00 |    7 | 0.666666666666667
       8 |          3 |     500.00 |    7 | 0.666666666666667
       5 |          2 |     500.00 |    7 | 0.666666666666667
       1 |          1 |     500.00 |    7 | 0.666666666666667
(10 rows)

select *,
RANK() OVER (PARTITION BY CustomerID ORDER BY OrderTotal DESC) AS rank_partition,
PERCENT_RANK() OVER (PARTITION BY CustomerID ORDER BY OrderTotal DESC) AS percent_rank_partition
from OrderRanking order by customerid, ordertotal desc;

 orderid | customerid | ordertotal | rank_partition | percent_rank_partition 
---------+------------+------------+----------------+------------------------
       4 |          1 |    3000.00 |              1 |                      0
       3 |          1 |    1000.00 |              2 |                   0.25
       9 |          1 |    1000.00 |              2 |                   0.25
       2 |          1 |     650.00 |              4 |                   0.75
       1 |          1 |     500.00 |              5 |                      1
       7 |          2 |    2000.00 |              1 |                      0
       6 |          2 |    1000.00 |              2 |      0.333333333333333
      10 |          2 |     500.00 |              3 |      0.666666666666667
       5 |          2 |     500.00 |              3 |      0.666666666666667
       8 |          3 |     500.00 |              1 |                      0
(10 rows)

CUME_DIST()
Relative rank of the current row: (number of rows preceding or peer with current row) / (total rows)

select *,
CUME_DIST() OVER (ORDER BY OrderTotal DESC) AS cume_dist
from OrderRanking order by OrderTotal desc;

 orderid | customerid | ordertotal | cume_dist 
---------+------------+------------+-----------
       4 |          1 |    3000.00 |       0.1
       7 |          2 |    2000.00 |       0.2
       6 |          2 |    1000.00 |       0.5
       9 |          1 |    1000.00 |       0.5
       3 |          1 |    1000.00 |       0.5
       2 |          1 |     650.00 |       0.6
      10 |          2 |     500.00 |         1
       8 |          3 |     500.00 |         1
       5 |          2 |     500.00 |         1
       1 |          1 |     500.00 |         1
(10 rows)

select *,
CUME_DIST() OVER (ORDER BY OrderTotal DESC) AS cume_dist,
CUME_DIST() OVER (PARTITION BY CustomerID ORDER BY OrderTotal DESC) AS cume_dist_partition
from OrderRanking order by customerid, ordertotal desc;

 orderid | customerid | ordertotal | cume_dist | cume_dist_partition 
---------+------------+------------+-----------+---------------------
       4 |          1 |    3000.00 |       0.1 |                 0.2
       3 |          1 |    1000.00 |       0.5 |                 0.6
       9 |          1 |    1000.00 |       0.5 |                 0.6
       2 |          1 |     650.00 |       0.6 |                 0.8
       1 |          1 |     500.00 |         1 |                   1
       7 |          2 |    2000.00 |       0.2 |                0.25
       6 |          2 |    1000.00 |       0.5 |                 0.5
      10 |          2 |     500.00 |         1 |                   1
       5 |          2 |     500.00 |         1 |                   1
       8 |          3 |     500.00 |         1 |                   1
(10 rows)

Percentile functions

select ordertotal from orderranking order by 1;
 ordertotal 
------------
     500.00
     500.00
     500.00
     500.00
     650.00
    1000.00
    1000.00
    1000.00
    2000.00
    3000.00
(10 rows)

The median for above data set is (1000+650)/2 = 825

select percentile_disc(0.5) WITHIN GROUP (order by ordertotal),
percentile_cont(0.5) WITHIN GROUP (order by ordertotal), 
median(ordertotal) from orderranking; 
 
 percentile_disc | percentile_cont | median 
-----------------+-----------------+--------
             650 |             825 |    825
(1 row)

Percentiles by customer

select 
 customerid,
 percentile_disc(0.25) WITHIN GROUP (order by ordertotal) as bottom_quartile,
 percentile_cont(0.25) WITHIN GROUP (order by ordertotal) as bottom_continuos,
 percentile_disc(0.5) WITHIN GROUP (order by ordertotal) as median_quartile,
 percentile_cont(0.5) WITHIN GROUP (order by ordertotal) as median_continuos,
 percentile_disc(0.75) WITHIN GROUP (order by ordertotal) as top_quartile,
 percentile_cont(0.75) WITHIN GROUP (order by ordertotal) as top_continuos
from orderranking group by 1 order by 1;

 customerid | bottom_quartile | bottom_continuos | median_quartile | median_continuos | top_quartile | top_continuos 
------------+-----------------+------------------+-----------------+------------------+--------------+---------------
          1 |             650 |              650 |            1000 |             1000 |         1000 |          1000
          2 |             500 |              500 |             500 |              750 |         1000 |          1250
          3 |             500 |              500 |             500 |              500 |          500 |           500
(3 rows)

Lead & Lag

select
customerid,
ordertotal,
lag(customerid, 1) OVER (order by ordertotal) as lag_func,
lead(customerid, 1) OVER (order by ordertotal) as lead_func
from orderranking group by 1,2 order by ordertotal;

 customerid | ordertotal | lag_func | lead_func 
------------+------------+----------+-----------
          3 |     500.00 |          |         2
          2 |     500.00 |        3 |         1
          1 |     500.00 |        2 |         1
          1 |     650.00 |        1 |         1
          1 |    1000.00 |        1 |         2
          2 |    1000.00 |        1 |         2
          2 |    2000.00 |        2 |         1
          1 |    3000.00 |        2 |          
(8 rows)

Reusing a window function

select 
CustomerID, 
SUM(OrderTotal) OVER (w) as sum, 
AVG(OrderTotal) OVER (w) as avg 
from OrderRanking WINDOW w AS (PARTITION BY CustomerID)
order by 1

 customerid |   sum   |          avg          
------------+---------+-----------------------
          1 | 6150.00 | 1230.0000000000000000
          1 | 6150.00 | 1230.0000000000000000
          1 | 6150.00 | 1230.0000000000000000
          1 | 6150.00 | 1230.0000000000000000
          1 | 6150.00 | 1230.0000000000000000
          2 | 4000.00 | 1000.0000000000000000
          2 | 4000.00 | 1000.0000000000000000
          2 | 4000.00 | 1000.0000000000000000
          2 | 4000.00 | 1000.0000000000000000
          3 |  500.00 |  500.0000000000000000
(10 rows)