2021年1月28日星期四

RECURSIVE CTE very slow when joined

Suppose the following relations:

  • match(match_id)
  • event(match_id, seq, gt, ...)

There are the following indexes:

  • match(match_id)
  • event(match_id, seq)

Further notes:

  • gt is monotonically increasing
  • For a given match I have a collection of events which happens at a specific 'gt' time

I am using postgresql 13.1

My goal is to come up with a RECURSIVE CTE query which solves the above, however I find that very slow. While this can be solved practically with a self-join, I am not interested in that, I want to find out why my CTE is slow. I believe it should not be that slow.

Further numbers:

  • number of matches is 400
  • each match has an average of 541 events

My RECURSIVE CTE query is the following:

  WITH RECURSIVE      delta_gts AS (          SELECT m.match_id, 1 AS seq, 0 AS gt, 0 AS delta          FROM matches m            UNION            SELECT dgt.match_id, ev.seq AS seq, ev.gt AS gt, (ev.gt - dgt.gt) AS delta          FROM delta_gts dgt          JOIN events ev ON ev.match_id = dgt.match_id AND ev.seq = (dgt.seq + 1)        )    SELECT * FROM delta_gts g  

Futher notes I also tried by adding the following (for one match only):

WHERE g.match_id = 'ita_1672780'  

and I find out in the plan that there is no predicate pushdown. I think this was implemented in pgsql 13.1

This is the plan:

QUERY PLAN  CTE Scan on delta_gts g  (cost=123323.44..123780.40 rows=22848 width=76)    CTE delta_gts      ->  Recursive Union  (cost=0.00..123323.44 rows=22848 width=76)            ->  Seq Scan on matches m  (cost=0.00..10.08 rows=408 width=28)            ->  Hash Join  (cost=10789.50..12285.64 rows=2244 width=76)                  Hash Cond: ((dgt.match_id = ev.match_id) AND ((dgt.seq + 1) = ev.seq))                  ->  WorkTable Scan on delta_gts dgt  (cost=0.00..81.60 rows=4080 width=72)                  ->  Hash  (cost=6194.60..6194.60 rows=220260 width=24)                        ->  Seq Scan on events ev  (cost=0.00..6194.60 rows=220260 width=24)  JIT:    Functions: 15  "  Options: Inlining false, Optimization false, Expressions true, Deforming true"    

Considerations:

  • It is not using the index (match_id, seq) on the events table at all when the recursive part of the CTE is executed.
https://stackoverflow.com/questions/65948130/recursive-cte-very-slow-when-joined January 29, 2021 at 10:50AM

没有评论:

发表评论