You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This query was fixed (improved in efficiency) recently in #607 (cc [@Andrew7234), but now it's consistently timing out again.
From mainnet staging logs:
WITH relevant_rounds AS (
SELECT min(round) as min_round, max(round) as max_round
FROM chain.runtime_blocks AS b
WHERE (b.runtime = $1 AND b.timestamp >= $2::timestamptz AND b.timestamp < $3::timestamptz)
)
SELECT COUNT(DISTINCT account_address)
FROM chain.runtime_related_transactions AS rt, relevant_rounds
WHERE (rt.runtime = $1 AND tx_round >= relevant_rounds.min_round AND tx_round <= relevant_rounds.max_round)
args: [
0: "emerald"
1: "2022-03-15T02:20:00Z"
2: "2022-03-16T02:20:00Z"
oasisindexer=> explain analyze WITH relevant_rounds AS (
SELECT min(round) as min_round, max(round) as max_round
FROM chain.runtime_blocks AS b
WHERE (b.runtime = 'emerald' AND b.timestamp >= '2022-03-15T02:20:00Z' AND b.timestamp < '2022-03-16T02:20:00Z')
)
SELECT COUNT(DISTINCT account_address)
FROM chain.runtime_related_transactions AS rt, relevant_rounds
WHERE (rt.runtime = 'emerald' AND tx_round >= relevant_rounds.min_round AND tx_round <= relevant_rounds.max_round);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=6654916.70..6654916.71 rows=1 width=8) (actual time=18810.986..18810.994 rows=1 loops=1)
-> Nested Loop (cost=705652.14..6610085.65 rows=17932419 width=47) (actual time=4268.994..12804.725 rows=1452131 loops=1)
-> Aggregate (cost=22921.23..22921.24 rows=1 width=16) (actual time=369.592..369.597 rows=1 loops=1)
-> Index Scan using ix_runtime_blocks_timestamp on runtime_blocks b (cost=0.56..22890.94 rows=6057 width=8) (actual time=2.250..367.111 rows=9937 loops=1)
Index Cond: ((runtime = 'emerald'::runtime) AND ("timestamp" >= '2022-03-15 02:20:00+00'::timestamp with time zone) AND ("timestamp" < '2022-03-16 02:20:00+00'::timestamp with time zone))
-> Bitmap Heap Scan on runtime_related_transactions rt (cost=682730.91..6407840.22 rows=17932419 width=55) (actual time=3899.384..12176.384 rows=1452131 loops=1)
Recheck Cond: ((runtime = 'emerald'::runtime) AND ((tx_round)::bigint >= (min((b.round)::bigint))) AND ((tx_round)::bigint <= (max((b.round)::bigint))))
Heap Blocks: exact=20638
-> Bitmap Index Scan on ix_runtime_related_transactions_round_index (cost=0.00..678247.81 rows=17932419 width=0) (actual time=3893.056..3893.057 rows=1452131 loops=1)
Index Cond: ((runtime = 'emerald'::runtime) AND ((tx_round)::bigint >= (min((b.round)::bigint))) AND ((tx_round)::bigint <= (max((b.round)::bigint))))
Planning Time: 0.454 ms
Execution Time: 18811.122 ms
(12 rows)
he query itself is using the indexes as expected(?), but it's just slow because emerald in 2022 was quite active. Each query returns ~4600 accounts.
The stats are making progress, but doing so slowly since we recalculate daily_active_accounts for 5 min window increments. Currently the batch timeout is 3min, which allows for ~5-6 queries to finish before hitting the timeout.
Thank you for digging and explaining! That looks pretty good to me actually. And if 5-6 queries succeed, it means we're throwing out about 10% of compute on average (0.5 out of 5 queries).
Speed projection: We process ~30minutes in every 3min batch, so to process the spammy ~3 months (IIRC) of emerald, we'll need 0.3 months, or 10 days. That's acceptable IMO.
Of your options above, I like option 2 best, seems simple enough. But for now, IDK that we should invest time even into that. Your call on whether it's a really quick fix. I'm fine to just close the ticket after this exploration of yours.
Turns out the query was slow not just for the busy period of emerald, but just in general. On sapphire, it was even slower, to the point where even option 2 would not have helped enough to allow the analyzer to catch up within a reasonable time frame. Discussed on slack with Warren; looks like postgres doesn't recognize that the timestamp column is monotonic and therefore creates strange inefficient query plans.
Warren crafted a much more efficient query (500ms), which has been added in #622
The text was updated successfully, but these errors were encountered:
This query was fixed (improved in efficiency) recently in #607 (cc [@Andrew7234), but now it's consistently timing out again.
From mainnet staging logs:
Discussion:
@Andrew7234
@Andrew7234:
@Andrew7234
@mitjat:
@Andrew7234:
The text was updated successfully, but these errors were encountered: