Skip to content

Commit

Permalink
[#26205] YSQL: Fix REFRESH MV CONCURRENTLY to use 1 XID per statement…
Browse files Browse the repository at this point in the history
…/txn

Summary:
**The problem**
`REFRESH MATERIALIZED VIEW CONCURRENTLY` generates 1 Postgres Transaction ID (XID) per row in the refreshed view.
This can lead to an exhaustion of XIDs in YugabyteDB when the view contains a large number of tables, and is refreshed frequently.
This exhaustion happens because committed XIDs are not vacuumed in order to reclaim XIDs.
To delay the exhaustion of XIDs, it is beneficial to generate 1 XID per concurrent refresh.

**Current State**
To understand why 1 XID is generated per row in case of a concurrent refresh, consider the table below which describes the current behavior.

| Operation | IsCurrentTxnWithPGRel() | Is XID set on pgproc |
| REFRESH MV [NONONCURRENTLY] | No | No |
| REFRESH MV CONCURRENTLY | No | Yes |
| INSERT/UPDATE/DELETE on TEMP TABLE | Yes | Yes |

- IsCurrentTxnWithPGRel() determines whether a commit log is written for the current transaction (and its subtransactions). When set to false (as in the case of `REFRESH MATERIALIZED VIEW`), writing a commit log is skipped. A commit log is needed by postgres whenever an XID is generated for a transaction.
- In terms of transactional handling of materialized views, the main difference between concurrent and non-concurrent refreshes is whether an XID is generated for the transaction.

Currently, `REFRESH MATERIALIZED VIEW CONCURRENTLY` generates an XID each time postgres' `heap_insert()` is invoked (once per row). It is then cleared immediately, so when the transaction is about to commit, there is no XID associated with the transaction. Therefore writing to the commit log is skipped.
`REFRESH MATERIALIZED VIEW [NONONCURRENTLY]` does not generate an XID as postgres' `heap_insert()` is not invoked. So, there is no log to be written at commit time.
In this manner, both refreshes behave identically with respect to writing to the commit log (ie. they don't).

**The fix**
 - To fix the problem mentioned above, this revision skips clearing the XID after `heap_insert()`.
 - This further requires clearing the XID from its pgproc entry at commit time. In the absence of this future statements/transactions will be incorrectly associated with this refresh which has already committed.
 - In order to run this cleanup only when `REFRESH MATERIALIZED VIEW CONCURRENTLY` (aside from temp tables) is involved, `isYBTxnWithPostgresRel` is expanded to distinguish between cases where `REFRESH MATERIALIZED VIEW` is invoked concurrently and non-concurrently, aside from the existing distinction for temp tables. This struct member is renamed to `ybPostgresOpsInTxn` in the process.

Note that a commit log entry is not written for any flavor of `REFRESH MATERIALIZED VIEW` to keep things as close as possible to current behavior.
Jira: DB-15551

Test Plan:
Run the following tests:
```
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressPgMatview'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressMatview'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressFeature'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressDml#schedule'
```

Jenkins: urgent

Reviewers: fizaa, hsunder, pjain, smishra, amartsinchyk

Reviewed By: fizaa, amartsinchyk

Subscribers: amartsinchyk, mihnea, smishra, yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D42159
  • Loading branch information
karthik-ramanathan-3006 committed Mar 4, 2025
1 parent 0ae37fd commit 0ff1f71
Show file tree
Hide file tree
Showing 15 changed files with 879 additions and 43 deletions.
46 changes: 25 additions & 21 deletions src/postgres/src/backend/access/transam/xact.c
Original file line number Diff line number Diff line change
Expand Up @@ -218,8 +218,8 @@ typedef struct TransactionStateData
bool ybDataSentForCurrQuery; /* Whether any data has been sent to
* frontend as part of current query's
* execution */
bool isYBTxnWithPostgresRel; /* does the current transaction
* operate on a postgres table? */
uint8 ybPostgresOpsInTxn; /* An OR'ed list of operations performed on
* postgres (temp) tables by current txn. */
List *YBPostponedDdlOps; /* We postpone execution of non-revertable
* DocDB operations (e.g. drop
* table/index) until the rest of the txn
Expand Down Expand Up @@ -263,7 +263,7 @@ static TransactionStateData TopTransactionStateData = {
.topXidLogged = false,
.ybDataSent = false,
.ybDataSentForCurrQuery = false,
.isYBTxnWithPostgresRel = false,
.ybPostgresOpsInTxn = 0,
.YBPostponedDdlOps = NULL,
};

Expand Down Expand Up @@ -1384,9 +1384,18 @@ RecordTransactionCommit(void)
bool RelcacheInitFileInval = false;
bool wrote_xlog;

if (IsYugaByteEnabled() && !IsCurrentTxnWithPGRel())
if (IsYugaByteEnabled() && !YbGetPgOpsInCurrentTxn())
return InvalidTransactionId;
/* TODO(kramanathan): The bitwise flags returned by YbGetPgOpsInCurrentTxn()
* are not independent of each other. The flag YB_TXN_USES_REFRESH_MAT_VIEW_CONCURRENTLY
* only requires a subset of YB_TXN_USES_TEMPORARY_RELATIONS's operations at
* commit. Logical equality is used intentionally here to exit early.
*/
else if (IsYugaByteEnabled() &&
YbGetPgOpsInCurrentTxn() == YB_TXN_USES_REFRESH_MAT_VIEW_CONCURRENTLY)
{
return latestXid;
nchildren = xactGetCommittedChildren(&children);
return TransactionIdLatest(xid, nchildren, children);
}

/*
Expand Down Expand Up @@ -2086,7 +2095,7 @@ static void
YBStartTransaction(TransactionState s)
{
elog(DEBUG2, "YBStartTransaction");
s->isYBTxnWithPostgresRel = !IsYugaByteEnabled();
s->ybPostgresOpsInTxn = 0;
s->ybDataSent = false;
s->ybDataSentForCurrQuery = false;
s->YBPostponedDdlOps = NULL;
Expand Down Expand Up @@ -3292,25 +3301,27 @@ RestoreTransactionCharacteristics(const SavedTransactionCharacteristics *s)
}

void
SetTxnWithPGRel(void)
YbSetTxnWithPgOps(uint8 pg_op_type)
{
TransactionState s = CurrentTransactionState;

/*
* YB doesn't support subtransactions for now and only top level transaction is committed.
* So the isYBTxnWithPostgresRel flag must be set on current and all top level transactions.
* TODO(kramanathan): This flag needs to be rolled back appropriately when
* rolling back a sub-transaction. Currently, the flag(s) being set is
* persisted until commit/abort, even if the transaction at commit does not
* end up performing the operation whose flag is being set.
*/
while (s != NULL && !s->isYBTxnWithPostgresRel)
while (s != NULL)
{
s->isYBTxnWithPostgresRel = true;
s->ybPostgresOpsInTxn |= pg_op_type;
s = s->parent;
}
}

bool
IsCurrentTxnWithPGRel(void)
uint8
YbGetPgOpsInCurrentTxn(void)
{
return CurrentTransactionState->isYBTxnWithPostgresRel;
return CurrentTransactionState->ybPostgresOpsInTxn;
}

/*
Expand Down Expand Up @@ -6711,13 +6722,6 @@ YBClearDdlHandles()
CurrentTransactionState->YBPostponedDdlOps = NULL;
}

void
YbClearCurrentTransactionId()
{
CurrentTransactionState->fullTransactionId = InvalidFullTransactionId;
MyProc->xid = InvalidTransactionId;
}

/*
* YbClearParallelContexts
* Clean up parallel contexts as a part of transparent query restart.
Expand Down
2 changes: 1 addition & 1 deletion src/postgres/src/backend/commands/copy.c
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,

if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP &&
IsYugaByteEnabled())
SetTxnWithPGRel();
YbSetTxnWithPgOps(YB_TXN_USES_TEMPORARY_RELATIONS);

relid = RelationGetRelid(rel);

Expand Down
2 changes: 1 addition & 1 deletion src/postgres/src/backend/commands/createas.c
Original file line number Diff line number Diff line change
Expand Up @@ -602,7 +602,7 @@ intorel_receive(TupleTableSlot *slot, DestReceiver *self)
*/
if (myState->rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP &&
IsYugaByteEnabled())
SetTxnWithPGRel();
YbSetTxnWithPgOps(YB_TXN_USES_TEMPORARY_RELATIONS);

if (IsYBRelation(myState->rel))
{
Expand Down
8 changes: 0 additions & 8 deletions src/postgres/src/backend/commands/matview.c
Original file line number Diff line number Diff line change
Expand Up @@ -533,14 +533,6 @@ transientrel_receive(TupleTableSlot *slot, DestReceiver *self)
myState->output_cid,
myState->ti_options,
myState->bistate);

/*
* In this case when the transient relation is a temporary relation, heap_insert
* will set a transaction id. However, we must clear this transaction id, since
* REFRESH matview should not invoke any TxnWithPGRel code paths
* (that are used for temp tables).
*/
YbClearCurrentTransactionId();
}

/* We know this is a newly created relation, so there are no indexes */
Expand Down
2 changes: 1 addition & 1 deletion src/postgres/src/backend/commands/prepare.c
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ ExecuteQuery(ParseState *pstate,
*/
if (entry->plansource->usesPostgresRel)
{
SetTxnWithPGRel();
YbSetTxnWithPgOps(YB_TXN_USES_TEMPORARY_RELATIONS);
}

/* Evaluate parameters, if any */
Expand Down
2 changes: 1 addition & 1 deletion src/postgres/src/backend/commands/tablecmds.c
Original file line number Diff line number Diff line change
Expand Up @@ -5983,7 +5983,7 @@ ATRewriteTables(AlterTableStmt *parsetree, List **wqueue, LOCKMODE lockmode,
* determine what rows are visible to a specific transaction.
*/
if (!IsYBRelationById(tab->relid) && tab->rewrite > 0)
SetTxnWithPGRel();
YbSetTxnWithPgOps(YB_TXN_USES_TEMPORARY_RELATIONS);

/*
* If we change column data types, the operation has to be propagated
Expand Down
2 changes: 1 addition & 1 deletion src/postgres/src/backend/executor/spi.c
Original file line number Diff line number Diff line change
Expand Up @@ -2485,7 +2485,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
*/
if (plansource->usesPostgresRel)
{
SetTxnWithPGRel();
YbSetTxnWithPgOps(YB_TXN_USES_TEMPORARY_RELATIONS);
}

spicallbackarg.query = plansource->query_string;
Expand Down
6 changes: 3 additions & 3 deletions src/postgres/src/backend/parser/analyze.c
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ parse_analyze_fixedparams(RawStmt *parseTree, const char *sourceText,
pstate->p_target_relation->rd_rel->relpersistence == RELPERSISTENCE_TEMP
&& IsYugaByteEnabled())
{
SetTxnWithPGRel();
YbSetTxnWithPgOps(YB_TXN_USES_TEMPORARY_RELATIONS);
}

if (IsQueryIdEnabled())
Expand Down Expand Up @@ -176,7 +176,7 @@ parse_analyze_varparams(RawStmt *parseTree, const char *sourceText,
pstate->p_target_relation->rd_rel->relpersistence == RELPERSISTENCE_TEMP
&& IsYugaByteEnabled())
{
SetTxnWithPGRel();
YbSetTxnWithPgOps(YB_TXN_USES_TEMPORARY_RELATIONS);
}

/* make sure all is well with parameter types */
Expand Down Expand Up @@ -223,7 +223,7 @@ parse_analyze_withcb(RawStmt *parseTree, const char *sourceText,
pstate->p_target_relation->rd_rel->relpersistence == RELPERSISTENCE_TEMP
&& IsYugaByteEnabled())
{
SetTxnWithPGRel();
YbSetTxnWithPgOps(YB_TXN_USES_TEMPORARY_RELATIONS);
}

if (IsQueryIdEnabled())
Expand Down
2 changes: 1 addition & 1 deletion src/postgres/src/backend/storage/ipc/procarray.c
Original file line number Diff line number Diff line change
Expand Up @@ -696,7 +696,7 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
*/
Assert(TransactionIdIsValid(proc->xid));

if (!IsCurrentTxnWithPGRel())
if (!YbGetPgOpsInCurrentTxn())
return;
/*
* If we can immediately acquire ProcArrayLock, we clear our own XID
Expand Down
2 changes: 1 addition & 1 deletion src/postgres/src/backend/utils/cache/plancache.c
Original file line number Diff line number Diff line change
Expand Up @@ -446,7 +446,7 @@ CompleteCachedPlan(CachedPlanSource *plansource,
plansource->resultDesc = PlanCacheComputeResultDesc(querytree_list);

/* If the planner txn uses a pg relation, so will the execution txn */
plansource->usesPostgresRel = IsCurrentTxnWithPGRel();
plansource->usesPostgresRel = YbGetPgOpsInCurrentTxn() & YB_TXN_USES_TEMPORARY_RELATIONS;

MemoryContextSwitchTo(oldcxt);

Expand Down
8 changes: 8 additions & 0 deletions src/postgres/src/backend/utils/misc/pg_yb_utils.c
Original file line number Diff line number Diff line change
Expand Up @@ -3096,12 +3096,20 @@ YbGetDdlMode(PlannedStmt *pstmt, ProcessUtilityContext context)

is_breaking_change = false;
if (stmt->concurrent)
{
/*
* REFRESH MATERIALIZED VIEW CONCURRENTLY does not need
* a catalog version increment as it does not alter any
* metadata. The command only performs data changes.
*/
is_version_increment = false;
/*
* REFRESH MATERIALIZED VIEW CONCURRENTLY uses temp tables
* which generates a PostgreSQL XID. Mark the transaction
* as such, so that it can be handled at commit time.
*/
YbSetTxnWithPgOps(YB_TXN_USES_REFRESH_MAT_VIEW_CONCURRENTLY);
}
else
/*
* REFRESH MATERIALIZED VIEW NONCONCURRENTLY needs a catalog
Expand Down
9 changes: 6 additions & 3 deletions src/postgres/src/include/access/xact.h
Original file line number Diff line number Diff line change
Expand Up @@ -464,8 +464,8 @@ extern void YBInitializeTransaction(void);
extern void YBResetTransactionReadPoint(void);
extern void YBRestartReadPoint(void);
extern void YBCRestartWriteTransaction(void);
extern void SetTxnWithPGRel(void);
extern bool IsCurrentTxnWithPGRel(void);
extern void YbSetTxnWithPgOps(uint8 pg_op_type);
extern uint8 YbGetPgOpsInCurrentTxn(void);
extern void StartTransactionCommand(void);
extern void SaveTransactionCharacteristics(SavedTransactionCharacteristics *s);
extern void RestoreTransactionCharacteristics(const SavedTransactionCharacteristics *s);
Expand Down Expand Up @@ -568,6 +568,9 @@ extern void YBClearDdlHandles(void);
/*
* Utility for clearing transaction ID.
*/
extern void YbClearCurrentTransactionId(void);
extern void YbClearParallelContexts(void);

#define YB_TXN_USES_REFRESH_MAT_VIEW_CONCURRENTLY 0x0001
#define YB_TXN_USES_TEMPORARY_RELATIONS 0x0002

#endif /* XACT_H */
129 changes: 129 additions & 0 deletions src/postgres/src/test/regress/expected/yb.orig.matview.out
Original file line number Diff line number Diff line change
@@ -1,6 +1,74 @@
--
-- YB tests for materialized views
--
CREATE TABLE t_simple (k INT PRIMARY KEY);
INSERT INTO t_simple (SELECT i FROM generate_series(1, 10) AS i);
-- GH-26205: Test that PostgreSQL XID increases by 1 for a concurrent refresh
-- and remains the same for a non-concurrent refresh. Placing this at the top
-- of the file to ensure that this is close to being the first set of commands
-- that are executed upon cluster creation and there is no variability in the
-- XIDs generated.
CREATE MATERIALIZED VIEW mv_xid_test AS (SELECT * FROM t_simple);
CREATE UNIQUE INDEX ON mv_xid_test (k);
SELECT age('3'::xid);
age
-----
0
(1 row)

REFRESH MATERIALIZED VIEW CONCURRENTLY mv_xid_test;
SELECT age('3'::xid);
age
-----
1
(1 row)

REFRESH MATERIALIZED VIEW mv_xid_test;
SELECT age('3'::xid);
age
-----
1
(1 row)

-- A few variations of the above test.
BEGIN ISOLATION LEVEL REPEATABLE READ;
REFRESH MATERIALIZED VIEW mv_xid_test;
INSERT INTO t_simple VALUES (11), (12);
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_xid_test;
SELECT COUNT(*) FROM mv_xid_test;
count
-------
10
(1 row)

COMMIT;
SELECT age('3'::xid);
age
-----
2
(1 row)

BEGIN ISOLATION LEVEL READ COMMITTED;
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_xid_test;
REFRESH MATERIALIZED VIEW mv_xid_test;
INSERT INTO t_simple VALUES (13), (14);
SELECT COUNT(*) FROM mv_xid_test;
count
-------
12
(1 row)

COMMIT;
-- The XID should jump by 2 in Read Committed mode as the concurrent refresh of
-- the matview is run as a sub-transaction. So, an additional XID is assigned to
-- its parent. In release builds where Read Committed isolation is disabled, the
-- XID should increase by 1.
SELECT age('3'::xid);
age
-----
4
(1 row)

CREATE TABLE test_yb (col int);
INSERT INTO test_yb VALUES (null);
CREATE UNLOGGED MATERIALIZED VIEW unlogged_mv_yb AS SELECT * FROM test_yb; -- not supported
Expand Down Expand Up @@ -492,3 +560,64 @@ DETAIL: Data movement is a long running asynchronous process and can be monitor
t | integer | | |
j | integer | | |
Tablespace: "mv_tblspace2"

-- Matview with temp tables + transactions
CREATE TEMPORARY TABLE t_temp (a INT, b INT);
BEGIN;
INSERT INTO t_temp VALUES (1, 1), (2, 2);
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_xid_test;
SELECT COUNT(*) FROM t_temp;
count
-------
2
(1 row)

COMMIT;
SELECT COUNT(*) FROM t_temp;
count
-------
2
(1 row)

-- Rolling back operations on temp tables should also roll back any
-- transactional side effects of the corresponding postgres transaction.
BEGIN;
SAVEPOINT sp1;
INSERT INTO t_temp VALUES (3, 3), (4, 4);
SAVEPOINT sp2;
ROLLBACK TO sp1;
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_xid_test;
COMMIT;
SELECT COUNT(*) FROM t_temp;
count
-------
2
(1 row)

BEGIN;
SAVEPOINT sp1;
REFRESH MATERIALIZED VIEW mv_xid_test;
SAVEPOINT sp2;
INSERT INTO t_temp VALUES (3, 3), (4, 4);
ROLLBACK TO sp2;
COMMIT;
SELECT COUNT(*) FROM t_temp;
count
-------
2
(1 row)

-- Matview with prepared statements + temp tables
PREPARE temp_stmt AS INSERT INTO t_temp VALUES (5, 5), (6, 6);
BEGIN;
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_xid_test;
SAVEPOINT sp1;
EXECUTE temp_stmt;
ROLLBACK TO sp1;
EXECUTE temp_stmt;
COMMIT;
SELECT COUNT(*) FROM t_temp;
count
-------
4
(1 row)
Loading

0 comments on commit 0ff1f71

Please sign in to comment.