Optimize journal table indices #496

Arkatufus · 2024-12-06T09:13:40Z

Fixes #495

Changes

Optimize journal table query speed by adding indices on (peristence_id) and (persistence_id, ordering)

Arkatufus · 2024-12-09T18:59:33Z

BenchmarkDotNet v0.14.0, Windows 10 (10.0.19045.5131/22H2/2022Update)
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
.NET SDK 8.0.111
  [Host]     : .NET 8.0.11 (8.0.1124.51707), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.11 (8.0.1124.51707), X64 RyuJIT AVX2

`dev` branch tag table - csv benchmark

Method	TagMode	Mean	Error	StdDev	Gen0	Gen1	Gen2	Allocated
QueryByTag10	Csv	2,422.617 ms	48.3860 ms	59.4224 ms	-	-	-	526.11 KB
QueryByTag100	Csv	2,386.379 ms	28.3730 ms	25.1520 ms	-	-	-	1355.14 KB
QueryByTag1000	Csv	2,399.557 ms	45.5216 ms	42.5809 ms	1000.0000	-	-	10571.98 KB
QueryByTag10000	Csv	2,514.846 ms	34.4976 ms	30.5812 ms	12000.0000	3000.0000	-	101477.38 KB
QueryByTag10	TagTable	4.303 ms	0.1250 ms	0.3647 ms	39.0625	-	-	347.56 KB
QueryByTag100	TagTable	5.713 ms	0.1133 ms	0.2558 ms	148.4375	31.2500	-	1240.29 KB
QueryByTag1000	TagTable	23.715 ms	0.4728 ms	0.9332 ms	1281.2500	468.7500	-	10461.22 KB
QueryByTag10000	TagTable	209.548 ms	4.1742 ms	11.4268 ms	12500.0000	3000.0000	500.0000	101584.23 KB

This PR benchmark

Method	TagMode	Mean	Error	StdDev	Gen0	Gen1	Gen2	Allocated
QueryByTag10	Csv	2,412.107 ms	44.5888 ms	41.7083 ms	-	-	-	527.56 KB
QueryByTag100	Csv	2,456.264 ms	48.5973 ms	69.6968 ms	-	-	-	1355.56 KB
QueryByTag1000	Csv	2,413.896 ms	42.7739 ms	37.9180 ms	1000.0000	-	-	10510.35 KB
QueryByTag10000	Csv	2,567.690 ms	50.4452 ms	51.8035 ms	12000.0000	3000.0000	-	101474.84 KB
QueryByTag10	TagTable	4.054 ms	0.0805 ms	0.1662 ms	39.0625	-	-	347.65 KB
QueryByTag100	TagTable	5.465 ms	0.1061 ms	0.2307 ms	148.4375	31.2500	-	1250.44 KB
QueryByTag1000	TagTable	23.638 ms	0.4622 ms	0.7464 ms	1281.2500	562.5000	-	10441.6 KB
QueryByTag10000	TagTable	206.246 ms	4.0650 ms	7.2256 ms	12500.0000	3000.0000	500.0000	101577.28 KB

Arkatufus · 2024-12-10T22:28:37Z

MS SQL Server dev VS new benchmark comparison

Test	dev	Optimized	Optimized VS dev
Persist	2,068.46	2,043.64	-1.20%
PersistAsync	239,005.74	216,919.74	-9.24%
PersistAll	58,146.30	54,936.00	-5.52%
PersistAllAsync	239,291.70	210,748.16	-11.93%
PersistGroup10	10,799.25	10,547.86	-2.33%
PersistGroup25	23,002.78	20,963.04	-8.87%
PersistGroup50	38,626.44	33,891.41	-12.26%
PersistGroup100	51,856.46	52,222.05	0.70%
PersistGroup200	41,349.65	36,129.78	-12.62%
Recovering	67,114.09	64,267.35	-4.24%
RecoveringTwo	43,224.55	42,140.75	-2.51%
RecoveringFour	50,352.47	50,511.43	0.32%
Recovering8	56,927.35	55,031.99	-3.33%
Recovering 100	63,678.04	64,110.37	0.68%
Recovering 500	64,248.77	64,826.29	0.90%
Recovering 1000	64,854.17	64,861.44	0.01%

Arkatufus · 2024-12-10T22:29:37Z

PostgreSQL dev VS new benchmark comparison

Test	dev	Optimized	Optimized VS dev
Persist	3,895.90	3,782.61	-2.91%
PersistAsync	124,906.32	123,609.39	-1.04%
PersistAll	129,466.60	117,439.81	-9.29%
PersistAllAsync	125,786.16	124,626.12	-0.92%
PersistGroup10	17,884.93	18,042.07	0.88%
PersistGroup25	36,952.18	33,708.62	-8.78%
PersistGroup50	56,593.10	50,735.67	-10.35%
PersistGroup100	92,293.49	86,482.75	-6.30%
PersistGroup200	70,487.07	59,269.80	-15.91%
Recovering	105,485.23	106,837.61	1.28%
RecoveringTwo	42,817.38	43,020.00	0.47%
RecoveringFour	51,314.95	51,092.09	-0.43%
Recovering8	57,016.61	57,240.98	0.39%
Recovering 100	63,678.04	63,829.65	0.24%
Recovering 500	64,248.77	64,380.64	0.21%
Recovering 1000	64,301.36	64,451.55	0.23%

Arkatufus · 2024-12-10T22:32:10Z

To be honest, the numbers are not good. It gives a fraction of query/read speed at the cost of write speed.

Aaronontheweb · 2024-12-18T14:04:15Z

To be honest, the numbers are not good. It gives a fraction of query/read speed at the cost of write speed.

Yes but these benchmarks aren't testing what happens with a large, pre-existing data set sitting inside the journal and tag tables, which is how 99% of successful Akka.NET applications run given a modest amount of time - so the measurements aren't realistic here.

Aaronontheweb · 2024-12-18T16:05:39Z

Relevant issue: akkadotnet/akka.net#5503

Arkatufus · 2024-12-19T17:34:40Z

Execution Plan Benchmarking

We've run all of the SQL queries generated by LinqToDb against a SQL Server 2022 running in Docker to observe the actual execution plan for each and these were the findings:

Observations

Actor recovery SQL query

/* Actor recovery */
DECLARE @take Int -- Int32
SET     @take = 1000
DECLARE @persistenceId NVarChar(255) -- String
SET     @persistenceId = N'PersistPid1'
DECLARE @fromSequenceNr BigInt -- Int64
SET     @fromSequenceNr = 1
DECLARE @toSequenceNr BigInt -- Int64
SET     @toSequenceNr = 1000

SELECT TOP (@take)
  [r].[ordering],
  [r].[created],
  [r].[deleted],
  [r].[persistence_id],
  [r].[sequence_number],
  [r].[message],
  [r].[manifest],
  [r].[identifier],
  [r].[writer_uuid]
FROM
  [journal] [r]
WHERE
  [r].[persistence_id] = @persistenceId AND
  [r].[sequence_number] >= @fromSequenceNr AND
  [r].[sequence_number] <= @toSequenceNr AND
  [r].[deleted] = 0
ORDER BY
  [r].[sequence_number]

GetHighestOrderingNr SQL Query

/* Highest ordering */
SELECT
  Max([r].[ordering])
FROM
  [journal] [r]

CurrentPersistenceIds SQL Query

/* CurrentPersistenceIds query */
DECLARE @take Int -- Int32
SET     @take = 2147483647
SELECT DISTINCT TOP (@take)
  [r].[persistence_id]
FROM
  [journal] [r]
WHERE
  [r].[deleted] = 0

CurrentPersistenceIds SQL Query With Forced Index

/* CurrentPersistenceIds query */
DECLARE @take Int -- Int32
SET     @take = 2147483647
SELECT DISTINCT TOP (@take)
  [r].[persistence_id]
FROM
  [journal] [r]
WITH(INDEX(IX_journal_persistence_id)) -- Force query to use the new index
WHERE
  [r].[deleted] = 0

CurrentEventsByTag SQL Query

DECLARE @take Int -- Int32
SET     @take = 500
DECLARE @Offset BigInt -- Int64
SET     @Offset = 0
DECLARE @MaxOffset BigInt -- Int64
SET     @MaxOffset = 0
DECLARE @Tag NVarChar(64) -- String
SET     @Tag = N'Tag1'
SELECT TOP (@take)
  [x].[ordering],
  [x].[created],
  [x].[deleted],
  [x].[persistence_id],
  [x].[sequence_number],
  [x].[message],
  [x].[manifest],
  [x].[identifier],
  [x].[writer_uuid],
    (
      SELECT STRING_AGG([r].[tag], N';')
      FROM [tags] [r]
      WHERE [r].[ordering_id] = [x].[ordering]
    )
FROM
  [journal] [x]
    LEFT JOIN [tags] [jtr] ON [jtr].[ordering_id] = [x].[ordering]
WHERE
  [jtr].[ordering_id] > @Offset AND
  [jtr].[ordering_id] <= @MaxOffset AND
  [x].[deleted] = 0 AND
  [jtr].[tag] = @Tag
ORDER BY
  [x].[ordering]

Findings

The new indices was not used in any of the generated SQL statements, and if we force them to use the new indices, it actually hurts performance (5 seconds execution time to 1 minute 25 seconds execution time).

Conclusion

The new indices does not help with actor recovery nor persistence query performance, we're dropping this PR and closing the issue.

Arkatufus added 3 commits December 6, 2024 01:25

Convert footer generator to raw string interpolation

5e98b41

Implement indexes for table initialization

d43f7b3

Fix SQL script

859092b

Arkatufus marked this pull request as ready for review December 10, 2024 22:30

Arkatufus closed this Dec 19, 2024

Arkatufus mentioned this pull request Dec 19, 2024

Missing indices on tables #495

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize journal table indices #496

Optimize journal table indices #496

Arkatufus commented Dec 6, 2024

Arkatufus commented Dec 9, 2024

Arkatufus commented Dec 10, 2024

Arkatufus commented Dec 10, 2024

Arkatufus commented Dec 10, 2024

Aaronontheweb commented Dec 18, 2024

Aaronontheweb commented Dec 18, 2024

Arkatufus commented Dec 19, 2024 •

edited

Loading

Optimize journal table indices #496

Optimize journal table indices #496

Conversation

Arkatufus commented Dec 6, 2024

Changes

Arkatufus commented Dec 9, 2024

dev branch tag table - csv benchmark

This PR benchmark

Arkatufus commented Dec 10, 2024

MS SQL Server dev VS new benchmark comparison

Arkatufus commented Dec 10, 2024

PostgreSQL dev VS new benchmark comparison

Arkatufus commented Dec 10, 2024

Aaronontheweb commented Dec 18, 2024

Aaronontheweb commented Dec 18, 2024

Arkatufus commented Dec 19, 2024 • edited Loading

Execution Plan Benchmarking

Observations

Actor recovery SQL query

GetHighestOrderingNr SQL Query

CurrentPersistenceIds SQL Query

CurrentPersistenceIds SQL Query With Forced Index

CurrentEventsByTag SQL Query

Findings

Conclusion

`dev` branch tag table - csv benchmark

Arkatufus commented Dec 19, 2024 •

edited

Loading