Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Math: IIR DF1: Optimizations for HiFi4 and HiFi5 #9684

Merged
merged 3 commits into from
Dec 3, 2024

Conversation

singalsu
Copy link
Collaborator

No description provided.

src/math/iir_df1_hifi4.c Outdated Show resolved Hide resolved
src/math/iir_df1_hifi5.c Outdated Show resolved Hide resolved
This patch adds iir_df1_hifi4.c that is a modified version of
iir_df1_hifi3.c. The IIR calculation uses 32x32 dual MAC.
The IIR delay lines update is improved with delay shift, round
and pack instruction.

The iir->delay address must be aligned 64 bits / 8 bytes due
to use of fastest non-aligning 64 bits load/store.

The updated version saves in sof-testbench4 run for MTL build
(scripts/sof-testbench-helper.sh -x -m eqiir) 0.8 MCPS,
from 10.6 to 9.8 MCPS for a 10th order filter.

In real MTL device with 2nd order high-pass filter the saving is
0.4 MCPS, from 7.8 to 7.4 MCPS.

Signed-off-by: Seppo Ingalsuo <[email protected]>
This patch adds iir_df1_hifi5.c that is a modified version
of iir_df1_hifi4.c. The coefficients and data load is 128 bits
when possible. The data load is fastest non-aligned, so the
iir->delay address needs to be 128 bits / 16 bytes aligned.

The updated version saves in sof-testbench4 run 2.1 MCPS, from
10.4 to 8.3 MCPS for used 10th order filter. The used test run
command for HiFi5 build of sof-testench4 was
"scripts/sof-testbench-helper.sh -x -m eqiir".

Signed-off-by: Seppo Ingalsuo <[email protected]>
The set to AE_ZALIGN64() is needed only for aligning writes, not
reads. The IIR coefficients are only read in function iir_df1()
so this is not needed.

Ref: HiFi3 DSP User's Guide, page 35, aligning stores.

Signed-off-by: Seppo Ingalsuo <[email protected]>
@singalsu singalsu force-pushed the iir_df1_hifi5_version branch from 7b3f824 to e2bf878 Compare November 27, 2024 15:11
in = x;
for (i = 0; i < nseries; i++) {
/* Load data */
AE_LA32X2_IP(delay_y2y1, data_r_align, delay_r);
Copy link
Collaborator Author

@singalsu singalsu Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I just noticed that the HiFi3 version required 64 bit aligned data while this version doesn't. I changed this to aligned load/store too since I wasn't sure about 128 bit align for HiFi5 version. So this might be too cautious.

But I think the largest saving for IIR can be achieved with a new stereo data function with common coefficients for L and R that is the most common use case today. E.g. the EQ component would check for identical coefficients and stereo channel count and then select other processing core.

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - can the hifi4 parts be tested in CI today or do we need a Kconfig/topology update to test ?

@singalsu
Copy link
Collaborator Author

LGTM - can the hifi4 parts be tested in CI today or do we need a Kconfig/topology update to test ?

There's good coverage for hifi4 IIR in CI at least for against a total mess like fw crash or bad audio quality. It's part of many alsabat sine quality checks and testbench run with chirp. I've tested this myself with process_test('eqiir', 32, 32, 48000, 1, 1, 'xt-run') for both hifi4 and hifi5. All the objective quality measurement look similar as before though the change is not bit exact.

The MAC operation that I now use is with asymmetric rounding since symmetrical rounding from previous version isn't available for those. But my feel is that both are as good but with bit different pros and cons (minor linearity, offset difference).

@kv2019i
Copy link
Collaborator

kv2019i commented Dec 3, 2024

SOFCI TEST

@kv2019i
Copy link
Collaborator

kv2019i commented Dec 3, 2024

sof-docs fail and Intel LNL fails all known and tracked in https://github.com/thesofproject/sof/issues?q=is%3Aissue+is%3Aopen+label%3A%22Known+PR+Failures%22+

@kv2019i kv2019i merged commit 05020ba into thesofproject:main Dec 3, 2024
44 of 47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants