Replies: 6 comments 2 replies
-
I use a RPi4 and can run two instances of NAM (one STANDARD and one LITE or FEATHER) plus IR, expander, split, EQ, delay with 64/2. I also tested the Radxa Zero 3W with one STANDARD NAM which used around 80% CPU but got xruns when using 64/2. I have not tried increasing to 3 or 4 yet. The board gets very hot even with a sink. The WiFi on it is really bad and doesn't work well with 2.4GHz in my tests. It creates alot rxfrag errors that hogs the CPU and journald. As long as WiFi is not used it works quite well. On a Libre La Frite I can use one FEATHER NAM and lots of additional effects at 64/2. It also does not get hot and has been very stable and most of it works off of mainline kernel. PiPedal has on my devices been incredibly stable while MODEP used a lot of CPU that caused xruns even on a RPi4 and also Rpi5 I think. |
Beta Was this translation helpful? Give feedback.
-
Buffer size
FYI, I think 64x2 is not a good choice of butter size. 32x4 will
definitely give you fewer underruns, with equivalent or better latency . I
do know that I did release a version of PiPedal that defaulted to 64x2
buffers; but it quickly became apparent to me that this was a very bad
choice. PiPedal currently defaults to 64x3 buffers, but I have suspected
for quite some time that it should actually be defaulting to 32x4. I'm
pretty sure this is true, even on very lightweight hardware. Unfortunately,
I've been chasing higher-priority issues for a while, so I haven't had the
luxury of being able to play as much as I should before making a fairly
risky change. There's even some reason to think that 16x5, or 16x6 buffer
configurations might be a good idea.
I still don't have a firm understanding of how audio buffer configuration
works on Linux. There seems to be lots of lore, and not a whole lot of
detail.
My current best understanding is that buffer size primarily affects
PiPedal's software rendering; and the number of buffers affects how much
breathing room ALSA has to keep the hardware fed. And that under the
covers, ALSA is mostly chasing buffer pointers for USB (or I2C), and
doesn't really care what the buffer sizes are, just how many bytes of data
it has to play with. In point of fact, the ALSA driver doesn't even get
told what the buffer size is, just the value of SIZE x NUMBER OF BUFFERS.
There is deep lore that says that buffers should be a multiple of 48 bytes
for USB audio devices. This may have been true for USB 1.0 devices; but I
Think it is no longer true for USB 2.0+ devices, that have a more efficient
way to transfer bulk data. So I sincerely believe that piece of lore should
be retired, except for exceptionally ancient hardware.
So I think... 32x4 is much better because PiPedal needs one buffer in
which to process input, which gives the OS up to 32x3=48 samples of data to
feed the hardware with. At any given time, ALSA is feeding the hardware
with some portion of that buffered data; but it can release at least two of
the three buffers back to Pipedal so that pipedal can start filling them
again. So 1 buffer for pipedal; 1 buffer to feed the hardware; and 2
buffers to keep things running smoothly. In the 64x2 case, pipedal needs
one buffer to fill; but it can't get access to the next buffer until the OS
has finished transferring the last byte of the other buffer to hardware So
1 buffer for Pipedal to fill; and some potentially very small lead time
between the time that the hardware transfer completes, and the time that
PiPedal gets to start filling a new buffer.A disaster waiting to happen! So
for the 64x2 case, there's no spare buffer. Pipedal has to process the
entire buffer in the interval between when the hardware releases the end of
the previous buffer, and starts requesting data for the start of the next
buffer.
(Omitted for the sake of simplicity: input and output each get 32x4
buffers, so the same general argument holds for input buffers if you were
just reading input data; in actual fact, input and output transfers are
locked together, so it's not clear how many of the input buffers actually
get used).
Problems with Wi-Fi
One of the nice things about Pi's 4 and above: that WiFi and USB run on
separate buses. On older PIs, the WiFi device appears as a USB device, and
shares an internal USB bus with USB audio. If your troublesome devices
have USB 2.0 AND USB 3.0 connectors, you might want to experiment with
using either the USB 2.0 or USB 3.0 ports, which may get their own
dedicated buses and controllers. On my Pi4, I take great care to ensure
that my SSD drive goes on a USB 3.0 port, and my USB audio device goes on
the USB 2.0 bus (so that it doesn't share an internal bus with the SSD.
…On Mon, Nov 25, 2024, 16:04 38github ***@***.***> wrote:
I use a RPi4 and can run two instances of NAM (one STANDARD and one LITE
or FEATHER) plus IR, expander, split, EQ, delay with 64/2.
I also tested the Radxa Zero 3W with one STANDARD NAM which used around
80% CPU but got xruns when using 64/2. I have not tried increasing to 3 or
4 yet. The board gets very hot even with a sink. The WiFi on it is really
bad and doesn't work well with 2.4GHz in my tests. It creates alot rxfrag
errors that hogs the CPU and journald. As long as WiFi is not used it works
quite well.
On a Libre La Frite I can use one FEATHER NAM and lots of additional
effects at 64/2. It also does not get hot and has been very stable and most
of it works off of mainline kernel.
PiPedal has on my devices been incredibly stable while MODEP used a lot of
CPU that caused xruns even on a RPi4 and also Rpi5 I think.
—
Reply to this email directly, view it on GitHub
<#201 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACXK2DHGSL3GZ76WTBLUEXD2COGGBAVCNFSM6AAAAABOT7ARAOVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCMZXG4YDMMA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
I use 32x4 on pi5 with a hardware codec and it is more stable than 64x2. The difference in latency is minimal. If you use 16, the sound with neural plugins starts to disappear completely. Apparently the network window can't be larger than the buffer. |
Beta Was this translation helpful? Give feedback.
-
Sorry to bring a thread back from the dead but looking into this stuff and not understanding the buffer. I use my motu mostly at 48000 sample rate and 64 sample buffer. how does that equate with the buffers you guys are talking about like 64x2 and 32x4? |
Beta Was this translation helpful? Give feedback.
-
If you are not a programmer, then it will be difficult for you to understand the operation of the aduio buffer. It always consists of at least two parts . In one part there is continuous recording with the second part there is processing and subsequent data transfer. In Linux, unlike windows (WASAPI) and Mac OS (Core Audio), you can specify the number of such n-periods by increasing them to 4 or more. This will increase stability and reduce the risk of XRUNS, but it will also have an impact on latency. For this reason, they always try to find a compromise between speed and stability. |
Beta Was this translation helpful? Give feedback.
-
As a programmer, it's difficult to understand the operation of the audio buffers, too! :-) Assuming you have selected 64x4 buffers for the sake of simplicity... A sample "frame" is one 32-bit floating point value for a mono signal, 2 32-bit floating point values for a stereo signal, or N floaing point values for an N-channel audio signal. PiPedal reads 64 sample frames at at time from the audio device input, processes all 64 of them in one go and writes the processed buffer back to the audio adapter. So that's what the buffer size is: how many frames at a time that PiPedal processes. So the "64x" part affects only PiPedal. As far as the actual device itself is concerned, on Linux it actually has one buffer consisting of 64x4=256 sample frames. (Actually one buffer for input, one buffer for output, each of 64x4=256 sample frames). And then the hardware chases the available data as fast as it can in ways that are not entirely well understood, to be perfectly honest. The user interface could, in fact, allow users to specify the device's buffer size using in frames. (259 sample frames for example). Although there's good reason think that some devices would perform badly if you did. Specifying it the way PiPedal does is fairly widely used convention on both Linux and Windows systems audio systems. And there IS actually good reason to make sure that the audio device buffers is an integer multiple of the size of PiPedal's buffer. Smaller buffers are generally better -- to a point. There is always a minimum of one buffers' worth of delay when processing. So smaller buffers will generally have lower audio latency, all things being equal. However, there is a certain amount of fixed system overhead for processing each buffer. The amount of overhead is not that significant when running with 64-sample buffers; but it would require (probably) about 5% extra system overhead when using 32-sample buffers, and about 35%(?) overhead when using 8 sample buffers. And more buffers increases latency, but reduces the probability of audio under-runs, where the system can't feed the hardware fast enough. Unfortunately, there's a fair bit of hidden buffering going on in the overall system. And USB audio hardware introduces additional buffers and delays. So all of these are general rather than exact principles. At the time PiPedal was first written, there was a fair bit of deep lore that claimed that USB audio adapters performed much better if their buffers were multiples of 48 bytes (sizeof a data frame on USB 1.0). So that's why PiPedal makes sure that you can always select 3 buffers if you want to. I don't actually think that's true anymore on USB 2.0+ devices. But, for historical reasons, by default, Pipedal uses 64x3 buffer configuration. The best value actually depends on what audio device you're using, how heavily you're loading the processor, and what kinds of plugins you're using, and whether the host system is headless or not. My current thinking: 32 sample buffers are almost always better than 64 sample buffers; 16-sample buffers require too much overhead. And, in retrospect, I think there might be value in allowing buffer configurations like 32x6, which PiPedal does not currently allow. The original posting was meant to float the idea that selecting 32x4 buffers is actually significantly better in all respects than using the default. And as an exploratory poke to see if I shold consider changing the default configuration. (I have not). |
Beta Was this translation helpful? Give feedback.
-
Stable, no under-runs, with plugins using 85% of available CPU, 64x3 buffer configuration, MOTU M2 USB audio adapter!
I'm not sure when it happened, but all of a sudden I seem to be getting it.
I was doing some testing on changes made to audio handling, hoping to cause ALSA audio to get upset, I added a TooB Nam plugin AND a ToobML plugin with one of the large Proteus models to the same preset. And it ran stably with NO under-runs! At 85% CPU use, with a 64x3 buffer configuration. (Also seems fine at 32x3 and 32x4).
There is a fix in TooB NAM which prevents memory allocations in the audio thread (a bug in the underling Neural Amp Modeler library, which was fixed and pushed back upstream to Steven Atkin's projec) which allows NAM to run with less variability in each audio frame.
There were a bunch of updates to audio subsystems, and drivers in a recent Raspberry Pi OS release. I'm wondering whether Raspberry Pi OS has made changes to Raspberry Pi OS that accounts for the difference. Very reasonable things that might account for the differences: updated device drivers for non-audio devices that have full RT_PREEMPT patches applied.
At any rate, I am quite amazed by this. I have never seen audio work stations on any OS running stably with 85% plugin CPU use.
So, I'm asking users of PiPedal to give me some feedback on what kind of CPU use works for you while still not under-running, as well as what kinds of buffer configurations are working for you, when you're using the latest release of PiPedal with full Raspberry Pi OS updates applied. applied.
I'm also curious about whether I should be allowing things like 16x6 or 16x8 buffer configurations. These sorts of buffer configurations do work, and at first glance, they seem to be surprisingly stable. And I do have reason to think that 32x4 might actually be more stable than 64x3, while providing even lower latency. But I haven't yet done sufficient testing to make that an actionable point. So I'm asking.
In passing, so you know: graphics operations in buster have a dire effect on audio stability. Much more so than in buster. Just moving the cursor will cause underruns on my system even with large audio buffers. The solution: run headless, or disconnect your HDMI cable. Bookworm seems to shut down the GPU if there's no display attached, which is a very good thing for PiPedal. That may not be surprising to you, but it is quite a big change from previous versions of Raspberry Pi OS. I suspect that the problem is more than one of relative priority of GPU and audio interrupts, and make have more to do with the fact that graphics operations are going to put a heavy load on CPU L2 memory caches. The heavyweight plugins (ML, NAM, and the convolution plugins) are carefully optimised to make best use of CPU L1 and L2 memory caches. So even low priority processes can cause audio unde-runs can and will cause audio under-runs if they do big flat memory operations that invalidate the caches being used by real-time audio.
Beta Was this translation helpful? Give feedback.
All reactions