Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of double literals severely harming performance #40

Open
Jaimies opened this issue Sep 16, 2024 · 0 comments
Open

Use of double literals severely harming performance #40

Jaimies opened this issue Sep 16, 2024 · 0 comments

Comments

@Jaimies
Copy link

Jaimies commented Sep 16, 2024

The classes MultiStepper and AccelStepper frequently use double literals, such as 20.0 where float literals should be used (20.0f).
One particular example is AccelStepper's setSpeed() method - line 316 features the following:

_stepInterval = fabs(1000000.0 / speed);

As one can see, a double literal divided by the value of speed (a float).
When running the following code on an Arduino Uno R4 Wifi with the library unmodified, the execution of setSpeed() requires 833 CPU cycles as evidenced by cycles for setspeed(): 833 being printed on the serial. You may ignore the specifics, the potentially confusing-looking code simply measures the number of cycles before the start of and after the end of the execution of the setSpeed() invocation.

#include "AccelStepper.h"
#include "Serial.h"

#include <cstdint>

#define DEMCR_TRCENA    0x01000000
#define DWT_CTRL        (*((volatile uint32_t *) 0xE0001000))
#define DWT_CYCCNT      (*((volatile uint32_t *) 0xE0001004))
#define DEMCR           (*((volatile uint32_t *) 0xE000EDFC))
#define CYCCNTENA       (1 << 0)

void stopwatch_reset() {
    DEMCR |= DEMCR_TRCENA;
    DWT_CYCCNT = 0;
    DWT_CTRL |= CYCCNTENA;
}

AccelStepper stepper(AccelStepper::DRIVER, 2, 5);

void setup() {
    stopwatch_reset();
    Serial.begin(500000);
    stepper.setMaxSpeed(500000);
    uint32_t start = DWT_CYCCNT;
    stepper.setSpeed(42000.0f);
    uint32_t end = DWT_CYCCNT;
    Serial.print("cycles for setspeed: ");
    Serial.println(end - start);
}

void loop() {}

However, upon changing line 316 in the source code to be the following (using a float literal), the very same operation only requires 69 CPU cycles.

_stepInterval = fabs(1000000.0f / speed);

The following code (using an integer literal) does the job just as well with identical performance:

_stepInterval = fabs(1000000 / speed);

I would attribute this performance difference to the fact that the ARM Cortex M4 processor powering the Arduino Uno R4 has a Floating Precision Unit which only supports single precision, not double precision - with double precision operations requiring an immense amount of computational power. I have not tested if a similar performance difference is observed on Arduino Uno R3, the situation may be a bit different since it does not feature a Floating Precision Unit, but I would imagine that operations with floats would still be faster than operations with doubles.

If you need any more information, let me know.
I might work on a pull request addressing this issue at some point in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant