impl `trig` for `core::simd` #6

Lokathor · 2020-09-27T03:53:19Z

etemesi-ke · 2020-09-30T20:19:26Z

Heey, i could implement some of these and submit a PR, and I was wondering
If the appropriate way was to use the Taylor series for the above trig functions since am not sure what the rust standard library uses

calebzulawski · 2020-09-30T23:53:12Z

The Taylor series is probably not sufficient since it's only accurate near 0. This paper summarizes a variety of algorithms used for vectorized implementations of popular numeric functions, including trigonometry: https://arxiv.org/abs/2001.09258. The referenced SLEEF library is used by packed_simd, however I don't believe SLEEF itself can be used in this crate.

etemesi-ke · 2020-10-01T05:37:45Z

What algorithm did you have in mind then?

And would the impls look like

trait trig {
     type Output=Self::Output
     #[target-enable="sse"]
     fn cos(&self);
}
impl trig for _mm128{
     type Output = _mm128
    #[target-enable="sse"]
     
    fn cos(){
     //Code
    }
}

etemesi-ke · 2020-10-01T05:41:17Z

Sorry for the first part, after re-reading your comment if I understand it right you want to use the algorithms in the paper used by sleef but not necessarily the sleef library. I.e rewrite them in rust?

Lokathor · 2020-10-01T05:47:53Z

So, I don't know what the quality or speed is, but LLVM has built-in handling for vector trig same as with vector add, sub, etc.

Example from packed_simd: https://github.com/rust-lang/packed_simd/blob/master/src/codegen/math/float/cos.rs

If the LLVM versions are of sufficient quality, we should likely just use that.

bjorn3 · 2020-10-01T08:13:38Z

There are platform-intrinsic versions for sin and cos by the way, so no need to use LLVM intrinsics directly. Curiously enough there are no versions for other trigonometry functions, but that should be easy to add.

https://github.com/rust-lang/rust/blob/41507ed0d57eba71adc20a021a19b64322162f04/compiler/rustc_codegen_llvm/src/intrinsic.rs#L1080-L1085

etemesi-ke · 2020-10-01T08:38:41Z

Are the LLVM intrinsics useful for SIMD
with what I understand according to their doc their cos function for example just operates on one floating point value

bjorn3 · 2020-10-01T08:42:17Z

There are LLVM intrinsics with names like llvm.cos.v2f32 that operate on vectors.

etemesi-ke · 2020-10-01T08:53:34Z

@bjorn3 Thanks for clarifying that,

so the code might look like

![feature(link_llvm_intrinsics)]
extern {
    #[link_name="llvm.cos.v2f32"]
      fn cos(a:[f64;2])->[f64;2];
}
/// A vector for two floating points
struct f64x2([f64; 2]);

impl trig for f64x2{
   fn cos(&self)->f64x2{
        f64x2(cos(self.0))
   }
}

bjorn3 · 2020-10-01T08:55:41Z

More like

#![feature(link_llvm_intrinsics)]
extern {
    #[link_name="llvm.cos.v2f32"]
      fn cos(a: f64x2)-> f64x2;
}
/// A vector for two floating points
#[repr(simd)]
struct f64x2([f64; 2]);

impl trig for f64x2{
   fn cos(&self)-> f64x2{
        cos(self)
   }
}

etemesi-ke · 2020-10-01T12:15:46Z

Okay that makes sin and cos easy

And the rest are we gonna implement from the ground up?

emmanuelantony2000 · 2020-10-01T15:19:56Z

I think so yes, tan and others have to be implemented from the ground up.
Is it implemented by using sin and cos, or is there an another way?

programmerjake · 2020-10-01T15:48:14Z

Note that llvm.cos.v2f32 means f32x2, if you want f64x2 you need to use llvm.cos.v2f64

Lokathor · 2020-10-01T15:55:53Z

tan = sin / cos, so that one specifically is easy. also you can write sin_cos and merge 90% of the work.

We probably want to port a lot of the Agner Fog vector math stuff, ~~https://github.com/vcoda/vectorclass~~ (edit: wrong link, this is an older gpl3 version, see below for apache2)

Some of which is already ported in the wide crate.

calebzulawski · 2020-10-01T16:10:04Z

I think we need to be careful about "porting" libraries, both SLEEF and that library are incompatible with MIT/Apache. Additionally I think we need to be careful about our implementations, since naive approaches will likely have poor accuracy. I think our best bet is to find branchless algorithms (could be scalar) that are available under compatible licenses, and port them.

Lokathor · 2020-10-01T16:17:38Z

oh my oh my, that's the wrong link! google search let me down.

Version 2 of the lib is here
https://github.com/vectorclass/version2, and is apache-2.0 licensed, so we could write the rust version and have it be apache/mit

calebzulawski · 2020-10-01T16:33:26Z

Oh awesome, much better. I'm definitely okay with using this implementation.

workingjubilee · 2021-02-13T03:50:48Z

Ah, this was the issue I kept forgetting. This issue, roughly, makes a lot of this effort more complex: rust-lang/rust#50145

fu5ha · 2021-02-22T21:01:55Z

myself, @Lokathor and one of the other wide contributor people already ported several of these from the agner fog lib linked aboe to rust for use in wide, it would probably not be very hard to port those rust impls to stdsimd. I would be willing to do this, if nobody else is working on it currently.

Lokathor · 2021-02-22T22:06:18Z

A contributor would be welcomed.

The current status appears to be that most of the library can't be in core anyway for other reasons, so i think it's easiest to make this work at all, and then we can (maybe!) make it work in core later.

thomcc · 2021-02-22T22:08:22Z

I think that it's still an open question where to put it though, since part of the motivation behind supporting these is to use instrucitons on the platform where these have dedicated instructions. That probably means using simd_sin and stuff...

Lokathor · 2021-02-22T22:13:21Z

I propose:

stage 1:
we just code it in the library

stage 2:
the library transitions to intrinsics
when llvm can't emit a direct call to a platform version it calls a vector version of libm
our vector version of libm is the new home of the code from stage 1

workingjubilee · 2021-02-22T22:31:24Z

Please do!
And that sounds like a plan, @Lokathor!

simd_sin and simd_cos on x86 seem to degenerate to scalar libm calls even with -Ctarget-feature=+avx2 enabled, unfortunately (but with AVX vmovss instructions, just to make sure you know it's in fact using AVX!), so we can't really depend on LLVM emitting useful code for those, even if LLVM nominally supports them.

programmerjake · 2021-05-18T04:19:30Z

WIP implementation:
#109
Code:
https://salsa.debian.org/Kazan-team/vector-math

andy-thomason · 2021-05-18T11:31:30Z

If you are interested.

I usually use a newton polynomial on the Chebyshev points pinned at the extrema
to avoid any conditional code.

This way sin and cos, for example just become a single horner-form polynomial over
the range [0..1) corresponding to 0..2pi.

You can use combinations of COORDIC transforms to lower the initial range and this
is necessary with inverse functions of trig and exponentials, but for most cases there
is no need and just adding a few more terms to the polynomial is sufficient to get
accuracy over the range and exact values at the exterma.

You can choose your newton polynomial points at points of interest to get exact values
where you like.

programmerjake · 2021-05-18T17:50:28Z

hmm, ok. for sin_cos_pi, I range reduced to sin([-pi/4,pi/4]) and cos([-pi/4,pi/4]) and used a taylor series up to x^16 for f64, x^9 for f32, and x^5 for f16. That gave me 2ULP accuracy (tested against f64::sin/cos for f16 and f32, tested against mpfr for f64). using more terms didn't give more accuracy due to round-off error.

andy-thomason · 2021-05-18T18:07:30Z

I'll try to get some time to do some manual runs for you with carefully chosen newton polynomials.

Sin and cos are very well behaved as their differentials are well behaved and the series
converges rapidly. Even Taylor (McLaurin) series, which are about the worst case because they
only approximate the zero point will converge.

Chebyshev approximation gives you a more consistent error profile, spreading the error out
over the function whereas Remez can suffer from instability.

doctor_syn my fledgling computer algebra system based on syn currently generates code
based on something close to Chebeyshev. It uses Newton polynomials
to build a polynomial that is exact in certain places.

By spreading the places where it is exact out along the valid range, you can get a much lower maximum
error for fewer terms.

To get a good f64 result, you need to work with algebra > f64 precision. This was
traditionally f80 (legacy 8087) but there are a few extended precision Rust libs out
there. Rug is a popular choice, but is a huge Frankenstein's monster wrapping
GMP - another huge monster. You can also use BigRational as long as you
keep approximating the numerator/denominator to avoid ourageous run times.

There may be a f128 crate out there somewhere, too.

https://en.wikipedia.org/wiki/Newton_polynomial
https://en.wikipedia.org/wiki/Approximation_theory#Chebyshev_approximation

andy-thomason · 2021-05-18T18:16:22Z

Anyway, great work, @programmerjake I'm quite excited!

I'll try to get to the next SIMD group meeting on Monday. I was doing a talk
last night with the Cardiff Rust group and could not attend.

programmerjake · 2021-05-18T18:16:54Z

If you don't care about speed, you can use the algebraics crate (implements real algebraic numbers), or the simple-soft-float crate which uses algebraics (implements ieee 754 fp math with support for huge fp types, e.g. f65536). I wrote both of them.

andy-thomason · 2021-05-18T18:18:24Z

Cool!

programmerjake · 2021-05-18T18:20:56Z

took me several months to figure out how to efficiently factor polynomials with bigint coefficients -- used to implement real algebraic numbers

programmerjake · 2021-05-18T18:21:59Z

modular fields of polynomials with modular integer coefficients are complicated!

andy-thomason · 2021-05-18T18:36:37Z

Tell Galois that :)

Good to see this work being done. I have slightly wider requirements with the full gamut of stats functions
in R, but I worked with SCEE on the Playstation compilers for a long time. I used to use Maple to
calculate aproximations and factorise polynomials, but I prefer to do it myself now.

programmerjake · 2021-05-18T18:45:18Z

for higher precision in core::simd, I'm planning on using double-double arithmetic, since f64 SIMD is commonly supported.

Lokathor added C-tracking-issue Ongoing issue with checkboxes for partial progress and such A-simd Area: SIMD. Put this on tracking issues to help with cross-repo issue organization labels Sep 27, 2020

programmerjake mentioned this issue Sep 30, 2020

Impl special functions for SIMD #14

Open

42 tasks

farnoy mentioned this issue Jun 16, 2023

Add portable_simd support dimforge/simba#45

Merged

impl trig for core::simd #6

impl trig for core::simd #6

Comments

Lokathor commented Sep 27, 2020

etemesi-ke commented Sep 30, 2020

Uh oh!

calebzulawski commented Sep 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

etemesi-ke commented Oct 1, 2020

Uh oh!

etemesi-ke commented Oct 1, 2020

Uh oh!

Lokathor commented Oct 1, 2020

Uh oh!

bjorn3 commented Oct 1, 2020

Uh oh!

etemesi-ke commented Oct 1, 2020

Uh oh!

bjorn3 commented Oct 1, 2020

Uh oh!

etemesi-ke commented Oct 1, 2020

Uh oh!

bjorn3 commented Oct 1, 2020

Uh oh!

etemesi-ke commented Oct 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

emmanuelantony2000 commented Oct 1, 2020

Uh oh!

programmerjake commented Oct 1, 2020

Uh oh!

Lokathor commented Oct 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

calebzulawski commented Oct 1, 2020

Uh oh!

Lokathor commented Oct 1, 2020

Uh oh!

calebzulawski commented Oct 1, 2020

Uh oh!

workingjubilee commented Feb 13, 2021

Uh oh!

fu5ha commented Feb 22, 2021

Uh oh!

Lokathor commented Feb 22, 2021

Uh oh!

thomcc commented Feb 22, 2021

Uh oh!

Lokathor commented Feb 22, 2021

Uh oh!

workingjubilee commented Feb 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

programmerjake commented May 18, 2021

Uh oh!

andy-thomason commented May 18, 2021

Uh oh!

programmerjake commented May 18, 2021

Uh oh!

andy-thomason commented May 18, 2021

Uh oh!

andy-thomason commented May 18, 2021

Uh oh!

programmerjake commented May 18, 2021

Uh oh!

andy-thomason commented May 18, 2021

Uh oh!

programmerjake commented May 18, 2021

Uh oh!

programmerjake commented May 18, 2021

Uh oh!

andy-thomason commented May 18, 2021

Uh oh!

programmerjake commented May 18, 2021

Uh oh!

impl `trig` for `core::simd` #6

impl `trig` for `core::simd` #6

calebzulawski commented Sep 30, 2020 •

edited

Loading

etemesi-ke commented Oct 1, 2020 •

edited

Loading

Lokathor commented Oct 1, 2020 •

edited

Loading

workingjubilee commented Feb 22, 2021 •

edited

Loading