Releases: FluxML/NNlib.jl
Releases · FluxML/NNlib.jl
v0.9.6
NNlib v0.9.6
Merged pull requests:
- Bump AMDGPU compat to 0.6 (#530) (@pxl-th)
- Simplify runtests.jl (#531) (@CarloLucibello)
- bump CUDA compat (#535) (@CarloLucibello)
v0.9.5
NNlib v0.9.5
Merged pull requests:
- Add
bias_act!
(#457) (@mcabbott) - Add a new workflow to trigger the benchmark workflow (#522) (@skyleaworlder)
- Fix BenchmarkTrigger.yml (#524) (@skyleaworlder)
- Fix BenchmarkTrigger.yml (#526) (@skyleaworlder)
v0.9.4
v0.9.3
NNlib v0.9.3
Closed issues:
- ∇conv_filter_direct! nonzero beta + flipkernel bug (#518)
Merged pull requests:
v0.9.2
NNlib v0.9.2
Closed issues:
- pointer(CuArray) is not defined (#131)
- NNPACK convolution issue (#203)
- move issues from NNlibCUDA.jl (#495)
Merged pull requests:
- move from bad thread-local to task-local (#497) (@IanButterworth)
- backports v0.8.21 (#498) (@CarloLucibello)
- port batchnorm rrule from Flux (#499) (@CarloLucibello)
- Update AMDGPU compat to 0.5 (#521) (@pxl-th)
v0.8.21
NNlib v0.8.21
Merged pull requests:
- Add (log)softmax benchmarks (#491) (@ToucheSir)
- make NNlibCUDA an extension (#492) (@CarloLucibello)
- Fix nthreads on 1.9 (failure in the presence of interactive threads) (#496) (@IanButterworth)
- backports v0.8.21 (#498) (@CarloLucibello)
v0.9.0
NNlib v0.9.0
Merged pull requests:
- Add (log)softmax benchmarks (#491) (@ToucheSir)
- make NNlibCUDA an extension (#492) (@CarloLucibello)
- Fix nthreads on 1.9 (failure in the presence of interactive threads) (#496) (@IanButterworth)
v0.8.20
NNlib v0.8.20
Closed issues:
Merged pull requests:
- Add hand written gelu derivative (#480) (@chengchingwen)
- [NNlibAMDGPUExt] Load MIOpen module only if it is available (#483) (@pxl-th)
- Use KernelAbstractions.jl for upsample kernels (#486) (@pxl-th)
- Use KernelAbstractions.jl for gather/scatter kernels (#487) (@pxl-th)
- [AMDGPU] Add dispatch path for FP16 batched mul (#488) (@pxl-th)
v0.8.19
NNlib v0.8.19
Merged pull requests:
- Add AMDGPU extension (#470) (@pxl-th)
- Merge NNlibCUDA as subpackage (#471) (@ToucheSir)
- Add dropout & attention tests for AMDGPU (#472) (@pxl-th)
- Allow regular convolution for AMDGPU (#473) (@pxl-th)
- Add downstream tests for Lux.jl (#474) (@avik-pal)
- change AMDGPUExt to NNlibAMDGPUExt (#477) (@CarloLucibello)
v0.8.18
NNlib v0.8.18
Closed issues:
- Unstable performance (#461)
Merged pull requests:
- Fix conv with groups when falling in direct backend (#468) (@gabrielpreviato)