cudnn error on windows #188

CorcovadoMing · 2016-12-21T19:12:45Z

I've installed torch on windows with cuda and cudnn,
I can run cunn without error, but when I convert my model into cudnn, the error appears:

In 1 module of nn.Sequential:
...\.\install\luarocks\systree/share/lua/5.1/cudnn\find.lua:379: bad argument #7 to 'call' (cannot convert 'int *' to 'uint64_t *')
stack traceback:
        [C]: in function 'call'
        ...\.\install\luarocks\systree/share/lua/5.1/cudnn\find.lua:379: in function 'callCudnn'
        ...\.\install\luarocks\systree/share/lua/5.1/cudnn\find.lua:472: in function 'forwardAlgorithm'
        ...rocks\systree/share/lua/5.1/cudnn\SpatialConvolution.lua:190: in function <...rocks\systree/share/lua/5.1/cudnn\SpatialConvolution.lua:186>
        [C]: in function 'xpcall'
        ...\install\luarocks\systree/share/lua/5.1/nn\Container.lua:63: in function 'rethrowErrors'
        ...install\luarocks\systree/share/lua/5.1/nn\Sequential.lua:44: in function 'forward'
        Main.lua:221: in function 'Train'
        Main.lua:343: in main chunk
        [C]: in function 'dofile'
        ...l\luarocks\systree\lib\luarocks\rocks\trepl\scm-1\bin\th:145: in main chunk
        [C]: at 0x7ff726b71eb0

Any suggestion?

The text was updated successfully, but these errors were encountered:

BTNC · 2016-12-22T03:30:54Z

Package cudnn is not fully patched to work on windows. The main problem is cudnn is using LongTensor as a storage for 64 bit integer in a few places, while the underlining long type is only 32 bit as int on windows. For now, you have to use cunn instead of cudnn if you face error with cudnn, or you can try to replace those LongTensors to real 64 bit integers.

CorcovadoMing · 2016-12-22T05:34:55Z

@BTNC I can use cunn without errors. However, it is slow, I still need 8 min for an epoch running vgg16 instead of 30 min an epoch on CPU, I thought the poor performance is because of not using cudnn, or do you think there are another issue related to the performance? (it only need 30 sec an epoch on Linux)

May you provide some clues about how could I help to port the cudnn running on windows?

elikosan · 2016-12-27T23:22:35Z

I am facing the same problem. When do you think there will be a patch available ?
Thanks!

elikosan · 2016-12-27T23:40:32Z

Actually, i tried @BTNC's suggestion to replace some LongTensor with 64bit pointers:
--local bufSize = torch.LongTensor(1)
local bufSize = ffi.new("size_t[1]")
And it seems to work fine.

wakanawakana · 2017-01-07T08:50:34Z

I Try this code
cudnn work

                    local bufSize = torch.LongTensor(1)
                    local uint64buf = ffi.new("size_t[1]")
                    ret = cudnn.call(getWSAlgos[findAPI_idx],
                                     cudnn.getHandle(),
                                     params[1], params[3], layer.convDesc[0], params[6],
                                     retAlgo, ffi.cast('uintptr_t*', uint64buf))
                                     --bufSize:data()
                    bufSize[1] = tonumber(uint64buf[0])

wakanawakana mentioned this issue Jan 7, 2017

'image' module error using from python #193

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cudnn error on windows #188

cudnn error on windows #188

CorcovadoMing commented Dec 21, 2016

BTNC commented Dec 22, 2016

CorcovadoMing commented Dec 22, 2016

elikosan commented Dec 27, 2016

elikosan commented Dec 27, 2016

wakanawakana commented Jan 7, 2017

cudnn error on windows #188

cudnn error on windows #188

Comments

CorcovadoMing commented Dec 21, 2016

BTNC commented Dec 22, 2016

CorcovadoMing commented Dec 22, 2016

elikosan commented Dec 27, 2016

elikosan commented Dec 27, 2016

wakanawakana commented Jan 7, 2017