Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Less memory usage by I+II layer convolution instead of sequential I and II layer convolution. #10

Open
rageworx opened this issue Jan 5, 2023 · 10 comments
Assignees
Labels
enhancement New feature or request

Comments

@rageworx
Copy link
Owner

rageworx commented Jan 5, 2023

Some issues before, years ago -

Issuer @zvezdochiot introduced his code with stb
Less memory ( convolution with layer I and II at once ) but bad performance in openMP model ( about double time ).
Take a look for less memory and keep performance in a way.

@rageworx rageworx self-assigned this Jan 5, 2023
@rageworx rageworx changed the title Less memory usage by skip I+II layer convolution. Less memory usage by I+II layer convolution instead of sequential I and II layer convolution. Jan 5, 2023
@rageworx
Copy link
Owner Author

rageworx commented Jan 5, 2023

These codes were aged about 4 years.
Need to understand code by myself again, may need more times.

@rageworx rageworx added the enhancement New feature or request label Jan 5, 2023
@zvezdochiot
Copy link

zvezdochiot commented Jan 5, 2023

Hi @rageworx .

See block algorithm. Allows you to process images of any size practically lossless. But due to block overlaps, performance is even lower. The size of the overlaps was chosen on the basis of dssim.

See also: shuwang127/SRCNN_Cpp#4

@rageworx
Copy link
Owner Author

rageworx commented Jan 5, 2023

See also: shuwang127/SRCNN_Cpp#4

Just simply this header, right ?

https://github.com/ImageProcessing-ElectronicPublications/stb-image-srcnn/blob/main/src/srcnn.h

Interesting, I will make performance check in low power consume system like aarch64 based debian linux systems.

@rageworx
Copy link
Owner Author

rageworx commented Jan 5, 2023

See also: shuwang127/SRCNN_Cpp#4

And shuwang127 repo seems to abandonned.
It looks better forget about asking pull request ...

@zvezdochiot
Copy link

zvezdochiot commented Jan 5, 2023

@rageworx say:

Just simply this header, right ?

Stand! Afraid! And do you want to shove a defective bicubic interpolant?

@rageworx say:

And shuwang127 repo seems to abandonned.

This is the question of combining Layer I and Layer II.

@rageworx
Copy link
Owner Author

rageworx commented Jan 5, 2023

Stand! Afraid! And ...

Is this a something kind of Russian slogan ?
Actually I cannot sense your point.
Anyway, your suggestion may help improve my old codes.

Regards, Raph.

@rageworx
Copy link
Owner Author

rageworx commented Jan 5, 2023

Hi @rageworx .

See block algorithm. Allows you to process images of any size practically lossless. But due to block overlaps, performance is even lower. The size of the overlaps was chosen on the basis of dssim.

See also: shuwang127/SRCNN_Cpp#4

Never heard about your announced algorithm, block? dssim ?
But I will try!

@zvezdochiot
Copy link

zvezdochiot commented Jan 5, 2023

@rageworx say:

Actually I cannot sense your point.

bicubic.h verified. See stb-image-resize and demo.

@rageworx say:

Never heard about your announced algorithm, block?

Simple division image into blocks with an overlap. With the processing of each block as a small image. At a time, one block is processed, this means that only one block needs to be allocated in memory.
stb-image-srcnn say:

For complete processing, memory is required for 175 original images. With block processing, this size is reduced to 170 block size + 5 size of the original image.

@rageworx say:

dssim ?

Metrics: delta SSIM == 1/SSIM-1. Maybe use stb-image-nhwmetrics.

dssim -o butterfly.x2.dssim.2-0.png butterfly.x2.0.png butterfly.x2.2.png 
0.00003022      butterfly.x2.2.png

butterfly x2 dssim 2-0

stbnhwmetrics -q butterfly.x2.0.png butterfly.x2.2.png butterfly.x2.nhw-r.2-0.png 
0.014613    butterfly.x2.2.png

butterfly x2 nhw-r 2-0

@rageworx
Copy link
Owner Author

rageworx commented Mar 9, 2023

Merged Co/nv I+II.
And dissm looks like checking frequency differences by "Fast Fourier Transform/FFT" as above result, let be checked.

@zvezdochiot
Copy link

@rageworx say:

let be checked.

I already checked everything with metrics. There are only differences between the monolithic and the block algorithm at the "junction" of blocks. Now it is necessary to check not metrics, but memory allocation. Combining layers I and II greatly reduced memory consumption. But the monolithic algorithm eats decently anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants