Post by Matthias BodenbinderCan it be that the DT GUI is not releasing all the GPU memory and then the commandline fails? And the worst thing is that the system is not recovering from that. Even with bench.SRW which is a lot smaller it is not working anymore. The GPU memory is gone.
You should check on your system with nvidia-smi (typically comes as part of the nvidia-compute package). The tool will tell you the amount of used GPU memory. You should be able to see if it differs before and after running darktable.
You may also try 'darktable -d opencl -d memory' which gives you a log of darktable's opencl memory usage. At the end it should go down back to zero and also tell you the peak memory usage.
Ulrich
____________________________________________________________________________
darktable user mailing list
Ok. I did the testing again with -d memory and with nvidia-smi in parallel.
I am repeating the following cycle:
1st: on the commadline darktable-cli with the big Canon 5Ds RAW file
and measuring with nvidia-smi
2nd: in darktable GUI export of 26 canon 6D raw files
and measuring with nvidia-smi
3rd: back to 1st
When I repeat this cycle I see with nvidia-smi that the GPU memory is not fully released.
When I start it is:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.24 Driver Version: 396.24 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 105... Off | 00000000:01:00.0 On | N/A |
| 20% 29C P0 N/A / 75W | 205MiB / 4032MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 3161 G /usr/lib/xorg-server/Xorg 119MiB |
| 0 3204 G /usr/bin/gnome-shell 55MiB |
| 0 3595 G /usr/lib/thunderbird/thunderbird 27MiB |
+-----------------------------------------------------------------------------+
after about 3-4 cycles it is:
Thu May 24 17:36:40 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.24 Driver Version: 396.24 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 105... Off | 00000000:01:00.0 On | N/A |
| 21% 47C P0 N/A / 75W | 365MiB / 4032MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 3161 G /usr/lib/xorg-server/Xorg 177MiB |
| 0 3204 G /usr/bin/gnome-shell 139MiB |
| 0 3595 G /usr/lib/thunderbird/thunderbird 29MiB |
| 0 12453 G ...-token=A9014C684CDE331E22E41D13D5C24B83 15MiB |
+-----------------------------------------------------------------------------+
And this is where the commandline with the big Canon 5Ds file is getting into memory issues:
[memory] after successful startup
[memory] max address space (vmpeak): 14048600 kB
[memory] cur address space (vmsize): 14048600 kB
[memory] max used memory (vmhwm ): 173900 kB
[memory] cur used memory (vmrss ): 173900 kB
0,577826 [dev] took 0,358 secs (0,294 CPU) to load the image.
0,657177 [export] creating pixelpipe took 0,079 secs (0,165 CPU)
0,657193 [pixelpipe_process] [export] using device 0
[memory] before pixelpipe process
[memory] max address space (vmpeak): 16013512 kB
[memory] cur address space (vmsize): 16013512 kB
[memory] max used memory (vmhwm ): 385344 kB
[memory] cur used memory (vmrss ): 285088 kB
0,657246 [dev_pixelpipe] took 0,000 secs (0,000 CPU) initing base buffer [export]
0,657260 [opencl memory] device 0: 105328640 bytes in use
0,667911 [opencl memory] device 0: 309913856 bytes in use
0,669008 [opencl memory] device 0: 309913872 bytes in use
0,669015 [opencl memory] device 0: 309913888 bytes in use
0,669421 [opencl memory] device 0: 309913872 bytes in use
0,675027 [opencl memory] device 0: 309913856 bytes in use
0,675107 [opencl memory] device 0: 204585216 bytes in use
0,675292 [dev_pixelpipe] took 0,018 secs (0,015 CPU) processed `Raw-Schwarz-/Weißpunkt' on GPU, blended on GPU [export]
0,675299 [opencl memory] device 0: 409170432 bytes in use
0,676389 [opencl memory] device 0: 409170444 bytes in use
0,678972 [opencl memory] device 0: 409170432 bytes in use
0,684469 [opencl memory] device 0: 204585216 bytes in use
0,684793 [dev_pixelpipe] took 0,009 secs (0,005 CPU) processed `Weißabgleich' on GPU, blended on GPU [export]
0,684800 [opencl memory] device 0: 409170432 bytes in use
0,693756 [opencl memory] device 0: 204585216 bytes in use
0,694007 [dev_pixelpipe] took 0,009 secs (0,008 CPU) processed `Spitzlicht-Rekonstruktion' on GPU, blended on GPU [export]
0,694014 [opencl memory] device 0: 1022926080 bytes in use
0,695104 [opencl memory] device 0: 1841266944 bytes in use
0,700809 [opencl memory] device 0: 1022926080 bytes in use
0,751384 [opencl memory] device 0: 818340864 bytes in use
0,755737 [dev_pixelpipe] took 0,062 secs (0,018 CPU) processed `Entrastern' on GPU, blended on GPU [export]
1,016704 [opencl memory] device 0: 0 bytes in use
1,018748 [default_process_tiling_cl_ptp] use tiling on module 'denoiseprofile' for image with full size 8734 x 5856
1,018754 [default_process_tiling_cl_ptp] (2 x 1) tiles with max dimensions 8128 x 5856 and overlap 8
1,018756 [default_process_tiling_cl_ptp] tile (0, 0) with 8128 x 5856 at origin [0, 0]
1,018760 [opencl memory] device 0: 761561088 bytes in use
1,018762 [opencl memory] device 0: 1523122176 bytes in use
1,113819 [opencl memory] device 0: 2284683264 bytes in use
1,113829 [opencl memory] device 0: 3046244352 bytes in use
1,113831 [opencl memory] device 0: 3236634624 bytes in use
1,113837 [opencl memory] device 0: 3427024896 bytes in use
1,113839 [opencl memory] device 0: 3617415168 bytes in use
1,113841 [opencl memory] device 0: 3807805440 bytes in use
11,528715 [opencl memory] device 0: 3617415168 bytes in use
11,528904 [opencl memory] device 0: 3427024896 bytes in use
11,531146 [opencl memory] device 0: 3236634624 bytes in use
11,533251 [opencl memory] device 0: 3046244352 bytes in use
11,537523 [opencl memory] device 0: 2284683264 bytes in use
11,542206 [opencl memory] device 0: 1523122176 bytes in use
11,546235 [opencl_denoiseprofile] couldn't enqueue kernel! -4
11,546240 [opencl memory] device 0: 761561088 bytes in use
11,550443 [opencl memory] device 0: 0 bytes in use
11,550447 [default_process_tiling_opencl_ptp] couldn't run process_cl() for module 'denoiseprofile' in tiling mode: 0
11,550450 [opencl_pixelpipe] could not run module 'denoiseprofile' on gpu. falling back to cpu path
36,835311 [dev_pixelpipe] took 36,080 secs (206,513 CPU) processed `Entrauschen (Profil)' on CPU, blended on CPU [export]
36,835361 [opencl memory] device 0: 818340864 bytes in use
36,926261 [opencl memory] device 0: 1636681728 bytes in use
36,927550 [opencl memory] device 0: 1636943872 bytes in use
36,927559 [opencl memory] device 0: 1636943884 bytes in use
36,928372 [opencl memory] device 0: 1636681740 bytes in use
36,962240 [opencl memory] device 0: 1636681728 bytes in use
36,962379 [opencl memory] device 0: 818340864 bytes in use
36,963098 [dev_pixelpipe] took 0,128 secs (0,183 CPU) processed `Basiskurve' on GPU, blended on GPU [export]
36,963108 [opencl memory] device 0: 1636681728 bytes in use
36,964171 [opencl memory] device 0: 1636681764 bytes in use
36,964177 [opencl memory] device 0: 1636681800 bytes in use
36,964276 [opencl memory] device 0: 1636943944 bytes in use
36,964372 [opencl memory] device 0: 1637206088 bytes in use
36,964468 [opencl memory] device 0: 1637468232 bytes in use
36,964471 [opencl memory] device 0: 1637468268 bytes in use
36,978946 [opencl memory] device 0: 1637468232 bytes in use
37,010210 [opencl memory] device 0: 1637468196 bytes in use
37,010222 [opencl memory] device 0: 1637206052 bytes in use
37,010228 [opencl memory] device 0: 1636943908 bytes in use
37,010232 [opencl memory] device 0: 1636681764 bytes in use
37,010575 [opencl memory] device 0: 1636681728 bytes in use
37,010693 [opencl memory] device 0: 818340864 bytes in use
37,011223 [dev_pixelpipe] took 0,048 secs (0,034 CPU) processed `Eingabefarbprofil' on GPU, blended on GPU [export]
37,011230 [opencl memory] device 0: 1636681728 bytes in use
37,012426 [opencl memory] device 0: 2455022592 bytes in use
37,012433 [opencl memory] device 0: 2455022636 bytes in use
37,022835 [opencl memory] device 0: 2455022592 bytes in use
37,120083 [opencl memory] device 0: 1636681728 bytes in use
37,120613 [opencl memory] device 0: 818340864 bytes in use
37,125276 [dev_pixelpipe] took 0,114 secs (0,076 CPU) processed `Schärfen' on GPU, blended on GPU [export]
37,125284 [opencl memory] device 0: 1636681728 bytes in use
37,126477 [opencl memory] device 0: 1636681764 bytes in use
37,126560 [opencl memory] device 0: 1636943908 bytes in use
37,126630 [opencl memory] device 0: 1637206052 bytes in use
37,126702 [opencl memory] device 0: 1637468196 bytes in use
37,126705 [opencl memory] device 0: 1637468232 bytes in use
37,141732 [opencl memory] device 0: 1637468196 bytes in use
37,202925 [opencl memory] device 0: 1637206052 bytes in use
37,202937 [opencl memory] device 0: 1636943908 bytes in use
37,202942 [opencl memory] device 0: 1636681764 bytes in use
37,203554 [opencl memory] device 0: 1636681728 bytes in use
37,203667 [opencl memory] device 0: 818340864 bytes in use
37,204190 [dev_pixelpipe] took 0,079 secs (0,044 CPU) processed `Ausgabefarbprofil' on GPU, blended on GPU [export]
37,269189 [opencl memory] device 0: 0 bytes in use
37,309177 [dev_pixelpipe] took 0,105 secs (0,363 CPU) processed `Gamma' on CPU, blended on CPU [export]
37,309216 [opencl_profiling] profiling device 0 ('GeForce GTX 1050 Ti'):
37,309220 [opencl_profiling] spent 0,1814 seconds in [Write Image (from host to device)]
37,309223 [opencl_profiling] spent 0,0056 seconds in rawprepare_1f
37,309226 [opencl_profiling] spent 0,0054 seconds in whitebalance_1f
37,309228 [opencl_profiling] spent 0,0058 seconds in highlights_1f_clip
37,309231 [opencl_profiling] spent 0,0254 seconds in ppg_demosaic_green
37,309234 [opencl_profiling] spent 0,0244 seconds in ppg_demosaic_redblue
37,309236 [opencl_profiling] spent 0,0017 seconds in border_interpolate
37,309240 [opencl_profiling] spent 0,3254 seconds in [Read Image (from device to host)]
37,309242 [opencl_profiling] spent 0,0219 seconds in denoiseprofile_precondition
37,309245 [opencl_profiling] spent 0,0076 seconds in denoiseprofile_init
37,309247 [opencl_profiling] spent 2,0810 seconds in denoiseprofile_dist
37,309250 [opencl_profiling] spent 0,8712 seconds in denoiseprofile_horiz
37,309252 [opencl_profiling] spent 2,1148 seconds in denoiseprofile_vert
37,309254 [opencl_profiling] spent 5,1772 seconds in denoiseprofile_accu
37,309257 [opencl_profiling] spent 0,0337 seconds in basecurve_lut
37,309259 [opencl_profiling] spent 0,0312 seconds in colorin_unbound
37,309262 [opencl_profiling] spent 0,0357 seconds in sharpen_hblur
37,309264 [opencl_profiling] spent 0,0207 seconds in sharpen_vblur
37,309267 [opencl_profiling] spent 0,0447 seconds in sharpen_mix
37,309269 [opencl_profiling] spent 0,0612 seconds in colorout
37,309271 [opencl_profiling] spent 11,0762 seconds totally in command queue (with 1 event missing)
37,309278 [dev_process_export] pixel pipeline processing took 36,652 secs (207,261 CPU)
[export_job] exported to `test-1.jpg'
37,787315 [opencl_summary_statistics] device 'GeForce GTX 1050 Ti' (0): peak memory usage 3807805440 bytes
37,787326 [opencl_summary_statistics] device 'GeForce GTX 1050 Ti' (0): 499 out of 500 events were successful and 1 events lost
####
# with the first tries I did not have the memory issue and it looked like this:
[memory] after successful startup
[memory] max address space (vmpeak): 14048600 kB
[memory] cur address space (vmsize): 14048600 kB
[memory] max used memory (vmhwm ): 174320 kB
[memory] cur used memory (vmrss ): 174320 kB
0,569022 [dev] took 0,352 secs (0,285 CPU) to load the image.
0,651794 [export] creating pixelpipe took 0,082 secs (0,187 CPU)
0,651810 [pixelpipe_process] [export] using device 0
[memory] before pixelpipe process
[memory] max address space (vmpeak): 16013444 kB
[memory] cur address space (vmsize): 16013444 kB
[memory] max used memory (vmhwm ): 385628 kB
[memory] cur used memory (vmrss ): 285376 kB
0,651859 [dev_pixelpipe] took 0,000 secs (0,000 CPU) initing base buffer [export]
0,651872 [opencl memory] device 0: 105328640 bytes in use
0,662504 [opencl memory] device 0: 309913856 bytes in use
0,663567 [opencl memory] device 0: 309913872 bytes in use
0,663570 [opencl memory] device 0: 309913888 bytes in use
0,663956 [opencl memory] device 0: 309913872 bytes in use
0,668663 [opencl memory] device 0: 309913856 bytes in use
0,668743 [opencl memory] device 0: 204585216 bytes in use
0,668907 [dev_pixelpipe] took 0,017 secs (0,010 CPU) processed `Raw-Schwarz-/Weißpunkt' on GPU, blended on GPU [export]
0,668913 [opencl memory] device 0: 409170432 bytes in use
0,669972 [opencl memory] device 0: 409170444 bytes in use
0,671924 [opencl memory] device 0: 409170432 bytes in use
0,677295 [opencl memory] device 0: 204585216 bytes in use
0,677508 [dev_pixelpipe] took 0,009 secs (0,001 CPU) processed `Weißabgleich' on GPU, blended on GPU [export]
0,677515 [opencl memory] device 0: 409170432 bytes in use
0,687658 [opencl memory] device 0: 204585216 bytes in use
0,687909 [dev_pixelpipe] took 0,010 secs (0,003 CPU) processed `Spitzlicht-Rekonstruktion' on GPU, blended on GPU [export]
0,687915 [opencl memory] device 0: 1022926080 bytes in use
0,689002 [opencl memory] device 0: 1841266944 bytes in use
0,693816 [opencl memory] device 0: 1022926080 bytes in use
0,738143 [opencl memory] device 0: 818340864 bytes in use
0,742500 [dev_pixelpipe] took 0,055 secs (0,030 CPU) processed `Entrastern' on GPU, blended on GPU [export]
1,003227 [opencl memory] device 0: 0 bytes in use
1,005273 [default_process_tiling_cl_ptp] use tiling on module 'denoiseprofile' for image with full size 8734 x 5856
1,005279 [default_process_tiling_cl_ptp] (2 x 1) tiles with max dimensions 8128 x 5856 and overlap 8
1,005281 [default_process_tiling_cl_ptp] tile (0, 0) with 8128 x 5856 at origin [0, 0]
1,005286 [opencl memory] device 0: 761561088 bytes in use
1,005288 [opencl memory] device 0: 1523122176 bytes in use
1,099501 [opencl memory] device 0: 2284683264 bytes in use
1,099511 [opencl memory] device 0: 3046244352 bytes in use
1,099513 [opencl memory] device 0: 3236634624 bytes in use
1,099519 [opencl memory] device 0: 3427024896 bytes in use
1,099521 [opencl memory] device 0: 3617415168 bytes in use
1,099522 [opencl memory] device 0: 3807805440 bytes in use
11,516149 [opencl memory] device 0: 3617415168 bytes in use
11,548085 [opencl memory] device 0: 3427024896 bytes in use
11,550326 [opencl memory] device 0: 3236634624 bytes in use
11,552427 [opencl memory] device 0: 3046244352 bytes in use
11,554597 [opencl memory] device 0: 2284683264 bytes in use
11,559091 [opencl memory] device 0: 1523122176 bytes in use
11,810621 [opencl memory] device 0: 761561088 bytes in use
11,811683 [opencl memory] device 0: 0 bytes in use
11,816696 [default_process_tiling_cl_ptp] tile (1, 0) with 622 x 5856 at origin [8112, 0]
11,816703 [opencl memory] device 0: 58278912 bytes in use
11,816706 [opencl memory] device 0: 116557824 bytes in use
11,833042 [opencl memory] device 0: 174836736 bytes in use
11,833051 [opencl memory] device 0: 233115648 bytes in use
11,833053 [opencl memory] device 0: 247685376 bytes in use
11,833055 [opencl memory] device 0: 262255104 bytes in use
11,833057 [opencl memory] device 0: 276824832 bytes in use
11,833059 [opencl memory] device 0: 291394560 bytes in use
12,569053 [opencl memory] device 0: 276824832 bytes in use
12,570918 [opencl memory] device 0: 262255104 bytes in use
12,571352 [opencl memory] device 0: 247685376 bytes in use
12,571788 [opencl memory] device 0: 233115648 bytes in use
12,572220 [opencl memory] device 0: 174836736 bytes in use
12,572684 [opencl memory] device 0: 116557824 bytes in use
12,586578 [opencl memory] device 0: 58278912 bytes in use
12,586720 [opencl memory] device 0: 0 bytes in use
12,588392 [dev_pixelpipe] took 11,846 secs (7,755 CPU) processed `Entrauschen (Profil)' on GPU with tiling, blended on CPU [export]
12,588401 [opencl memory] device 0: 818340864 bytes in use
12,672145 [opencl memory] device 0: 1636681728 bytes in use
12,673425 [opencl memory] device 0: 1636943872 bytes in use
12,673433 [opencl memory] device 0: 1636943884 bytes in use
12,674257 [opencl memory] device 0: 1636681740 bytes in use
12,706723 [opencl memory] device 0: 1636681728 bytes in use
12,706837 [opencl memory] device 0: 818340864 bytes in use
12,707443 [dev_pixelpipe] took 0,119 secs (0,104 CPU) processed `Basiskurve' on GPU, blended on GPU [export]
12,707452 [opencl memory] device 0: 1636681728 bytes in use
12,708645 [opencl memory] device 0: 1636681764 bytes in use
12,708652 [opencl memory] device 0: 1636681800 bytes in use
12,708748 [opencl memory] device 0: 1636943944 bytes in use
12,708840 [opencl memory] device 0: 1637206088 bytes in use
12,708932 [opencl memory] device 0: 1637468232 bytes in use
12,708935 [opencl memory] device 0: 1637468268 bytes in use
12,723289 [opencl memory] device 0: 1637468232 bytes in use
12,752757 [opencl memory] device 0: 1637468196 bytes in use
12,752768 [opencl memory] device 0: 1637206052 bytes in use
12,752773 [opencl memory] device 0: 1636943908 bytes in use
12,752778 [opencl memory] device 0: 1636681764 bytes in use
12,753118 [opencl memory] device 0: 1636681728 bytes in use
12,753225 [opencl memory] device 0: 818340864 bytes in use
12,753745 [dev_pixelpipe] took 0,046 secs (0,022 CPU) processed `Eingabefarbprofil' on GPU, blended on GPU [export]
12,753752 [opencl memory] device 0: 1636681728 bytes in use
12,754946 [opencl memory] device 0: 2455022592 bytes in use
12,754953 [opencl memory] device 0: 2455022636 bytes in use
12,765358 [opencl memory] device 0: 2455022592 bytes in use
12,860049 [opencl memory] device 0: 1636681728 bytes in use
12,860570 [opencl memory] device 0: 818340864 bytes in use
12,865228 [dev_pixelpipe] took 0,111 secs (0,073 CPU) processed `Schärfen' on GPU, blended on GPU [export]
12,865235 [opencl memory] device 0: 1636681728 bytes in use
12,866429 [opencl memory] device 0: 1636681764 bytes in use
12,866504 [opencl memory] device 0: 1636943908 bytes in use
12,866569 [opencl memory] device 0: 1637206052 bytes in use
12,866637 [opencl memory] device 0: 1637468196 bytes in use
12,866640 [opencl memory] device 0: 1637468232 bytes in use
12,881695 [opencl memory] device 0: 1637468196 bytes in use
12,941271 [opencl memory] device 0: 1637206052 bytes in use
12,941280 [opencl memory] device 0: 1636943908 bytes in use
12,941285 [opencl memory] device 0: 1636681764 bytes in use
12,941892 [opencl memory] device 0: 1636681728 bytes in use
12,942000 [opencl memory] device 0: 818340864 bytes in use
12,942517 [dev_pixelpipe] took 0,077 secs (0,045 CPU) processed `Ausgabefarbprofil' on GPU, blended on GPU [export]
13,007575 [opencl memory] device 0: 0 bytes in use
13,043600 [dev_pixelpipe] took 0,101 secs (0,335 CPU) processed `Gamma' on CPU, blended on CPU [export]
13,043661 [opencl_profiling] profiling device 0 ('GeForce GTX 1050 Ti'):
13,043665 [opencl_profiling] spent 0,1912 seconds in [Write Image (from host to device)]
13,043668 [opencl_profiling] spent 0,0047 seconds in rawprepare_1f
13,043671 [opencl_profiling] spent 0,0053 seconds in whitebalance_1f
13,043674 [opencl_profiling] spent 0,0049 seconds in highlights_1f_clip
13,043676 [opencl_profiling] spent 0,0195 seconds in ppg_demosaic_green
13,043679 [opencl_profiling] spent 0,0233 seconds in ppg_demosaic_redblue
13,043681 [opencl_profiling] spent 0,0017 seconds in border_interpolate
13,043684 [opencl_profiling] spent 0,5840 seconds in [Read Image (from device to host)]
13,043687 [opencl_profiling] spent 0,0220 seconds in denoiseprofile_precondition
13,043689 [opencl_profiling] spent 0,0095 seconds in denoiseprofile_init
13,043692 [opencl_profiling] spent 2,1725 seconds in denoiseprofile_dist
13,043694 [opencl_profiling] spent 0,9731 seconds in denoiseprofile_horiz
13,043697 [opencl_profiling] spent 2,2716 seconds in denoiseprofile_vert
13,043699 [opencl_profiling] spent 5,4051 seconds in denoiseprofile_accu
13,043702 [opencl_profiling] spent 0,0336 seconds in denoiseprofile_finish
13,043704 [opencl_profiling] spent 0,0324 seconds in basecurve_lut
13,043706 [opencl_profiling] spent 0,0294 seconds in colorin_unbound
13,043709 [opencl_profiling] spent 0,0342 seconds in sharpen_hblur
13,043711 [opencl_profiling] spent 0,0195 seconds in sharpen_vblur
13,043713 [opencl_profiling] spent 0,0449 seconds in sharpen_mix
13,043716 [opencl_profiling] spent 0,0595 seconds in colorout
13,043718 [opencl_profiling] spent 11,9418 seconds totally in command queue (with 0 events missing)
13,043726 [dev_process_export] pixel pipeline processing took 12,392 secs (8,381 CPU)