vokouder r2.0 beta3 - GPU at 20% ( beta2 95/98%)

  • Vouk

    Set the Label from In Progress to Fixed
  • I tested I a few more things out and there is some other curious thing. If I change the preset value from nvenc to Slow or medium or fast there is a performance difference in Voukoder 1.2.0 (as it should be).

    Slow = 275 fps average

    Medium = 285 fps average

    Fast = 300 fps average

    But in Voukoder R2 all presets have pretty much the same encoding speed and fps values (it never goes over around 200 fps). There has to be something that is slowing down the encoding in voukoder r2.

    • Official Post

    In voukoder 1 the settings were defaulted to proper values. Even if you did not change it they were already pretty optimized. In version 2 you start completely from zero. So you can not really compare both to eachother unless you are very careful and you are very sure about what values the nvenc gets called with.


    But even then:


    gpu=0 preset=slow qp=15 rc=constqp

    Code
    [16:23:21] Video frame #13077: Render: 13 µs, Process: 3 µs, Encoding: 1429 µs
    [16:23:21] Video frame #13078: Render: 12 µs, Process: 3 µs, Encoding: 1701 µs
    [16:23:21] Video frame #13079: Render: 19 µs, Process: 4 µs, Encoding: 1532 µs


    gpu=0 preset=fast qp=15 rc=constqp

    Code
    [16:24:15] Video frame #10041: Render: 17 µs, Process: 5 µs, Encoding: 594 µs
    [16:24:15] Video frame #10042: Render: 10 µs, Process: 2 µs, Encoding: 492 µs
    [16:24:16] Video frame #10043: Render: 15 µs, Process: 5 µs, Encoding: 676 µs


    It always depends on your video sources / filters / effects / almost everything


    On my system CPU and GPU are at 100% load, and it is encoding h264 and 1152p with 430fps. I can't see anything slow there. It is all about the settings.


    I would like to really have some nice settings to create some presets, but i need to find someone who has really insight in encoder configuration to create the best presets available to make this easier for users.

  • As I can see your Render time is about 15 µs average.

    Is it normal that this value goes sometimes up to over 5000 µs ?

    You can see it in the logfile.

    My Gtx 1080ti is reaching with your settings (gpu=0 preset=slow qp=15 rc=constqp) about 200 fps. I don`t think that a 2080ti is more than twice as fast. (my 1080p video has no effects or filters or something else and is located on a SSD raid 0)

  • Rendering is done on the CPU mostly. It also depends on your source pixel format and all effects and filters you have applied.

    I have no filters and no effects in my test.

    I also have no effects or something else applied. And I think a TR 1950X should have enough power.

    • Official Post

    Just to clarify:


    Render: 15 µs <- Voukoder has no impact on this, it is entirely premiere

    Process: 5 µs <- Voukoder has high impact on this

    Encoding: 676 µs <- Voukoder has small impact on this. Mostly libav / FFmpeg


    Voukoder is very performant with 8bit formats (esp. yuv420p). Formats with higher pixel depths require an expensive frame conversion and are slower.

  • Just to clarify:


    Render: 15 µs <- Voukoder has no impact on this, it is entirely premiere

    Process: 5 µs <- Voukoder has high impact on this

    Encoding: 676 µs <- Voukoder has small impact on this. Mostly libav / FFmpeg

    Can you look at my logfile above and tell me if the latencies are normal?

    I am changing and debugging some things in the source code right now. Maybe I can find something.

    • Official Post

    It's interesting to see that every 4th frame is slow.


    [18:46:11] Video frame #12935: Render: 14392 µs, Process: 6 µs, Encoding: 861 µs

    [18:46:11] Video frame #12936: Render: 75 µs, Process: 3 µs, Encoding: 768 µs

    [18:46:11] Video frame #12937: Render: 69 µs, Process: 2 µs, Encoding: 1461 µs

    [18:46:11] Video frame #12938: Render: 64 µs, Process: 4 µs, Encoding: 1841 µs


    Average FPS: 205

    • Official Post

    There are some issues with your system, yes. I have no clue why it is like that. It could be many things:


    Maybe ...

    • with AMD CPUs at all
    • with Threadripper CPUs
    • with high core counts
    • with specific Premiere settings? Hardware decoding?
    • ...

    I did not get any hardware donations and I can't afford buying any test equipment. So I can just guess here.


    But as you have a working Visual Studio IDE and the voukoder sources I'd recommend that you try to find the issue. Try to find that slow spot.


    Try to profile it.


    Edit:


    My i7-4770 looks similar but still faster than your Threadripper:


    Code
    [20:01:44] Video frame #2297: Render: 808 µs, Process: 6 µs, Encoding: 1884 µs
    [20:01:44] Video frame #2298: Render: 56 µs, Process: 3 µs, Encoding: 2543 µs
    [20:01:44] Video frame #2299: Render: 19 µs, Process: 4 µs, Encoding: 1799 µs
    [20:01:44] Video frame #2300: Render: 16 µs, Process: 3 µs, Encoding: 1747 µs
    [20:01:44] Video frame #2301: Render: 9 µs, Process: 3 µs, Encoding: 3062 µs
    [20:01:44] Video frame #2302: Render: 17 µs, Process: 3 µs, Encoding: 2303 µs
    [20:01:44] Video frame #2303: Render: 8254 µs, Process: 6 µs, Encoding: 1191 µs
    [20:01:44] Video frame #2304: Render: 654 µs, Process: 5 µs, Encoding: 1199 µs
    [20:01:44] Video frame #2305: Render: 991 µs, Process: 5 µs, Encoding: 1220 µs
    [20:01:44] Video frame #2306: Render: 227 µs, Process: 5 µs, Encoding: 1145 µs
  • But the question is why Voukoder 1.2.0 is that much faster?

    I am debugging right now, maybe I can find something.

    What I can currently say is that the ffmpeg version doesn't matter, I tested it with both 4.0 and 4.1.

  • I found it. The problem is CUDA enabled in the Premiere Project settings. If I disable it (setting to software only), Voukoder R2 is exactly same as fast as Voukoder 1.2.0 (around 370 fps). Voukoder 1.2.0 doesn't care the CUDA setting in Premiere but in Voukoder R2 for some reason if its enabled the export is much slower.

    Latencies are also normal with CUDA disabled:

    Code
    [20:04:42] Video frame #574: Render: 38 µs, Process: 8 µs, Encoding: 2456 µs
    [20:04:42] Video frame #575: Render: 77 µs, Process: 19 µs, Encoding: 2560 µs
    [20:04:42] Video frame #576: Render: 79 µs, Process: 19 µs, Encoding: 2813 µs
    [20:04:42] Video frame #577: Render: 47 µs, Process: 15 µs, Encoding: 2729 µs
    [20:04:42] Video frame #578: Render: 84 µs, Process: 16 µs, Encoding: 2063 µs
    [20:04:42] Video frame #579: Render: 58 µs, Process: 13 µs, Encoding: 1721 µs


    But in CS6 this didn't work. In my case this doesn't matter because I am not using CS6. :)

  • So today i found the beutiful plugin, but upon reading the forums looks like my prefomance is not actually that good.

    It looks like i have the same problem Vogelforscher had. When encoding using suggested settings(gpu=0 preset=slow qp=15 rc=constqp) i also get around 190 in AME and 200fps in premiere. Disabling CUDA does nothing, and disabling hardware decode makes premiere(or AME) unusable. Cpu usage peaks at 85, GPU around 50%.

    I am using i7 8750H(6 core @3.9), 16 GB of Ram, GTX 1060(shouldn't matter, right?). Rendering and reading from my only SSD in the system(should not be the bottleneck).