Thanks for the input. I will try to reproduce it over the weekend and see if i can't provide a quick fix.
FYI: If you set the encoder to use a "cuda"-pixel format you first need to upload the frame from the cpu to the gpu memory using the upload filter. But that only makes sense if you to use more or less complex filter chains on the gpu (and not on the cpu). For a standard GPU accelerated workflow you don't need this.