Right. I would assume that this is because the framebuffer must be copied from one GPU (the game) to the other (the OBS compositor/ NVENC encoder), via the PCI-e buss? I imagine this would also be transferred via system RAM rather than directly via PCIe, so that would be yet more overhead? Then again I'm not sure how this would differ from copying that same framebuffer to a capture card or network card for export to a dedicated streaming PC?
I am in agreement with you here, but obviously some people don't quite understand the nature of the situation and the reason why using a 2nd card for encoding can sometimes provide an improvement, and what the reasons or limitations of those improvements are, so I would like to be able to provide them with specifics. Watching the results from the video I linked above, does raise a lot of "Why is that?" kind of questions. We know it is not the encoder, since that is not the same hardware that processes the game. We know (from your posts here) that it is not the compositing of the scene (since that also occurs on the primary monitor's GPU)... Could it just be from drawing the preview on screen?
Please don't feel like I am arguing with you here. I just want to be fully informed, and I know that you know your stuff. I'm not fighting you, I am seeking education, that I may share it with others :)