>>1061
Isn't this, like, pretty much how mesa goes to do this shit? Yea, I know it translates from opengl to tgsi first, but these tgsi calls are then either translated to native gpu code or cpu code depending if the gpu is able to perform said operation.
There's some situations where not having certain operations is slow as fuck though, as it may involve entire arrays of data which need to then be copied out of the gpu, have this one fucking operation done on them, and then have it copied back. Bonus points if the next operation indeed needs the gpu and the next one needs to be done on the cpu again. You can actually out-slow a pure-cpu implementation quite easily this way, so a modern gpu would still be beneficial even if it is not strictly required.