GPT-3 has 350 MB of weights and will take atleast 8 of the latest 80GB A100's. That will increase cloud cost quite a bit from 1.9$/hour to 19$/hour. The A100's are also twice as fast as V100, so that will offset some of the cost. Overall, I estimate that margin will drop to 10x from 60x
Does the prompt also consume tokens?
I doubt you could put a book and generate its sentiment for the cost of 5-6 tokens right?
10M tokens / $400 = 25k tokens / $
When you say 17 requests / $ are you assuming each request to be 1500 tokens long?
Does the top-up pack cost 6 cents for 1k tokens or 1 token?