Check out the power draw metrics. Following the CPU+GPU power consumption, it seems like it averaged 22W for about a minute. Unless I'm missing something, the inference for this example consumed at most .0004 kWh.<p>That's almost nothing. If these models are capable/functional enough for most day-to-day uses, then useful LLM-based GenAI is already at the "too cheap to meter" stage.