On the evening of June 23, Alibaba's intelligent Agent platform QoderWork officially introduced the industry's first "Off-Peak Tokens" pricing model in China. According to official sources, Agents running between 22:00 and 08:00 the next day will automatically qualify for off-peak discounts, with the flagship Qwen3.7-Max model seeing prices slashed by up to 80% (priced at 20% of the original cost).
This innovative billing model is designed to significantly lower the operational costs of AI Agents. In practice, users can configure scheduled tasks during the day or submit complex, long-running instructions right before sleep. The Agent will autonomously execute the entire pipeline during the off-peak night hours, allowing users to collect results first thing in the morning. This discount is currently supported across several products, including QoderWork and Qoder Desktop.
[AgentUpdate Depth Analysis] Alibaba’s introduction of "Off-Peak Tokens" marks a pivotal shift in the unit economics of the AI Agent ecosystem. While time-of-use pricing is a staple in traditional cloud computing and utility grids, applying it to LLM inference is highly strategic for agentic workflows. Unlike standard chat interfaces, AI Agents rely heavily on autonomous multi-step reasoning and iterative execution, which inherently consume vast amounts of tokens. By shifting non-urgent, asynchronous batch processes to underutilized nighttime computing slots, providers can optimize server utility while drastically lowering the barrier to entry for developers. This financial optimization is crucial for transitioning from simple chatbots to robust, production-grade enterprise AI Agents that run continuously in the background.