Heating, Ventilation, and Air Conditioning (HVAC) systems are major energy consumers in buildings, challenging the balance between efficiency and occupant comfort. While prior research explored generative AI for HVAC control in simulations, real-world validation remained scarce. This study addresses this gap by designing, deploying, and evaluating “Office-in-the-Loop,” a novel cyber-physical system leveraging generative AI within an operational office setting. Capitalizing on multimodal foundation models and Agentic AI, our system integrates real-time environmental sensor data (temperature, occupancy, etc.), occupants’ subjective thermal comfort feedback, and historical context as input prompts for the generative AI to dynamically predict optimal HVAC temperature setpoints. Extensive real-world experiments demonstrate significant energy savings (up to 47.92%) while simultaneously improving comfort (up to 26.36%) compared to baseline operation. Regression analysis confirmed the robustness of our approach against confounding variables like outdoor conditions and occupancy levels. Furthermore, we introduce Data-Driven Reasoning using Agentic AI, finding that prompting the AI for data-grounded rationales significantly enhances prediction stability and enables the inference of system dynamics and cost functions, bypassing the need for traditional reinforcement learning paradigms. This work bridges simulation and reality, showcasing generative AI’s potential for efficient, comfortable building environments and indicating future scalability to large systems like data centers.