Due to the tremendous cooling costs, data center cooling efficiency improvement has been actively pursued for years. In addition to cooling efficiency, the reliability of the cooling system is also essential for guaranteed uptime. In traditional data center cooling system design with N+1 or higher redundancy, all the computer room air conditioning (CRAC) units are either constantly online or cycled according to a predefined schedule. Both cooling system configurations, however, have their respective drawbacks. Data centers are usually over provisioned when all CRAC units are online all the time, and hence the cooling efficiency is low. On the other hand, although cooling efficiency can be improved by cycling CRAC units and turning off the backups, it is difficult to schedule the cycling such that sufficient cooling provisioning is guaranteed and gross over provisioning is avoided. In this paper, we aim to maintain the data center cooling redundancy while achieving high cooling efficiency. Using model-based thermal zone mapping, we first partition data centers to achieve the desired level of cooling influence redundancy. We then design a distributed controller for each of the CRAC units to regulate the thermal status within its zone of influence. The distributed controllers coordinate with each other to achieve the desired data center thermal status using the least cooling power. When CRAC units or their associated controllers fail, racks in the affected thermal zones are still within the control “radius” of other decentralized cooling controllers through predefined thermal zone overlap, and hence their thermal status is properly managed by the active CRAC units and controllers. Using this failure resistant data center cooling control approach, both cooling efficiency and robustness are achieved simultaneously. A higher flexibility in cooling system maintenance is also expected, since the distributed control system can automatically adapt to the new cooling facility configuration incurred by maintenance.
Skip Nav Destination
ASME 2012 Heat Transfer Summer Conference collocated with the ASME 2012 Fluids Engineering Division Summer Meeting and the ASME 2012 10th International Conference on Nanochannels, Microchannels, and Minichannels
July 8–12, 2012
Rio Grande, Puerto Rico, USA
Conference Sponsors:
- Heat Transfer Division
ISBN:
978-0-7918-4478-6
PROCEEDINGS PAPER
Failure Resistant Data Center Cooling Control Through Model-Based Thermal Zone Mapping
Rongliang Zhou,
Rongliang Zhou
Hewlett-Packard Company, Palo Alto, CA
Search for other works by this author on:
Zhikui Wang,
Zhikui Wang
Hewlett-Packard Company, Palo Alto, CA
Search for other works by this author on:
Cullen E. Bash,
Cullen E. Bash
Hewlett-Packard Company, Palo Alto, CA
Search for other works by this author on:
Tahir Cader,
Tahir Cader
Hewlett-Packard Company, Liberty Lake, WA
Search for other works by this author on:
Alan McReynolds
Alan McReynolds
Hewlett-Packard Company, Palo Alto, CA
Search for other works by this author on:
Rongliang Zhou
Hewlett-Packard Company, Palo Alto, CA
Zhikui Wang
Hewlett-Packard Company, Palo Alto, CA
Cullen E. Bash
Hewlett-Packard Company, Palo Alto, CA
Tahir Cader
Hewlett-Packard Company, Liberty Lake, WA
Alan McReynolds
Hewlett-Packard Company, Palo Alto, CA
Paper No:
HT2012-58403, pp. 751-757; 7 pages
Published Online:
July 24, 2013
Citation
Zhou, R, Wang, Z, Bash, CE, Cader, T, & McReynolds, A. "Failure Resistant Data Center Cooling Control Through Model-Based Thermal Zone Mapping." Proceedings of the ASME 2012 Heat Transfer Summer Conference collocated with the ASME 2012 Fluids Engineering Division Summer Meeting and the ASME 2012 10th International Conference on Nanochannels, Microchannels, and Minichannels. Volume 2: Heat Transfer Enhancement for Practical Applications; Fire and Combustion; Multi-Phase Systems; Heat Transfer in Electronic Equipment; Low Temperature Heat Transfer; Computational Heat Transfer. Rio Grande, Puerto Rico, USA. July 8–12, 2012. pp. 751-757. ASME. https://doi.org/10.1115/HT2012-58403
Download citation file:
12
Views
Related Proceedings Papers
Related Articles
Uncertainty of Integral System Safety in Engineering
ASME J. Risk Uncertainty Part B (June,2022)
The Thermal Design of a Next Generation Data Center: A Conceptual Exposition
J. Electron. Packag (December,2008)
A Gray-Box Based Virtual SCFM Meter in Rooftop Air-Conditioning Units
J. Thermal Sci. Eng. Appl (March,2011)
Related Chapters
Managing Energy Resources from within the Corporate Information Technology System
Industrial Energy Systems
Telecom: A Field with Myths and Mistakes All Its Own
More Hot Air
Comparison of the Availability of Trip Systems for Reactors with Exothermal Reactions (PSAM-0361)
Proceedings of the Eighth International Conference on Probabilistic Safety Assessment & Management (PSAM)