With the rapid growth in demand for distributed computing, data centers are a critical physical component of the “cloud.” Recent studies show that the energy consumption of data centers for both cooling and computing keeps increasing, and the growth in server power densities makes it ever more challenging to keep the servers below their maximum operating temperature. This paper presents a new dynamic load-balancing approach based on individual server central processing unit (CPU) temperatures. In this approach, a load balancer assigns a task in real time to a server based on the objective to keep the CPU temperatures below a maximum value. Experimental studies are conducted in a single rack based on production workload traces of Google clusters. This study also compares the performance of this method with two other load balancing approaches, Round Robin, and a CPU utilization-based method in terms of temperature distributions, local fan rotation speeds, system loads, and server processing times. Furthermore, we investigate how the effect of the proposed load balancing changes with different assumed applications run on servers. The results indicate that this new method can more effectively reduce both server CPU temperatures and local fan rotation speed in a rack especially for the most of web applications.