We present an efficient Monte Carlo based probabilistic fracture mechanics simulation implementation for heterogeneous high-performance (HPC) architectures including CPUs and GPUs. The specific application focuses on large heavy-duty gas turbine rotor components for the energy sector. A reliable probabilistic risk quantification requires the simulation of millions to billions of Monte Carlo (MC) samples. We apply a modified Runge-Kutta algorithm in order to solve numerically the fatigue crack growth for this large number of cracks for varying initial crack sizes, locations, material and service conditions. This compute intensive simulation has already been demonstrated to perform efficiently and scalable on parallel and distributed HPC architectures including hundreds of CPUs utilizing the Message Passing Interface (MPI) paradigm. In this work, we go a step further and include GPUs in our parallelization strategy. We develop a load distribution scheme to share one or more GPUs on compute nodes distributed over a network. We detail technical challenges and solution strategies in performing the simulations on GPUs efficiently. We show that the key computation of the modified Runge-Kutta integration step speeds up over two orders of magnitude on a typical GPU compared to a single threaded CPU. This is supported by our use of GPU textures for efficient interpolation of multi-dimensional tables utilized in the implementation. We demonstrate weak and strong scaling of our GPU implementation, i.e., that we can efficiently utilize a large number of GPUs/CPUs in order to solve for more MC samples, or reduce the computational turn-around time, respectively. On seven different GPUs spanning four generations, the presented probabilistic fracture mechanics simulation tool ProbFM achieves a speed-up ranging from 16.4x to 47.4x compared to single threaded CPU implementation.