The integration of rotating machineries in human-populated environments requires to limit noise emissions, with multiple aspects impacting on control of amplitude and frequency of the acoustic signature. This is a key issue to address and when combined with compliance of minimum efficiency grades, further complicates the design of axial fans. The aim of this research is to assess the capability of unsupervised learning techniques in unveiling the mechanisms that concur to the sound generation process in axial fans starting from high-fidelity simulations. To this aim, a numerical dataset was generated by means of large Eddy simulation (LES) simulation of a low-speed axial fan. The dataset is enriched with sound source computed solving a-posteriori the perturbed convective wave equation (PCWE). First, the instantaneous flow features are associated with the sound sources through correlation matrices and then projected on latent basis to highlight the features with the highest importance. This analysis in also carried out on a reduced dataset, derived by considering two surfaces at 50% and 95% of the blade span. The sampled features on the surfaces are then exploited to train three cluster algorithms based on partitional, density and Gaussian criteria. The cluster algorithms are optimized and their results are compared, with the Gaussian Mixture one demonstrating the highest similarity (>80%). The derived clusters are analyzed, and the role of statistical distribution of velocity and pressure gradients is underlined. This suggests that design choices that affect these aspects may be beneficial to control the generation of noise sources.