Zhang Haoran, Ji Xia
Acoustic point source inversion is a critical yet ill-posed inverse problem in wave physics, proving particularly challenging when observation data is sparse, limited-aperture, and noisy. To overcome the limitations of traditional methods and existing deep learning models in addressing complex multi-source and strong-interference scenarios, this paper proposes a unified end-to-end inversion network based on Multi-Frequency data fusion and the Transformer (Multi-Frequency Field and Count Transformer, MFFC-Former). The model innovatively aggregates the Multi-Frequency complex response from each measurement point into a feature token. It then leverages the Transformer's powerful self-attention mechanism to capture the global dependencies among all measurement points, thereby synchronously performing source count classification and location indicator field regression within a single network. This end-to-end, multi-task learning paradigm discards complex post-processing steps (such as MCMC) and avoids reliance on information such as noise priors, enhancing solution efficiency and system integration. Numerical experiments under complex conditions, involving up to 6 point sources and high noise levels (up to 20\%), demonstrate that MFFC-Former outperforms a fully connected network (MLP) baseline with a comparable number of parameters in both average localization error and count prediction accuracy. Particularly in the challenging scenario with 6 sources and 20\% noise, where the MLP baseline fails to resolve all sources, MFFC-Former successfully resolves and locates all source points, demonstrating its resolution and robustness in multi-source, strong-interference environments. The results of this study demonstrate that leveraging the Transformer architecture to effectively fuse the intrinsic correlations within Multi-Frequency observation data is a viable pathway for solving such highly ill-posed inverse problems.