In this paper, a distributed proxy architecture is introduced for the provisioning of an immersive audio communication service to massively multi-player online games. The immersive audio communication service enables each avatar to hear a realistic audio mix of conversations in its hearing range. In our earlier work, peer-to-peer and central server architectures have been proposed for this service. In this paper, a distributed proxy architecture with either using network multicast or unicast between proxies is introduced to address the limitations of the previous architectures. The main focus of this paper is to evaluate the bandwidth cost saving of network multicast in the distributed proxy architecture in different avatar grouping behaviours and distribution of game player scenarios. In addition, the effect of varying the number of proxy servers on communication delays and network bandwidth usages are investigated. We have developed a simulation environment that creates both the physical world (geographic distribution of participants and the Internet topology model) and the virtual world (distribution of avatars based on different avatar aggregation behaviors). Based on the simulation study, we provide recommendations on a cost-effective delivery architecture for this service.