Robust GNSS positioning in urban environments is still plagued by multipath effects, particularly due to the complex signal propagation induced by ubiquitous surfaces with varied radio frequency reflectivities. Current 3D Mapping Aided (3DMA) GNSS techniques show great potentials in mitigating multipath but face a critical trade-off between computational efficiency and modeling accuracy. Most approaches often rely on offline outdated or oversimplified 3D maps, while real-time LiDAR-based reconstruction boasts high accuracy, it is problematic in low laser reflectivity conditions; camera 3DMA is a good candidate to balance accuracy and efficiency but current methods suffer from extremely low reconstruction speed, a far cry from real-time multipath-mitigated navigation. This paper proposes an accelerated framework incorporating camera multi-view stereo (MVS) reconstruction and ray tracing. By hypothesizing on surface textures, an orthogonal visual feature fusion framework is proposed, which robustly addresses both texture-rich and texture-poor surfaces, lifting off the reflectivity challenges in visual reconstruction. A polygonal surface modeling scheme is further integrated to accurately delineate complex building boundaries, enhancing the reconstruction granularity. To avoid excessively accurate reconstruction, reprojected point cloud multi-plane fitting and two complexity control strategies are proposed, thus improving upon multipath estimation speed. Experiments were conducted in Lujiazui, Shanghai, a typical multipath-prone district. The results show that the method achieves an average reconstruction accuracy of 2.4 meters in dense urban environments featuring glass curtain wall structures, a traditionally tough case for reconstruction, and achieves a ray-tracing-based multipath correction rate of 30 image frames per second, 10 times faster than the contemporary benchmarks.