Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zepeng Liu

On Simplifying Large-Scale Spatial Vectors: Fast, Memory-Efficient, and Cost-Predictable k-means

Dec 03, 2024

Yushuai Ji, Zepeng Liu, Sheng Wang, Yuan Sun, Zhiyong Peng

Figure 1 for On Simplifying Large-Scale Spatial Vectors: Fast, Memory-Efficient, and Cost-Predictable k-means

Figure 2 for On Simplifying Large-Scale Spatial Vectors: Fast, Memory-Efficient, and Cost-Predictable k-means

Figure 3 for On Simplifying Large-Scale Spatial Vectors: Fast, Memory-Efficient, and Cost-Predictable k-means

Figure 4 for On Simplifying Large-Scale Spatial Vectors: Fast, Memory-Efficient, and Cost-Predictable k-means

Abstract:The k-means algorithm can simplify large-scale spatial vectors, such as 2D geo-locations and 3D point clouds, to support fast analytics and learning. However, when processing large-scale datasets, existing k-means algorithms have been developed to achieve high performance with significant computational resources, such as memory and CPU usage time. These algorithms, though effective, are not well-suited for resource-constrained devices. In this paper, we propose a fast, memory-efficient, and cost-predictable k-means called Dask-means. We first accelerate k-means by designing a memory-efficient accelerator, which utilizes an optimized nearest neighbor search over a memory-tunable index to assign spatial vectors to clusters in batches. We then design a lightweight cost estimator to predict the memory cost and runtime of the k-means task, allowing it to request appropriate memory from devices or adjust the accelerator's required space to meet memory constraints, and ensure sufficient CPU time for running k-means. Experiments show that when simplifying datasets with scale such as $10^6$, Dask-means uses less than $30$MB of memory, achieves over $168$ times speedup compared to the widely-used Lloyd's algorithm. We also validate Dask-means on mobile devices, where it demonstrates significant speedup and low memory cost compared to other state-of-the-art (SOTA) k-means algorithms. Our cost estimator estimates the memory cost with a difference of less than $3\%$ from the actual ones and predicts runtime with an MSE up to $33.3\%$ lower than SOTA methods.

Via

Access Paper or Ask Questions

SAGE-NDVI: A Stereotype-Breaking Evaluation Metric for Remote Sensing Image Dehazing Using Satellite-to-Ground NDVI Knowledge

Jun 09, 2023

Zepeng Liu, Zhicheng Yang, Mingye Zhu, Andy Wong, Yibing Wei, Mei Han, Jun Yu, Jui-Hsin Lai

Abstract:Image dehazing is a meaningful low-level computer vision task and can be applied to a variety of contexts. In our industrial deployment scenario based on remote sensing (RS) images, the quality of image dehazing directly affects the grade of our crop identification and growth monitoring products. However, the widely used peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) provide ambiguous visual interpretation. In this paper, we design a new objective metric for RS image dehazing evaluation. Our proposed metric leverages a ground-based phenology observation resource to calculate the vegetation index error between RS and ground images at a hazy date. Extensive experiments validate that our metric appropriately evaluates different dehazing models and is in line with human visual perception.

* Accepted by ICME 2023 Industry Track

Via

Access Paper or Ask Questions