Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Junzhou He

StackSight: Unveiling WebAssembly through Large Language Models and Neurosymbolic Chain-of-Thought Decompilation

Jun 07, 2024

Weike Fang, Zhejian Zhou, Junzhou He, Weihang Wang

Abstract:WebAssembly enables near-native execution in web applications and is increasingly adopted for tasks that demand high performance and robust security. However, its assembly-like syntax, implicit stack machine, and low-level data types make it extremely difficult for human developers to understand, spurring the need for effective WebAssembly reverse engineering techniques. In this paper, we propose StackSight, a novel neurosymbolic approach that combines Large Language Models (LLMs) with advanced program analysis to decompile complex WebAssembly code into readable C++ snippets. StackSight visualizes and tracks virtual stack alterations via a static analysis algorithm and then applies chain-of-thought prompting to harness LLM's complex reasoning capabilities. Evaluation results show that StackSight significantly improves WebAssembly decompilation. Our user study also demonstrates that code snippets generated by StackSight have significantly higher win rates and enable a better grasp of code semantics.

* 9 pages. In the Proceedings of the 41st International Conference on Machine Learning (ICML' 24)

Via

Access Paper or Ask Questions

Attention: Large Multimodal Model is Watching your Geo-privacy

Nov 21, 2023

Yifan Yang, Yixian Zhang, Daoyang Li, Shuju Sun, Junhong Duan, Junzhou He, Qingyang Wu, Hao Liu

Figure 1 for Attention: Large Multimodal Model is Watching your Geo-privacy

Figure 2 for Attention: Large Multimodal Model is Watching your Geo-privacy

Abstract:Geographic privacy, a crucial aspect of personal security, often goes unnoticed in daily activities. This paper addresses the underestimation of this privacy in the context of increasing online data sharing and the advancements in information gathering technologies. With the surge in the use of Large Multimodal Models, such as GPT-4, for Open Source Intelligence (OSINT), the potential risks associated with geographic privacy breaches have intensified. This study highlights the criticality of these developments, focusing on their implications for individual privacy. The primary objective is to demonstrate the capabilities of advanced AI tools, specifically a GPT-4 based model named "Dr. Watson," in identifying and potentially compromising geographic privacy through online shared content. We developed "Dr. Watson" to analyze and extract geographic information from publicly available data sources. The study involved five experimental cases, each offering different perspectives on the tool's application in extracting precise location data from partial images and social media content. The experiments revealed that "Dr. Watson" could successfully identify specific geographic details, thereby exposing the vulnerabilities in current geo-privacy measures. These findings underscore the ease with which geographic information can be unintentionally disclosed. The paper concludes with a discussion on the broader implications of these findings for individuals and the community at large. It emphasizes the urgency for enhanced awareness and protective measures against geo-privacy leakage in the era of advanced AI and widespread social media usage.

* 14 pages, 5 figures

Via

Access Paper or Ask Questions