Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Junyi Liu

Adaptive Learning-based Surrogate Method for Stochastic Programs with Implicitly Decision-dependent Uncertainty

May 12, 2025

Boyang Shen, Junyi Liu

Abstract:We consider a class of stochastic programming problems where the implicitly decision-dependent random variable follows a nonparametric regression model with heteroscedastic error. The Clarke subdifferential and surrogate functions are not readily obtainable due to the latent decision dependency. To deal with such a computational difficulty, we develop an adaptive learning-based surrogate method that integrates the simulation scheme and statistical estimates to construct estimation-based surrogate functions in a way that the simulation process is adaptively guided by the algorithmic procedure. We establish the non-asymptotic convergence rate analysis in terms of $(\nu, \delta)$-near stationarity in expectation under variable proximal parameters and batch sizes, which exhibits the superior convergence performance and enhanced stability in both theory and practice. We provide numerical results with both synthetic and real data which illustrate the benefits of the proposed algorithm in terms of algorithmic stability and efficiency.

Via

Access Paper or Ask Questions

Good things come in small packages: Should we adopt Lite-GPUs in AI infrastructure?

Jan 17, 2025

Burcu Canakci, Junyi Liu, Xingbo Wu, Nathanaël Cheriere, Paolo Costa, Sergey Legtchenko, Dushyanth Narayanan, Ant Rowstron

Abstract:To match the blooming demand of generative AI workloads, GPU designers have so far been trying to pack more and more compute and memory into single complex and expensive packages. However, there is growing uncertainty about the scalability of individual GPUs and thus AI clusters, as state-of-the-art GPUs are already displaying packaging, yield, and cooling limitations. We propose to rethink the design and scaling of AI clusters through efficiently-connected large clusters of Lite-GPUs, GPUs with single, small dies and a fraction of the capabilities of larger GPUs. We think recent advances in co-packaged optics can be key in overcoming the communication challenges of distributing AI workloads onto more Lite-GPUs. In this paper, we present the key benefits of Lite-GPUs on manufacturing cost, blast radius, yield, and power efficiency; and discuss systems opportunities and challenges around resource, workload, memory, and network management.

* 5+ pages, 4 figures

Via

Access Paper or Ask Questions

Managed-Retention Memory: A New Class of Memory for the AI Era

Jan 16, 2025

Sergey Legtchenko, Ioan Stefanovici, Richard Black, Antony Rowstron, Junyi Liu, Paolo Costa, Burcu Canakci, Dushyanth Narayanan, Xingbo Wu

Abstract:AI clusters today are one of the major uses of High Bandwidth Memory (HBM). However, HBM is suboptimal for AI workloads for several reasons. Analysis shows HBM is overprovisioned on write performance, but underprovisioned on density and read bandwidth, and also has significant energy per bit overheads. It is also expensive, with lower yield than DRAM due to manufacturing complexity. We propose a new memory class: Managed-Retention Memory (MRM), which is more optimized to store key data structures for AI inference workloads. We believe that MRM may finally provide a path to viability for technologies that were originally proposed to support Storage Class Memory (SCM). These technologies traditionally offered long-term persistence (10+ years) but provided poor IO performance and/or endurance. MRM makes different trade-offs, and by understanding the workload IO patterns, MRM foregoes long-term data retention and write performance for better potential performance on the metrics important for these workloads.

* 8 pages (5 content + 3 refs); 1 figure

Via

Access Paper or Ask Questions

TS-SatMVSNet: Slope Aware Height Estimation for Large-Scale Earth Terrain Multi-view Stereo

Jan 02, 2025

Song Zhang, Zhiwei Wei, Wenjia Xu, Lili Zhang, Yang Wang, Jinming Zhang, Junyi Liu

Figure 1 for TS-SatMVSNet: Slope Aware Height Estimation for Large-Scale Earth Terrain Multi-view Stereo

Figure 2 for TS-SatMVSNet: Slope Aware Height Estimation for Large-Scale Earth Terrain Multi-view Stereo

Figure 3 for TS-SatMVSNet: Slope Aware Height Estimation for Large-Scale Earth Terrain Multi-view Stereo

Figure 4 for TS-SatMVSNet: Slope Aware Height Estimation for Large-Scale Earth Terrain Multi-view Stereo

Abstract:3D terrain reconstruction with remote sensing imagery achieves cost-effective and large-scale earth observation and is crucial for safeguarding natural disasters, monitoring ecological changes, and preserving the environment.Recently, learning-based multi-view stereo~(MVS) methods have shown promise in this task. However, these methods simply modify the general learning-based MVS framework for height estimation, which overlooks the terrain characteristics and results in insufficient accuracy. Considering that the Earth's surface generally undulates with no drastic changes and can be measured by slope, integrating slope considerations into MVS frameworks could enhance the accuracy of terrain reconstructions. To this end, we propose an end-to-end slope-aware height estimation network named TS-SatMVSNet for large-scale remote sensing terrain reconstruction.To effectively obtain the slope representation, drawing from mathematical gradient concepts, we innovatively proposed a height-based slope calculation strategy to first calculate a slope map from a height map to measure the terrain undulation. To fully integrate slope information into the MVS pipeline, we separately design two slope-guided modules to enhance reconstruction outcomes at both micro and macro levels. Specifically, at the micro level, we designed a slope-guided interval partition module for refined height estimation using slope values. At the macro level, a height correction module is proposed, using a learnable Gaussian smoothing operator to amend the inaccurate height values. Additionally, to enhance the efficacy of height estimation, we proposed a slope direction loss for implicitly optimizing height estimation results. Extensive experiments on the WHU-TLC dataset and MVS3D dataset show that our proposed method achieves state-of-the-art performance and demonstrates competitive generalization ability.

Via

Access Paper or Ask Questions

Integrating MedCLIP and Cross-Modal Fusion for Automatic Radiology Report Generation

Dec 10, 2024

Qianhao Han, Junyi Liu, Zengchang Qin, Zheng Zheng

Abstract:Automating radiology report generation can significantly reduce the workload of radiologists and enhance the accuracy, consistency, and efficiency of clinical documentation.We propose a novel cross-modal framework that uses MedCLIP as both a vision extractor and a retrieval mechanism to improve the process of medical report generation.By extracting retrieved report features and image features through an attention-based extract module, and integrating them with a fusion module, our method improves the coherence and clinical relevance of generated reports.Experimental results on the widely used IU-Xray dataset demonstrate the effectiveness of our approach, showing improvements over commonly used methods in both report quality and relevance.Additionally, ablation studies provide further validation of the framework, highlighting the importance of accurate report retrieval and feature integration in generating comprehensive medical reports.

* Accepted in IEEE Big Data 2024

Via

Access Paper or Ask Questions

Learning Robust Treatment Rules for Censored Data

Aug 17, 2024

Yifan Cui, Junyi Liu, Tao Shen, Zhengling Qi, Xi Chen

Abstract:There is a fast-growing literature on estimating optimal treatment rules directly by maximizing the expected outcome. In biomedical studies and operations applications, censored survival outcome is frequently observed, in which case the restricted mean survival time and survival probability are of great interest. In this paper, we propose two robust criteria for learning optimal treatment rules with censored survival outcomes; the former one targets at an optimal treatment rule maximizing the restricted mean survival time, where the restriction is specified by a given quantile such as median; the latter one targets at an optimal treatment rule maximizing buffered survival probabilities, where the predetermined threshold is adjusted to account the restricted mean survival time. We provide theoretical justifications for the proposed optimal treatment rules and develop a sampling-based difference-of-convex algorithm for learning them. In simulation studies, our estimators show improved performance compared to existing methods. We also demonstrate the proposed method using AIDS clinical trial data.

Via

Access Paper or Ask Questions

ALPS: An Auto-Labeling and Pre-training Scheme for Remote Sensing Segmentation With Segment Anything Model

Jun 16, 2024

Song Zhang, Qingzhong Wang, Junyi Liu, Haoyi Xiong

Abstract:In the fast-growing field of Remote Sensing (RS) image analysis, the gap between massive unlabeled datasets and the ability to fully utilize these datasets for advanced RS analytics presents a significant challenge. To fill the gap, our work introduces an innovative auto-labeling framework named ALPS (Automatic Labeling for Pre-training in Segmentation), leveraging the Segment Anything Model (SAM) to predict precise pseudo-labels for RS images without necessitating prior annotations or additional prompts. The proposed pipeline significantly reduces the labor and resource demands traditionally associated with annotating RS datasets. By constructing two comprehensive pseudo-labeled RS datasets via ALPS for pre-training purposes, our approach enhances the performance of downstream tasks across various benchmarks, including iSAID and ISPRS Potsdam. Experiments demonstrate the effectiveness of our framework, showcasing its ability to generalize well across multiple tasks even under the scarcity of extensively annotated datasets, offering a scalable solution to automatic segmentation and annotation challenges in the field. In addition, the proposed a pipeline is flexible and can be applied to medical image segmentation, remarkably boosting the performance. Note that ALPS utilizes pre-trained SAM to semi-automatically annotate RS images without additional manual annotations. Though every component in the pipeline has bee well explored, integrating clustering algorithms with SAM and novel pseudo-label alignment significantly enhances RS segmentation, as an off-the-shelf tool for pre-training data preparation. Our source code is available at: https://github.com/StriveZs/ALPS.

Via

Access Paper or Ask Questions

Neural Implicit Field Editing Considering Object-environment Interaction

Nov 01, 2023

Zhihong Zeng, Zongji Wang, Yuanben Zhang, Weinan Cai, Zehao Cao, Lili Zhang, Yan Guo, Yanhong Zhang, Junyi Liu

Figure 1 for Neural Implicit Field Editing Considering Object-environment Interaction

Figure 2 for Neural Implicit Field Editing Considering Object-environment Interaction

Figure 3 for Neural Implicit Field Editing Considering Object-environment Interaction

Figure 4 for Neural Implicit Field Editing Considering Object-environment Interaction

Abstract:The 3D scene editing method based on neural implicit field has gained wide attention. It has achieved excellent results in 3D editing tasks. However, existing methods often blend the interaction between objects and scene environment. The change of scene appearance like shadows is failed to be displayed in the rendering view. In this paper, we propose an Object and Scene environment Interaction aware (OSI-aware) system, which is a novel two-stream neural rendering system considering object and scene environment interaction. To obtain illuminating conditions from the mixture soup, the system successfully separates the interaction between objects and scene environment by intrinsic decomposition method. To study the corresponding changes to the scene appearance from object-level editing tasks, we introduce a depth map guided scene inpainting method and shadow rendering method by point matching strategy. Extensive experiments demonstrate that our novel pipeline produce reasonable appearance changes in scene editing tasks. It also achieve competitive performance for the rendering quality in novel-view synthesis tasks.

Via

Access Paper or Ask Questions

LFG: A Generative Network for Real-Time Recommendation

Oct 31, 2023

Junyi Liu

Figure 1 for LFG: A Generative Network for Real-Time Recommendation

Figure 2 for LFG: A Generative Network for Real-Time Recommendation

Figure 3 for LFG: A Generative Network for Real-Time Recommendation

Figure 4 for LFG: A Generative Network for Real-Time Recommendation

Abstract:Recommender systems are essential information technologies today, and recommendation algorithms combined with deep learning have become a research hotspot in this field. The recommendation model known as LFM (Latent Factor Model), which captures latent features through matrix factorization and gradient descent to fit user preferences, has given rise to various recommendation algorithms that bring new improvements in recommendation accuracy. However, collaborative filtering recommendation models based on LFM lack flexibility and has shortcomings for real-time recommendations, as they need to redo the matrix factorization and retrain using gradient descent when new users arrive. In response to this, this paper innovatively proposes a Latent Factor Generator (LFG) network, and set the movie recommendation as research theme. The LFG dynamically generates user latent factors through deep neural networks without the need for re-factorization or retrain. Experimental results indicate that the LFG recommendation model outperforms traditional matrix factorization algorithms in recommendation accuracy, providing an effective solution to the challenges of real-time recommendations with LFM.

* 9 pages, 1 figure, 4 tables. Source code would be uploaded to github soon

Via

Access Paper or Ask Questions

TCRA-LLM: Token Compression Retrieval Augmented Large Language Model for Inference Cost Reduction

Oct 25, 2023

Junyi Liu, Liangzhi Li, Tong Xiang, Bowen Wang, Yiming Qian

Abstract:Since ChatGPT released its API for public use, the number of applications built on top of commercial large language models (LLMs) increase exponentially. One popular usage of such models is leveraging its in-context learning ability and generating responses given user queries leveraging knowledge obtained by retrieval augmentation. One problem of deploying commercial retrieval-augmented LLMs is the cost due to the additionally retrieved context that largely increases the input token size of the LLMs. To mitigate this, we propose a token compression scheme that includes two methods: summarization compression and semantic compression. The first method applies a T5-based model that is fine-tuned by datasets generated using self-instruct containing samples with varying lengths and reduce token size by doing summarization. The second method further compresses the token size by removing words with lower impact on the semantic. In order to adequately evaluate the effectiveness of the proposed methods, we propose and utilize a dataset called Food-Recommendation DB (FRDB) focusing on food recommendation for women around pregnancy period or infants. Our summarization compression can reduce 65% of the retrieval token size with further 0.3% improvement on the accuracy; semantic compression provides a more flexible way to trade-off the token size with performance, for which we can reduce the token size by 20% with only 1.6% of accuracy drop.

* EMNLP 2023 Findings

Via

Access Paper or Ask Questions