Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tianyu Wu

You Know What I'm Saying -- Jailbreak Attack via Implicit Reference

Oct 04, 2024

Tianyu Wu, Lingrui Mei, Ruibin Yuan, Lujun Li, Wei Xue, Yike Guo

Figure 1 for You Know What I'm Saying -- Jailbreak Attack via Implicit Reference

Figure 2 for You Know What I'm Saying -- Jailbreak Attack via Implicit Reference

Figure 3 for You Know What I'm Saying -- Jailbreak Attack via Implicit Reference

Figure 4 for You Know What I'm Saying -- Jailbreak Attack via Implicit Reference

Abstract:While recent advancements in large language model (LLM) alignment have enabled the effective identification of malicious objectives involving scene nesting and keyword rewriting, our study reveals that these methods remain inadequate at detecting malicious objectives expressed through context within nested harmless objectives. This study identifies a previously overlooked vulnerability, which we term Attack via Implicit Reference (AIR). AIR decomposes a malicious objective into permissible objectives and links them through implicit references within the context. This method employs multiple related harmless objectives to generate malicious content without triggering refusal responses, thereby effectively bypassing existing detection techniques.Our experiments demonstrate AIR's effectiveness across state-of-the-art LLMs, achieving an attack success rate (ASR) exceeding 90% on most models, including GPT-4o, Claude-3.5-Sonnet, and Qwen-2-72B. Notably, we observe an inverse scaling phenomenon, where larger models are more vulnerable to this attack method. These findings underscore the urgent need for defense mechanisms capable of understanding and preventing contextual attacks. Furthermore, we introduce a cross-model attack strategy that leverages less secure models to generate malicious contexts, thereby further increasing the ASR when targeting other models.Our code and jailbreak artifacts can be found at https://github.com/Lucas-TY/llm_Implicit_reference.

Via

Access Paper or Ask Questions

Optimizing Robotic Manipulation with Decision-RWKV: A Recurrent Sequence Modeling Approach for Lifelong Learning

Jul 23, 2024

Yujian Dong, Tianyu Wu, Chaoyang Song

Abstract:Models based on the Transformer architecture have seen widespread application across fields such as natural language processing, computer vision, and robotics, with large language models like ChatGPT revolutionizing machine understanding of human language and demonstrating impressive memory and reproduction capabilities. Traditional machine learning algorithms struggle with catastrophic forgetting, which is detrimental to the diverse and generalized abilities required for robotic deployment. This paper investigates the Receptance Weighted Key Value (RWKV) framework, known for its advanced capabilities in efficient and effective sequence modeling, and its integration with the decision transformer and experience replay architectures. It focuses on potential performance enhancements in sequence decision-making and lifelong robotic learning tasks. We introduce the Decision-RWKV (DRWKV) model and conduct extensive experiments using the D4RL database within the OpenAI Gym environment and on the D'Claw platform to assess the DRWKV model's performance in single-task tests and lifelong learning scenarios, showcasing its ability to handle multiple subtasks efficiently. The code for all algorithms, training, and image rendering in this study is open-sourced at https://github.com/ancorasir/DecisionRWKV.

* 14 pages, 7 figures, 1 table, submitted to the Special Issue on Large Language Models In Design And Manufacturing in the Journal of Computing and Information Science in Engineering, see https://github.com/ancorasir/DecisionRWKV

Via

Access Paper or Ask Questions

Hybrid Control Policy for Artificial Pancreas via Ensemble Deep Reinforcement Learning

Jul 14, 2023

Wenzhou Lv, Tianyu Wu, Luolin Xiong, Liang Wu, Jian Zhou, Yang Tang, Feng Qian

Abstract:Objective: The artificial pancreas (AP) has shown promising potential in achieving closed-loop glucose control for individuals with type 1 diabetes mellitus (T1DM). However, designing an effective control policy for the AP remains challenging due to the complex physiological processes, delayed insulin response, and inaccurate glucose measurements. While model predictive control (MPC) offers safety and stability through the dynamic model and safety constraints, it lacks individualization and is adversely affected by unannounced meals. Conversely, deep reinforcement learning (DRL) provides personalized and adaptive strategies but faces challenges with distribution shifts and substantial data requirements. Methods: We propose a hybrid control policy for the artificial pancreas (HyCPAP) to address the above challenges. HyCPAP combines an MPC policy with an ensemble DRL policy, leveraging the strengths of both policies while compensating for their respective limitations. To facilitate faster deployment of AP systems in real-world settings, we further incorporate meta-learning techniques into HyCPAP, leveraging previous experience and patient-shared knowledge to enable fast adaptation to new patients with limited available data. Results: We conduct extensive experiments using the FDA-accepted UVA/Padova T1DM simulator across three scenarios. Our approaches achieve the highest percentage of time spent in the desired euglycemic range and the lowest occurrences of hypoglycemia. Conclusion: The results clearly demonstrate the superiority of our methods for closed-loop glucose management in individuals with T1DM. Significance: The study presents novel control policies for AP systems, affirming the great potential of proposed methods for efficient closed-loop glucose control.

* 12 pages

Via

Access Paper or Ask Questions

Random Copolymer inverse design system orienting on Accurate discovering of Antimicrobial peptide-mimetic copolymers

Dec 08, 2022

Tianyu Wu, Yang Tang

Abstract:Antimicrobial resistance is one of the biggest health problem, especially in the current period of COVID-19 pandemic. Due to the unique membrane-destruction bactericidal mechanism, antimicrobial peptide-mimetic copolymers are paid more attention and it is urgent to find more potential candidates with broad-spectrum antibacterial efficacy and low toxicity. Artificial intelligence has shown significant performance on small molecule or biotech drugs, however, the higher-dimension of polymer space and the limited experimental data restrict the application of existing methods on copolymer design. Herein, we develop a universal random copolymer inverse design system via multi-model copolymer representation learning, knowledge distillation and reinforcement learning. Our system realize a high-precision antimicrobial activity prediction with few-shot data by extracting various chemical information from multi-modal copolymer representations. By pre-training a scaffold-decorator generative model via knowledge distillation, copolymer space are greatly contracted to the near space of existing data for exploration. Thus, our reinforcement learning algorithm can be adaptive for customized generation on specific scaffolds and requirements on property or structures. We apply our system on collected antimicrobial peptide-mimetic copolymers data, and we discover candidate copolymers with desired properties.

* We decide to make deep modifacation on this paper, and we would like to add further provement on the final results generated to prove the rationality. In this way, we hope this work can be withdrew temporarily and we would open the final version in near furture. Thank you

Via

Access Paper or Ask Questions

Molecular Joint Representation Learning via Multi-modal Information

Nov 25, 2022

Tianyu Wu, Yang Tang, Qiyu Sun, Luolin Xiong

Abstract:In recent years, artificial intelligence has played an important role on accelerating the whole process of drug discovery. Various of molecular representation schemes of different modals (e.g. textual sequence or graph) are developed. By digitally encoding them, different chemical information can be learned through corresponding network structures. Molecular graphs and Simplified Molecular Input Line Entry System (SMILES) are popular means for molecular representation learning in current. Previous works have done attempts by combining both of them to solve the problem of specific information loss in single-modal representation on various tasks. To further fusing such multi-modal imformation, the correspondence between learned chemical feature from different representation should be considered. To realize this, we propose a novel framework of molecular joint representation learning via Multi-Modal information of SMILES and molecular Graphs, called MMSG. We improve the self-attention mechanism by introducing bond level graph representation as attention bias in Transformer to reinforce feature correspondence between multi-modal information. We further propose a Bidirectional Message Communication Graph Neural Network (BMC GNN) to strengthen the information flow aggregated from graphs for further combination. Numerous experiments on public property prediction datasets have demonstrated the effectiveness of our model.

Via

Access Paper or Ask Questions

Deploying self-supervised learning in the wild for hybrid automatic speech recognition

May 17, 2022

Mostafa Karimi, Changliang Liu, Kenichi Kumatani, Yao Qian, Tianyu Wu, Jian Wu

Figure 1 for Deploying self-supervised learning in the wild for hybrid automatic speech recognition

Figure 2 for Deploying self-supervised learning in the wild for hybrid automatic speech recognition

Figure 3 for Deploying self-supervised learning in the wild for hybrid automatic speech recognition

Figure 4 for Deploying self-supervised learning in the wild for hybrid automatic speech recognition

Abstract:Self-supervised learning (SSL) methods have proven to be very successful in automatic speech recognition (ASR). These great improvements have been reported mostly based on highly curated datasets such as LibriSpeech for non-streaming End-to-End ASR models. However, the pivotal characteristics of SSL is to be utilized for any untranscribed audio data. In this paper, we provide a full exploration on how to utilize uncurated audio data in SSL from data pre-processing to deploying an streaming hybrid ASR model. More specifically, we present (1) the effect of Audio Event Detection (AED) model in data pre-processing pipeline (2) analysis on choosing optimizer and learning rate scheduling (3) comparison of recently developed contrastive losses, (4) comparison of various pre-training strategies such as utilization of in-domain versus out-domain pre-training data, monolingual versus multilingual pre-training data, multi-head multilingual SSL versus single-head multilingual SSL and supervised pre-training versus SSL. The experimental results show that SSL pre-training with in-domain uncurated data can achieve better performance in comparison to all the alternative out-domain pre-training strategies.

Via

Access Paper or Ask Questions

3D Reconstruction from public webcams

Aug 21, 2021

Tianyu Wu, Konrad Schindler, Cenek Albl

Figure 1 for 3D Reconstruction from public webcams

Figure 2 for 3D Reconstruction from public webcams

Figure 3 for 3D Reconstruction from public webcams

Figure 4 for 3D Reconstruction from public webcams

Abstract:In this paper, we investigate the possibility of reconstructing the 3D geometry of a scene captured by multiple webcams. The number of publicly accessible webcams is already large and it is growing every day. A logical question arises - can we use this free source of data for something beyond leisure activities? The challenge is that no internal, external, or temporal calibration of these cameras is available. We show that using recent advances in computer vision, we successfully calibrate the cameras, perform 3D reconstructions of the static scene and also recover the 3D trajectories of moving objects.

Via

Access Paper or Ask Questions

Recent Ice Trends in Swiss Mountain Lakes: 20-year Analysis of MODIS Imagery

Mar 23, 2021

Manu Tom, Tianyu Wu, Emmanuel Baltsavias, Konrad Schindler

Figure 1 for Recent Ice Trends in Swiss Mountain Lakes: 20-year Analysis of MODIS Imagery

Figure 2 for Recent Ice Trends in Swiss Mountain Lakes: 20-year Analysis of MODIS Imagery

Figure 3 for Recent Ice Trends in Swiss Mountain Lakes: 20-year Analysis of MODIS Imagery

Figure 4 for Recent Ice Trends in Swiss Mountain Lakes: 20-year Analysis of MODIS Imagery

Abstract:Depleting lake ice can serve as an indicator for climate change, just like sea level rise or glacial retreat. Several Lake Ice Phenological (LIP) events serve as sentinels to understand the regional and global climate change. Hence, monitoring the long-term lake freezing and thawing patterns can prove very useful. In this paper, we focus on observing the LIP events such as freeze-up, break-up and temporal freeze extent in the Oberengadin region of Switzerland, where there are several small- and medium-sized mountain lakes, across two decades (2000-2020) from optical satellite images. We analyse time-series of MODIS imagery (and additionally cross-check with VIIRS data when available), by estimating spatially resolved maps of lake ice for these Alpine lakes with supervised machine learning. To train the classifier we rely on reference data annotated manually based on publicly available webcam images. From the ice maps we derive long-term LIP trends. Since the webcam data is only available for two winters, we also validate our results against the operational MODIS and VIIRS snow products. We find a change in Complete Freeze Duration (CFD) of -0.76 and -0.89 days per annum (d/a) for lakes Sils and Silvaplana respectively. Furthermore, we correlate the lake freezing and thawing trends with climate data such as temperature, sunshine, precipitation and wind measured at nearby meteorological stations.

Via

Access Paper or Ask Questions

Ice Monitoring in Swiss Lakes from Optical Satellites and Webcams using Machine Learning

Oct 27, 2020

Manu Tom, Rajanie Prabha, Tianyu Wu, Emmanuel Baltsavias, Laura Leal-Taixe, Konrad Schindler

Figure 1 for Ice Monitoring in Swiss Lakes from Optical Satellites and Webcams using Machine Learning

Figure 2 for Ice Monitoring in Swiss Lakes from Optical Satellites and Webcams using Machine Learning

Figure 3 for Ice Monitoring in Swiss Lakes from Optical Satellites and Webcams using Machine Learning

Figure 4 for Ice Monitoring in Swiss Lakes from Optical Satellites and Webcams using Machine Learning

Abstract:Continuous observation of climate indicators, such as trends in lake freezing, is important to understand the dynamics of the local and global climate system. Consequently, lake ice has been included among the Essential Climate Variables (ECVs) of the Global Climate Observing System (GCOS), and there is a need to set up operational monitoring capabilities. Multi-temporal satellite images and publicly available webcam streams are among the viable data sources to monitor lake ice. In this work we investigate machine learning-based image analysis as a tool to determine the spatio-temporal extent of ice on Swiss Alpine lakes as well as the ice-on and ice-off dates, from both multispectral optical satellite images (VIIRS and MODIS) and RGB webcam images. We model lake ice monitoring as a pixel-wise semantic segmentation problem, i.e., each pixel on the lake surface is classified to obtain a spatially explicit map of ice cover. We show experimentally that the proposed system produces consistently good results when tested on data from multiple winters and lakes. Our satellite-based method obtains mean Intersection-over-Union (mIoU) scores >93%, for both sensors. It also generalises well across lakes and winters with mIoU scores >78% and >80% respectively. On average, our webcam approach achieves mIoU values of 87% (approx.) and generalisation scores of 71% (approx.) and 69% (approx.) across different cameras and winters respectively. Additionally, we put forward a new benchmark dataset of webcam images (Photi-LakeIce) which includes data from two winters and three cameras.

* Accepted for publication in MDPI Remote Sensing Journal

Via

Access Paper or Ask Questions

Coordinate Friendly Structures, Algorithms and Applications

Aug 14, 2016

Zhimin Peng, Tianyu Wu, Yangyang Xu, Ming Yan, Wotao Yin

Figure 1 for Coordinate Friendly Structures, Algorithms and Applications

Figure 2 for Coordinate Friendly Structures, Algorithms and Applications

Figure 3 for Coordinate Friendly Structures, Algorithms and Applications

Figure 4 for Coordinate Friendly Structures, Algorithms and Applications

Abstract:This paper focuses on coordinate update methods, which are useful for solving problems involving large or high-dimensional datasets. They decompose a problem into simple subproblems, where each updates one, or a small block of, variables while fixing others. These methods can deal with linear and nonlinear mappings, smooth and nonsmooth functions, as well as convex and nonconvex problems. In addition, they are easy to parallelize. The great performance of coordinate update methods depends on solving simple sub-problems. To derive simple subproblems for several new classes of applications, this paper systematically studies coordinate-friendly operators that perform low-cost coordinate updates. Based on the discovered coordinate friendly operators, as well as operator splitting techniques, we obtain new coordinate update algorithms for a variety of problems in machine learning, image processing, as well as sub-areas of optimization. Several problems are treated with coordinate update for the first time in history. The obtained algorithms are scalable to large instances through parallel and even asynchronous computing. We present numerical examples to illustrate how effective these algorithms are.

* Annals of Mathematical Sciences and Applications, 1 (2016), 57-119

Via

Access Paper or Ask Questions