Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zishuo Zhao

FLEX-CLIP: Feature-Level GEneration Network Enhanced CLIP for X-shot Cross-modal Retrieval

Nov 26, 2024

Jingyou Xie, Jiayi Kuang, Zhenzhou Lin, Jiarui Ouyang, Zishuo Zhao, Ying Shen

Abstract:Given a query from one modality, few-shot cross-modal retrieval (CMR) retrieves semantically similar instances in another modality with the target domain including classes that are disjoint from the source domain. Compared with classical few-shot CMR methods, vision-language pretraining methods like CLIP have shown great few-shot or zero-shot learning performance. However, they still suffer challenges due to (1) the feature degradation encountered in the target domain and (2) the extreme data imbalance. To tackle these issues, we propose FLEX-CLIP, a novel Feature-level Generation Network Enhanced CLIP. FLEX-CLIP includes two training stages. In multimodal feature generation, we propose a composite multimodal VAE-GAN network to capture real feature distribution patterns and generate pseudo samples based on CLIP features, addressing data imbalance. For common space projection, we develop a gate residual network to fuse CLIP features with projected features, reducing feature degradation in X-shot scenarios. Experimental results on four benchmark datasets show a 7%-15% improvement over state-of-the-art methods, with ablation studies demonstrating enhancement of CLIP features.

Via

Access Paper or Ask Questions

Proof-of-Learning with Incentive Security

Apr 13, 2024

Zishuo Zhao, Zhixuan Fang, Xuechao Wang, Yuan Zhou

Figure 1 for Proof-of-Learning with Incentive Security

Figure 2 for Proof-of-Learning with Incentive Security

Figure 3 for Proof-of-Learning with Incentive Security

Figure 4 for Proof-of-Learning with Incentive Security

Abstract:Most concurrent blockchain systems rely heavily on the Proof-of-Work (PoW) or Proof-of-Stake (PoS) mechanisms for decentralized consensus and security assurance. However, the substantial energy expenditure stemming from computationally intensive yet meaningless tasks has raised considerable concerns surrounding traditional PoW approaches, The PoS mechanism, while free of energy consumption, is subject to security and economic issues. Addressing these issues, the paradigm of Proof-of-Useful-Work (PoUW) seeks to employ challenges of practical significance as PoW, thereby imbuing energy consumption with tangible value. While previous efforts in Proof of Learning (PoL) explored the utilization of deep learning model training SGD tasks as PoUW challenges, recent research has revealed its vulnerabilities to adversarial attacks and the theoretical hardness in crafting a byzantine-secure PoL mechanism. In this paper, we introduce the concept of incentive-security that incentivizes rational provers to behave honestly for their best interest, bypassing the existing hardness to design a PoL mechanism with computational efficiency, a provable incentive-security guarantee and controllable difficulty. Particularly, our work is secure against two attacks to the recent work of Jia et al. [2021], and also improves the computational overhead from $\Theta(1)$ to $O(\frac{\log E}{E})$. Furthermore, while most recent research assumes trusted problem providers and verifiers, our design also guarantees frontend incentive-security even when problem providers are untrusted, and verifier incentive-security that bypasses the Verifier's Dilemma. By incorporating ML training into blockchain consensus mechanisms with provable guarantees, our research not only proposes an eco-friendly solution to blockchain systems, but also provides a proposal for a completely decentralized computing power market in the new AI age.

* 22 pages, 6 figures

Via

Access Paper or Ask Questions

Musical Instrument Classification via Low-Dimensional Feature Vectors

Sep 16, 2019

Zishuo Zhao, Haoyun Wang

Figure 1 for Musical Instrument Classification via Low-Dimensional Feature Vectors

Figure 2 for Musical Instrument Classification via Low-Dimensional Feature Vectors

Figure 3 for Musical Instrument Classification via Low-Dimensional Feature Vectors

Figure 4 for Musical Instrument Classification via Low-Dimensional Feature Vectors

Abstract:Music is a mysterious language that conveys feeling and thoughts via different tones and timbre. For better understanding of timbre in music, we chose music data of 6 representative instruments, analysed their timbre features and classified them. Instead of the current trend of Neural Network for black-box classification, our project is based on a combination of MFCC and LPC, and augmented with a 6-dimensional feature vector designed by ourselves from observation and attempts. In our white-box model, we observed significant patterns of sound that distinguish different timbres, and discovered some connection between objective data and subjective senses. With a totally 32-dimensional feature vector and a naive all-pairs SVM, we achieved improved classification accuracy compared to a single tool. We also attempted to analyze music pieces downloaded from the Internet, found out different performance on different instruments, explored the reasons and suggested possible ways to improve the performance.

Via

Access Paper or Ask Questions

Noise-Stable Rigid Graphs for Euclidean Embedding

Aug 06, 2019

Zishuo Zhao

Figure 1 for Noise-Stable Rigid Graphs for Euclidean Embedding

Figure 2 for Noise-Stable Rigid Graphs for Euclidean Embedding

Figure 3 for Noise-Stable Rigid Graphs for Euclidean Embedding

Figure 4 for Noise-Stable Rigid Graphs for Euclidean Embedding

Abstract:We proposed a new criterion \textit{noise-stability}, which revised the classical rigidity theory, for evaluation of MDS algorithms which can truthfully represent the fidelity of global structure reconstruction; then we proved the noise-stability of the cMDS algorithm in generic conditions, which provides a rigorous theoretical guarantee for the precision and theoretical bounds for Euclidean embedding and its application in fields including wireless sensor network localization and satellite positioning. Furthermore, we looked into previous work about minimum-cost globally rigid spanning subgraph, and proposed an algorithm to construct a minimum-cost noise-stable spanning graph in the Euclidean space, which enabled reliable localization on sparse graphs of noisy distance constraints with linear numbers of edges and sublinear costs in total edge lengths. Additionally, this algorithm also suggests a scheme to reconstruct point clouds from pairwise distances at a minimum of $O(n)$ time complexity, down from $O(n^3)$ for cMDS.

Via

Access Paper or Ask Questions