Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Muhammad Ahmed

Permutation-Aware Action Segmentation via Unsupervised Frame-to-Segment Alignment

May 31, 2023

Quoc-Huy Tran, Ahmed Mehmood, Muhammad Ahmed, Muhammad Naufil, Anas Zafar, Andrey Konin, M. Zeeshan Zia

Abstract:This paper presents a novel transformer-based framework for unsupervised activity segmentation which leverages not only frame-level cues but also segment-level cues. This is in contrast with previous methods which often rely on frame-level information only. Our approach begins with a frame-level prediction module which estimates framewise action classes via a transformer encoder. The frame-level prediction module is trained in an unsupervised manner via temporal optimal transport. To exploit segment-level information, we introduce a segment-level prediction module and a frame-to-segment alignment module. The former includes a transformer decoder for estimating video transcripts, while the latter matches frame-level features with segment-level features, yielding permutation-aware segmentation results. Moreover, inspired by temporal optimal transport, we develop simple-yet-effective pseudo labels for unsupervised training of the above modules. Our experiments on four public datasets, i.e., 50 Salads, YouTube Instructions, Breakfast, and Desktop Assembly show that our approach achieves comparable or better performance than previous methods in unsupervised activity segmentation.

Via

Access Paper or Ask Questions

Learning by Aligning 2D Skeleton Sequences in Time

May 31, 2023

Quoc-Huy Tran, Muhammad Ahmed, Ahmed Mehmood, M. Hassan Ahmed, Murad Popattia, Andrey Konin, M. Zeeshan Zia

Figure 1 for Learning by Aligning 2D Skeleton Sequences in Time

Figure 2 for Learning by Aligning 2D Skeleton Sequences in Time

Figure 3 for Learning by Aligning 2D Skeleton Sequences in Time

Figure 4 for Learning by Aligning 2D Skeleton Sequences in Time

Abstract:This paper presents a novel self-supervised temporal video alignment framework which is useful for several fine-grained human activity understanding applications. In contrast with the state-of-the-art method of CASA, where sequences of 3D skeleton coordinates are taken directly as input, our key idea is to use sequences of 2D skeleton heatmaps as input. Given 2D skeleton heatmaps, we utilize a video transformer which performs self-attention in the spatial and temporal domains for extracting effective spatiotemporal and contextual features. In addition, we introduce simple heatmap augmentation techniques based on 2D skeletons for self-supervised learning. Despite the lack of 3D information, our approach achieves not only higher accuracy but also better robustness against missing and noisy keypoints than CASA. Extensive evaluations on three public datasets, i.e., Penn Action, IKEA ASM, and H2O, demonstrate that our approach outperforms previous methods in different fine-grained human activity understanding tasks, i.e., phase classification, phase progression, video alignment, and fine-grained frame retrieval.

Via

Access Paper or Ask Questions

Sequential Embedding-based Attentive (SEA) classifier for malware classification

Feb 11, 2023

Muhammad Ahmed, Anam Qureshi, Jawwad Ahmed Shamsi, Murk Marvi

Abstract:The tremendous growth in smart devices has uplifted several security threats. One of the most prominent threats is malicious software also known as malware. Malware has the capability of corrupting a device and collapsing an entire network. Therefore, its early detection and mitigation are extremely important to avoid catastrophic effects. In this work, we came up with a solution for malware detection using state-of-the-art natural language processing (NLP) techniques. Our main focus is to provide a lightweight yet effective classifier for malware detection which can be used for heterogeneous devices, be it a resource constraint device or a resourceful machine. Our proposed model is tested on the benchmark data set with an accuracy and log loss score of 99.13 percent and 0.04 respectively.

* 2022 International Conference on Cyber Warfare and Security (ICCWS), Islamabad, Pakistan, 2022, pp. 28-35

Via

Access Paper or Ask Questions