Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Saikat Guha

Entanglement-Assisted Coding for Arbitrary Linear Computations Over a Quantum MAC

Jan 27, 2025

Lei Hu, Mohamed Nomeir, Alptug Aytekin, Yu Shi, Sennur Ulukus, Saikat Guha

Abstract:We study a linear computation problem over a quantum multiple access channel (LC-QMAC), where $S$ servers share an entangled state and separately store classical data streams $W_1,\cdots, W_S$ over a finite field $\mathbb{F}_d$. A user aims to compute $K$ linear combinations of these data streams, represented as $Y = \mathbf{V}_1 W_1 + \mathbf{V}_2 W_2 + \cdots + \mathbf{V}_S W_S \in \mathbb{F}_d^{K \times 1}$. To this end, each server encodes its classical information into its local quantum subsystem and transmits it to the user, who retrieves the desired computations via quantum measurements. In this work, we propose an achievable scheme for LC-QMAC based on the stabilizer formalism and the ideas from entanglement-assisted quantum error-correcting codes (EAQECC). Specifically, given any linear computation matrix, we construct a self-orthogonal matrix that can be implemented using the stabilizer formalism. Also, we apply precoding matrices to minimize the number of auxiliary qudits required. Our scheme achieves more computations per qudit, i.e., a higher computation rate, compared to the best-known methods in the literature, and attains the capacity in certain cases.

Via

Access Paper or Ask Questions

Optimum classical beam position sensing

Feb 01, 2024

Wenhua He, Christos N. Gagatsos, Dalziel J. Wilson, Saikat Guha

Figure 1 for Optimum classical beam position sensing

Figure 2 for Optimum classical beam position sensing

Figure 3 for Optimum classical beam position sensing

Figure 4 for Optimum classical beam position sensing

Abstract:Beam displacement measurements are widely used in optical sensing and communications; however, their performance is affected by numerous intrinsic and extrinsic factors including beam profile, propagation loss, and receiver architecture. Here we present a framework for designing a classically optimal beam displacement transceiver, using quantum estimation theory. We consider the canonical task of estimating the position of a diffraction-limited laser beam after passing through an apertured volume characterized by Fresnel-number product DF. As a rule of thumb, higher-order Gaussian modes provide more information about beam displacement, but are more sensitive to loss. Applying quantum Fisher information, we design mode combinations that optimally leverage this trade-off, and show that a greater than 10-fold improvement in precision is possible, relative to the fundamental mode, for a practically relevant DF = 100. We also show that this improvement is realizable with a variety of practical receiver architectures. Our findings extend previous works on lossless transceivers, may have immediate impact on applications such as atomic force microscopy and near-field optical communication, and pave the way towards globally optimal transceivers using non-classical laser fields.

* 11 pages, 4 figures

Via

Access Paper or Ask Questions

ChartParser: Automatic Chart Parsing for Print-Impaired

Nov 16, 2022

Anukriti Kumar, Tanuja Ganu, Saikat Guha

Figure 1 for ChartParser: Automatic Chart Parsing for Print-Impaired

Figure 2 for ChartParser: Automatic Chart Parsing for Print-Impaired

Figure 3 for ChartParser: Automatic Chart Parsing for Print-Impaired

Abstract:Infographics are often an integral component of scientific documents for reporting qualitative or quantitative findings as they make it much simpler to comprehend the underlying complex information. However, their interpretation continues to be a challenge for the blind, low-vision, and other print-impaired (BLV) individuals. In this paper, we propose ChartParser, a fully automated pipeline that leverages deep learning, OCR, and image processing techniques to extract all figures from a research paper, classify them into various chart categories (bar chart, line chart, etc.) and obtain relevant information from them, specifically bar charts (including horizontal, vertical, stacked horizontal and stacked vertical charts) which already have several exciting challenges. Finally, we present the retrieved content in a tabular format that is screen-reader friendly and accessible to the BLV users. We present a thorough evaluation of our approach by applying our pipeline to sample real-world annotated bar charts from research papers.

* Submitted at Scientific Document Understanding Workshop, AAAI 2023

Via

Access Paper or Ask Questions

Towards Zero-Shot and Few-Shot Table Question Answering using GPT-3

Oct 31, 2022

Pragya Srivastava, Tanuja Ganu, Saikat Guha

Figure 1 for Towards Zero-Shot and Few-Shot Table Question Answering using GPT-3

Figure 2 for Towards Zero-Shot and Few-Shot Table Question Answering using GPT-3

Figure 3 for Towards Zero-Shot and Few-Shot Table Question Answering using GPT-3

Figure 4 for Towards Zero-Shot and Few-Shot Table Question Answering using GPT-3

Abstract:We present very early results on using GPT-3 to perform question answering on tabular data. We find that stock pre-trained GPT-3 is able to zero-shot learn the table structure from a serialized JSON array-of-arrays representation, and able to answer lookup queries and simple comparison questions in natural language without any fine-tuning. We further find that simple prompt engineering to include few-shot static Q&A examples significantly improves accuracy. Lastly, we find that intermixing passage text improves accuracy even further on heterogeneous data. We apply our approach on a novel dataset of simple tables in newspaper infographics with promising results. Overall, we find much cause for optimism in this basic approach.

* 7 pages

Via

Access Paper or Ask Questions

Towards Optimizing OCR for Accessibility

Jun 24, 2022

Peya Mowar, Tanuja Ganu, Saikat Guha

Figure 1 for Towards Optimizing OCR for Accessibility

Figure 2 for Towards Optimizing OCR for Accessibility

Figure 3 for Towards Optimizing OCR for Accessibility

Figure 4 for Towards Optimizing OCR for Accessibility

Abstract:Visual cues such as structure, emphasis, and icons play an important role in efficient information foraging by sighted individuals and make for a pleasurable reading experience. Blind, low-vision and other print-disabled individuals miss out on these cues since current OCR and text-to-speech software ignore them, resulting in a tedious reading experience. We identify four semantic goals for an enjoyable listening experience, and identify syntactic visual cues that help make progress towards these goals. Empirically, we find that preserving even one or two visual cues in aural form significantly enhances the experience for listening to print content.

* Extended Abstract for Poster Session at Accessibility, Vision, and Autonomy Meet (CVPR 2022 Workshop)

Via

Access Paper or Ask Questions

Broken News: Making Newspapers Accessible to Print-Impaired

Jun 23, 2022

Vishal Agarwal, Tanuja Ganu, Saikat Guha

Figure 1 for Broken News: Making Newspapers Accessible to Print-Impaired

Figure 2 for Broken News: Making Newspapers Accessible to Print-Impaired

Figure 3 for Broken News: Making Newspapers Accessible to Print-Impaired

Figure 4 for Broken News: Making Newspapers Accessible to Print-Impaired

Abstract:Accessing daily news content still remains a big challenge for people with print-impairment including blind and low-vision due to opacity of printed content and hindrance from online sources. In this paper, we present our approach for digitization of print newspaper into an accessible file format such as HTML. We use an ensemble of instance segmentation and detection framework for newspaper layout analysis and then OCR to recognize text elements such as headline and article text. Additionally, we propose EdgeMask loss function for Mask-RCNN framework to improve segmentation mask boundary and hence accuracy of downstream OCR task. Empirically, we show that our proposed loss function reduces the Word Error Rate (WER) of news article text by 32.5 %.

* Extended Abstract at Accessibility, Vision, and Autonomy Meet (CVPR 2022 Workshop)

Via

Access Paper or Ask Questions

Document Navigability: A Need for Print-Impaired

Jun 21, 2022

Anukriti Kumar, Tanuja Ganu, Saikat Guha

Figure 1 for Document Navigability: A Need for Print-Impaired

Figure 2 for Document Navigability: A Need for Print-Impaired

Figure 3 for Document Navigability: A Need for Print-Impaired

Figure 4 for Document Navigability: A Need for Print-Impaired

Abstract:Printed documents continue to be a challenge for blind, low-vision, and other print-disabled (BLV) individuals. In this paper, we focus on the specific problem of (in-)accessibility of internal references to citations, footnotes, figures, tables and equations. While sighted users can flip to the referenced content and flip back in seconds, linear audio narration that BLV individuals rely on makes following these references extremely hard. We propose a vision based technique to locate the referenced content and extract metadata needed to (in subsequent work) inline a content summary into the audio narration. We apply our technique to citations in scientific documents and find it works well both on born-digital as well as scanned documents.

* CVPR 2022 Workshop on Accessibility, Vision, and Autonomy
* Published at Accessibility, Vision, and Autonomy Meet, CVPR 2022 Workshop

Via

Access Paper or Ask Questions