Abstract:Point cloud registration aligns 3D point clouds using spatial transformations. It is an important task in computer vision, with applications in areas such as augmented reality (AR) and medical imaging. This work explores the intersection of two research trends: the integration of AR into image-guided surgery and the use of deep learning for point cloud registration. The main objective is to evaluate the feasibility of applying deep learning-based point cloud registration methods for image-to-patient registration in augmented reality-guided surgery. We created a dataset of point clouds from medical imaging and corresponding point clouds captured with a popular AR device, the HoloLens 2. We evaluate three well-established deep learning models in registering these data pairs. While we find that some deep learning methods show promise, we show that a conventional registration pipeline still outperforms them on our challenging dataset.
Abstract:rollama is an R package that wraps the Ollama API, which allows you to run different Generative Large Language Models (GLLM) locally. The package and learning material focus on making it easy to use Ollama for annotating textual or imagine data with open-source models as well as use these models for document embedding. But users can use or extend rollama to do essentially anything else that is possible through OpenAI's API, yet more private, reproducible and for free.
Abstract:This paper explores the use of open generative Large Language Models (LLMs) for annotation tasks in the social sciences. The study highlights the challenges associated with proprietary models, such as limited reproducibility and privacy concerns, and advocates for the adoption of open (source) models that can be operated on independent devices. Two examples of annotation tasks, sentiment analysis in tweets and identification of leisure activities in childhood aspirational essays are provided. The study evaluates the performance of different prompting strategies and models (neural-chat-7b-v3-2, Starling-LM-7B-alpha, openchat_3.5, zephyr-7b-alpha and zephyr-7b-beta). The results indicate the need for careful validation and tailored prompt engineering. The study highlights the advantages of open models for data privacy and reproducibility.
Abstract:Segmentation is a key step in analyzing and processing medical images. Due to the low fault tolerance in medical imaging, manual segmentation remains the de facto standard in this domain. Besides, efforts to automate the segmentation process often rely on large amounts of manually labeled data. While existing software supporting manual segmentation is rich in features and delivers accurate results, the necessary time to set it up and get comfortable using it can pose a hurdle for the collection of large datasets. This work introduces a client/server based online environment, referred to as Studierfenster (studierfenster.at), that can be used to perform manual segmentations directly in a web browser. The aim of providing this functionality in the form of a web application is to ease the collection of ground truth segmentation datasets. Providing a tool that is quickly accessible and usable on a broad range of devices, offers the potential to accelerate this process. The manual segmentation workflow of Studierfenster consists of dragging and dropping the input file into the browser window and slice-by-slice outlining the object under consideration. The final segmentation can then be exported as a file storing its contours and as a binary segmentation mask. In order to evaluate the usability of Studierfenster, a user study was performed. The user study resulted in a mean of 6.3 out of 7.0 possible points given by users, when asked about their overall impression of the tool. The evaluation also provides insights into the results achievable with the tool in practice, by presenting two ground truth segmentations performed by physicians.