Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Arnoud Visser

Supervised and self-supervised land-cover segmentation & classification of the Biesbosch wetlands

May 27, 2025

Eva Gmelich Meijling, Roberto Del Prete, Arnoud Visser

Abstract:Accurate wetland land-cover classification is essential for environmental monitoring, biodiversity assessment, and sustainable ecosystem management. However, the scarcity of annotated data, especially for high-resolution satellite imagery, poses a significant challenge for supervised learning approaches. To tackle this issue, this study presents a methodology for wetland land-cover segmentation and classification that adopts both supervised and self-supervised learning (SSL). We train a U-Net model from scratch on Sentinel-2 imagery across six wetland regions in the Netherlands, achieving a baseline model accuracy of 85.26%. Addressing the limited availability of labeled data, the results show that SSL pretraining with an autoencoder can improve accuracy, especially for the high-resolution imagery where it is more difficult to obtain labeled data, reaching an accuracy of 88.23%. Furthermore, we introduce a framework to scale manually annotated high-resolution labels to medium-resolution inputs. While the quantitative performance between resolutions is comparable, high-resolution imagery provides significantly sharper segmentation boundaries and finer spatial detail. As part of this work, we also contribute a curated Sentinel-2 dataset with Dynamic World labels, tailored for wetland classification tasks and made publicly available.

* 12 pages, presented at the Netherlands Conference on Computer Vision (NCCV), Utrecht, May 2025

Via

Access Paper or Ask Questions

Bringing the RT-1-X Foundation Model to a SCARA robot

Sep 05, 2024

Jonathan Salzer, Arnoud Visser

Abstract:Traditional robotic systems require specific training data for each task, environment, and robot form. While recent advancements in machine learning have enabled models to generalize across new tasks and environments, the challenge of adapting these models to entirely new settings remains largely unexplored. This study addresses this by investigating the generalization capabilities of the RT-1-X robotic foundation model to a type of robot unseen during its training: a SCARA robot from UMI-RTX. Initial experiments reveal that RT-1-X does not generalize zero-shot to the unseen type of robot. However, fine-tuning of the RT-1-X model by demonstration allows the robot to learn a pickup task which was part of the foundation model (but learned for another type of robot). When the robot is presented with an object that is included in the foundation model but not in the fine-tuning dataset, it demonstrates that only the skill, but not the object-specific knowledge, has been transferred.

* 14 pages, submitted to the joint Artificial Intelligence & Machine Learning conference for Belgium, Netherlands & Luxembourg (BNAIC/BeNeLearn)

Via

Access Paper or Ask Questions

An Earth Rover dataset recorded at the ICRA@40 party

Jul 08, 2024

Qi Zhang, Zhihao Lin, Arnoud Visser

Abstract:The ICRA conference is celebrating its $40^{th}$ anniversary in Rotterdam in September 2024, with as highlight the Happy Birthday ICRA Party at the iconic Holland America Line Cruise Terminal. One month later the IROS conference will take place, which will include the Earth Rover Challenge. In this challenge open-world autonomous navigation models are studied truly open-world settings. As part of the Earth Rover Challenge several real-world navigation sets in several cities world-wide, like Auckland, Australia and Wuhan, China. The only dataset recorded in the Netherlands is the small village Oudewater. The proposal is to record a dataset with the robot used in the Earth Rover Challenge in Rotterdam, in front of the Holland America Line Cruise Terminal, before the festivities of the Happy Birthday ICRA Party start.

* 2 page, submitted as Late-Breaking extended abstract to IEEE Conference on Robotics and Automation

Via

Access Paper or Ask Questions

Position and Altitude of the Nao Camera Head from Two Points on the Soccer Field plus the Gravitational Direction

Jul 03, 2024

Stijn Oomes, Arnoud Visser

Abstract:To be able to play soccer, a robot needs a good estimate of its current position on the field. Ideally, multiple features are visible that have known locations. By applying trigonometry we can estimate the viewpoint from where this observation was actually made. Given that the Nao robots of the Standard Platform League have quite a limited field of view, a given camera frame typically only allows for one or two points to be recognized. In this paper we propose a method for determining the (x, y) coordinates on the field and the height h of the camera from the geometry of a simplified tetrahedron. This configuration is formed by two observed points on the ground plane plus the gravitational direction. When the distance between the two points is known, and the directions to the points plus the gravitational direction are measured, all dimensions of the tetrahedron can be determined. By performing these calculations with rational trigonometry instead of classical trigonometry, the computations turn out to be 28.7% faster, with equal numerical accuracy. The position of the head of the Nao can also be externally measured with the OptiTrack system. The difference between externally measured and internally predicted position from sensor data gives us mean absolute errors in the 3-6 centimeters range, when we estimated the gravitational direction from the vanishing point of the outer edges of the goal posts.

* to be published in the Proceedings of the RoboCup 2024 symposium - 12 pages

Via

Access Paper or Ask Questions

A shallow residual neural network to predict the visual cortex response

Jun 27, 2019

Anne-Ruth José Meijer, Arnoud Visser

Figure 1 for A shallow residual neural network to predict the visual cortex response

Figure 2 for A shallow residual neural network to predict the visual cortex response

Figure 3 for A shallow residual neural network to predict the visual cortex response

Figure 4 for A shallow residual neural network to predict the visual cortex response

Abstract:Understanding how the visual cortex of the human brain really works is still an open problem for science today. A better understanding of natural intelligence could also benefit object-recognition algorithms based on convolutional neural networks. In this paper we demonstrate the asset of using a shallow residual neural network for this task. The benefit of this approach is that earlier stages of the network can be accurately trained, which allows us to add more layers at the earlier stage. With this additional layer the prediction of the visual brain activity improves from $10.4\%$ (block 1) to $15.53\%$ (last fully connected layer). By training the network for more than 10 epochs this improvement can become even larger.

* 3 pages, 5 figures

Via

Access Paper or Ask Questions