Abstract:The task of locating and classifying different types of vehicles has become a vital element in numerous applications of automation and intelligent systems ranging from traffic surveillance to vehicle identification and many more. In recent times, Deep Learning models have been dominating the field of vehicle detection. Yet, Bangladeshi vehicle detection has remained a relatively unexplored area. One of the main goals of vehicle detection is its real-time application, where `You Only Look Once' (YOLO) models have proven to be the most effective architecture. In this work, intending to find the best-suited YOLO architecture for fast and accurate vehicle detection from traffic images in Bangladesh, we have conducted a performance analysis of different variants of the YOLO-based architectures such as YOLOV3, YOLOV5s, and YOLOV5x. The models were trained on a dataset containing 7390 images belonging to 21 types of vehicles comprising samples from the DhakaAI dataset, the Poribohon-BD dataset, and our self-collected images. After thorough quantitative and qualitative analysis, we found the YOLOV5x variant to be the best-suited model, performing better than YOLOv3 and YOLOv5s models respectively by 7 & 4 percent in mAP, and 12 & 8.5 percent in terms of Accuracy.
Abstract:Ill-managed intersections are the primary reasons behind the increasing traffic problem in urban areas, leading to nonoptimal traffic-flow and unnecessary deadlocks. In this paper, we propose an automated intersection management system that extracts data from a well-defined grid of sensors and optimizes traffic flow by controlling traffic signals. The data extraction mechanism is independent of the optimization algorithm and this paper primarily emphasizes the later one. We have used MiniZinc modeling language to define our system as a constraint satisfaction problem which can be solved using any off-the-shelf solver. The proposed system performs much better than the systems currently in use. Our system reduces the mean waiting time and standard deviation of the waiting time of vehicles and avoids deadlocks.
Abstract:Class Scheduling is a highly constrained task. Educational institutes spend a lot of resources, in the form of time and manual computation, to find a satisficing schedule that fulfills all the requirements. A satisficing class schedule accommodates all the students to all their desired courses at convenient timing. The scheduler also needs to take into account the availability of course teachers on the given slots. With the added limitation of available classrooms, the number of solutions satisfying all constraints in this huge search-space, further decreases. This paper proposes an efficient system to generate class schedules that can fulfill every possible need of a typical university. Though it is primarily a fixed-credit scheduler, it can be adjusted for open-credit systems as well. The model is designed in MiniZinc and solved using various off-the-shelf solvers. The proposed scheduling system can find a balanced schedule for a moderate-sized educational institute in less than a minute.
Abstract:Video captioning, i.e. the task of generating captions from video sequences creates a bridge between the Natural Language Processing and Computer Vision domains of computer science. Generating a semantically accurate description of a video is an arduous task. Considering the complexity of the problem, the results obtained in recent researches are quite outstanding. But still there is plenty of scope for improvement. This paper addresses this scope and proposes a novel solution. Most video captioning models comprise of two sequential/recurrent layers - one as a video-to-context encoder and the other as a context-to-caption decoder. This paper proposes a novel architecture, SSVC (Semantically Sensible Video Captioning) which modifies the context generation mechanism by using two novel approaches - "stacked attention" and "spatial hard pull". For evaluating the proposed architecture, along with the BLEU scoring metric for quantitative analysis, we have used a human evaluation metric for qualitative analysis. This paper refers to this proposed human evaluation metric as the Semantic Sensibility (SS) scoring metric. SS score overcomes the shortcomings of common automated scoring metrics. This paper reports that the use of the aforementioned novelties improves the performance of the state-of-the-art architectures.