Abstract:Understanding the creation, evolution, and dissemination of scientific knowledge is crucial for bridging diverse subject areas and addressing complex global challenges such as pandemics, climate change, and ethical AI. Scientometrics, the quantitative and qualitative study of scientific literature, provides valuable insights into these processes. We introduce Scito2M, a longitudinal scientometric dataset with over two million academic publications, providing comprehensive contents information and citation graphs to support cross-disciplinary analyses. Using Scito2M, we conduct a temporal study spanning over 30 years to explore key questions in scientometrics: the evolution of academic terminology, citation patterns, and interdisciplinary knowledge exchange. Our findings reveal critical insights, such as disparities in epistemic cultures, knowledge production modes, and citation practices. For example, rapidly developing, application-driven fields like LLMs exhibit significantly shorter citation age (2.48 years) compared to traditional theoretical disciplines like oral history (9.71 years).
Abstract:Intelligent maritime, as an essential component of smart ocean construction, deeply integrates advanced artificial intelligence technology and data analysis methods, which covers multiple aspects such as smart vessels, route optimization, safe navigation, aiming to enhance the efficiency of ocean resource utilization and the intelligence of transportation networks. However, the complex and dynamic maritime environment, along with diverse and heterogeneous large-scale data sources, present challenges for real-time decision-making in intelligent maritime. In this paper, We propose KUNPENG, the first-ever embodied large model for intelligent maritime in the smart ocean construction, which consists of six systems. The model perceives multi-source heterogeneous data for the cognition of environmental interaction and make autonomous decision strategies, which are used for intelligent vessels to perform navigation behaviors under safety and emergency guarantees and continuously optimize power to achieve embodied intelligence in maritime. In comprehensive maritime task evaluations, KUNPENG has demonstrated excellent performance.
Abstract:Peer review is fundamental to the integrity and advancement of scientific publication. Traditional methods of peer review analyses often rely on exploration and statistics of existing peer review data, which do not adequately address the multivariate nature of the process, account for the latent variables, and are further constrained by privacy concerns due to the sensitive nature of the data. We introduce AgentReview, the first large language model (LLM) based peer review simulation framework, which effectively disentangles the impacts of multiple latent factors and addresses the privacy issue. Our study reveals significant insights, including a notable 37.1% variation in paper decisions due to reviewers' biases, supported by sociological theories such as the social influence theory, altruism fatigue, and authority bias. We believe that this study could offer valuable insights to improve the design of peer review mechanisms.
Abstract:Image editing serves as a practical yet challenging task considering the diverse demands from users, where one of the hardest parts is to precisely describe how the edited image should look like. In this work, we present a new form of editing, termed imitative editing, to help users exercise their creativity more conveniently. Concretely, to edit an image region of interest, users are free to directly draw inspiration from some in-the-wild references (e.g., some relative pictures come across online), without having to cope with the fit between the reference and the source. Such a design requires the system to automatically figure out what to expect from the reference to perform the editing. For this purpose, we propose a generative training framework, dubbed MimicBrush, which randomly selects two frames from a video clip, masks some regions of one frame, and learns to recover the masked regions using the information from the other frame. That way, our model, developed from a diffusion prior, is able to capture the semantic correspondence between separate images in a self-supervised manner. We experimentally show the effectiveness of our method under various test cases as well as its superiority over existing alternatives. We also construct a benchmark to facilitate further research.
Abstract:We extend the adversarial/non-stochastic multi-play multi-armed bandit (MPMAB) to the case where the number of arms to play is variable. The work is motivated by the fact that the resources allocated to scan different critical locations in an interconnected transportation system change dynamically over time and depending on the environment. By modeling the malicious hacker and the intrusion monitoring system as the attacker and the defender, respectively, we formulate the problem for the two players as a sequential pursuit-evasion game. We derive the condition under which a Nash equilibrium of the strategic game exists. For the defender side, we provide an exponential-weighted based algorithm with sublinear pseudo-regret. We further extend our model to heterogeneous rewards for both players, and obtain lower and upper bounds on the average reward for the attacker. We provide numerical experiments to demonstrate the effectiveness of a variable-arm play.
Abstract:In this paper we propose a novel observer-based method for anomaly detection in connected and automated vehicles (CAVs). The proposed method utilizes an augmented extended Kalman filter (AEKF) to smooth sensor readings of a CAV based on a nonlinear car-following motion model with time delay, where the leading vehicle's trajectory is used by the subject vehicle to detect sensor anomalies. We use the classic $\chi^2$ fault detector in conjunction with the proposed AEKF for anomaly detection. To make the proposed model more suitable for real-world applications, we consider a stochastic communication time delay in the car-following model. Our experiments conducted on real-world connected vehicle data indicate that the AEKF with $\chi^2$-detector can achieve a high anomaly detection performance.
Abstract:In this paper we propose a novel observer-based method to improve the safety and security of connected and automated vehicle (CAV) transportation. The proposed method combines model-based signal filtering and anomaly detection methods. Specifically, we use adaptive extended Kalman filter (AEKF) to smooth sensor readings of a CAV based on a nonlinear car-following model. Using the car-following model the subject vehicle (i.e., the following vehicle) utilizes the leading vehicle's information to detect sensor anomalies by employing previously-trained One Class Support Vector Machine (OCSVM) models. This approach allows the AEKF to estimate the state of a vehicle not only based on the vehicle's location and speed, but also by taking into account the state of the surrounding traffic. A communication time delay factor is considered in the car-following model to make it more suitable for real-world applications. Our experiments show that compared with the AEKF with a traditional $\chi^2$-detector, our proposed method achieves a better anomaly detection performance. We also demonstrate that a larger time delay factor has a negative impact on the overall detection performance.
Abstract:We in this paper propose a realizable framework TECU, which embeds task-specific strategies into update schemes of coordinate descent, for optimizing multivariate non-convex problems with coupled objective functions. On one hand, TECU is capable of improving algorithm efficiencies through embedding productive numerical algorithms, for optimizing univariate sub-problems with nice properties. From the other side, it also augments probabilities to receive desired results, by embedding advanced techniques in optimizations of realistic tasks. Integrating both numerical algorithms and advanced techniques together, TECU is proposed in a unified framework for solving a class of non-convex problems. Although the task embedded strategies bring inaccuracies in sub-problem optimizations, we provide a realizable criterion to control the errors, meanwhile, to ensure robust performances with rigid theoretical analyses. By respectively embedding ADMM and a residual-type CNN in our algorithm framework, the experimental results verify both efficiency and effectiveness of embedding task-oriented strategies in coordinate descent for solving practical problems.
Abstract:Enhancing visual qualities of images plays very important roles in various vision and learning applications. In the past few years, both knowledge-driven maximum a posterior (MAP) with prior modelings and fully data-dependent convolutional neural network (CNN) techniques have been investigated to address specific enhancement tasks. In this paper, by exploiting the advantages of these two types of mechanisms within a complementary propagation perspective, we propose a unified framework, named deep prior ensemble (DPE), for solving various image enhancement tasks. Specifically, we first establish the basic propagation scheme based on the fundamental image modeling cues and then introduce residual CNNs to help predicting the propagation direction at each stage. By designing prior projections to perform feedback control, we theoretically prove that even with experience-inspired CNNs, DPE is definitely converged and the output will always satisfy our fundamental task constraints. The main advantage against conventional optimization-based MAP approaches is that our descent directions are learned from collected training data, thus are much more robust to unwanted local minimums. While, compared with existing CNN type networks, which are often designed in heuristic manners without theoretical guarantees, DPE is able to gain advantages from rich task cues investigated on the bases of domain knowledges. Therefore, DPE actually provides a generic ensemble methodology to integrate both knowledge and data-based cues for different image enhancement tasks. More importantly, our theoretical investigations verify that the feedforward propagations of DPE are properly controlled toward our desired solution. Experimental results demonstrate that the proposed DPE outperforms state-of-the-arts on a variety of image enhancement tasks in terms of both quantitative measure and visual perception quality.
Abstract:In recent years, numerous vision and learning tasks have been (re)formulated as nonconvex and nonsmooth programmings(NNPs). Although some algorithms have been proposed for particular problems, designing fast and flexible optimization schemes with theoretical guarantee is a challenging task for general NNPs. It has been investigated that performing inexact inner iterations often benefit to special applications case by case, but their convergence behaviors are still unclear. Motivated by these practical experiences, this paper designs a novel algorithmic framework, named inexact proximal alternating direction method (IPAD) for solving general NNPs. We demonstrate that any numerical algorithms can be incorporated into IPAD for solving subproblems and the convergence of the resulting hybrid schemes can be consistently guaranteed by a series of simple error conditions. Beyond the guarantee in theory, numerical experiments on both synthesized and real-world data further demonstrate the superiority and flexibility of our IPAD framework for practical use.