Abstract:Learning effective negotiation strategies poses two key challenges: the exploration-exploitation dilemma and dealing with large action spaces. However, there is an absence of learning-based approaches that effectively address these challenges in negotiation. This paper introduces a comprehensive formulation to tackle various negotiation problems. Our approach leverages contextual combinatorial multi-armed bandits, with the bandits resolving the exploration-exploitation dilemma, and the combinatorial nature handles large action spaces. Building upon this formulation, we introduce NegUCB, a novel method that also handles common issues such as partial observations and complex reward functions in negotiation. NegUCB is contextual and tailored for full-bandit feedback without constraints on the reward functions. Under mild assumptions, it ensures a sub-linear regret upper bound. Experiments conducted on three negotiation tasks demonstrate the superiority of our approach.
Abstract:The generalization of decision-making agents encompasses two fundamental elements: learning from past experiences and reasoning in novel contexts. However, the predominant emphasis in most interactive environments is on learning, often at the expense of complexity in reasoning. In this paper, we introduce CivRealm, an environment inspired by the Civilization game. Civilization's profound alignment with human history and society necessitates sophisticated learning, while its ever-changing situations demand strong reasoning to generalize. Particularly, CivRealm sets up an imperfect-information general-sum game with a changing number of players; it presents a plethora of complex features, challenging the agent to deal with open-ended stochastic environments that require diplomacy and negotiation skills. Within CivRealm, we provide interfaces for two typical agent types: tensor-based agents that focus on learning, and language-based agents that emphasize reasoning. To catalyze further research, we present initial results for both paradigms. The canonical RL-based agents exhibit reasonable performance in mini-games, whereas both RL- and LLM-based agents struggle to make substantial progress in the full game. Overall, CivRealm stands as a unique learning and reasoning challenge for decision-making agents. The code is available at https://github.com/bigai-ai/civrealm.
Abstract:Illegal vehicle parking is a common urban problem faced by major cities in the world, as it incurs traffic jams, which lead to air pollution and traffic accidents. The government highly relies on active human efforts to detect illegal parking events. However, such an approach is extremely ineffective to cover a large city since the police have to patrol over the entire city roads. The massive and high-quality sharing bike trajectories from Mobike offer us a unique opportunity to design a ubiquitous illegal parking detection approach, as most of the illegal parking events happen at curbsides and have significant impact on the bike users. The detection result can guide the patrol schedule, i.e. send the patrol policemen to the region with higher illegal parking risks, and further improve the patrol efficiency. Inspired by this idea, three main components are employed in the proposed framework: 1)~{\em trajectory pre-processing}, which filters outlier GPS points, performs map-matching, and builds trajectory indexes; 2)~{\em illegal parking detection}, which models the normal trajectories, extracts features from the evaluation trajectories, and utilizes a distribution test-based method to discover the illegal parking events; and 3)~{\em patrol scheduling}, which leverages the detection result as reference context, and models the scheduling task as a multi-agent reinforcement learning problem to guide the patrol police. Finally, extensive experiments are presented to validate the effectiveness of illegal parking detection, as well as the improvement of patrol efficiency.
Abstract:People often refer to a place of interest (POI) by an alias. In e-commerce scenarios, the POI alias problem affects the quality of the delivery address of online orders, bringing substantial challenges to intelligent logistics systems and market decision-making. Labeling the aliases of POIs involves heavy human labor, which is inefficient and expensive. Inspired by the observation that the users' GPS locations are highly related to their delivery address, we propose a ubiquitous alias discovery framework. Firstly, for each POI name in delivery addresses, the location data of its associated users, namely Mobility Profile are extracted. Then, we identify the alias relationship by modeling the similarity of mobility profiles. Comprehensive experiments on the large-scale location data and delivery address data from JD logistics validate the effectiveness.
Abstract:Data-driven approaches have been applied to many problems in urban computing. However, in the research community, such approaches are commonly studied under data from limited sources, and are thus unable to characterize the complexity of urban data coming from multiple entities and the correlations among them. Consequently, an inclusive and multifaceted dataset is necessary to facilitate more extensive studies on urban computing. In this paper, we present CityNet, a multi-modal urban dataset containing data from 7 cities, each of which coming from 3 data sources. We first present the generation process of CityNet as well as its basic properties. In addition, to facilitate the use of CityNet, we carry out extensive machine learning experiments, including spatio-temporal predictions, transfer learning, and reinforcement learning. The experimental results not only provide benchmarks for a wide range of tasks and methods, but also uncover internal correlations among cities and tasks within CityNet that, with adequate leverage, can improve performances on various tasks. With the benchmarking results and the correlations uncovered, we believe that CityNet can contribute to the field of urban computing by supporting research on many advanced topics.