Abstract:A point-to-point movable element (ME) enabled reconfigurable intelligent surface (ME-RIS) communication system is investigated, where each element position can be flexibly adjusted to create favorable channel conditions. For maximizing the communication rate, an efficient ME position optimization approach is proposed. Specifically, by characterizing the cascaded channel power gain in an element-wise manner, the position of each ME is iteratively updated by invoking the successive convex approximation method. Numerical results unveil that 1) the proposed element-wise ME position optimization algorithm outperforms the gradient descent algorithm; and 2) the ME-RIS significantly improves the communication rate compared to the conventional RIS with fixed-position elements.
Abstract:Legal case documents play a critical role in judicial proceedings. As the number of cases continues to rise, the reliance on manual drafting of legal case documents is facing increasing pressure and challenges. The development of large language models (LLMs) offers a promising solution for automating document generation. However, existing benchmarks fail to fully capture the complexities involved in drafting legal case documents in real-world scenarios. To address this gap, we introduce CaseGen, the benchmark for multi-stage legal case documents generation in the Chinese legal domain. CaseGen is based on 500 real case samples annotated by legal experts and covers seven essential case sections. It supports four key tasks: drafting defense statements, writing trial facts, composing legal reasoning, and generating judgment results. To the best of our knowledge, CaseGen is the first benchmark designed to evaluate LLMs in the context of legal case document generation. To ensure an accurate and comprehensive evaluation, we design the LLM-as-a-judge evaluation framework and validate its effectiveness through human annotations. We evaluate several widely used general-domain LLMs and legal-specific LLMs, highlighting their limitations in case document generation and pinpointing areas for potential improvement. This work marks a step toward a more effective framework for automating legal case documents drafting, paving the way for the reliable application of AI in the legal field. The dataset and code are publicly available at https://github.com/CSHaitao/CaseGen.
Abstract:Skyline detection plays an important role in geolocalizaion, flight control, visual navigation, port security, etc. The appearance of the sky and non-sky areas are variable, because of different weather or illumination environment, which brings challenges to skyline detection. In this research, we proposed the YUNet algorithm, which improved the YOLOv11 architecture to segment the sky region and extract the skyline in complicated and variable circumstances. To improve the ability of multi-scale and large range contextual feature fusion, the YOLOv11 architecture is extended as an UNet-like architecture, consisting of an encoder, neck and decoder submodule. The encoder extracts the multi-scale features from the given images. The neck makes fusion of these multi-scale features. The decoder applies the fused features to complete the prediction rebuilding. To validate the proposed approach, the YUNet was tested on Skyfinder and CH1 datasets for segmentation and skyline detection respectively. Our test shows that the IoU of YUnet segmentation can reach 0.9858, and the average error of YUnet skyline detection is just 1.36 pixels. The implementation is published at https://github.com/kuazhangxiaoai/SkylineDet-YOLOv11Seg.git.
Abstract:A novel movable-element enabled simultaneously transmitting and reflecting surface (ME-STARS) communication system is proposed, where ME-STARS elements positions can be adjusted to enhance the degress-of-freedom for transmission and reflection. For each ME-STARS operating protocols, namely energy-splitting (ES), mode switching (MS), and time switching (TS), a weighted sum rate (WSR) maximization problem is formulated to jointly optimize the active beamforming at the base station (BS) as well as the elements positions and passive beamforming at the ME-STARS. An alternative optimization (AO)-based iterative algorithm is developed to decompose the original non-convex problem into three subproblems. Specifically, the gradient descent algorithm is employed for solving the ME-STARS element position optimization subproblem, and the weighted minimum mean square error and the successive convex approximation methods are invoked for solving the active and passive beamforming subproblems, respectively. It is further demonstrated that the proposed AO algorithm for ES can be extended to solve the problems for MS and TS. Numerical results unveil that: 1) the ME-STARS can significantly improve the WSR compared to the STARS with fixed position elements and the conventional reconfigurable intelligent surface with movable elements, thanks to the extra spatial-domain diversity and the higher flexibility in beamforming; and 2) the performance gain of ME-STARS is significant in the scenarios with larger number of users or more scatterers.
Abstract:Non-contact manipulation is an emerging and highly promising methodology in robotics, offering a wide range of scientific and industrial applications. Among the proposed approaches, airflow stands out for its ability to project across considerable distances and its flexibility in actuating objects of varying materials, sizes, and shapes. However, predicting airflow fields at a distance, as well as the motion of objects within them, remains notoriously challenging due to their nonlinear and stochastic nature. Here, we propose a model-based learning approach using a jet-induced airflow field for remote multi-object manipulation on a surface. Our approach incorporates an analytical model of the field, learned object dynamics, and a model-based controller. The model predicts an air velocity field over an infinite surface for a specified jet orientation, while the object dynamics are learned through a robust system identification algorithm. Using the model-based controller, we can automatically and remotely, at meter-scale distances, control the motion of single and multiple objects for different tasks, such as path-following, aggregating, and sorting.
Abstract:Ensuring fairness has emerged as one of the primary concerns in AI and its related algorithms. Over time, the field of machine learning fairness has evolved to address these issues. This paper provides an extensive overview of this field and introduces two formal frameworks to tackle open questions in machine learning fairness. In one framework, operator-valued optimisation and min-max objectives are employed to address unfairness in time-series problems. This approach showcases state-of-the-art performance on the notorious COMPAS benchmark dataset, demonstrating its effectiveness in real-world scenarios. In the second framework, the challenge of lacking sensitive attributes, such as gender and race, in commonly used datasets is addressed. This issue is particularly pressing because existing algorithms in this field predominantly rely on the availability or estimations of such attributes to assess and mitigate unfairness. Here, a framework for a group-blind bias-repair is introduced, aiming to mitigate bias without relying on sensitive attributes. The efficacy of this approach is showcased through analyses conducted on the Adult Census Income dataset. Additionally, detailed algorithmic analyses for both frameworks are provided, accompanied by convergence guarantees, ensuring the robustness and reliability of the proposed methodologies.
Abstract:Time series anomaly detection (TSAD) has become an essential component of large-scale cloud services and web systems because it can promptly identify anomalies, providing early warnings to prevent greater losses. Deep learning-based forecasting methods have become very popular in TSAD due to their powerful learning capabilities. However, accurate predictions don't necessarily lead to better anomaly detection. Due to the common occurrence of noise, i.e., local peaks and drops in time series, existing black-box learning methods can easily learn these unintended patterns, significantly affecting anomaly detection performance. Kolmogorov-Arnold Networks (KAN) offers a potential solution by decomposing complex temporal sequences into a combination of multiple univariate functions, making the training process more controllable. However, KAN optimizes univariate functions using spline functions, which are also susceptible to the influence of local anomalies. To address this issue, we present KAN-AD, which leverages the Fourier series to emphasize global temporal patterns, thereby mitigating the influence of local peaks and drops. KAN-AD improves both effectiveness and efficiency by transforming the existing black-box learning approach into learning the weights preceding univariate functions. Experimental results show that, compared to the current state-of-the-art, we achieved an accuracy increase of 15% while boosting inference speed by 55 times.
Abstract:Soft robotic manipulators are generally slow despite their great adaptability, resilience, and compliance. This limitation also extends to current soft robotic micromanipulators. Here, we introduce FilMBot, a 3-DOF film-based, electromagnetically actuated, soft kinematic robotic micromanipulator achieving speeds up to 2117 $\deg$/s and 2456 $\deg$/s in $\alpha$ and $\beta$ angular motions, with corresponding linear velocities of 1.61 m/s and 1.92 m/s using a 4-cm needle end-effector, and 1.57 m/s along the Z axis. The robot can reach ~1.50 m/s in path-following tasks, operates at frequencies up to 30 Hz, and remains functional up to 50 Hz. It demonstrates high precision (~6.3 $\mu$m, or ~0.05% of its workspace) in small path-following tasks. The novel combination of the low-stiffness soft kinematic film structure and strong electromagnetic actuation in FilMBot opens new avenues for soft robotics. Furthermore, its simple construction and inexpensive, readily accessible components could broaden the application of micromanipulators beyond current academic and professional users.
Abstract:The bottleneck associated with the key-value(KV) cache presents a significant challenge during the inference processes of large language models. While depth pruning accelerates inference, it requires extensive recovery training, which can take up to two weeks. On the other hand, width pruning retains much of the performance but offers slight speed gains. To tackle these challenges, we propose KVPruner to improve model efficiency while maintaining performance. Our method uses global perplexity-based analysis to determine the importance ratio for each block and provides multiple strategies to prune non-essential KV channels within blocks. Compared to the original model, KVPruner reduces runtime memory usage by 50% and boosts throughput by over 35%. Additionally, our method requires only two hours of LoRA fine-tuning on small datasets to recover most of the performance.
Abstract:Convergence analysis of Markov chain Monte Carlo methods in high-dimensional statistical applications is increasingly recognized. In this paper, we develop general mixing time bounds for Metropolis-Hastings algorithms on discrete spaces by building upon and refining some recent theoretical advancements in Bayesian model selection problems. We establish sufficient conditions for a class of informed Metropolis-Hastings algorithms to attain relaxation times that are independent of the problem dimension. These conditions are grounded in high-dimensional statistical theory and allow for possibly multimodal posterior distributions. We obtain our results through two independent techniques: the multicommodity flow method and single-element drift condition analysis; we find that the latter yields a tighter mixing time bound. Our results and proof techniques are readily applicable to a broad spectrum of statistical problems with discrete parameter spaces.