Abstract:This paper explores continuous-time and state-space optimal stopping problems from a reinforcement learning perspective. We begin by formulating the stopping problem using randomized stopping times, where the decision maker's control is represented by the probability of stopping within a given time--specifically, a bounded, non-decreasing, c\`adl\`ag control process. To encourage exploration and facilitate learning, we introduce a regularized version of the problem by penalizing it with the cumulative residual entropy of the randomized stopping time. The regularized problem takes the form of an (n+1)-dimensional degenerate singular stochastic control with finite-fuel. We address this through the dynamic programming principle, which enables us to identify the unique optimal exploratory strategy. For the specific case of a real option problem, we derive a semi-explicit solution to the regularized problem, allowing us to assess the impact of entropy regularization and analyze the vanishing entropy limit. Finally, we propose a reinforcement learning algorithm based on policy iteration. We show both policy improvement and policy convergence results for our proposed algorithm.
Abstract:Free-space optics (FSO) is an attractive technology to meet the ever-growing demand for wireless bandwidth in next generation networks. To increase the spectral efficiency of FSO links, transmission over spatial division multiplexing (SDM) can be exploited, where orthogonal light beams have to be shaped according to suitable amplitude, phase, and polarization profiles. In this work, we show that a programmable photonic circuits, consisting of a silicon photonic mesh of tunable Mach-Zehnder Interferometers (MZIs) can be used as an adaptive multibeam receiver for a FSO communication link. The circuit can self-configure to simultaneously receive and separate, with negligible mutual crosstalk, signals carried by orthogonal FSO beams sharing the same wavelength and polarization. This feature is demonstrated on signal pairs either arriving at the receiver from orthogonal directions (direction-diversity) or being shaped according to different orthogonal spatial modes (mode-diversity), even in the presence of some mixing during propagation. The performance of programmable mesh as an adaptive multibeam receiver is assessed by means of data channel transmission at 10 Gbit/s a wavelength of 1550 nm, but the optical bandwidth of the receiver (>40 nm) allows its use at much higher data rates as well as in wavelength-division multiplexing SDM communication links.
Abstract:There are several indications that brain is organized not on a basis of individual unreliable neurons, but on a micro-circuital scale providing Lego blocks employed to create complex architectures. At such an intermediate scale, the firing activity in the microcircuits is governed by collective effects emerging by the background noise soliciting spontaneous firing, the degree of mutual connections between the neurons, and the topology of the connections. We compare spontaneous firing activity of small populations of neurons adhering to an engineered scaffold with simulations of biologically plausible CMOS artificial neuron populations whose spontaneous activity is ignited by tailored background noise. We provide a full set of flexible and low-power consuming silicon blocks including neurons, excitatory and inhibitory synapses, and both white and pink noise generators for spontaneous firing activation. We achieve a comparable degree of correlation of the firing activity of the biological neurons by controlling the kind and the number of connection among the silicon neurons. The correlation between groups of neurons, organized as a ring of four distinct populations connected by the equivalent of interneurons, is triggered more effectively by adding multiple synapses to the connections than increasing the number of independent point-to-point connections. The comparison between the biological and the artificial systems suggests that a considerable number of synapses is active also in biological populations adhering to engineered scaffolds.