Abstract:We present an novel framework for efficiently and effectively extending the powerful continuous diffusion processes to discrete modeling. Previous approaches have suffered from the discrepancy between discrete data and continuous modeling. Our study reveals that the absence of guidance from discrete boundaries in learning probability contours is one of the main reasons. To address this issue, we propose a two-step forward process that first estimates the boundary as a prior distribution and then rescales the forward trajectory to construct a boundary conditional diffusion model. The reverse process is proportionally adjusted to guarantee that the learned contours yield more precise discrete data. Experimental results indicate that our approach achieves strong performance in both language modeling and discrete image generation tasks. In language modeling, our approach surpasses previous state-of-the-art continuous diffusion language models in three translation tasks and a summarization task, while also demonstrating competitive performance compared to auto-regressive transformers. Moreover, our method achieves comparable results to continuous diffusion models when using discrete ordinal pixels and establishes a new state-of-the-art for categorical image generation on the Cifar-10 dataset.
Abstract:This manuscript gives a brief description of the algorithm used to participate in CoNIC Challenge 2022. After the baseline was made available, we follow the method in it and replace the ResNet baseline with ConvNeXt one. Moreover, we propose to first convert RGB space to Haematoxylin-Eosin-DAB(HED) space, then use Haematoxylin composition of origin image to smooth semantic one hot label. Afterwards, nuclei distribution of train and valid set are explored to select the best fold split for training model for final test phase submission. Results on validation set shows that even with channel of each stage smaller in number, HoVerNet with ConvNeXt-tiny backbone still improves the mPQ+ by 0.04 and multi r2 by 0.0144