Picture for Jin-Woo Chung

Jin-Woo Chung

Improbable Bigrams Expose Vulnerabilities of Incomplete Tokens in Byte-Level Tokenizers

Add code
Oct 31, 2024
Viaarxiv icon

Ignore Me But Don't Replace Me: Utilizing Non-Linguistic Elements for Pretraining on the Cybersecurity Domain

Add code
Mar 15, 2024
Viaarxiv icon

DarkBERT: A Language Model for the Dark Side of the Internet

Add code
May 18, 2023
Viaarxiv icon

Shedding New Light on the Language of the Dark Web

Add code
Apr 14, 2022
Figure 1 for Shedding New Light on the Language of the Dark Web
Figure 2 for Shedding New Light on the Language of the Dark Web
Figure 3 for Shedding New Light on the Language of the Dark Web
Figure 4 for Shedding New Light on the Language of the Dark Web
Viaarxiv icon