Text watermarking is a technique for embedding hidden information within textual content to verify its authenticity, origin, or ownership.[1] With the rise of generative AI systems using large language models (LLM), there has been significant development focused on watermarking AI-generated text.[2] Potential applications include detecting fake news and academic cheating, and excluding AI-generated material from LLM training data.[3] With LLMs the focus is on linguistic approaches that involve selecting words to form patterns within the text that can later be identified.[1] The results of the first reported large-scale public deployment, a trial using Google's Gemini chatbot, appeared in October 2024: Google found that users across 20 million responses found watermarked and unwatermarked text to be of equal quality.[3] Research on text watermarking began in 1997.[1]
See also
editReferences
edit- ^ a b c Kamaruddin, Nurul Shamimi; Kamsin, Amirrudin; Por, Lip Yee; Rahman, Hameedur (2018). "A Review of Text Watermarking: Theory, Methods, and Applications". IEEE Access. 6: 8011–8028. doi:10.1109/ACCESS.2018.2796585. ISSN 2169-3536.
- ^ Liu, Aiwei; Pan, Leyi; Lu, Yijian; Li, Jingjing; Hu, Xuming; Zhang, Xi; Wen, Lijie; King, Irwin; Xiong, Hui; Yu, Philip (2024-09-03). "A Survey of Text Watermarking in the Era of Large Language Models". ACM Computing Surveys. doi:10.1145/3691626. ISSN 0360-0300.
- ^ a b Gibney, Elizabeth (Oct 23, 2024). "Google unveils invisible 'watermark' for AI-generated text". Nature. Retrieved Oct 26, 2024.
{{cite web}}
: CS1 maint: url-status (link)