I'll be attending (and presenting a poster) at #EACL this week. Definitely reach out if you want to grab coffee!
My poster is
`Parameter-Efficient Korean Character-Level Language Modeling`,
where we describe a way to efficiently encode and decode (Korean) syllable-level representations without requiring an embedding for each possible syllable in the Korean writing system (more than 11k).
#eacl #nlp #nlproc #eacl2023 #CJK
The GB18030 standard was updated this year (GB18030-2022), and there are several notable conformance requirement changes that may impact CJK typeface family development teams.
Ken Lunde’s take: https://ken-lunde.medium.com/the-gb-18030-2022-standard-3d0ebaeb4132
Peter Constable’s take: https://www.unicode.org/L2/L2022/22274-disruptive-changes.pdf
Time for an #introduction
I'm a PhD student focusing on #tokenization in #CJK translation and do language stuff at Google Tokyo. I also help as an editor for @thegradient.
I'm interested in #federatedlearning, #Julia, #Korean, and high school CS #education. Definitely reach out!
www.theoreticallygoodwithcomputers.com
#Introduction #tokenization #CJK #federatedlearning #Julia #Korean #education