Baseline system + leaderboards are up for #MerlionChallenge untangling complex code-mixed speech. Which #ML #DeepLearning #SpeechProc system will do the best job on complex language use in the wild? 👀
TWO TEAMS have already beaten the baseline for Language ID:
🎉Lingua_Lumos (Closed)
🎉UNSW_Signal_Processing (Open)
There’s still time to join the challenge and prep your paper for our special session at #Interspeech2023
#merlionchallenge #ml #deeplearning #speechproc #interspeech2023
Have you ever seen auto-generated subtitles turn to mush because they couldn’t handle a speaker’s accent or figure out what language they’re speaking after a switch?
The #MerlionChallenge for #Interspeech23 tests how well teams can build a language detection system for Code-Switching in >300 Zoom recordings.
Help build robust systems for multilingualism by joining the challenge or sharing with #ML #DeepLearning #SpeechProc friends 💪🏼💪🏽💪🏿
#merlionchallenge #interspeech23 #ml #deeplearning #speechproc
Ever seen auto-generated subtitles turn to mush because they couldn’t handle a speaker’s accent or figure out what language they’re speaking after a switch?
The #MerlionChallenge at #Interspeech23 tests how well teams can build a language detection system for real-world Code-Switching between English and Mandarin Chinese in >300 Zoom recordings.
Join the challenge or boost to help build more robust speech systems for multilingualism 💪🏼💪🏽💪🏿
#merlionchallenge #interspeech23
Bonus
✨✨DID YOU KNOW?✨✨With the body of a mermaid and the head of a lion, the Merlion is a national icon of Singapore.
✨Just as the Merlion is a mix of different creatures, the code-switched child-directed speech in the #MerlionChallenge is a mix of different languages✨
(apologies for cross posting)
Most AI speech processing systems are developed using samples of monolingual speech between adults. #WEIRDbias
We hope the #MerlionChallenge at @Interspeech 2023 pushes the frontiers of how automated systems handle the diverse kinds of #translanguaging we see in the world 🌍
If you want to see better tools for #LangDev #Multilingualism #DiverseVoices and #GlobalLanguages then help us ✨boost✨ these posts can reach all the lovely #SpeechProc and #CognitiveScience folks!
#WEIRDbias #merlionchallenge #translanguaging #LangDev #multilingualism #diversevoices #globallanguages #speechproc #cognitivescience
What makes this a good #AI challenge?
👉Natural code switching (no shuffled segments)
👉Accented English & Mandarin
👉Precision human annotation
👉Various far field mics (laptops/tablets)
👉Internet audio (Zoom)
👉Adults speaking to kids
#ai #merlionchallenge #interspeech
Participating teams have a chance to submit their papers at our special session at #Interspeech 2023 in Dublin (yes, Ireland) ☘️
You can find out more about the #MerlionChallenge or sign up to take part by taking a look at our shiny new website!
#interspeech #merlionchallenge
For the #MerlionChallenge at #Interspeech we’ll be asking teams to train a #SpeechProc / #AI system that can guess which language is which (Task 1: Language ID) and when (Task 2: Language Diarization)!
👉Challenge audio is Zoom recordings with English and/or Mandarin Chinese
👉Audio for development matches audio for evaluation 😗👌
#merlionchallenge #interspeech #speechproc #ai
Our annotation protocol is documented in the BELA transcription conventions. The Wiki includes instructions for how to do multi-tier multilingual transcriptions using Elan (free!)
BELA Con:
blipntu.github.io/belacon/
For the #MerlionChallenge we hold some info back
All of the #MerlionChallenge audio recordings were collected via Zoom calls, where parents narrated a wordless picturebook to their children (link to an old thread on the bird site)
The book is free to download, and can be used for any language or combo!
https://twitter.com/suzyjstyles/status/1453324258654883844?s=21&t=UaXkchQhLvUn0AzL7Hu90g
The #MerlionChallenge is a 🌏 collab between psycholinguists at NTU in Singapore (me, Victoria Chua, Fei Ting Woon) and Engineers at JHU, TUD and NTU (Leibny Paola GarciaPerera, Sanjeev Khudanpur, Justin Dauwels, Hexin Liu, Andy Khong) for #Interspeech 2023
#merlionchallenge #interspeech
I’m sure I have a bunch of #Multilingual #LangDev, #SpeechProc #NLP and #CogSci friends over here 🦣
We’ve prepped >30hrs of our English/Mandarin code-switched child directed speech for the #MerlionChallenge at this year’s INTERSPEECH
>300 files, >100 voices 🙀 (+ training data)
We’re looking for speech systems that can figure out which language is spoken when!
The #MerlionChallenge will see whose system does the best job 💪🏼
Join or help us boost the message: https://sites.google.com/view/merlion-ccs-challenge/
#multilingual #LangDev #speechproc #nlp #cogsci #merlionchallenge