Dr Suzy J Styles · @suzyjstyles
700 followers · 259 posts · Server toot.community

Baseline system + leaderboards are up for untangling complex code-mixed speech. Which system will do the best job on complex language use in the wild? 👀

TWO TEAMS have already beaten the baseline for Language ID:
🎉Lingua_Lumos (Closed)
🎉UNSW_Signal_Processing (Open)

There’s still time to join the challenge and prep your paper for our special session at

toot.community/@suzyjstyles/10

#merlionchallenge #ml #deeplearning #speechproc #interspeech2023

Last updated 3 years ago

Dr Suzy J Styles · @suzyjstyles
700 followers · 259 posts · Server toot.community

Have you ever seen auto-generated subtitles turn to mush because they couldn’t handle a speaker’s accent or figure out what language they’re speaking after a switch?

The for tests how well teams can build a language detection system for Code-Switching in >300 Zoom recordings.

Help build robust systems for multilingualism by joining the challenge or sharing with friends 💪🏼💪🏽💪🏿

toot.community/@suzyjstyles/10

#merlionchallenge #interspeech23 #ml #deeplearning #speechproc

Last updated 3 years ago

Dr Suzy J Styles · @suzyjstyles
700 followers · 259 posts · Server toot.community

Ever seen auto-generated subtitles turn to mush because they couldn’t handle a speaker’s accent or figure out what language they’re speaking after a switch?

The at tests how well teams can build a language detection system for real-world Code-Switching between English and Mandarin Chinese in >300 Zoom recordings.

Join the challenge or boost to help build more robust speech systems for multilingualism 💪🏼💪🏽💪🏿

toot.community/@suzyjstyles/10

#merlionchallenge #interspeech23

Last updated 3 years ago

Dr Suzy J Styles · @suzyjstyles
700 followers · 259 posts · Server toot.community

Bonus

✨✨DID YOU KNOW?✨✨With the body of a mermaid and the head of a lion, the Merlion is a national icon of Singapore.

✨Just as the Merlion is a mix of different creatures, the code-switched child-directed speech in the is a mix of different languages✨

(apologies for cross posting)

#merlionchallenge

Last updated 3 years ago

Dr Suzy J Styles · @suzyjstyles
700 followers · 259 posts · Server toot.community

Most AI speech processing systems are developed using samples of monolingual speech between adults.

We hope the at @Interspeech 2023 pushes the frontiers of how automated systems handle the diverse kinds of we see in the world 🌍

If you want to see better tools for and then help us ✨boost✨ these posts can reach all the lovely and folks!

#WEIRDbias #merlionchallenge #translanguaging #LangDev #multilingualism #diversevoices #globallanguages #speechproc #cognitivescience

Last updated 3 years ago

Dr Suzy J Styles · @suzyjstyles
700 followers · 259 posts · Server toot.community

What makes this a good challenge?
👉Natural code switching (no shuffled segments)
👉Accented English & Mandarin
👉Precision human annotation
👉Various far field mics (laptops/tablets)
👉Internet audio (Zoom)
👉Adults speaking to kids

#ai #merlionchallenge #interspeech

Last updated 3 years ago

Dr Suzy J Styles · @suzyjstyles
700 followers · 259 posts · Server toot.community

Participating teams have a chance to submit their papers at our special session at 2023 in Dublin (yes, Ireland) ☘️

You can find out more about the or sign up to take part by taking a look at our shiny new website!

sites.google.com/view/merlion-

#interspeech #merlionchallenge

Last updated 3 years ago

Dr Suzy J Styles · @suzyjstyles
700 followers · 259 posts · Server toot.community

For the at we’ll be asking teams to train a / system that can guess which language is which (Task 1: Language ID) and when (Task 2: Language Diarization)!

👉Challenge audio is Zoom recordings with English and/or Mandarin Chinese
👉Audio for development matches audio for evaluation 😗👌

#merlionchallenge #interspeech #speechproc #ai

Last updated 3 years ago

Dr Suzy J Styles · @suzyjstyles
700 followers · 259 posts · Server toot.community

Our annotation protocol is documented in the BELA transcription conventions. The Wiki includes instructions for how to do multi-tier multilingual transcriptions using Elan (free!)

BELA Con:
blipntu.github.io/belacon/

For the we hold some info back

#merlionchallenge

Last updated 3 years ago

Dr Suzy J Styles · @suzyjstyles
700 followers · 259 posts · Server toot.community

All of the audio recordings were collected via Zoom calls, where parents narrated a wordless picturebook to their children (link to an old thread on the bird site)

The book is free to download, and can be used for any language or combo!

twitter.com/suzyjstyles/status

#merlionchallenge

Last updated 3 years ago

Dr Suzy J Styles · @suzyjstyles
700 followers · 259 posts · Server toot.community

The is a 🌏 collab between psycholinguists at NTU in Singapore (me, Victoria Chua, Fei Ting Woon) and Engineers at JHU, TUD and NTU (Leibny Paola GarciaPerera, Sanjeev Khudanpur, Justin Dauwels, Hexin Liu, Andy Khong) for 2023

interspeech2023.org

#merlionchallenge #interspeech

Last updated 3 years ago

Dr Suzy J Styles · @suzyjstyles
700 followers · 259 posts · Server toot.community

I’m sure I have a bunch of , and friends over here 🦣

We’ve prepped >30hrs of our English/Mandarin code-switched child directed speech for the at this year’s INTERSPEECH
>300 files, >100 voices 🙀 (+ training data)

We’re looking for speech systems that can figure out which language is spoken when!

The will see whose system does the best job 💪🏼

Join or help us boost the message: sites.google.com/view/merlion-

#multilingual #LangDev #speechproc #nlp #cogsci #merlionchallenge

Last updated 3 years ago