2013 spring, tue1, staff and student office building, room s316
who should take this course
Consider this course if you wish to quantitatively analyze spoken language by:
choosing the phenomenon of interest
deciding the quality and quantity of data needed
collecting and labeling speech
analyzing and reporting results
before taking this course
To succeed in this course:
you must have completed undergraduate or graduate courses in:
linguistics (especially phonetics and phonology)
statistics (descriptive statistics at the minimum; predictive statistics preferred)
you must have English language skills sufficient to:
read course material
after taking this course
You will become able to:
explain core concepts of spoken language corpora (e.g., differences between read speech and spontaneous speech; characteristics and usage of close-talking microphones; characteristics and limitations of telephone speech)
explain core concepts of computation phonology (e.g., characteristics of biphones, triphones, and clustering)
explain the purpose of spoken language corpora (e.g., speech analysis, speech synthesis, speech recognition, speech interactive systems, automated spoken language learning)
explain required specifications and development strategies of spoken language corpora (e.g., when labeling at the phone or semantic levels are needed; how to collect speech from native or non-native speakers)
label speech at the phone and word level (note: this course does not address ToBI)
display and interpret narrow-band and wide-band spectrograms, spectra, formants, and F0 pitch traces.
measure speech rate in various ways (e.g., the number of phones, syllables, and words per unit time with or without considering filled or unfilled pauses)
design and develop a small spoken language corpus
pronounce phones used in the world's languages
During this course:
you may collect and analyze speech in any language you choose
you may participate in class in either English or Japanese
when, where, what
in 2013 spring semester
during tuesday 1st period
staff and student office building, 3rd floor, room s316 (CALL staff room B)
You need a laptop that:
runs either Windows, MacOS, or Linux
you can bring to class
connects to a projector using a DVI, VGA, mini-display-port, or HDMI connector
connects to an external loudspeaker using a 3.5 mm plug or via bluetooth
You need a headset (or microphone and headphones) for your laptop. I will explain during the 1st class session.
科目名［英文名］ Course Title
[Multimedia Language Processing]
担当教員［ローマ字表記］（所属） Other Instructors(Institution)
Open To Other Faculties / Schools
Type of Class
Number of Credits
Year of Eligible Students
Major Category Code
Major Category Title
Middle Category Code
Middle Category Title
Small Category Code
Small Category Title
キーワード Key Words
digital signal processing of speech, acoustical analysis, phonetics, computational phonology, spoken language corpora
Course objective: This is a hands on course where students acquire the technical skills for using computers to analyze spoken language. Students must be familiar with linguistics and statistics.
Required general skills: (a) Strong English language reading skills are essential. Most reading assignments and all software manuals will be in English. If students desire, lectures and student presentations can be given in English. Regardless of the emphasis on English language, students are expected to become bilingual (English and Japanese) in the technical terminology. (b) Students must bring their own computer to class, and present their assignments. MacOS, Linux, and Windows are acceptable.
Prerequisites: Students must have taken at least undergraduate courses in linguistics (especially phonetics and phonology), statistics (descriptive statistics at the minimum; predictive statistics preferred), and experiment design.
After completing this course, students will be able to do the following: (1) Explain basic characteristics of spoken language corpora (e.g., the differences between read speech and spontaneous speech; features of close-talking microphones; features of telephone-bandwidth speech). (2) Explain basic concepts of computational phonology (e.g., biphones, triphones, clustering). (3) Explain the uses of spoken language corpora (e.g., speech analysis, speech synthesis, speech recognition, spoken language interactive systems, automated pronunciation learning). (4) Explain the design criteria and development strategies of spoken language corpora (e.g., when labeling at the phone level or annotating semantic information are necessary; how to collect speech and develop corpora from native and non-native speakers). (5) Label speech at the phone and word levels. (Note that this course does not teach ToBI labeling.) (6) Display and interpret narrow-band and wide-band spectrograms, spectra, formants, and F0 tracks. (7) Measure speech rate using various methods (e.g., the number of phones, syllables, or words per unit time considering filled and/or unfilled pauses). (8) Design and develop a small spoken language corpus.
授業計画 Course Schedule
Phase 1 (
Phase 2 (
Phase 3 (
Phase 4 (
Phase 5 (
Phase 6 (
Phase 7 (
Phase 8 (
Concepts and techniques will be explained based on the needs of students.
Phase 1 (Week 1) Understand the course's objectives, format, requirements, and outcomes. Learn the instructor's background and interests. Review basic knowledge required for the course. Learn requirements for audio computer hardware and speech analysis software.
Phase 2 (Weeks 2 and 3) Read and understand some reference material on articulatory phonetics and speech analysis. Analyze short speech samples at the syllable, word, and utterance levels. Display and interpret narrow-band and wide-band spectrograms, spectra, formants, and F0 tracks.
Phase 3 (Weeks 4 and 5) Read and understand some reference material on computational phonology. Analyze longer speech samples at the utterance level. Label speech at the phone and word levels.
Phase 4 (Weeks 6 and 7) Understand the design criteria, development strategies, and computational tools of spoken language corpora. Understand procedures for collecting speech from native and non-native speakers. Collect and analyze a small spoken language corpus.
Phase 5 (Weeks 8 and 9) Design, collect, and analyze a small spoken language corpus.
Phase 6 (Weeks 10 and 11) Design and explain term projects, focusing on the phenomena of interest, why understanding that phenomena is important, how to measure the phenomena, and how to interpret the measurements. (Each student chooses their own term project.)
Phase 7 (Weeks 12 and 13) Report and discuss term projects. Improve collection or interpretation of data.
Phase 8 (Week 15) No class unless classes canceled or delayed prior to week 15.
Much of the work for this course is done individually outside of class (e.g., installing software, interviewing people, collecting speech, analyzing waveforms). Classroom time is for presenting students' assignments. Assignments are structured incrementally, and require substantial hands-on effort. We will use the freely available, excellent software package http://www.praat.org/.
Information about me (including my educational background, vocational background, list of research publications, courses offered, student comments, contact information) are online at http://goh.kawai.com/. If you are considering taking my course(s), I urge you to talk to my past students. View my English language class "English Seminar at the introductory level" on Hokudai's Open Courseware at http://ocw.hokudai.ac.jp/.