Spoken language corpora


2013 spring, tue1, staff and student office building, room s316

who should take this course

Stacks Image 48
Consider this course if you wish to quantitatively analyze spoken language by:
  • choosing the phenomenon of interest
  • deciding the quality and quantity of data needed
  • collecting and labeling speech
  • analyzing and reporting results

before taking this course

To succeed in this course:
  • you must have completed undergraduate or graduate courses in:
    • linguistics (especially phonetics and phonology)
    • statistics (descriptive statistics at the minimum; predictive statistics preferred)
    • experiment design
  • you must have English language skills sufficient to:
    • read course material
    • use software

after taking this course

You will become able to:
  • explain core concepts of spoken language corpora (e.g., differences between read speech and spontaneous speech; characteristics and usage of close-talking microphones; characteristics and limitations of telephone speech)
  • explain core concepts of computation phonology (e.g., characteristics of biphones, triphones, and clustering)
  • explain the purpose of spoken language corpora (e.g., speech analysis, speech synthesis, speech recognition, speech interactive systems, automated spoken language learning)
  • explain required specifications and development strategies of spoken language corpora (e.g., when labeling at the phone or semantic levels are needed; how to collect speech from native or non-native speakers)
  • label speech at the phone and word level (note: this course does not address ToBI)
  • display and interpret narrow-band and wide-band spectrograms, spectra, formants, and F0 pitch traces.
  • measure speech rate in various ways (e.g., the number of phones, syllables, and words per unit time with or without considering filled or unfilled pauses)
  • design and develop a small spoken language corpus
  • pronounce phones used in the world's languages

language

During this course:
  • you may collect and analyze speech in any language you choose
  • you may participate in class in either English or Japanese

when, where, what

We meet:
  • in 2013 spring semester
  • during tuesday 1st period
  • staff and student office building, 3rd floor, room s316 (CALL staff room B)
You need a laptop that:
  • runs either Windows, MacOS, or Linux
  • you can bring to class
  • connects to a projector using a DVI, VGA, mini-display-port, or HDMI connector
  • connects to an external loudspeaker using a 3.5 mm plug or via bluetooth
You need a headset (or microphone and headphones) for your laptop. I will explain during the 1st class session.

course offering

mark_ore 科目名[英文名] Course Title
   [Multimedia Language Processing]
マルチメディア言語情報処理論演習
mark_ore 講義題目 Subtitle
  
mark_ore 責任教員[ローマ字表記](所属) Instructor(Institution)
河合 剛[Goh KAWAI](大学院メディア・コミュニケーション研究院
mark_ore 担当教員[ローマ字表記](所属) Other Instructors(Institution)
河合 剛[Goh KAWAI](大学院メディア・コミュニケーション研究院
mark_ore 科目種別
      Course Type
 
mark_ore 他学部履修等の可否
      Open To Other Faculties / Schools       

----  

mark_ore 開講年度

      Year

2013 

mark_ore 開講学期

      Semester

1学期

 

mark_ore 時間割番号

      Course Number       

083156  

mark_ore 授業形態

      Type of Class

 

mark_ore 単位数

      Number of Credits       

2  

mark_ore 対象年次

      Year of Eligible Students

1


mark_ore 対象学科・クラス

      Eligible Department/Class       
国際広報メディア専攻 
mark_ore 補足事項
      Other Information

mark_ore ナンバリングコード
      Numbering Code

mark_ore 大分類コード
      Major Category Code
mark_ore 大分類名称
      Major Category Title
mark_ore 開講部局
     

 

 

 
mark_ore レベルコード
      Level Code
mark_ore レベル
      Level

 

 

mark_ore 中分類コード
      Middle Category Code
mark_ore 中分類名称
      Middle Category Title

 

 

mark_ore 小分類コード
      Small Category Code
mark_ore 小分類名称
      Small Category Title

 

 

mark_ore 言語コード
      Language Code
mark_ore 言語
      Language Type

2

日本語及び英語のバイリンガル授業、受講者決定後に使用言語(日本語又は英語)を決定する授業

 


spacer
mark_ore キーワード Key Words
 
digital signal processing of speech, acoustical analysis, phonetics, computational phonology, spoken language corpora
mark_ore 授業の目標 Course Objectives
授業の意図: 音声言語を収録し分析するためのコンピュータ技術を、手を動かして学ぶ。基本となる言語学や統計学の概念は既習が前提。

不可欠な能力: (a) 英語読解力を有すること。参考文献ならびにソフトウェア使用説明書は全て英文。学生が希望すれば講義や学生発表も英語で行なう。英語が重要であるとはいえ日英両言語で専門用語が理解できなければならない。(b) コンピュータを授業に持参でき操作できること。MacOS, Linux, Windows いずれも可。

既習教科: 言語学 (なかでも音声学と音韻論)、統計学 (最小限でも記述統計、できれば予測統計)、実験計画法を履修していなければならない。

Course objective: This is a hands on course where students acquire the technical skills for using computers to analyze spoken language. Students must be familiar with linguistics and statistics.

Required general skills: (a) Strong English language reading skills are essential. Most reading assignments and all software manuals will be in English. If students desire, lectures and student presentations can be given in English. Regardless of the emphasis on English language, students are expected to become bilingual (English and Japanese) in the technical terminology. (b) Students must bring their own computer to class, and present their assignments. MacOS, Linux, and Windows are acceptable.

Prerequisites: Students must have taken at least undergraduate courses in linguistics (especially phonetics and phonology), statistics (descriptive statistics at the minimum; predictive statistics preferred), and experiment design.
 
mark_ore 到達目標 Course Goals
この授業を受けると以下ができるようになる。
(1)
音声言語コーパスの基本概念を説明できる。(: 朗読発話と自由発話の違い、接話型マイクロホン、電話帯域音声)
(2)
計算音韻論の基本概念を説明できる。(: biphone, triphone, clustering)
(3)
音声言語コーパスの用途を説明できる。(: 音声分析、音声合成、音声認識、音声対話システム、自動化された発音学習)
(4)
音声言語コーパスの設計条件と開発戦略を説明できる。(: 単音のラベルや意味情報のラベルが必要な状況、母語話者や非母語話者から音声データを収録しコーパスを開発する方法)
(5)
単語と単音のラベルづけができる。(注意: この授業は ToBI ラベルを扱わない)
(6)
狭帯域および広帯域のスペクトグラム、スペクトル、フォルマント、F0曲線を表示し解釈できる。
(7)
さまざまな方法で発話速度を測定できる。(: filled pause unfilled pause を考慮した単位時間あたりの単音・音節・単語の数)
(8)
小規模な音声言語コーパスを設計し開発できる。

After completing this course, students will be able to do the following:
(1) Explain basic characteristics of spoken language corpora (e.g., the differences between read speech and spontaneous speech; features of close-talking microphones; features of telephone-bandwidth speech).
(2) Explain basic concepts of computational phonology (e.g., biphones, triphones, clustering).
(3) Explain the uses of spoken language corpora (e.g., speech analysis, speech synthesis, speech recognition, spoken language interactive systems, automated pronunciation learning).
(4) Explain the design criteria and development strategies of spoken language corpora (e.g., when labeling at the phone level or annotating semantic information are necessary; how to collect speech and develop corpora from native and non-native speakers).
(5) Label speech at the phone and word levels. (Note that this course does not teach ToBI labeling.)
(6) Display and interpret narrow-band and wide-band spectrograms, spectra, formants, and F0 tracks.
(7) Measure speech rate using various methods (e.g., the number of phones, syllables, or words per unit time considering filled and/or unfilled pauses).
(8) Design and develop a small spoken language corpus.
 
mark_ore 授業計画 Course Schedule
学生の需要に応じ、概念や技術を随時説明する。

Phase 1 (
1)
 授業と教員を紹介する。授業を理解するための前提知識を復習する。
 音声ハードウェアと分析ソフトウェアを中心にパソコン環境を説明する。

Phase 2 (
23)
 調音音声学や音声分析の資料を読み、理解する。
 音節、単語、発話の単位で短い音声を分析する。
 狭帯域および広帯域のスペクトグラム、スペクトル、フォルマント、F0曲線を表示し解釈する。

Phase 3 (
45)
 計算音韻論の資料を読み、理解する。
 発話の単位でやや長い音声を分析する。
 単音と単語のラベルをつける。

Phase 4 (
67)
 音声コーパスの設計、開発とアプリケーションを理解する。
 母語話者と非母語話者から音声を収録する手法を理解する。
 小規模な音声コーパスを収録し、分析する。

Phase 5 (
89)
 小規模な音声コーパスを設計し、収録し、分析する。

Phase 6 (
1011)
 自由課題の計画を説明する。

Phase 7 (
1213)
 自由課題の成果を発表し、討論する。

Phase 8 (
15)
 予備日。

Concepts and techniques will be explained based on the needs of students.

Phase 1 (Week 1)
Understand the course's objectives, format, requirements, and outcomes.
Learn the instructor's background and interests.
Review basic knowledge required for the course.
Learn requirements for audio computer hardware and speech analysis software.

Phase 2 (Weeks 2 and 3)
Read and understand some reference material on articulatory phonetics and speech analysis.
Analyze short speech samples at the syllable, word, and utterance levels.
Display and interpret narrow-band and wide-band spectrograms, spectra, formants, and F0 tracks.

Phase 3 (Weeks 4 and 5)
Read and understand some reference material on computational phonology.
Analyze longer speech samples at the utterance level.
Label speech at the phone and word levels.

Phase 4 (Weeks 6 and 7)
Understand the design criteria, development strategies, and computational tools of spoken language corpora.
Understand procedures for collecting speech from native and non-native speakers.
Collect and analyze a small spoken language corpus.

Phase 5 (Weeks 8 and 9)
Design, collect, and analyze a small spoken language corpus.

Phase 6 (Weeks 10 and 11)
Design and explain term projects, focusing on the phenomena of interest, why understanding that phenomena is important, how to measure the phenomena, and how to interpret the measurements. (Each student chooses their own term project.)

Phase 7 (Weeks 12 and 13)
Report and discuss term projects.
Improve collection or interpretation of data.

Phase 8 (Week 15)
No class unless classes canceled or delayed prior to week 15.
 
mark_ore 準備学習(予習・復習)等の内容と分量 Homework
授業は講義、実習、課題の説明、課題の報告からなる。課題の多くは音声の収録と分析。授業時間外の作業時間を確保でき、かつ、パソコンを使えなければならない。ソフトウェアは優秀かつ無料の http://www.praat.org/ を用いる。

Much of the work for this course is done individually outside of class (e.g., installing software, interviewing people, collecting speech, analyzing waveforms). Classroom time is for presenting students' assignments. Assignments are structured incrementally, and require substantial hands-on effort. We will use the freely available, excellent software package http://www.praat.org/.
 
mark_ore 成績評価の基準と方法 Grading System
評価の基準は発表と討論。いずれも授業に参加しなければ不可能ゆえ出席必須。学期末試験やレポートといった記述課題は、ない。

Students must present their work, and critique that of their classmates. Intense class participation is mandatory. There will be no written exam or term paper.
 
mark_ore テキスト・教科書 Textbooks
No textbooks need to be purchased. Excerpts will be given in class as a readings package.
 
mark_ore 講義指定図書 Reading List
Speech and Language Processing (2nd Edition) / Daniel Jurafsky, James H. Martin : Prentice Hall, 2008, ISBN:978-0131873216
Foundations of Statistical Natural Language Processing / Christopher D. Manning), Hinrich Schuetze : MIT Press, 1999, ISBN:978-0262133609
Introduction to Information Retrieval / Christopher D. Manning, Prabhakar Raghavan, Hinrich Sch?tze : Cambridge University Press, 2008, ISBN:978-0521865715
The Oxford Handbook of Computational Linguistics / Ruslan Mitkov : Oxford University Press, 2005, ISBN:978-0199276349
Digital Processing of Speech Signals / Lawrence R. Rabiner, Ronald W. Schafer : Prentice Hall, 1978, ISBN:978-0132136037
Discrete-Time Speech Signal Processing: Principles and Practice / Thomas F. Quatieri : Prentice Hall, 2001, ISBN:978-0132429429
No textbooks need to be purchased. Excerpts will be given in class as a readings package.
 
mark_ore 参照ホームページ Websites
http://goh.kawai.com/
http://www.praat.org/
http://ocw.hokudai.ac.jp/
 
mark_ore 研究室のホームページ Website of Laboratory
http://goh.kawai.com/
 
mark_ore 備考 Additional Information 
私の学歴、職歴、研究業績、教育業績、学生意見、連絡手段などを http://goh.kawai.com/ に記載。
私の授業の内容や指導手法について、他の学生に聞くと良い。北大のオープンコースウェア http://ocw.hokudai.ac.jp/ に公開されている 2012 秋学期の学部1年生むけの授業「英語演習初級」のビデオを通じて(異なる科目であるものの)授業風景が見られる。

Information about me (including my educational background, vocational background, list of research publications, courses offered, student comments, contact information) are online at http://goh.kawai.com/. If you are considering taking my course(s), I urge you to talk to my past students. View my English language class "English Seminar at the introductory level" on Hokudai's Open Courseware at http://ocw.hokudai.ac.jp/.
 
mark_ore 更新日時 Update
2013/01/17 14:52:41

course material