end2end-asr-pytorch - PAD_TOKEN - SOS_TOKEN - EOS_TOKEN
- 1. End-to-End Speech Recognition on Pytorch
- 2. end2end-asr-pytorch/utils/constant.py
- References
1. End-to-End Speech Recognition on Pytorch
https://github.com/gentaiscool/end2end-asr-pytorch
Transformer-based Speech Recognition Model
2. end2end-asr-pytorch/utils/constant.py
https://github.com/gentaiscool/end2end-asr-pytorch/blob/master/utils/constant.py
PAD_TOKEN = 0
SOS_TOKEN = 1
EOS_TOKEN = 2PAD_CHAR = "¶"
SOS_CHAR = "§"
EOS_CHAR = "¤"
- Sequence to Sequence models: Attention Models
https://deeplearning.cs.cmu.edu/F21/document/slides/lec18.attention.noanim.pdf
To make it explicit, we will add two additional symbols (in addition to the words) to the base vocabulary
<sos>
: Indicates start of a sentence
<eos>
: Indicates end of a sentence
- Neural Abstractive Text Summarization with Sequence-to-Sequence Models
SOS and EOS represent the start and end of a sequence, respectively.
- Listen, Attend and Spell
Here <sos>
and <eos>
are the special start-of-sentence token, and end-of-sentence tokens, respectively.
- END-TO-END SPEECH RECOGNITION WITH ADAPTIVE COMPUTATION STEPS
The decoding targets include 7065 Chinese characters and four special tokens as <UNK>
(unknown), <PAD>
(padding), <SOS>
(start of speech) and <EOS>
(end of speech).
References
[1] Yongqiang Cheng, https://yongqiang.blog.csdn.net/