【音频】如何使用 PyTorch-Kaldi 的评分脚本来评估 ASR 结果

如果你想添加新的评估指标或者处理不同的输入格式，可以根据需要修改评分脚本。例如，为了计算字符错误率（CER），你可以引入另一个函数，并根据需要调整参数解析逻辑。

奔跑草-

1402人浏览 · 2024-12-12 17:27:24

奔跑草- · 2024-12-12 17:27:24 发布

1. 环境准备

安装依赖项

确保你已经安装了 Python 和必要的库（如 NumPy, SciPy, PyTorch）。你可以通过以下命令来安装这些库：

pip install numpy scipy torch torchaudio

此外，还需要安装 Kaldi 工具包。Kaldi 是一个用于语音识别的工具集，它提供了许多有用的工具和脚本。克隆 PyTorch-Kaldi 仓库并编译 Kaldi：

git clone https://github.com/ SpeechCom-PyTorch/pytorch-kaldi.git
cd pytorch-kaldi/kaldi
./src/run_all.sh
export KALDI_ROOT=`pwd`

设置环境变量

设置 KALDI_ROOT 环境变量，以便在运行脚本时可以访问 Kaldi 的二进制文件：

echo "export KALDI_ROOT=$PWD" >> ~/.bashrc
source ~/.bashrc

2. 准备数据

创建两个文本文件 hyp.txt 和 ref.txt，每个文件中的每一行代表一个音频片段的转录。假设你有一个包含假设转录的文件 hyp.txt 和一个包含参考转录的文件 ref.txt。它们的内容可能如下所示：

hyp.txt (假设转录):

this is an example of a hypothesis transcription
the quick brown fox jumps over the lazy dog
...

ref.txt (参考转录):

this is an example of a reference transcription
the quick brown fox jumped over the lazy dog
...

确保两个文件中的句子顺序一致，因为评分脚本会逐行比较这两个文件。

3. 使用评分脚本

导航到 PyTorch-Kaldi 项目根目录，并进入存放评分脚本的子目录。通常情况下，这个目录可能是 local/ 或者 steps/。然后，使用 Python 执行评分脚本。这里我们假设评分脚本名为 compute-wer.py。

cd path/to/pytorch-kaldi/local/
python compute-wer.py --hyp ../data/hyp.txt --ref ../data/ref.txt

如果你想要保存输出结果到文件中，可以使用重定向操作符：

python compute-wer.py --hyp ../data/hyp.txt --ref ../data/ref.txt > evaluation_results.txt

示例：计算WER

下面是一个简单的 Python 脚本示例，用于计算词错误率（WER）。请注意，实际的 compute-wer.py 可能更加复杂，包含了更多的功能和选项。此示例仅用于说明目的。
在这里插入图片描述

# compute-wer.py
import sys
from jiwer import wer

def calculate_wer(hypothesis_file, reference_file):
    with open(hypothesis_file, 'r', encoding='utf-8') as hyp_f, \
         open(reference_file, 'r', encoding='utf-8') as ref_f:
        hypotheses = hyp_f.readlines()
        references = ref_f.readlines()

    if len(hypotheses) != len(references):
        print("The number of lines in hypothesis and reference files do not match.")
        return
    
    word_error_rate = wer(references, hypotheses)
    print(f"Word Error Rate: {word_error_rate:.4f}")

if __name__ == "__main__":
    if len(sys.argv) != 5 or sys.argv[1] != '--hyp' or sys.argv[3] != '--ref':
        print("Usage: python compute-wer.py --hyp <hypothesis_file> --ref <reference_file>")
        sys.exit(1)

    hyp_file = sys.argv[2]
    ref_file = sys.argv[4]

    calculate_wer(hyp_file, ref_file)

在这个例子中，我们使用了 jiwer 库来计算 WER。你需要先安装这个库：

pip install jiwer

4. 自定义评分脚本

示例：添加 CER 计算

from jiwer import cer

def calculate_cer(hypothesis_file, reference_file):
    with open(hypothesis_file, 'r', encoding='utf-8') as hyp_f, \
         open(reference_file, 'r', encoding='utf-8') as ref_f:
        hypotheses = hyp_f.read().splitlines()
        references = ref_f.read().splitlines()

    if len(hypotheses) != len(references):
        print("The number of lines in hypothesis and reference files do not match.")
        return
    
    character_error_rate = cer(references, hypotheses)
    print(f"Character Error Rate: {character_error_rate:.4f}")

然后，在主程序中调用这个新函数：

if __name__ == "__main__":
    if len(sys.argv) < 5 or sys.argv[1] not in ['--hyp', '--cer']:
        print("Usage: python compute-wer.py (--hyp <hypothesis_file> --ref <reference_file>) | (--cer)")
        sys.exit(1)

    if sys.argv[1] == '--hyp':
        hyp_file = sys.argv[2]
        ref_file = sys.argv[4]
        calculate_wer(hyp_file, ref_file)
    elif sys.argv[1] == '--cer':
        hyp_file = sys.argv[2]
        ref_file = sys.argv[4]
        calculate_cer(hyp_file, ref_file)