===== 評価 ===== === ツール === * 2021-04-29 | [[https://github.com/princeton-nlp/metric-wsd|MetricWSD]] - Non-Parametric Few-Shot Learning for Word Sense Disambiguation * 2021-02-18 | [[https://github.com/cl-tohoku/PheMT/tree/main/eval_tools|PheMT evaluation toolkit]] - 日英[[機械翻訳]]の言語現象毎評価データセット * 2019-10-17 | [[https://github.com/gcunhase/NLPMetrics|Natural Language Processing Performance Metrics]] * 2017-11-02 | [[https://github.com/borgr/gec-ranking|Ground Truth for Grammatical Error Correction Metrics]] -- python implementation of the GLEU metric === 記事 === * 2022-06-01 | [[https://blog.shinonome.io/huggingface-evaluate/|【機械学習】Hugging faceの評価指標計算ライブラリ「Evaluate」を使ってみた。]] * 2022-02-25 | [[https://ai-scholar.tech/articles/natural-language-processing/mauve|生成されたテキストの人間っぽさや面白さを高精度にモデル化:MAUVE]] * 2021-12-18 | [[https://gotutiyan.hatenablog.com/entry/2021/12/18/123008|評価手法としてではない評価手法]] * 2020-12-15 | [[https://stop-the-world.hatenablog.com/entry/cs276-information-retrieval-15|Information Retrieval and Web Search まとめ(15): 評価(2)]] * 2020-12-14 | [[https://stop-the-world.hatenablog.com/entry/cs276-information-retrieval-14|Information Retrieval and Web Search まとめ(14): 評価(1)]] * 2020-06-03 | [[https://webbigdata.jp/ai/post-5978|BLEURT:人工知能が生成した文章の品質を評価(1/3)]] * 2020-05-15 | (スライド) [[https://speakerdeck.com/cfiken/15-nlpaper-dot-challenge-bertying-yong-mian-qiang-hui-tekisutosheng-cheng-falseping-jia-x-bert|[2020/05/15] nlpaper.challenge BERT応用勉強会 テキスト生成の評価 × BERT]] * 2020-01-12 | [[https://qiita.com/amtsyh/items/a926b79b90dfabe895e9|テキスト生成の自動評価指標について]]