LLM Pragmatic Benchmark

MML実用ベンチマーク

Rank	Model Name	Score	Params (#P)	Template

The LLM Practical Benchmark is a benchmark that quantifies the practicality of large language models. It evaluates knowledge, logical reasoning ability, instruction following, code writing, generality of answers, and non-censorship.

LLM実用ベンチマークは大規模言語モデルの実用性を数値化するベンチマークです。知識量、論理的推論能力、指示への追従、コードライティング、回答の一般性、非検閲度を評価しています。

データを読み込み中...