LogoHackDB
icon of PromptBench

PromptBench

PromptBench: A unified library for evaluating and understanding large language models, enabling quick model assessment and robustness testing.

Introduction

PromptBench is a Pytorch-based Python package designed for evaluating Large Language Models (LLMs). It offers user-friendly APIs for researchers to conduct thorough evaluations. Key features include:

  • Quick Model Performance Assessment: Easily build models, load datasets, and evaluate performance.
  • Prompt Engineering: Implements techniques like Few-shot Chain-of-Thought, Emotion Prompt, and Expert Prompting.
  • Adversarial Prompt Evaluation: Integrates prompt attacks to assess model robustness.
  • Dynamic Evaluation: Uses DyVal to generate evaluation samples on-the-fly, mitigating test data contamination.
  • Efficient Multi-Prompt Evaluation: Integrates PromptEval for efficient evaluation using a small amount of data.

Information

Categories

Tags

Newsletter

Join the Community

Subscribe to our newsletter for the latest news and updates