markotesic.org
markotesic.org
Home
CV
Tutorials
Publications
Past Projects
Contact
3
Robust evaluation of generative AI
A tutorial on evaluating the capabilities of LLMs presented at the European Association for Data Science
Summer School on Generative AI
Marko Tešić
Jun 20, 2024
GitHub
Measurement layouts for capability-oriented AI evaluation
A tutorial presented at AAAI-24 on AI evaluation that focuses on estimating capabilities and creating capability profiles of AI systems (e.g., reinforcement learning agents and large language models) using a Bayesian framework.
John Burden
,
José Hernández-Orallo
,
Marko Tešić
,
Konstantinos Voudouris
Feb 20, 2024
GitHub
Cite
×