New versions of evaluation metrics can be added to the table by sending over the updated metrics to us in the form of a CSV file.
Open Questions
Are the missing evaluation scores going to be updated in the future? Can we convince all curators to evaluate every model we have in the database so we can have a more comprehensive leaderboard?