Open Questions
1. In your work or industry, where are you currently using language models? What specific tasks are these models performing for you?
2. What aspects of the models' performance are your users responding well to, and what aspects are causing challenges? Have you encountered any recurring issues like inaccuracies, biases, or hallucinations?
3. When selecting a language model for your tasks, what key features or capabilities do you look for? Are there any particular strengths or weaknesses in current models that significantly influence your choice?
4. What factors are most important to you when choosing a model - for example, open-source availability, multilingual capabilities, or specific performance metrics? How do you prioritise these factors?
5. How do you define and measure "success" for the tasks where you're using language models? What metrics or outcomes are most critical for you?
6. When evaluating different language models, which performance metrics do you consider most important? How do you balance trade-offs between different metrics for various tasks?
7. Could you describe 3 specific tasks you use language models for, and outline what characteristics make a model particularly suitable or unsuitable for each of those tasks?
8. Based on your experience working with language models, what additional factors or considerations do you think are important when choosing between different models? Is there anything else you'd like to share that could help in developing better model selection criteria?