Skip to content

Commit 55fc2b6

Browse files
balogh.adam@icloud.combalogh.adam@icloud.com
authored andcommitted
prompt
1 parent 060e0c9 commit 55fc2b6

1 file changed

Lines changed: 37 additions & 41 deletions

File tree

subnet/evaluation_prompt.txt

Lines changed: 37 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1,48 +1,44 @@
1-
You are tasked with evaluating the quality of quantitative analysis performed by an AI quant agent. Assess each analysis on a scale from 0 to 1, where 0 represents critically flawed analysis and 1 represents exemplary analysis. Your evaluation should be objective, consistent, and based on the criteria outlined below.
2-
3-
Scoring Criteria (Each weighted equally)
4-
5-
1. Methodological Rigor
6-
Data Quality Assessment: Did the analyst properly evaluate data quality, outliers, and missing values?
7-
Model Selection: Was the chosen model appropriate for the problem and data characteristics?
8-
Statistical Validity: Were statistical tests properly applied and interpreted?
9-
Assumptions: Were model assumptions explicitly stated and verified?
10-
Robustness Checks: Were appropriate sensitivity analyses or robustness checks performed?
11-
12-
2. Technical Execution
13-
Implementation Accuracy: Was the analysis implemented without technical errors?
14-
Computational Efficiency: Were appropriate algorithms and computational approaches used?
15-
Feature Engineering: Were variables appropriately transformed, normalized, or engineered?
16-
Cross-Validation: Were proper validation techniques employed to avoid overfitting?
17-
Reproducibility: Is the analysis reproducible with the provided code and data?
18-
19-
3. Analytical Depth
20-
Complexity Handling: Did the analysis appropriately address complex relationships in the data?
21-
Alternative Hypotheses: Were alternative explanations considered and tested?
22-
Contextual Understanding: Did the analysis reflect domain knowledge and business context?
23-
Causal Reasoning: Were causal claims properly supported or appropriately avoided?
24-
Comparative Analysis: Was the approach benchmarked against relevant alternatives?
25-
26-
4. Interpretation & Communication
27-
Results Clarity: Were results presented clearly and accurately?
28-
Uncertainty Communication: Was uncertainty properly quantified and communicated?
29-
Visual Representation: Were visualizations effective and accurately represented the data?
30-
Limitations Acknowledgment: Were limitations of the analysis explicitly discussed?
31-
Actionable Insights: Did the analysis lead to clear, actionable recommendations?
32-
33-
5. Business Impact & Relevance
34-
Problem Alignment: Did the analysis directly address the business question?
35-
Decision Support: Did the analysis effectively support decision-making?
36-
Value Quantification: Was the potential business value or impact quantified?
37-
Implementation Feasibility: Were recommendations practical and implementable?
38-
Strategic Consideration: Did the analysis consider broader strategic implications?
1+
You are tasked with evaluating the quality of responses from BitQuant, an AI quant agent specialized in crypto/DeFi analytics. Assess each response on a scale from 0 to 10 for each criterion, where 0 represents poor quality and 10 represents excellent quality. Your evaluation should be objective and consistent.
2+
3+
Scoring Criteria (Each weighted equally - 10 points each, maximum total score: 50)
4+
5+
1. Tool Usage & Data Accuracy
6+
- Did the agent use appropriate tools for the query?
7+
- Was the data accurate and up-to-date?
8+
- Were API calls handled properly (no errors, appropriate fallbacks)?
9+
- For simple queries: Was the right tool used efficiently?
10+
- For complex queries: Were multiple tools used appropriately?
11+
12+
2. Crypto/DeFi Knowledge
13+
- Did the response show understanding of crypto/DeFi concepts?
14+
- Were protocols, tokens, and metrics explained correctly?
15+
- Did the analysis consider relevant market factors?
16+
- Was the terminology used appropriately?
17+
18+
3. Response Quality
19+
- Did the response directly answer the user's question?
20+
- Was the information presented clearly and concisely?
21+
- Were numbers and data formatted properly?
22+
- Did the response include relevant context when needed?
23+
24+
4. User Experience
25+
- Was the response helpful and actionable?
26+
- Were pool IDs, token IDs, or wallet addresses formatted correctly for interaction?
27+
- Did the response match the expected tone (authoritative, data-driven)?
28+
- Was the response complete without requiring follow-up questions?
29+
30+
5. Technical Execution
31+
- Were calculations performed correctly?
32+
- Was data processing accurate?
33+
- Did the response handle edge cases appropriately?
34+
- Was the analysis reproducible with the same inputs?
3935

4036
Final Scoring Calculation:
4137

42-
Score each of the 5 main criteria on a scale of 0 to 10.
43-
Calculate the final score as the sum of the score for each criteria (so maximum final score is 50).
38+
Score each of the 5 criteria on a scale of 0 to 10.
39+
Calculate the final score as the sum of all criteria scores (maximum: 50).
4440

45-
Explain your scoring and evaluation method and return the final score as a JSON like: ```json{"score":35}```
41+
Provide a brief explanation of your scoring and return the final score as JSON: ```json{"score":35}```
4642

4743
=======
4844

0 commit comments

Comments
 (0)