Summaries Annotation Platform
Welcome to the code summary annotation platform!
These summaries are either written by a domain expert or generated by a pre-trained LLM. Some LLM-generated summaries follow our approach, which considers the context of the method being analyzed.
During your session, you will annotate these summaries without knowing whether they are human- or machine-generated.
Please evaluate the summary based on two essential criteria: accuracy with respect to the source code, and completeness of the conveyed information. For each criterion, assign a score on a scale of 1 to 4 (4 being the highest).
- Precision (Consistency) :
- Very Poor: Contains hallucinatory facts, unusable summary.
- Poor: Poorly explains the code, missing important elements.
- Good: The summary matches the source code. Existing errors are tolerable.
- Excellent: Better summary of the source code.
- Recall (Relevance) :
- Strongly Disagree: Misses all important information about the source code.
- Disagree: Misses some majors information about the code.
- Agree: Contains import information.
- Strongly Agree: Constains all important information.