Due Jan. 24, 2026 by email to the teaching assistant.
In class, you have learned about a number of networking-related papers, but in the absence of open-source prototypes, manually reproducing one of these articles takes a long time.In this project, you will learn how to use large language models, chain-of-thought-based prompt engineering, and few-shot learning to reproduce networking papers. Your goals are to (1) select a networking domain paper and obtain a good grasp of the effects of large language models, and (2) reproduce your selected networking domain paper with our semi-automated reproduction framework and perform an evaluation.
You can choose one or more large language models to help you reproduce the networking domain paper.
Choose one of the papers below to reproduce.
Criteria for judging the success of reproducing the paper:
Verify if the replication system achieves the basic functionalities of the original system. If the replication system performs similarly to or better than the original system in these aspects, it can be deemed successful functionally.
Compare the performance of the replication system with that of the original system. Performance metrics may include speed, stability, resource utilization, etc.
The information that needs to be recorded during the experiment:
Your choice of papers and large language model.
Count the number of all prompts you used, the number of prompts constructed with the semi-automated framework, the number of prompts used for debugging, and the number of prompts used in addition to semi-automated frameworks (human involvement). If you believe the current number of manual prompts is excessive, you can improve the semi-automated reproduction framework and include the document outlining your improvements to the prompt framework in the final submission.
| All Prompt | Prompts Constructed with the Semi-automated Frame | Prompt of Human Involvement | Debug Prompt |
|---|---|---|---|
Count the total time spent from the time you read the paper to the time you reproduced the system, and how much of that time was spent reading the paper, how much was spent on code generation, and how much was spent on debugging.
| All Time (hour) | Read Paper | Code Generation | Debug |
|---|---|---|---|
What functions were implemented in the original system and what functions did you reproduce. For example:
| Simple Marking at the Switch | ... | ... | |
|---|---|---|---|
| Functions realized in the original system | |||
| Reproduced system |
If there are functions that have not been reproduced, please explain why.
Perform two or more performance evaluations, with each performance evaluation tested with at least two datasets. As much as possible, use the dataset used by the original system. If the dataset tested by the original system is overly large, you may choose a subset of the data for this project's testing. However, be sure to elaborate on the method used to select the subset in your documentation. For example:
| Dataset | Original System | Reproduction System | Average Relative Error(%) |
|---|---|---|---|
| Dataset1 | X s | Y s | |Y-X|/X *100% |
| Dataset2 | |||
| ... |
Count the lines of code of the reproduction system and original system.
| Original System | Reproduction System | |
|---|---|---|
| LOC (line) | X lines | Y lines |
Please submit by email to the instructor. Turn in electronic and paper material as follows.
Submission should include: