Project
Code data preprocessing module to remove the headers and commented code
Objective
This module is designed to automatically remove license headers and commented code from source code files. It will support multiple programming languages and can be easily extended to accommodate new ones. The module will be integrated into the data preprocessing pipeline to generate a clean code datasets to train Code LLM.
Outcome
deliver a new code data preprocessing module based on the data prep kit
Apply By Date |
15 May 2024 |
Students |
1 / 1 |
Duration |
8 weeks |
Mentor |
Parameswaran Selvam |
Tools-Technologies | Jupyter Python Notebooks, |
Platform | 1 ) WatsonX |
College | 1. Sarvajanik College of Engineering and Technology |
|