-
Notifications
You must be signed in to change notification settings - Fork 162
Assignment Submission Guidelines
Chinta Geetha Charan Reddy edited this page Oct 8, 2021
·
1 revision
- Fork the repository to utilize the template
- Create a branch with name as follows:
{your_name}-{data_product_name}
- Rename the root folder for the data product according to the name of the data source
- Add your scripts for individual steps in an appropriate folder.
- Any additional utility scripts should be added to the utils folder.
- Follow this general template for writing abstraction classes for every step
# import your dependencies here
class MyClass:
def __init__(self, **kwargs):
self.config = kwargs.get("config")
def do_something(self):
"""Do extraction or processing of data here"""
return None
def load_data(self):
"""Function to load data"""
return None
def save_data(self):
"""Function to save data"""
return None
def run(self):
"""Load data, do_something and finally save the data"""
return None
if __name__ == "__main__":
config = {}
obj = MyClass(config = config)
obj.run()
The class can have other helper functions if required
Use black formatter
Use flake8 linter
- A
settings.json
is given in .vscode directory
- The data can be submitted in one of the formats -
CSV
orJSON
- The intermediate and final standardized data can be submitted in the following manner
- Create a data directory outside of the root folder for the data product and push the data there
- Use Git LFS(Git large file storage) to commit those files
- Push the relative scripts to your branch
- When done working with the data product create a PULL request to the head repository from your branch
- The assignment will be evaluated based upon the following criteria