Aim: Predict the market value of football players
This directory consists of a python notebook (.ipynb
) and two python files (.py
). The notebook explores the dataset and compares and analayses various machine learning techniques. The python file called football_transfer_predictor.py
runs the chosen regression model on a new set of football player data and predicts their market value. To play with the prediction model, you can change the data in the new_player.csv
file. The other python file called football_transfer_prediction.py
is the same as the notebook but as a pure python file.
Dear students, The session on Wednesday 20th February, on Practical Application Case Study, is a practical working group session. For this practical working group session you will need to work in groups and apply the material already learnt to a practical problem of your choice. In contrast to previous non-class learning sessions, the assignment for this one will stay open beyond today. You should work in groups on the assignment in the teams already assigned. The assignment is to be submitted on 20th March 2023. Each group should then present the results of their assignment during the face-to-face session on 24th March 2023. According to the syllabus: "The group assignment will consist of practical application of the learning techniques learnt to specific datasets. Evaluation of the group work will be both for the discussion and decision taking (20%) and for the presentation (20%)." The assignment involves:
- selecting a dataset -- with an associated problem -- for which machine learning solutions may provide interesting insights
- preparing the dataset in the best possible manner (cleaning, normalising, feature selection...)
- selecting a set of machine learning techniques that might be applicable to the dataset (this may involve applying different competing techniques to address the same task)
- running the selected set of techniques on the dataset to the best advantage (trying for different configurations of parameters, applying intermediate validation on part of the dataset...)
- comparative analysis of the different solutions applied, leading to a set of conclusions concerning the most suitable solution
- informed reflection on how the application of machine learning solutions provide insights useful for addressing the underlying problem The submission for the assignment should include:
- a Jupyter notebook covering the six points above
- the set of pure Python programs required to address each of the tasks You are encouraged to search the Internet for ideas, datasets and for code to reuse. However, keep in mind that your submissions will be graded on originality as well as on correctness. If you find good material applicable to your problem, make sure it is (correctly) combined with other elements to ensure your submission has added value of its own. (To facilitate interaction across the group and submission of the material, I highly recommend the use of a GitHub site for the project. I have noticed in the past that students were using GitHub and it seems a pity not to take advantage of that. If you do not know what I am talking about, do not worry and proceed as you would have without this paragraph.) Best regards, Pablo