Skip to content

🔭 Extracting data from a dataset and manipulating the data with python.

Notifications You must be signed in to change notification settings

bielborgesc/data-manipulation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

✔️ FINISHED

📙 Final work of Programming Algorithms l

👮‍♂️💻 Handling Police Data

This was my first job with data manipulation using only the Python language. This work was done in my first semester of college, and after 3 semesters I decide to rewrite the code to see how much I've evolved.

🔍 Objective

The main goal was to rewrite the code using new concepts I've learned since then, concepts like software engineering, clean code, English, data structure and making the code as efficient as possible.

💻 Development

  • I will explain and list the differences of the differents versions and their functions so that we can see in depth the optimization of the code. The file "data_manipulation.py" of folder V1 was the first version made and is in Portuguese, the file "data_manipulation.py" of folder V2 is the optimized code and is in English. The the file "data_manipulation.ipynb" of folder V3 is de last version, I used some libraries like Pandas and write the code with Jupyter.

  • The file "original_database.csv" is the original file found at the link "https://www.kaggle.com/ahsen1330/us-police-shootings", was downloaded in 2020, may have been updated after that date.

  • The file "ajusted_database.csv" is a copy of the file "original_database.csv" with some corrections, because the original file had some data not filled and this caused many errors in the application.

  • The file "invented_database.csv" is a file with the same columns as the original dataset, but with dummy data created by me.

  • In front of each function is a brief comment saying what it does.

  • PLEASE SEE OPTION 10 OF BOTH VERSIONS, THEY SHOW A GREAT SPEED OF PRODUCTIVITY AND THE BIG CHANGE ABOUT MY EVOLUTION.

✔️ Concluding

Many values were entered locally, but you can very easily substitute an input. The code was produced to better adapt to new updates. From v1 to v2 it is possible to notice an excellent optimization when we move from a more functional programming to one that is similar to OOP. From v2 to v3 we can see how libraries such as pandas can give you more productivity and fewer lines of code.

▶️ Run the code

To run this application, Jetbrains' Pycharm IDE was used (https://www.jetbrains.com/pt-br/pycharm/), but you can choose one of your choice. The Python version used to develop the application was 3.10.6. I used libraries like Matplotlib (https://matplotlib.org/) and Pandas (https://pandas.pydata.org/). V1 and V2 can be run in Pycharm, but V3 needs to be run in Jupyter Notebook (https://jupyter.org/). I recommend using Anacondas (https://www.anaconda.com/products/distribution)

🙋‍♂️ Developer

Gabriel Carvalho

About

🔭 Extracting data from a dataset and manipulating the data with python.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published