Refactor/optimize pipeline #57

MariaGabrielaReis · 2023-11-26T16:19:46Z

Otimizando passos 4 a 6 e extra: pré processamento, análise de sentimento, teste do modelo e modelagem de tópico

PR Type

Que tipo de mudança a PR introduz?

Feature
Code style update (formatting, local variables)
Refactoring (no functional changes, no api changes)

Descreva a alteração

Foram criados novos arquivos para melhor organização dos passos da pipeline: um arquivo apenas para o modelo classificador e suas funções, outro arquivo exclusivo de pré processamento e um outro exclusivo para modelagem de tópicos, além de um arquivo separado na pasta utils para avaliar as métricas do modelo classificador (geração de métricas).

Refatorações e otimizações foram feitas no modelo, na formatação do código, organização de funções entre outras ações que permitiram que toda a base de dados fosse pré processada em um tempo médio de 9 a 12 minutos, como demonstrado na captura do terminal abaixo:

OBS.: Após as refaforações e otimizações gerais será observado se ainda haverá a necessidade de aplicação da técnica de chunks e se existem outras formas de melhorar o modelo classificador de sentimento

…cks/mood-hound-nlp into refactor/optimize-pipeline

sonarcloud · 2023-11-26T22:16:52Z

SonarCloud Quality Gate failed.

0 Bugs
0 Vulnerabilities
0 Security Hotspots
3 Code Smells

16.4% Coverage
0.0% Duplication

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

* hotfix names * Refactor: Format date (#27) * refactor: capitalize sentiments (#24) Co-authored-by: JoaoM-py <[email protected]> * fix: translate sentiments (#28) * Feat/separate training reviews (#35) * feat: create function to get random data * feat: create function to get training data * feat: get training data * chore: Manual sentiment classification --------- Co-authored-by: Maria Gabriela Reis <[email protected]> * Feat/#0303 create classification model (#37) * feat: update manual classification * fix: translate topics and sentiments * feat: add seaborn lib * fix: translate topics, update reviews count * feat: create classification model * feat: training, test and apply classification model * refactor: Update stars name * refactor: Update training method --------- Co-authored-by: JoaoM-py <[email protected]> * Feat/#42 test coverage (#38) * feat: create pipeline tests * refactor: Update files names * feat: Training data * feat: Creating pipeline tests * merge: Merge develop * feat: Coverage and sonarcloud config * refactor: Update processing and remove comments * chore: update python version * chore: update sonarcloud * chore: update tests * chore: update tests * chore: update tests * Feat/#0106 bring birth year and gender (#50) * Tests and Sentment Analysis (#39) * hotfix names * Refactor: Format date (#27) * refactor: capitalize sentiments (#24) Co-authored-by: JoaoM-py <[email protected]> * fix: translate sentiments (#28) * Feat/separate training reviews (#35) * feat: create function to get random data * feat: create function to get training data * feat: get training data * chore: Manual sentiment classification --------- Co-authored-by: Maria Gabriela Reis <[email protected]> * Feat/#0303 create classification model (#37) * feat: update manual classification * fix: translate topics and sentiments * feat: add seaborn lib * fix: translate topics, update reviews count * feat: create classification model * feat: training, test and apply classification model * refactor: Update stars name * refactor: Update training method --------- Co-authored-by: JoaoM-py <[email protected]> * Feat/#42 test coverage (#38) * feat: create pipeline tests * refactor: Update files names * feat: Training data * feat: Creating pipeline tests * merge: Merge develop * feat: Coverage and sonarcloud config * refactor: Update processing and remove comments * chore: update python version * chore: update sonarcloud * chore: update tests * chore: update tests * chore: update tests --------- Co-authored-by: GabrielCamargoL <[email protected]> Co-authored-by: JoaoM-py <[email protected]> Co-authored-by: JoaoM-py <[email protected]> * feat: Bring informations of the client --------- Co-authored-by: Maria Gabriela Reis <[email protected]> Co-authored-by: GabrielCamargoL <[email protected]> * Feat/#11 new runtime tracker (#49) * Tests and Sentment Analysis (#39) * hotfix names * Refactor: Format date (#27) * refactor: capitalize sentiments (#24) Co-authored-by: JoaoM-py <[email protected]> * fix: translate sentiments (#28) * Feat/separate training reviews (#35) * feat: create function to get random data * feat: create function to get training data * feat: get training data * chore: Manual sentiment classification --------- Co-authored-by: Maria Gabriela Reis <[email protected]> * Feat/#0303 create classification model (#37) * feat: update manual classification * fix: translate topics and sentiments * feat: add seaborn lib * fix: translate topics, update reviews count * feat: create classification model * feat: training, test and apply classification model * refactor: Update stars name * refactor: Update training method --------- Co-authored-by: JoaoM-py <[email protected]> * Feat/#42 test coverage (#38) * feat: create pipeline tests * refactor: Update files names * feat: Training data * feat: Creating pipeline tests * merge: Merge develop * feat: Coverage and sonarcloud config * refactor: Update processing and remove comments * chore: update python version * chore: update sonarcloud * chore: update tests * chore: update tests * chore: update tests --------- Co-authored-by: GabrielCamargoL <[email protected]> Co-authored-by: JoaoM-py <[email protected]> Co-authored-by: JoaoM-py <[email protected]> * feat: New time metric, storage of metrics and * feat: increase test coverage * feat: Coverage * feat: increase test coverage * feature: Update sonar yml * feat: Update test url * feat: Update test url * feat: Update test url * feat: Update test url * feat: Update test url * feat: Config env * refactor:Update utils imports * refactor: Update env * refactor: Update url and env --------- Co-authored-by: Maria Gabriela Reis <[email protected]> Co-authored-by: GabrielCamargoL <[email protected]> * Feat/#48 unit tests (#51) * Tests and Sentment Analysis (#39) * hotfix names * Refactor: Format date (#27) * refactor: capitalize sentiments (#24) Co-authored-by: JoaoM-py <[email protected]> * fix: translate sentiments (#28) * Feat/separate training reviews (#35) * feat: create function to get random data * feat: create function to get training data * feat: get training data * chore: Manual sentiment classification --------- Co-authored-by: Maria Gabriela Reis <[email protected]> * Feat/#0303 create classification model (#37) * feat: update manual classification * fix: translate topics and sentiments * feat: add seaborn lib * fix: translate topics, update reviews count * feat: create classification model * feat: training, test and apply classification model * refactor: Update stars name * refactor: Update training method --------- Co-authored-by: JoaoM-py <[email protected]> * Feat/#42 test coverage (#38) * feat: create pipeline tests * refactor: Update files names * feat: Training data * feat: Creating pipeline tests * merge: Merge develop * feat: Coverage and sonarcloud config * refactor: Update processing and remove comments * chore: update python version * chore: update sonarcloud * chore: update tests * chore: update tests * chore: update tests --------- Co-authored-by: GabrielCamargoL <[email protected]> Co-authored-by: JoaoM-py <[email protected]> Co-authored-by: JoaoM-py <[email protected]> * feat: increase test coverage * feat: Coverage * feat: increase test coverage * feature: Update sonar yml * feat: Update test url * feat: Update test url * feat: Update test url * feat: Update test url * feat: Update test url * feat: Config env * refactor:Update utils imports * refactor: Update env * refactor: Update url and env --------- Co-authored-by: Maria Gabriela Reis <[email protected]> Co-authored-by: GabrielCamargoL <[email protected]> * Feat: new classification model (#52) * feat: add new model to sentiment analysis * fix: use correct training and treat exceptions * fix: remove console logs * feat: increment reviews quantity * Refactor/optimize pipeline (#56) * feat: add new model to sentiment analysis * fix: use correct training and treat exceptions * fix: remove console logs * feat: increment reviews quantity * feat: select only necessary columns and apply your types * feat: optimize clear data step * Delete .env * Refactor/optimize pipeline (#57) * feat: add new model to sentiment analysis * fix: use correct training and treat exceptions * fix: remove console logs * feat: increment reviews quantity * feat: select only necessary columns and apply your types * feat: optimize clear data step * Delete .env * style: format and remove unused files * feat: create a file for pre processing step * feat: create file to classification model * feat: create file to topic model * feat: get metrics from classification model * feat: use new steps and adjust details * fix: remove some code smells and run processing * fix: update training and test data visualization * Fix/unit tests (#59) * Tests and Sentment Analysis (#39) * hotfix names * Refactor: Format date (#27) * refactor: capitalize sentiments (#24) Co-authored-by: JoaoM-py <[email protected]> * fix: translate sentiments (#28) * Feat/separate training reviews (#35) * feat: create function to get random data * feat: create function to get training data * feat: get training data * chore: Manual sentiment classification --------- Co-authored-by: Maria Gabriela Reis <[email protected]> * Feat/#0303 create classification model (#37) * feat: update manual classification * fix: translate topics and sentiments * feat: add seaborn lib * fix: translate topics, update reviews count * feat: create classification model * feat: training, test and apply classification model * refactor: Update stars name * refactor: Update training method --------- Co-authored-by: JoaoM-py <[email protected]> * Feat/#42 test coverage (#38) * feat: create pipeline tests * refactor: Update files names * feat: Training data * feat: Creating pipeline tests * merge: Merge develop * feat: Coverage and sonarcloud config * refactor: Update processing and remove comments * chore: update python version * chore: update sonarcloud * chore: update tests * chore: update tests * chore: update tests --------- Co-authored-by: GabrielCamargoL <[email protected]> Co-authored-by: JoaoM-py <[email protected]> Co-authored-by: JoaoM-py <[email protected]> * feat: add new model to sentiment analysis * fix: use correct training and treat exceptions * fix: remove console logs * feat: increment reviews quantity * fix: tests of the updated pipeline * feat: select only necessary columns and apply your types * feat: optimize clear data step * Delete .env * style: format and remove unused files * feat: create a file for pre processing step * feat: create file to classification model * feat: create file to topic model * feat: get metrics from classification model * feat: use new steps and adjust details * fix: remove some code smells and run processing * fix: update training and test data visualization * feat: create and fix unit tests * fix: remove .coverage * reafacto: remove unused test --------- Co-authored-by: Maria Gabriela Reis <[email protected]> Co-authored-by: GabrielCamargoL <[email protected]> * Feat/logs and alerts (#60) * Tests and Sentment Analysis (#39) * hotfix names * Refactor: Format date (#27) * refactor: capitalize sentiments (#24) Co-authored-by: JoaoM-py <[email protected]> * fix: translate sentiments (#28) * Feat/separate training reviews (#35) * feat: create function to get random data * feat: create function to get training data * feat: get training data * chore: Manual sentiment classification --------- Co-authored-by: Maria Gabriela Reis <[email protected]> * Feat/#0303 create classification model (#37) * feat: update manual classification * fix: translate topics and sentiments * feat: add seaborn lib * fix: translate topics, update reviews count * feat: create classification model * feat: training, test and apply classification model * refactor: Update stars name * refactor: Update training method --------- Co-authored-by: JoaoM-py <[email protected]> * Feat/#42 test coverage (#38) * feat: create pipeline tests * refactor: Update files names * feat: Training data * feat: Creating pipeline tests * merge: Merge develop * feat: Coverage and sonarcloud config * refactor: Update processing and remove comments * chore: update python version * chore: update sonarcloud * chore: update tests * chore: update tests * chore: update tests --------- Co-authored-by: GabrielCamargoL <[email protected]> Co-authored-by: JoaoM-py <[email protected]> Co-authored-by: JoaoM-py <[email protected]> * feat: add new model to sentiment analysis * fix: use correct training and treat exceptions * fix: remove console logs * feat: increment reviews quantity * fix: tests of the updated pipeline * feat: select only necessary columns and apply your types * feat: optimize clear data step * Delete .env * style: format and remove unused files * feat: create a file for pre processing step * feat: create file to classification model * feat: create file to topic model * feat: get metrics from classification model * feat: use new steps and adjust details * fix: remove some code smells and run processing * fix: update training and test data visualization * feat: create and fix unit tests * fix: remove .coverage * reafacto: remove unused test * feat: Organizing metrics, collect of logs and alerts * fix: Pipeline exec time * fix: code format and increase reviews quantity * feat: increase reviews quantity * fix: Pipeline exec time * fix: Pipeline exec time * fix: Pipeline exec time * fix: Pipeline exec time format --------- Co-authored-by: Maria Gabriela Reis <[email protected]> Co-authored-by: GabrielCamargoL <[email protected]> --------- Co-authored-by: GabrielCamargoL <[email protected]> Co-authored-by: JoaoM-py <[email protected]> Co-authored-by: JoaoM-py <[email protected]>

MariaGabrielaReis added 17 commits November 7, 2023 20:47

feat: add new model to sentiment analysis

7e4f0de

fix: use correct training and treat exceptions

109fad1

fix: remove console logs

e5407e5

feat: increment reviews quantity

76c87fe

Merge branch 'develop' into feat/new-model

eb05f43

feat: select only necessary columns and apply your types

ebcf792

feat: optimize clear data step

8481e3a

Merge branch 'develop' into refactor/optimize-pipeline

71700a7

Delete .env

689ab02

style: format and remove unused files

6e3171a

feat: create a file for pre processing step

fc59817

feat: create file to classification model

6d0c238

feat: create file to topic model

d386c86

feat: get metrics from classification model

7bfd189

feat: use new steps and adjust details

32a6dfa

Merge branch 'refactor/optimize-pipeline' of github.com:The-Bugger-Du…

4c9c201

…cks/mood-hound-nlp into refactor/optimize-pipeline

Merge branch 'develop' into refactor/optimize-pipeline

01a3eb9

MariaGabrielaReis added this to the Sprint 4 milestone Nov 26, 2023

MariaGabrielaReis requested a review from JoaoM-py November 26, 2023 16:19

MariaGabrielaReis self-assigned this Nov 26, 2023

This was linked to issues Nov 26, 2023

[#0057] Processar a base de dados toda #53

Closed

[#0058] Melhorar a qualidade do código #54

Closed

MariaGabrielaReis added 2 commits November 26, 2023 15:42

fix: remove some code smells and run processing

e8e5c2e

fix: update training and test data visualization

770b825

JoaoM-py mentioned this pull request Nov 27, 2023

Fix/unit tests #59

Merged

7 tasks

MariaGabrielaReis merged commit ad1f1ed into develop Nov 27, 2023
2 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor/optimize pipeline #57

Refactor/optimize pipeline #57

MariaGabrielaReis commented Nov 26, 2023

sonarcloud bot commented Nov 26, 2023

Refactor/optimize pipeline #57

Refactor/optimize pipeline #57

Conversation

MariaGabrielaReis commented Nov 26, 2023

Otimizando passos 4 a 6 e extra: pré processamento, análise de sentimento, teste do modelo e modelagem de tópico

PR Type

Descreva a alteração

sonarcloud bot commented Nov 26, 2023