-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor/optimize pipeline #57
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…cks/mood-hound-nlp into refactor/optimize-pipeline
This was
linked to
issues
Nov 26, 2023
SonarCloud Quality Gate failed. 0 Bugs 16.4% Coverage Catch issues before they fail your Quality Gate with our IDE extension SonarLint |
JoaoM-py
added a commit
that referenced
this pull request
Dec 1, 2023
* hotfix names * Refactor: Format date (#27) * refactor: capitalize sentiments (#24) Co-authored-by: JoaoM-py <[email protected]> * fix: translate sentiments (#28) * Feat/separate training reviews (#35) * feat: create function to get random data * feat: create function to get training data * feat: get training data * chore: Manual sentiment classification --------- Co-authored-by: Maria Gabriela Reis <[email protected]> * Feat/#0303 create classification model (#37) * feat: update manual classification * fix: translate topics and sentiments * feat: add seaborn lib * fix: translate topics, update reviews count * feat: create classification model * feat: training, test and apply classification model * refactor: Update stars name * refactor: Update training method --------- Co-authored-by: JoaoM-py <[email protected]> * Feat/#42 test coverage (#38) * feat: create pipeline tests * refactor: Update files names * feat: Training data * feat: Creating pipeline tests * merge: Merge develop * feat: Coverage and sonarcloud config * refactor: Update processing and remove comments * chore: update python version * chore: update sonarcloud * chore: update tests * chore: update tests * chore: update tests * Feat/#0106 bring birth year and gender (#50) * Tests and Sentment Analysis (#39) * hotfix names * Refactor: Format date (#27) * refactor: capitalize sentiments (#24) Co-authored-by: JoaoM-py <[email protected]> * fix: translate sentiments (#28) * Feat/separate training reviews (#35) * feat: create function to get random data * feat: create function to get training data * feat: get training data * chore: Manual sentiment classification --------- Co-authored-by: Maria Gabriela Reis <[email protected]> * Feat/#0303 create classification model (#37) * feat: update manual classification * fix: translate topics and sentiments * feat: add seaborn lib * fix: translate topics, update reviews count * feat: create classification model * feat: training, test and apply classification model * refactor: Update stars name * refactor: Update training method --------- Co-authored-by: JoaoM-py <[email protected]> * Feat/#42 test coverage (#38) * feat: create pipeline tests * refactor: Update files names * feat: Training data * feat: Creating pipeline tests * merge: Merge develop * feat: Coverage and sonarcloud config * refactor: Update processing and remove comments * chore: update python version * chore: update sonarcloud * chore: update tests * chore: update tests * chore: update tests --------- Co-authored-by: GabrielCamargoL <[email protected]> Co-authored-by: JoaoM-py <[email protected]> Co-authored-by: JoaoM-py <[email protected]> * feat: Bring informations of the client --------- Co-authored-by: Maria Gabriela Reis <[email protected]> Co-authored-by: GabrielCamargoL <[email protected]> * Feat/#11 new runtime tracker (#49) * Tests and Sentment Analysis (#39) * hotfix names * Refactor: Format date (#27) * refactor: capitalize sentiments (#24) Co-authored-by: JoaoM-py <[email protected]> * fix: translate sentiments (#28) * Feat/separate training reviews (#35) * feat: create function to get random data * feat: create function to get training data * feat: get training data * chore: Manual sentiment classification --------- Co-authored-by: Maria Gabriela Reis <[email protected]> * Feat/#0303 create classification model (#37) * feat: update manual classification * fix: translate topics and sentiments * feat: add seaborn lib * fix: translate topics, update reviews count * feat: create classification model * feat: training, test and apply classification model * refactor: Update stars name * refactor: Update training method --------- Co-authored-by: JoaoM-py <[email protected]> * Feat/#42 test coverage (#38) * feat: create pipeline tests * refactor: Update files names * feat: Training data * feat: Creating pipeline tests * merge: Merge develop * feat: Coverage and sonarcloud config * refactor: Update processing and remove comments * chore: update python version * chore: update sonarcloud * chore: update tests * chore: update tests * chore: update tests --------- Co-authored-by: GabrielCamargoL <[email protected]> Co-authored-by: JoaoM-py <[email protected]> Co-authored-by: JoaoM-py <[email protected]> * feat: New time metric, storage of metrics and * feat: increase test coverage * feat: Coverage * feat: increase test coverage * feature: Update sonar yml * feat: Update test url * feat: Update test url * feat: Update test url * feat: Update test url * feat: Update test url * feat: Config env * refactor:Update utils imports * refactor: Update env * refactor: Update url and env --------- Co-authored-by: Maria Gabriela Reis <[email protected]> Co-authored-by: GabrielCamargoL <[email protected]> * Feat/#48 unit tests (#51) * Tests and Sentment Analysis (#39) * hotfix names * Refactor: Format date (#27) * refactor: capitalize sentiments (#24) Co-authored-by: JoaoM-py <[email protected]> * fix: translate sentiments (#28) * Feat/separate training reviews (#35) * feat: create function to get random data * feat: create function to get training data * feat: get training data * chore: Manual sentiment classification --------- Co-authored-by: Maria Gabriela Reis <[email protected]> * Feat/#0303 create classification model (#37) * feat: update manual classification * fix: translate topics and sentiments * feat: add seaborn lib * fix: translate topics, update reviews count * feat: create classification model * feat: training, test and apply classification model * refactor: Update stars name * refactor: Update training method --------- Co-authored-by: JoaoM-py <[email protected]> * Feat/#42 test coverage (#38) * feat: create pipeline tests * refactor: Update files names * feat: Training data * feat: Creating pipeline tests * merge: Merge develop * feat: Coverage and sonarcloud config * refactor: Update processing and remove comments * chore: update python version * chore: update sonarcloud * chore: update tests * chore: update tests * chore: update tests --------- Co-authored-by: GabrielCamargoL <[email protected]> Co-authored-by: JoaoM-py <[email protected]> Co-authored-by: JoaoM-py <[email protected]> * feat: increase test coverage * feat: Coverage * feat: increase test coverage * feature: Update sonar yml * feat: Update test url * feat: Update test url * feat: Update test url * feat: Update test url * feat: Update test url * feat: Config env * refactor:Update utils imports * refactor: Update env * refactor: Update url and env --------- Co-authored-by: Maria Gabriela Reis <[email protected]> Co-authored-by: GabrielCamargoL <[email protected]> * Feat: new classification model (#52) * feat: add new model to sentiment analysis * fix: use correct training and treat exceptions * fix: remove console logs * feat: increment reviews quantity * Refactor/optimize pipeline (#56) * feat: add new model to sentiment analysis * fix: use correct training and treat exceptions * fix: remove console logs * feat: increment reviews quantity * feat: select only necessary columns and apply your types * feat: optimize clear data step * Delete .env * Refactor/optimize pipeline (#57) * feat: add new model to sentiment analysis * fix: use correct training and treat exceptions * fix: remove console logs * feat: increment reviews quantity * feat: select only necessary columns and apply your types * feat: optimize clear data step * Delete .env * style: format and remove unused files * feat: create a file for pre processing step * feat: create file to classification model * feat: create file to topic model * feat: get metrics from classification model * feat: use new steps and adjust details * fix: remove some code smells and run processing * fix: update training and test data visualization * Fix/unit tests (#59) * Tests and Sentment Analysis (#39) * hotfix names * Refactor: Format date (#27) * refactor: capitalize sentiments (#24) Co-authored-by: JoaoM-py <[email protected]> * fix: translate sentiments (#28) * Feat/separate training reviews (#35) * feat: create function to get random data * feat: create function to get training data * feat: get training data * chore: Manual sentiment classification --------- Co-authored-by: Maria Gabriela Reis <[email protected]> * Feat/#0303 create classification model (#37) * feat: update manual classification * fix: translate topics and sentiments * feat: add seaborn lib * fix: translate topics, update reviews count * feat: create classification model * feat: training, test and apply classification model * refactor: Update stars name * refactor: Update training method --------- Co-authored-by: JoaoM-py <[email protected]> * Feat/#42 test coverage (#38) * feat: create pipeline tests * refactor: Update files names * feat: Training data * feat: Creating pipeline tests * merge: Merge develop * feat: Coverage and sonarcloud config * refactor: Update processing and remove comments * chore: update python version * chore: update sonarcloud * chore: update tests * chore: update tests * chore: update tests --------- Co-authored-by: GabrielCamargoL <[email protected]> Co-authored-by: JoaoM-py <[email protected]> Co-authored-by: JoaoM-py <[email protected]> * feat: add new model to sentiment analysis * fix: use correct training and treat exceptions * fix: remove console logs * feat: increment reviews quantity * fix: tests of the updated pipeline * feat: select only necessary columns and apply your types * feat: optimize clear data step * Delete .env * style: format and remove unused files * feat: create a file for pre processing step * feat: create file to classification model * feat: create file to topic model * feat: get metrics from classification model * feat: use new steps and adjust details * fix: remove some code smells and run processing * fix: update training and test data visualization * feat: create and fix unit tests * fix: remove .coverage * reafacto: remove unused test --------- Co-authored-by: Maria Gabriela Reis <[email protected]> Co-authored-by: GabrielCamargoL <[email protected]> * Feat/logs and alerts (#60) * Tests and Sentment Analysis (#39) * hotfix names * Refactor: Format date (#27) * refactor: capitalize sentiments (#24) Co-authored-by: JoaoM-py <[email protected]> * fix: translate sentiments (#28) * Feat/separate training reviews (#35) * feat: create function to get random data * feat: create function to get training data * feat: get training data * chore: Manual sentiment classification --------- Co-authored-by: Maria Gabriela Reis <[email protected]> * Feat/#0303 create classification model (#37) * feat: update manual classification * fix: translate topics and sentiments * feat: add seaborn lib * fix: translate topics, update reviews count * feat: create classification model * feat: training, test and apply classification model * refactor: Update stars name * refactor: Update training method --------- Co-authored-by: JoaoM-py <[email protected]> * Feat/#42 test coverage (#38) * feat: create pipeline tests * refactor: Update files names * feat: Training data * feat: Creating pipeline tests * merge: Merge develop * feat: Coverage and sonarcloud config * refactor: Update processing and remove comments * chore: update python version * chore: update sonarcloud * chore: update tests * chore: update tests * chore: update tests --------- Co-authored-by: GabrielCamargoL <[email protected]> Co-authored-by: JoaoM-py <[email protected]> Co-authored-by: JoaoM-py <[email protected]> * feat: add new model to sentiment analysis * fix: use correct training and treat exceptions * fix: remove console logs * feat: increment reviews quantity * fix: tests of the updated pipeline * feat: select only necessary columns and apply your types * feat: optimize clear data step * Delete .env * style: format and remove unused files * feat: create a file for pre processing step * feat: create file to classification model * feat: create file to topic model * feat: get metrics from classification model * feat: use new steps and adjust details * fix: remove some code smells and run processing * fix: update training and test data visualization * feat: create and fix unit tests * fix: remove .coverage * reafacto: remove unused test * feat: Organizing metrics, collect of logs and alerts * fix: Pipeline exec time * fix: code format and increase reviews quantity * feat: increase reviews quantity * fix: Pipeline exec time * fix: Pipeline exec time * fix: Pipeline exec time * fix: Pipeline exec time format --------- Co-authored-by: Maria Gabriela Reis <[email protected]> Co-authored-by: GabrielCamargoL <[email protected]> --------- Co-authored-by: GabrielCamargoL <[email protected]> Co-authored-by: JoaoM-py <[email protected]> Co-authored-by: JoaoM-py <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Otimizando passos 4 a 6 e extra: pré processamento, análise de sentimento, teste do modelo e modelagem de tópico
PR Type
Que tipo de mudança a PR introduz?
Descreva a alteração
Foram criados novos arquivos para melhor organização dos passos da pipeline: um arquivo apenas para o modelo classificador e suas funções, outro arquivo exclusivo de pré processamento e um outro exclusivo para modelagem de tópicos, além de um arquivo separado na pasta
utils
para avaliar as métricas do modelo classificador (geração de métricas).Refatorações e otimizações foram feitas no modelo, na formatação do código, organização de funções entre outras ações que permitiram que toda a base de dados fosse pré processada em um tempo médio de 9 a 12 minutos, como demonstrado na captura do terminal abaixo: