Bag-of-words-model.

Description

This program implements the Bag-of-words-model (https://en.wikipedia.org/wiki/Bag-of-words_model), to study similarities between texts. In this particular example, we selected three different texts containing the lyrics of three nice Mexican and Cuban songs, although, it can be modified to pass any documents we might want to analyze.

Execution of the program

In your terminal run the command: python BagOfWordsM.py

Notice that the source code must be "outside" the folder "InputTexts" that has the texts we want to analyze.

Relevant functions:

clean_data -> "clean" the input text data
vectorization_frequencies -> vectorization of input texts, as vectors of frequencies
cos_similarities -> Computation of the cosine similarity between vectors (input texts represented by vectors)

Songs:

"La Llorona" -> https://www.youtube.com/watch?v=5pqPFMVAIeM
"La Ixhuateca" -> https://www.youtube.com/watch?v=VHRDLv5Y9Lg
"La niña de Guatemala" -> https://www.youtube.com/watch?v=XAAP6bfNGK4

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
InputTexts		InputTexts
BagOfWordsM.py		BagOfWordsM.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bag-of-words-model.

Description

Execution of the program

Relevant functions:

Songs:

References:

About

Releases

Packages

Languages

Freddy-94/Bag-of-words-model

Folders and files

Latest commit

History

Repository files navigation

Bag-of-words-model.

Description

Execution of the program

Relevant functions:

Songs:

References:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages