experienceAnalysis

Package {experienceAnalysis} contains a suite of functions for performing text mining such as sentiment analysis, analysis of word counts, TF-IDFs and n-grams etc. The package was developed as a helper package for use with other packages/repos developed by the CDU Data Science Team, but the functions are generic and thus suitable for broader use. The focus is on calculating sentiment indicators and word counts/frequencies for labeled or unlabeled text, and plotting the outcomes to easily detect potentially important information in the text. However, there are a few "spin-off" functions for assessing the performance of a classification model, e.g. calculating accuracy per class, making and plotting confusion matrices etc.

The function documentation is here.

For an example of how the package is used in practice, see the source code for this dashboard.

The package makes extensive use of {tidytext} (Silge & Robinson, 2017).

In line with the broader work of the CDU Data Science Team, all function names have prefixes that give users a hint of what type of operations they perform (e.g. prep_* and plot_* for preparing and plotting data respectively). See Naming guidelines for functions.

Naming guidelines for functions

Data manipulations

get_*(): Get data, e.g. from a database or a file;
tidy_*(): Tidy data, e.g. renaming variables, removing duplicates, creating factors, "wide" to "long" format etc.;
collect_*(): Collect data of specific cases, mainly wrapper functions for specific filter commands;
prep_*(): Prepare data for further use, e.g. to create tables, sorting vectors etc.;

Data analyses

calc_*(): Calculations or analyses, e.g. counting data, regression analyses etc.;
summary_*(): Summarise results of calculations (there might be some overlap with prep_*());

Visualisations

plot_*(): Create a plot;

Deployment on a server

Note that the sentiment dictionaries loaded from {tidytext} (via {textdata}) do not all have an open licence and users must accept the licence agreement the first time they run these functions. The console prompt to do this is not accessible when deploying this software on a server. Consequently, in order to deploy this application it is necessary to run the contents of the data-raw/ folder. You will need to accept the terms of the licences yourself. When this is done, the data will be accessible to the package in data/ and all of the sentiment dictionary functions will automatically load from this location instead of using the {tidytext} functions.

References

Silge J. & Robinson D. (2017). Text Mining with R: A Tidy Approach. Sebastopol, CA: O’Reilly Media. ISBN 978-1-491-98165-8.

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
R		R
data-raw		data-raw
docs		docs
man		man
tests		tests
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.md		README.md
experienceAnalysis.Rproj		experienceAnalysis.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

experienceAnalysis

Naming guidelines for functions

Data manipulations

Data analyses

Visualisations

Deployment on a server

References

About

Licenses found

Releases

Packages

Contributors 2

Languages

License

Licenses found

CDU-data-science-team/experienceAnalysis

Folders and files

Latest commit

History

Repository files navigation

experienceAnalysis

Naming guidelines for functions

Data manipulations

Data analyses

Visualisations

Deployment on a server

References

About

Topics

Resources

License

Licenses found

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages