Skip to content
This repository has been archived by the owner on Sep 5, 2020. It is now read-only.

uwescience/dssg-disinfo

Repository files navigation

dssg-disinfo

DSSG 2020 Online disinformation classification project

Using deep learning to identify disinformation new articles online

Websites that disseminate disinformation about coronavirus likely contribute to public harm by sowing confusion and distrust as well as preventing people from taking appropariate prevention measures or engagin gin dangerous fake treatment and cures, which could result in increased virus transmission, morbidity, and mortality worldwide.

Developing a method to identify disinformation sites could mitigate these harmful effects by allowing advertisers to not fund such sites. The purpose of this project is to develop an open-source natural language processing model that can accurately classify news articles according to their risk of containing disinformation about the coronavirus.

See project web page.

For access to a sample of the dataset, please contact: XXX? ADUniverse Web App Demonstration

By using the dataset (adunits.db) from this repository, you agree to the City of Seattle's Terms of Use and Policy, as well as to the King County Assessors', the US Census Bureau's and Zillow's, from whom this data was acquired.

Installation and Running

  1. Your machine should have the following installed already:

  2. First, clone the repository repository.

    • git clone https://github.com/uwescience/dssg-disinfo.git
    • cd ADUniverse # change
  3. Next, download the following packages:

    • NLTK, Pandas, SciPy, SpaCy, Keras, TensorFlow, numpy, matplotlib and sklearn
  4. You will be working in a "virtual environment".

    • conda create -n test_adu python=3.6
    • conda activate test_adu
  5. This code works for python 3.6. You should have miniconda installed. Then issue the following commands:

    • conda install -f -y -q --name test_adu -c conda-forge --file requirements.txt
    • pip install dash-dangerously-set-inner-html
  6. You just installed all the necessary dependencies needed but LFS (large file system). Now let's install lfs with the following commands:

    • git lfs install
  7. Clone ADUniverse again.

    • cd ..
    • mv ADUniverse ADUniverse_old
    • git clone https://github.com/uwescience/ADUniverse
    • cd ADUniverse
  8. To run the code

    • Change directories to the subfolder within ADUniverse by doing cd ADUniverse
    • Run the application. python index.py.
    • You will see a URL like http://127.0.0.1:8050. Browse to this URL and the application will load.
  9. When you are done,

    • conda deactivate

Notes for Windows 10 users

  • You should have python 3.7 installed already.
  • Open a gitbash command prompt from the search bar. You will do the git clone from this prompt. Then close it.
  • Install the 64 bit version of miniconda for python 3. This will run an installer. When this finishes, you will have an anaconda prompt available to you from the command search.
  • Open the Anaconda prompt as administrator. Change directories to the clone of the ADUniverse. This should be in c:\Users<user name>\ADUniverse
  • Resume with item (3) above.
  • In step 6, you will use move instead of mv.