Skip to content

shiva1003/Projects

Repository files navigation

PROJECT-1 Fake News Detection.

task-1 for internship of codeclause In this modern world, data is very important and by the 2020 year, 1.7 megaBytes data generated per second. So there are many technologies that change the world by this large amount of data. Machine learning is one of them and we are using this technology to detect fake news. Fake news's simple meaning is to incorporate information that leads people to the wrong path. Nowadays fake news spreading like water and people share this information without verifying it. This is often done to further or impose certain ideas and is often achieved with political agendas.

You can find many datasets for fake news detection on Kaggle or many other sites. I download these datasets from Kaggle. There are two datasets one for fake news and one for true news. In-text preprocess we are cleaning our text by steaming, lemmatization, remove stopwords, remove special symbols and numbers, etc. After cleaning the data we have to feed this text data into a vectorizer which will convert this text data into numerical features.

PROJECT-2 Sentiment Analysis on tweets using LSTM.

task-2 for internship of codeclause In today’s digital world, social media platforms like Facebook, Whatsapp, Twitter have become a part of our everyday schedule. Many NLP techniques can be used on the text data available from Twitter. Sentiment analysis refers to the idea of predicting the sentiment ( happy, sad, neutral) from a particular text. In this blog, I will be performing sentiment analysis on a large real-world dataset by applying techniques of NLP(Natural Language Processing).

This is an entity-level sentiment analysis dataset of twitter. Given a message and an entity, the task is to judge the sentiment of the message about the entity. There are three classes in this dataset: Positive, Negative and Neutral. We regard messages that are not relevant to the entity (i.e. Irrelevant) as Neutral. I am taking my data from the dataset available in Kaggle. It has around million tweets that have been extracted. You can access the dataset here: dataset. The annotations or labels for the tweets are as follows: Accuracy: 0.8633 0 = negative 1 = positive

PROJECT-3 Detection of Parkinson's disease.

task-3 for internship of codeclause By implementing several models of Machine Learning Linear SVM: 79.30% XGBClassifier: 85.90% Gradient Boosting: 84.58% Decision Tree: 81.94% Random Forest: 86.34% KNeighborsClassifier: 72.69% Bagging Classifier: 82.82%