Skip to content

OSINT microservices architecture using Apache Kafka for message exchange.

License

Notifications You must be signed in to change notification settings

TNOCS/cyber-security

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cyber-security

OSINT microservices architecture using Apache Kafka for message exchange.

Welcome to the Cyber-Security repository for Web scraping and Entity Extraction. Whitin the packages folder you’ll find both projects.

Web scraper: This is a scraper build using TypeScript and Puppeteer. Its self-contained and produces a JSON file with alle collected objects. It is originally built for www.nu.nl.

Using the config.json file you can change selectors and set the timer for the scraper. In the current built the timer is set to 2 minutes for testing purposes.

The scraper can be used for different sections of nu.nl but must be altered if used with different news sites.

Simple instructions (refer to README.MD for detailed instructions):

npm i         #	to install dependencies 
npm start     # to start compile (in watch mode)
npm run node  # to run

When running:

  • You must accept the Cookie notification on screen manually
  • To stop the automated process type Ctrl+c, then Yes (Y)
  • data will be saved to data/baseline.json

The config.json file in src/config.json. This file contains variables used to select elements on the webpage.

There is also an "IdleTimeMin" that allows the user to decide how many minutes the program waits before scraping again.

About

OSINT microservices architecture using Apache Kafka for message exchange.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published