Skip to content

Organizer for job searching across multiple sites. Fetch offers, measure recruitment progress, collect info about potential employer

Notifications You must be signed in to change notification settings

Ne0bliviscaris/Job-Search-Tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Job-Search-Tool

Organizer for job searching across multiple sites. Fetch offers, measure recruitment progress, collect info about potential employer

Current dataframe state:

Raw Scraped Screenshot

TODO:

Data processing

Location fetching adjustments

  • If site puts selected location on first place - use only the first location
  • Else - fetch html with location block hovered to show extract list of all locations

Get proper search links

Raw data extraction improvements:

  • Location extraction improvements - making sure that either a list or the proper location is extracted

Synchronization ETL module:

  • Extract elements from raw CSV -> unify them across all sites
  • Use tag and location dictionaries to unify variable elements
  • Mark new offers as new
  • Move finished offers to archive
  • Gather additional data, like added time, removed time
  • Browseable archive file

Records visualization:

  • Prepare record template - fetch one record from CSV, fill specific fields

  • Initially scrolled up, showing minimal info. Click, to show full record details

  • Add additional editable fields:

    • Mark as applied button - saves current time as time applied
    • Application status - not applied, applied, rejected
    • Feedback status - received or not received
    • Note field for feedback
    • Mark as interesting, prefferable 1-5 stars ranking

Cloud related issues

Session and data access:

  • Introduce session for admin user
  • Columns not for public info available only for admin
  • Saving data/files available only for admin

Move to docker container and host it remotely

  • Run updater on a scheduler

Ideas for the future:

  • Scrape each interesting offer (3+ stars)
  • Fetch and unify requirements, additional info etc
  • Build RAG using my CV to analyze each offer in relation to my skills
  • RAG generate unified template from scraped offers

Changelog:

16.09.2024

  • Improvement in extracting job location. Added separate field for remote job status
  • Properly extracting salary details (currency etc)
  • Fixed logo extraction from Nofluffjobs
  • Storing job tags as a string

14.09.2024

  • Introduced Streamlit

11.09.2024

  • Integrated JustJoinIT.pl site
  • Integrated Solid.jobs site
  • Integrated it.pracuj.pl site

10.09.2024

  • Integrated Rocketjobs.pl site
  • Integrated Bulldogjob.pl site
  • Minor improvements to handling data extraction

09.09.2024

  • Massively reduced update time complexity by reusing one webdriver

06.09.2024

  • Moved data extraction to containers: Instead of only pointing containers, functions now handle data extraction. This greatly improves scaleability for the project
  • Big improvements to code clarity
  • Solved theprotocol fetching inconsistencies by setting fixed chromedriver window size (not displayed anyway) The point of failure was rendering site in mobile version by default

05.09.2024

  • Now salary extraction properly handles various notations

04.09.2024

  • Moved to Selenium scraping. This provides better results than requests.
  • Introduced file handling. Now data is extracted from saved files, resulting in improved performance. Update function scrapes search links to their respective file.
  • Search links are now stored in a dictionary with this structure: {website_tag1-tag2-tag3 : link} This enables using multiple links from same website.

03.09.2024

  • Temporarily dropped Streamlit and Selenium to work on basics.

27.08.2024

  • Moved to Streamlit
  • Added function to turn records into dataframe

26.08.2024

  • Introduced JobRecord class to handle HTML records

About

Organizer for job searching across multiple sites. Fetch offers, measure recruitment progress, collect info about potential employer

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages