This simple application will score each row defined in the src/data/profile.csv
which represents the candidate data with their job title, industries they worked in, and their residence.
The scoring will be performed based on the data match between the project data provided in the src/data/project.json
and the ideal candidates that matches those data with the relevant fields such as industries, job title, and the distance between the project location and candidates location.
We will be leveraging the TFIDF algorithm to evaluate the weight of the industries that represents the expected industrial experience an ideal candidate should possesses
to the industrials experience a candidate actually possess
. We will measure the average frequency of each expected project industry in the candidate's profile data to get the final matching score.
We will be using a fuzzy match to score the possible match to match the expected job titles with the title a candidate possesses.
We will be using the Haversine formula to measure the distance between two geo coordinates. Once we measure the distance, we want to score it and scale it to match the other scores so we can get some accurate results. To do this, we will subtract the closest distance with 1 so we get a number close to 1. We are also filtering distance that is more than 100KM as per the requirements.
The final scoring will be nothing but the average of all three scores.
We are using Docker
and docker-compose
to ease the application packaging and deploying process.
We are using Mocha
and Chai
for performing unit testing and assertion. To measure and generate the coverage report, we are using Istanbul
.
The engineering of the application leverages the TypeScript
's Object Oriented functionalities. We are using concepts such as Inheritance, and Encapsulation to make the application strictly typed, more resilient, and secure.
The engineered application project structure looks like the following tree:
.
├── app.ts
├── data
│ ├── project.json
│ └── profile.csv
├── lib
│ ├── helper
│ │ └── helper.ts
│ ├── nlp
│ │ ├── fuzzy.ts
│ │ ├── normalizer.ts
│ │ └── tfidf.ts
│ └── profiler.ts
├── model
│ ├── city.ts
│ ├── location.ts
│ ├── person.ts
│ └── profile.ts
└── tests
├── fixtures
│ └── project.ts
└── unit
├── helper
│ ├── helper.spec.ts
│ └── profiler.spec.ts
├── model
│ ├── person.spec.ts
│ └── profile.spec.ts
└── nlp
├── fuzzy.spec.ts
├── normalizer.spec.ts
└── tfidf.spec.ts
11 directories, 21 files
The following block will describe what each directory in the project structure represents.
-
The file
app.ts
holds the main function which will drive the full flow from data ingestion to scoring and result printing. -
The
lib
directory will hold all the classes that are required to perform candidate profiling.-
The Helper class in
src/lib/helper/helper.ts
holds utilities for all the basic tasks required to be shared by multiple classes such as loading CSV and JSON files, measuring distance and radius. -
The FuzzyMatch class in
src/lib/nlp/fuzzy.ts
will provide the fuzzy matching related functionalities. -
The Normalizer class in
src/lib/nlp/normalizer.ts
will provide the text normalization functionality. -
The NlpTfIdf class in
src/lib/nlp/tfidf.ts
will provide the functionality to perform tfidf on any given document. -
The Profiler class in
src/lib/profiler.ts
will provide the ability to perform profiling on any given candidate profile and calculate the final score.
-
-
The
model
directory will hold all the data models for modeling city, location, person, and profile data. -
The
tests
directory has all the unit tests and it's required fixtures to perform testing.
# Clone the Git Repo
> git clone https://github.com/shreyaspatel7/profile-matcher.git
> cd profile-matcher/
# Via NPM
> npm install -D
> env PROFILE_DATA=src/data/profile.csv PROJECT_DATA=src/data/project.json npm start
# Via Docker Compose
> docker-compose run profiler
> PROFILE_DATA=src/data/profile.csv PROJECT_DATA=src/data/project.json npm test
> docker-compose run test
PROFILE_DATA=src/data/profile.csv PROJECT_DATA=src/data/project.json npm run coverage
> docker-compose run coverage
We will be able to get the coverage report in both HTML and CLI format. If we run the coverage via docker compose, the results will be available in the results
directory.