-
-
Notifications
You must be signed in to change notification settings - Fork 413
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
4db4933
commit b8d9ae8
Showing
2 changed files
with
121 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
# CLI | ||
|
||
`pyresparser` comes with a **cli** option which you can use right away in your terminal | ||
|
||
```bash | ||
usage: pyresparser [-h] [-f FILE] [-d DIRECTORY] [-r REMOTEFILE] | ||
[-sf SKILLSFILE] | ||
|
||
optional arguments: | ||
-h, --help show this help message and exit | ||
-f FILE, --file FILE resume file to be extracted | ||
-d DIRECTORY, --directory DIRECTORY directory containing all the resumes to be extracted | ||
-r REMOTEFILE, --remotefile REMOTEFILE remote path for resume file to be extracted | ||
-sf SKILLSFILE, --skillsfile SKILLSFILE custom skills CSV file against which skills are searched for | ||
``` | ||
|
||
## Parsing single resume | ||
|
||
For extracting data from a **single resume** file, use | ||
|
||
```bash | ||
pyresparser -f /path/to/resume/file | ||
``` | ||
|
||
## Parsing mutliple resumes | ||
|
||
For extracting data from several resumes, place them in a **directory** and then execute | ||
|
||
```bash | ||
pyresparser -d /path/to/resume/directory/ | ||
``` | ||
|
||
## Parsing hosted resumes | ||
|
||
For extracting data from **remote resumes**, execute | ||
|
||
```bash | ||
pyresparser -r https://www.example.com/path/to/resume/file | ||
``` | ||
|
||
## Specifying skills explicitly | ||
|
||
Pyresparser comes with built-in skills file that defaults to many technical skills. You can find the default skills file [here](https://github.com/OmkarPathak/pyresparser/blob/master/pyresparser/skills.csv). | ||
|
||
For extracting data against your specified skills, create a CSV file with no headers and execute | ||
|
||
```bash | ||
pyresparser -sf /path/to/resume/file.csv -f /path/to/resume/file | ||
``` | ||
|
||
## Specifying export format | ||
|
||
For specifying the export format you can use the following option: | ||
|
||
```bash | ||
pyresparser -e json -f /path/to/resume/file | ||
``` | ||
|
||
Note: Currently only JSON export is supported |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -43,8 +43,66 @@ from pyresparser import ResumeParser | |
data = ResumeParser('/path/to/resume/file').get_extracted_data() | ||
``` | ||
|
||
# Supported File Formats | ||
## Result | ||
|
||
- PDF and DOCx files are supported on all Operating Systems | ||
- If you want to extract DOC files you can install [textract](https://textract.readthedocs.io/en/stable/installation.html) for your OS (Linux, MacOS) | ||
- Note: You just have to install textract (and nothing else) and doc files will get parsed easily | ||
The module would return a list of dictionary objects with result as follows: | ||
|
||
``` | ||
[ | ||
{ | ||
'college_name': ['Marathwada Mitra Mandal’s College of Engineering'], | ||
'company_names': None, | ||
'degree': ['B.E. IN COMPUTER ENGINEERING'], | ||
'designation': ['Manager', | ||
'TECHNICAL CONTENT WRITER', | ||
'DATA ENGINEER'], | ||
'email': '[email protected]', | ||
'mobile_number': '8087996634', | ||
'name': 'Omkar Pathak', | ||
'no_of_pages': 3, | ||
'skills': ['Operating systems', | ||
'Linux', | ||
'Github', | ||
'Testing', | ||
'Content', | ||
'Automation', | ||
'Python', | ||
'Css', | ||
'Website', | ||
'Django', | ||
'Opencv', | ||
'Programming', | ||
'C', | ||
...], | ||
'total_experience': 1.83 | ||
} | ||
] | ||
``` | ||
|
||
## Supported Resume File Formats | ||
|
||
- Parsing of PDF and DOCx files are supported on all Operating Systems | ||
- If you want to parse DOC files you can install [textract](https://textract.readthedocs.io/en/stable/installation.html) for your OS (Linux, MacOS) | ||
- Note: You just have to install textract (and nothing else) and doc files will get parsed easily | ||
|
||
# Advanced Options | ||
|
||
## Explicitly specifying skills file | ||
|
||
Pyresparser comes with built-in skills file that defaults to many technical skills. You can find the default skills file [here](https://github.com/OmkarPathak/pyresparser/blob/master/pyresparser/skills.csv). | ||
|
||
For extracting data against your specified skills, create a CSV file with no headers. | ||
|
||
```python | ||
from pyresparser import ResumeParser | ||
data = ResumeParser('/path/to/resume/file', skills_file='/path/to/skills.csv').get_extracted_data() | ||
``` | ||
|
||
## Explicitly providing regex to parse phone numbers | ||
|
||
While pyresparser parses most of the phone numbers correctly, there is a possibility of new patterns being added in near future. Hence, we can explicitly provide the regex required to parse the desired phone numbers. This can be done using | ||
|
||
```python | ||
from pyresparser import ResumeParser | ||
data = ResumeParser('/path/to/resume/file', custom_regex='pattern').get_extracted_data() | ||
``` |