Skip to content

Crawl and download meta information and documents on technical standards and contributions

License

Notifications You must be signed in to change notification settings

lorenzbr/pystandards

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pystandards

Crawl and download meta information and documents on technical standards and contributions

Installation

You can install the development version from GitHub with:

pip install git+https://github.com/lorenzbr/pystandards.git

Please make sure you have Google Chrome and the corresponding chromedriver.exe (see here) installed to crawl meta information on ITU-T recommendations.

Functions

  • Crawl meta information on IEEE contributions (see here)
    • You can find the name of a standard (std_name) by clicking on the standard of interest. The standard name can be extracted from the URL as follows: https://mentor.ieee.org/ [standard name] /documents (e.g., 802.11, 802.16, ...)
    • Please specify from which pages you want to get the meta information (start_page and end_page)
  • Download IEEE contribution documents (see here)
    • A data frame which contains the meta information on IEEE contributions, i.e. it has at least the three columns dl_link, file and doc_type
    • A path where the documents are saved
  • Crawl meta information on ITU-T recommendations/standards (see here)
    • Specify the recommendation series (e.g., A, G, H, ...)
    • Provide path and name of the Chrome driver
  • Download ITU-T recommendation/standard documents (see here)
    • A data frame which contains the meta information of ITU-T standards, i.e. it has at least the two columns download_link_recommendation and citation
    • A path where the documents are saved

To parse standard documents and for related functions (e.g., accessing ETSI standard documents), see here.

Examples

# Crawl meta information and download IEEE contributions
from pystandards.itut_standards import itut_standards
from pystandards.ieee_contributions import ieee_contributions
ieee_contr = ieee_contributions(verbose=True)
# Name of the WiFi standard
std_name = "802.11"
# Get meta information
df_output = ieee_contr.get_meta(std_name, start_page=1, end_page=3)
# Download three contribution documents
df_download = df_output[0:3]
ieee_contr.download_contributions(df_download, path="")

# Crawl meta information and download ITU-T recommendations
itut_std = itut_standards(verbose=True)
series = ['A']
# Specify the file of the Chrome driver (required for the use of Selenium)
driver_file = "chromedriver.exe"
# Get meta information
df_output = itut_std.get_meta(series, driver_file)
# Download three standard documents as PDFs
df_download = df_output[0:3]
itut_std.download_standards(df_download, path="")

License

This repository is licensed under the MIT license.

See here for further information.

About

Crawl and download meta information and documents on technical standards and contributions

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages