Skip to content
This repository has been archived by the owner on Oct 9, 2019. It is now read-only.
/ py-web-search Public archive

A Python module to fetch and parse results from different search engines.

License

Notifications You must be signed in to change notification settings

rohithpr/py-web-search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

py-web-search

NOTE: This project is not being maintained anymore.

Latest VersionJoin the chat at https://gitter.im/rohithpr/py-web-search

A Python module to fetch and parse results from different search engines.

Warning: Do not make queries rapidly! The servers may block you.

Related project

Use the search-api to get results in JSON format using http requests. (Does not need Python)

Table of Contents

Search engines supported

Installation

Python3: Install using pip:

    pip install py-web-search

Python2: Not available on PyPI at the moment. You can download this repository and set it up manually.

Usage

Web search

    from pws import Google
    from pws import Bing

    print(Google.search(query='hello world', num=5, start=2, country_code="es"))
    print(Bing.search('hello world', 5, 2))
    
    # Arguments:
    # search(query, num, start, sleep, recent)
    # query: Required. The keyword that will be searched.
    # num: Default 10. The number of results returned.
    # start: Default 0. The number of top results that are to be ignored.
    # sleep: Default True. If True, the program will wait for a second, when applicable, to avoid overwhelming the servers.
    # recent: Default None. The following values are allowed: 'h': hour, 'd': day, 'w': week, 'm': month and 'y': year.(Buggy)
    # country_code: For local results.

Prints 5 results from the the third result onwards (ignores the first 2) in the following format.

    {
        'url': '...',
        'expected_num': 5,
        'received_num' : 5, # There will be a difference in case of insufficient results
        'start': 2,
        'search_engine': 'google',
        'total_results': ...,
        'results':
        [
            {
                'link': '...',
                'link_text': '...',
                'link_info': '...',
                'related_queries': [...],
                'additional_links':
                {
                    linktext: link,
                    ...
                }
        	},
        	...
        ]
    }

News search

    from pws import Bing
    from pws import Google

    print(Bing.search_news(query='github', 10, 0, True, 'h'))
    print(Google.search_news('github', 10, 0, True, 'd', "es"))
    
    # Arguments:
    # search_news(query, num, start, sleep, recent)
    # query: Required. The keyword that will be searched.
    # num: Default 10. The number of results returned.
    # start: Default 0. The number of top results that are to be ignored.
    # sleep: Default True. If True, the program will wait for a second, when applicable, to avoid overwhelming the servers.
    # recent: Default None. The following values are allowed: 'h': hour, 'd': day, 'w': week, 'm': month and 'y': year.(Buggy)
    # country_code: For local results. 

Prints 10 results from the the first result onwards (ignores the first 0) in the following format.

    {
        'url': '...',
        'num': 10,
        'start': 0,
        'search_engine': 'bing',
        'results':
        [
            {
                'link': '...',
                'link_text': '...',
                'link_info': '...',
                'source': '...',
                'time': '...',
                'additional_links':{}, # Always empty for Bing.
            },
            ...
        ]
    }

Todo

  • Other search engines
  • Images etc.

Contribution

Feel free to add any features that you think might be useful.

About

A Python module to fetch and parse results from different search engines.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages