Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Couldn't disable log or change log level in stdout #83

Open
jh88 opened this issue Dec 19, 2018 · 1 comment
Open

Couldn't disable log or change log level in stdout #83

jh88 opened this issue Dec 19, 2018 · 1 comment
Labels
bug project maintainers identified this issue as potential bug in project

Comments

@jh88
Copy link

jh88 commented Dec 19, 2018

In my custom CrawlManager, I changed LOG_LEVEL and LOG_ENABLED.

class CrawlManager(ScrapyrtCrawlManager):

       ...

        def get_scrapyrt_settings(self):
            spider_settings = {
                "LOG_LEVEL": "INFO",
                "LOG_ENABLED": False,
                "LOG_FILE": None,
                "LOG_STDOUT": False,
                "EXTENSIONS": {
                    'scrapy.extensions.logstats.LogStats': None,
                    'scrapy.webservice.WebService': None,
                    'scrapy.extensions.telnet.TelnetConsole': None,
                    'scrapy.extensions.throttle.AutoThrottle': None
                }
            }
            return spider_settings

        def get_project_settings(self):
            custom_settings = self.get_scrapyrt_settings()
            return get_project_settings(custom_settings=custom_settings)

      ...

However neither of them worked, I could still see logs and DEBUG

2018-12-20 09:56:44+1100 [scrapyrt] {'request': {'url': 'https://www.railexpress.com.au/brisbanes-cross-river-rail-will-feed-the-centre-at-the-expense-of-people-in-the-suburbs/', 'meta': {'test': False, 'show_rule': True}}, 'spider_name': 'news_scraper'}
WARNING:py.warnings:/Users/ubuntu/miniconda3/envs/streem/lib/python3.6/site-packages/scrapyrt/core.py:9: ScrapyDeprecationWarning: Module `scrapy.log` has been deprecated, Scrapy now relies on the builtin Python library for logging. Read the updated logging entry in the documentation to learn more.
  from scrapy import signals, log as scrapy_log

2018-12-20 09:56:44+1100 [scrapyrt] Created request for spider news_scraper with url https://www.railexpress.com.au/brisbanes-cross-river-rail-will-feed-the-centre-at-the-expense-of-people-in-the-suburbs/ and kwargs {'meta': {'test': False, 'show_rule': True}}
INFO:scrapy.crawler:Overridden settings: {'BOT_NAME': 'news_scraper', 'LOG_ENABLED': False, 'LOG_LEVEL': 'INFO', 'NEWSPIDER_MODULE': 'news_scraper.spiders', 'SPIDER_MODULES': ['news_scraper.spiders'], 'USER_AGENT': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36'}
INFO:scrapy.middleware:Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.memusage.MemoryUsage']
INFO:scrapy.middleware:Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
INFO:scrapy.middleware:Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
INFO:scrapy.middleware:Enabled item pipelines:
[]
INFO:scrapy.core.engine:Spider opened
DEBUG:scrapy.core.engine:Crawled (200) <GET https://www.railexpress.com.au/brisbanes-cross-river-rail-will-feed-the-centre-at-the-expense-of-people-in-the-suburbs/> (referer: http://media.streem.com.au)
DEBUG:scrapy.core.scraper:Scraped from <200 https://www.railexpress.com.au/brisbanes-cross-river-rail-will-feed-the-centre-at-the-expense-of-people-in-the-suburbs/>
{'author': 'Industry Opinion',
 'body': 'OPINION: The rail project may well help get more commuters into the '
         '...',
 'detected_lang': {'code': 'en', 'confidence': 99.0},
 'language': 'en',
 'modified_at': None,
 'published_at': '2017-07-10T02:42:24+10:00',
 'title': 'Brisbane’s Cross River Rail will feed the centre at the expense of '
          'people in the suburbs',
 'url': 'https://www.railexpress.com.au/brisbanes-cross-river-rail-will-feed-the-centre-at-the-expense-of-people-in-the-suburbs/'}
INFO:scrapy.core.engine:Closing spider (finished)
INFO:scrapy.statscollectors:Dumping Scrapy stats:
{'downloader/request_bytes': 426,
 'downloader/request_count': 1,
 'downloader/request_method_count/GET': 1,
 'downloader/response_bytes': 19470,
 'downloader/response_count': 1,
 'downloader/response_status_count/200': 1,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2018, 12, 19, 22, 56, 44, 571116),
 'item_scraped_count': 1,
 'log_count/INFO': 6,
 'log_count/WARNING': 1,
 'memusage/max': 57487360,
 'memusage/startup': 57487360,
 'response_received_count': 1,
 'scheduler/dequeued': 1,
 'scheduler/dequeued/memory': 1,
 'scheduler/enqueued': 1,
 'scheduler/enqueued/memory': 1,
 'start_time': datetime.datetime(2018, 12, 19, 22, 56, 44, 277020)}
INFO:scrapy.core.engine:Spider closed (finished)
2018-12-20 09:56:44+1100 [-] "127.0.0.1" - - [19/Dec/2018:22:56:43 +0000] "POST /scrape HTTP/1.1" 200 7665 "-" "PostmanRuntime/7.4.0"
2018-12-20 09:57:44+1100 [-] Timing out client: IPv4Address(TCP, '127.0.0.1', 53118)

My gold is to remove all default logs from stdout, so I can use stdout to only show my own logs.

@pawelmhm
Copy link
Member

related to #10

@pawelmhm pawelmhm added the bug project maintainers identified this issue as potential bug in project label Sep 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug project maintainers identified this issue as potential bug in project
Projects
None yet
Development

No branches or pull requests

2 participants