Forbid `await` in `for` loop #1600

sobolevn · 2020-09-13T22:38:04Z

https://github.com/Instagram/Fixit/blob/master/fixit/rules/gather_sequential_await.py

Sxderp · 2020-09-16T15:01:42Z

What if you're using aiohttp or similar to fetch web pages? Any considerate person writing a bot will space their requests. How would you implement that without just waiting in the for loop? I suppose you could spin off to an actual thread and do it entirely synchronously, but I feel like that's a lot of setup just to avoid an await in a loop.

* Note, we're making the assumption that the code is also doing something other than fetch pages.

shivang970 · 2020-09-23T13:01:55Z

Hi @sobolevn
I want to get involve in this project. It's my first time to open source project, hoping you could help me...

sobolevn · 2020-09-23T14:09:44Z

@shivang970 you would need to use visit_For somewhere in https://github.com/wemake-services/wemake-python-styleguide/blob/master/wemake_python_styleguide/visitors/ast/loops.py

And ast.walk on this node to make sure that there are no await nodes.
You can have a look at this implementation, it is pretty similar: https://github.com/wemake-services/wemake-python-styleguide/blob/master/wemake_python_styleguide/visitors/ast/loops.py#L177-L179

Thanks a lot for your help!

sobolevn · 2020-09-23T14:11:42Z

@Sxderp there are better ways to do it. https://stackoverflow.com/questions/42231161/asyncio-gather-vs-asyncio-wait

Sxderp · 2020-09-23T15:21:25Z

I understand how gather works, but there are workflows in which we don't want to "start off" 100 connection requests. Using gather would cause everything to initialize in fairly quick succession. I can see something like gathering in increments of four, but you are still waiting in a loop.

Anyway, I think rate limiting is definitely an appropriate use.

Sxderp · 2020-09-23T15:31:19Z

I can get behind a Rule that forbids await in list comprehensions or a single line for loop. I think in most circumstances that would be a mistake. The rate limiting could potentially still apply, but if you need rate limiting you're generally doing other things with the data (more than one line).

sobolevn · 2020-09-23T15:49:00Z

Can you please provide a real-ish code sample with rate limiting in a loop?

Sxderp · 2020-09-23T16:31:01Z

Here is a stripped sample of something similar to what I've used. I can also think of examples in IPC in which state is encoded in the response and you need to send part of the response back in order to perform your next request. Something like a Redis "SCAN" operation. I have no actual implements doing that with asyncio, but just a thought.

import aiohttp
import asyncio
import random
from bs4 import BeautifulSoup

# Galleries would normally be fetched too. But for simplicity sake
# hardcode them.
GALLERIES = [
    'https://example.com/gallery/1',
    'https://example.com/gallery/2',
    'https://example.com/gallery/3',
    'https://example.com/gallery/4',
]

async def extract_image_srcs(session, gallery_url):
    async with session.get(gallery_url) as gallery_response:
        gallery_soup = BeautifulSoup(await gallery_response.text(), features='lxml')
        return [img.attrs['src'] for img in gallery_soup.findAll('img')]

async def get_image_data(session, image_src):
    async with session.get(image_url) as image_response:
        return await image_response.read()

async fetch_gallery(gallery_url):
    session = aiohttp.ClientSession()
    image_sources = await extract_image_srcs(session, gallery_url)
    for image_src in image_sources:
        # This could technically be "gathered" but await is used to artifically
        # reduced the number of potential requests. We can't guarantee any
        # specific number, but it'll definitely be less than doing everything
        # at once.
        image_data = await get_image_data(session, image_src)
        with open(str(random.randint(1, 255)), 'wb') as fd:
            fd.write(image_data)
        # Optionally include additional waiting if we want to be extra nice.
        await asyncio.sleep(random.randint(1, 10))
    await session.close()

async def bootstrap():
    await asyncio.gather(*[fetch_gallery(gallery_url) for gallery_url in GALLERIES])

asyncio.run(bootstrap())

strmwalker · 2022-08-02T17:33:47Z

You should use asyncio.Semaphore if you need to ratelimit your request.
But asyncio.sleep(0) may be useful if you have async context but doing CPU-intensive task (crunching image data in script) to allow another coroutines advance.

Sxderp · 2022-08-03T10:34:35Z

If you're rate-limiting with more than one I can agree. If it's one I'm not sure I see the point.

sobolevn added help wanted Extra attention is needed rule request Adding a new rule good first issue Entrypoint to the project labels Sep 13, 2020

sobolevn assigned shivang970 Sep 23, 2020

sobolevn added the Hacktoberfest Hactoberfest fun! label Sep 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Forbid `await` in `for` loop #1600

Forbid `await` in `for` loop #1600

sobolevn commented Sep 13, 2020

Sxderp commented Sep 16, 2020 •

edited

Loading

shivang970 commented Sep 23, 2020

sobolevn commented Sep 23, 2020

sobolevn commented Sep 23, 2020

Sxderp commented Sep 23, 2020

Sxderp commented Sep 23, 2020

sobolevn commented Sep 23, 2020

Sxderp commented Sep 23, 2020 •

edited

Loading

strmwalker commented Aug 2, 2022

Sxderp commented Aug 3, 2022 •

edited

Loading

Forbid await in for loop #1600

Forbid await in for loop #1600

Comments

sobolevn commented Sep 13, 2020

Sxderp commented Sep 16, 2020 • edited Loading

shivang970 commented Sep 23, 2020

sobolevn commented Sep 23, 2020

sobolevn commented Sep 23, 2020

Sxderp commented Sep 23, 2020

Sxderp commented Sep 23, 2020

sobolevn commented Sep 23, 2020

Sxderp commented Sep 23, 2020 • edited Loading

strmwalker commented Aug 2, 2022

Sxderp commented Aug 3, 2022 • edited Loading

Forbid `await` in `for` loop #1600

Forbid `await` in `for` loop #1600

Sxderp commented Sep 16, 2020 •

edited

Loading

Sxderp commented Sep 23, 2020 •

edited

Loading

Sxderp commented Aug 3, 2022 •

edited

Loading