Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError running 1_scraping_fbref.ipynb #6

Open
chrisdaloa opened this issue Aug 9, 2024 · 3 comments
Open

IndexError running 1_scraping_fbref.ipynb #6

chrisdaloa opened this issue Aug 9, 2024 · 3 comments

Comments

@chrisdaloa
Copy link

Hi, when I run the 3rd cell of "1_scraping_fbref.ipynb" I receive this error, can you help me?

IndexError Traceback (most recent call last)
Cell In[3], line 1
----> 1 df_outfield = get_outfield_data('https://fbref.com/en/comps/11/','/Serie-A-Stats')
3 df_outfield.to_csv('fbref_data/outfield_players.csv', index=False)
5 df_outfield

Cell In[2], line 152
151 def get_outfield_data(top, end):
--> 152 df1 = frame_for_category('stats',top,end,stats)
153 df2 = frame_for_category('shooting',top,end,shooting2)
154 df3 = frame_for_category('passing',top,end,passing2)

Cell In[2], line 136
134 def frame_for_category(category,top,end,features):
135 url = (top + category + end)
--> 136 player_table, team_table = get_tables(url,'for')
137 df_player = get_frame(features, player_table)
138 return df_player

Cell In[2], line 45
42 soup = BeautifulSoup(comm.sub("",res.text),'lxml')
43 all_tables = soup.findAll("tbody")
---> 45 team_table = all_tables[0]
46 team_vs_table = all_tables[1]
47 player_table = all_tables[2]

IndexError: list index out of range

@uPeppe
Copy link
Owner

uPeppe commented Aug 9, 2024

Try changing the FBref link with the one for 2023-2024 season (see old seasons code part for the format), as current season (2024-2025) doesn't have any stats yet.

Or it could be that FBref is blocking the connection,
I've had this problem before.

@AndreaCovelli
Copy link

AndreaCovelli commented Aug 17, 2024

@chrisdaloa It's due to the fact that fbref.com limits the requests to the site to prevent overloading of their servers due to bots. If I remember correctly, the limit is like 15 requests for each minute. If you go over that limit your ip is banned for an hour.

To solve the problem is suggest you to insert some time.wait inside the python code.

@chrisdaloa
Copy link
Author

Hi, with the data of new season it work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants