403 Error Cant access my .py files in my Repo? #121229

liot-io · 2024-04-26T15:24:17Z

liot-io
Apr 26, 2024

Hi.

Suddenly i cannot access some of my scripts in my repo?
I can still open some of them, but others return an error with the following text:

Error loading page
An unexpected 403 error occured. Try reloading the page.

I have been working on some custom projects, that i would really hate to loose :S Is there someone or somehow to rescue the content in the files i cannot open?

Repo = https://github.com/liot-io/AIOpenDK/blob/main/projects/scrapers/web_scraper.py

davevad93 · 2024-04-26T17:48:22Z

davevad93
Apr 26, 2024

Hi @liot-io , here's the content of your web_scraper.py file:

import os
import html2text
from requests_html import HTMLSession
from urllib.parse import urlparse

def extract_filename_from_url(url: str) -> str:
    """Extract the filename from the URL."""
    parsed_url = urlparse(url)
    filename = parsed_url.netloc  # Extract domain name from URL
    return filename + ".md"  # Add .md extension

def download_and_save_in_markdown(url: str, dir_path: str) -> None:
    """Download the HTML content from the web page and save it as a markdown file."""
    # Extract a filename from the URL
    filename = extract_filename_from_url(url)
    print(f"Downloading {url} into {filename}...")

    session = HTMLSession()
    response = session.get(url, timeout=30)

    # Check if the content type is HTML
    content_type = response.headers.get('content-type', '')
    if 'text/html' not in content_type:
        print(f"Skipping {url} as it is not an HTML page")
        return

    # Render the page, which will execute JavaScript
    response.html.render(timeout=60)  # Increased timeout to 60 seconds

    # Convert the rendered HTML content to markdown
    h = html2text.HTML2Text()
    markdown_content = h.handle(response.html.raw_html.decode("utf-8"))

    # Write the markdown content to a file
    filename = os.path.join(dir_path, filename)
    if not os.path.exists(filename):
        with open(filename, "w", encoding="utf-8") as f:
            f.write(markdown_content)

def download_target_page(url: str) -> None:
    """Download the HTML content from the target page and save it as a markdown file."""
    # Create the content directory if it doesn't exist
    base_dir = os.path.dirname(os.path.abspath(__file__))
    dir_path = os.path.join(base_dir, "content")
    os.makedirs(dir_path, exist_ok=True)
    
    # Download and save the target page
    download_and_save_in_markdown(url, dir_path)
    print("Target page has been successfully downloaded!")

# Define the target page
TARGET_PAGES = [
    "https://Example.dk/",
]

if __name__ == "__main__":
    for target_page in TARGET_PAGES:
        download_target_page(target_page)

0 replies

Jv2350 · 2024-04-26T17:56:41Z

Jv2350
Apr 26, 2024

Hey there,

That sounds frustrating! It's never fun to hit a roadblock when you're in the middle of a project. Let's try to troubleshoot this together.

First off, have you checked your GitHub permissions and made sure you're logged in with the right account? Sometimes, things get a bit wonky with permissions, so it's worth a double-check.

If everything seems fine on your end, it might be worth reaching out to GitHub support. They're usually pretty helpful and might be able to shed some light on what's going on.

In the meantime, if you have local copies of those files, you should still be able to access them. If not, don't worry just yet. We'll figure this out together!

0 replies

liot-io · 2024-04-26T20:03:20Z

liot-io
Apr 26, 2024
Author

Hi!

I can access all from my phone. Seems to be a problem with SSL certificate and Company policies - as i'm on my work computer.
phew - thought i lost 3 months work there! Zscaler is a B*tch for productivity.

thank you for quick reply! and Thank you @davevad93!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Community

403 Error Cant access my .py files in my Repo? #121229

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

GitHub Community

**403 Error** Cant access my .py files in my Repo? #121229

liot-io Apr 26, 2024

Replies: 3 comments

davevad93 Apr 26, 2024

Jv2350 Apr 26, 2024

liot-io Apr 26, 2024 Author

403 Error Cant access my .py files in my Repo? #121229

liot-io
Apr 26, 2024

davevad93
Apr 26, 2024

Jv2350
Apr 26, 2024

liot-io
Apr 26, 2024
Author