-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add help file to crawl github repos #51
Comments
@zackees i found it very difficult to get it to work with github repos, so i actually created my own repo based on this focused on crawling GitHub repos using GitHub's api if you want to check it out https://github.com/haydenwhayne/gpt-github-crawler |
I was after the same thing and had similar difficulty, I think my problem was it wouldn't atomically traverse the subfolders, perhaps because the content is not loaded until it is interacted with 🤔🤷♂️ I experimented with these selectors #repo-content-turbo-frame, #read-only-cursor-text-area, #repos-file-tree. Thinking on it a bit more since the repository is fully retrievable perhaps this kind of thing could be done effectively by cloning the repo and then traversing the file system. Perhaps a web crawler is not really required for this. |
@nicholascross Yep I agree, the github repo I linked above allows for crawling both remote and local repos. This way you can clone the repository if you want and run it in local mode to traverse the file system. This would allow you to still add match patterns so you can specify which file and file types you want. |
I found local filesystem crawling has been requested here so maybe go upvote if you want it. Unlikely to be useful for anyone unless they are a Swift dev experimenting in this space but I ended up going down the local checkout path myself. https://github.com/nicholascross/SourceCrawler I found it interesting that once I had the first version of this which used heuristic regexes for type extraction I was able to use the crawling output with a GPT agent to add AST based type extraction using a "third party" library I had no experience with. 🤯 |
I ended up creating a repo crawler as well. Mine supports either crawling a public repo based on its URL, or crawling the locally checked out repo: |
+1 |
I would love to create a gpt out of a github repo. Can you please add this?
K thx bai
The text was updated successfully, but these errors were encountered: