Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Improve the Features section in README #772

Merged
merged 4 commits into from
Dec 3, 2024
Merged

Conversation

vdusek
Copy link
Collaborator

@vdusek vdusek commented Dec 2, 2024

No description provided.

@vdusek vdusek added documentation Improvements or additions to documentation. t-tooling Issues with this label are in the ownership of the tooling team. adhoc Ad-hoc unplanned task added during the sprint. labels Dec 2, 2024
@vdusek vdusek added this to the 104th sprint - Tooling team milestone Dec 2, 2024
@vdusek vdusek requested a review from janbuchar December 2, 2024 18:31
@vdusek vdusek self-assigned this Dec 2, 2024
Copy link
Contributor

@honzajavorek honzajavorek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added two comments which I think improve the spelling or wording.

The rest is my subjective commentary on the matter of Crawlee and Scrapy comparison, which you can take as a feedback, but also completely ignore. I wanted to provide an outsider perspective, but at the same time, I think the framework creators should have the freedom to express their opinionated view and their ambition how the project should differentiate.

Just chiming in, not approving nor disapproving.

README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
Co-authored-by: Honza Javorek <[email protected]>
@vdusek
Copy link
Collaborator Author

vdusek commented Dec 3, 2024

@honzajavorek, thanks for your feedback.

@honzajavorek
Copy link
Contributor

honzajavorek commented Dec 3, 2024

One more thing I didn't notice previously - I'm sorry! Most of the points start with "Crawlee something..." or "unlike Scrapy, Crawlee..." Given the heading already sets the scene, I think we can be shorter by just listing the benefits:

  • Newer project built with modern Python and complete type hint coverage for a better developer experience.
  • Its crawlers are regular Python scripts. You don't need a separate command to launch them, and you can integrate them directly into other applications.
  • Supports state persistence during interruptions, saving time and costs by avoiding the need to restart scraping pipelines from scratch after an issue.
  • Allows saving of multiple types of results in a single scraping run. Offers several storing options (see datasets and key-value stores).

Something along these lines. Feel free to drop my suggestion or get just loosely inspired by it.

@vdusek vdusek merged commit 27db2e4 into master Dec 3, 2024
22 of 23 checks passed
@vdusek vdusek deleted the update-readme branch December 3, 2024 14:07
Mantisus pushed a commit to Mantisus/crawlee-python that referenced this pull request Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
adhoc Ad-hoc unplanned task added during the sprint. documentation Improvements or additions to documentation. t-tooling Issues with this label are in the ownership of the tooling team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants