Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redesign for 1.0.0 #26

Open
1 of 5 tasks
micahcochran opened this issue Aug 3, 2021 · 0 comments
Open
1 of 5 tasks

Redesign for 1.0.0 #26

micahcochran opened this issue Aug 3, 2021 · 0 comments
Milestone

Comments

@micahcochran
Copy link
Owner

micahcochran commented Aug 3, 2021

Need a class RecipeSchema. This is a class that will store the data in JSON-like dictionary and also as a python-object dictionary in one class.

The functions load(), loads(), scrape(), and scrape_url(), should return an list of RecipeSchema objects.

The RecipeSchema.__init__ will take json_style dictionaries recipe_json, and python object dictionaries recipe_dict. A RecipeSchema class should help code reuse and I think this is a better design paradigm than the entirely functional code. It will add an additional step to decoding, but future class function could likely be added without a major overhaul.

Here's some ideas

>>> import scrape_schema_recipe

>>> url = 'https://www.foodnetwork.com/recipes/alton-brown/honey-mustard-dressing-recipe-1939031'
>>> recipe_list = scrape_schema_recipe.scrape_url(url)  # remove that --> python_objects=True

# OLD WAY
>>> recipe = recipe_list[0]

# NEW WAY
>>> recipe = recipe_list[0].dict
# OR
>>> recipe = dict(recipe_list[0])
# OR if you need the JSON-like dictionary version
>>> recipe = recipe_list[0].json

# to get Name of the recipe
>>> recipe['name']
'Honey Mustard Dressing'

I could still be convinced that one syntax might be better than another.

This is a design decision to entirely remove the python_objects parameter from the functions. It is a little more complicated than it should be. The RecipeSchema.dict and/or dict(RecipeSchema) representation(s) should have that functionality.

These will be breaking changes to the API from version 0.1.5.

TODO:

  • Redesign with OOP (above)
  • use dataclasses (use dataclasses backport to support Python 3.6 or just leave 3.6 in the dust)
  • Create SSR's own TypeError exception that is raised, perhaps SSRTypeError? This way Python TypeErrors won't be caught unnecessarily. -- Done in Release v0.2.0
  • Remove code duplication in functions load(), loads(), scrape(), and scrape_url(). (pretty minor)
  • Fix sub key date parsing. See edit below for details.

If designed correctly, these changes might make issue #19 a moot point. Might need __set_attr__ function or a specific function like define_default_value to make that happen.

Edit: There are subkeys in which dates that are not being parsed. 'video' -> 'uploadDate'

Here's 0.2.0 code:

In [1]: import scrape_schema_recipe                                             

In [3]: s=scrape_schema_recipe.scrape_url("https://www.midgetmomma.com/olive-garden-at-home-alfredo-sauce-salad-dressing-recipe/", python_objects=True)                      

In [4]: s[0]["video"]["uploadDate"]                            
Out[4]: '2020-04-10T16:41:25.000Z'

Here's proposed 1.0.0 code:

In [1]: import scrape_schema_recipe                                             

In [3]: s=scrape_schema_recipe.scrape_url("https://www.midgetmomma.com/olive-garden-at-home-alfredo-sauce-salad-dressing-recipe/")                      

In [4]: s[0].py["video"]["uploadDate"]                                          
Out[4]: '2020-04-10T16:41:25.000Z'

In both cases those should be returning datetime.datetime(2020, 4, 10, 16, 41, 25)

@micahcochran micahcochran added this to the 1.0.0 milestone Aug 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant