Redesign for 1.0.0 #26

micahcochran · 2021-08-03T22:01:32Z

Need a class RecipeSchema. This is a class that will store the data in JSON-like dictionary and also as a python-object dictionary in one class.

The functions load(), loads(), scrape(), and scrape_url(), should return an list of RecipeSchema objects.

The RecipeSchema.__init__ will take json_style dictionaries recipe_json, and python object dictionaries recipe_dict. A RecipeSchema class should help code reuse and I think this is a better design paradigm than the entirely functional code. It will add an additional step to decoding, but future class function could likely be added without a major overhaul.

Here's some ideas

>>> import scrape_schema_recipe

>>> url = 'https://www.foodnetwork.com/recipes/alton-brown/honey-mustard-dressing-recipe-1939031'
>>> recipe_list = scrape_schema_recipe.scrape_url(url)  # remove that --> python_objects=True

# OLD WAY
>>> recipe = recipe_list[0]

# NEW WAY
>>> recipe = recipe_list[0].dict
# OR
>>> recipe = dict(recipe_list[0])
# OR if you need the JSON-like dictionary version
>>> recipe = recipe_list[0].json

# to get Name of the recipe
>>> recipe['name']
'Honey Mustard Dressing'

I could still be convinced that one syntax might be better than another.

This is a design decision to entirely remove the python_objects parameter from the functions. It is a little more complicated than it should be. The RecipeSchema.dict and/or dict(RecipeSchema) representation(s) should have that functionality.

These will be breaking changes to the API from version 0.1.5.

TODO:

Redesign with OOP (above)
use dataclasses (use dataclasses backport to support Python 3.6 or just leave 3.6 in the dust)
Create SSR's own TypeError exception that is raised, perhaps SSRTypeError? This way Python TypeErrors won't be caught unnecessarily. -- Done in Release v0.2.0
Remove code duplication in functions load(), loads(), scrape(), and scrape_url(). (pretty minor)
Fix sub key date parsing. See edit below for details.

If designed correctly, these changes might make issue #19 a moot point. Might need __set_attr__ function or a specific function like define_default_value to make that happen.

Edit: There are subkeys in which dates that are not being parsed. 'video' -> 'uploadDate'

Here's 0.2.0 code:

In [1]: import scrape_schema_recipe                                             

In [3]: s=scrape_schema_recipe.scrape_url("https://www.midgetmomma.com/olive-garden-at-home-alfredo-sauce-salad-dressing-recipe/", python_objects=True)                      

In [4]: s[0]["video"]["uploadDate"]                            
Out[4]: '2020-04-10T16:41:25.000Z'

Here's proposed 1.0.0 code:

In [1]: import scrape_schema_recipe                                             

In [3]: s=scrape_schema_recipe.scrape_url("https://www.midgetmomma.com/olive-garden-at-home-alfredo-sauce-salad-dressing-recipe/")                      

In [4]: s[0].py["video"]["uploadDate"]                                          
Out[4]: '2020-04-10T16:41:25.000Z'

In both cases those should be returning datetime.datetime(2020, 4, 10, 16, 41, 25)

The text was updated successfully, but these errors were encountered:

micahcochran added this to the 1.0.0 milestone Aug 3, 2021

micahcochran mentioned this issue Aug 5, 2021

v0.2.0 - SSRTypeError #27

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redesign for 1.0.0 #26

Redesign for 1.0.0 #26

micahcochran commented Aug 3, 2021 •

edited

Loading

Redesign for 1.0.0 #26

Redesign for 1.0.0 #26

Comments

micahcochran commented Aug 3, 2021 • edited Loading

micahcochran commented Aug 3, 2021 •

edited

Loading