-
Notifications
You must be signed in to change notification settings - Fork 879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[New Hack] 067-FabricLakehouse #714
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added numbered list to the Success Criteria.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have done an initial review and discussed this with Jordan.
First off, we really like the topic and what you did with the shipwreck data set to make it a fun experience. One concern is that it should incorporate more around best practices with regards to lakehouse architectures. For example, what should sit in bronze, silver and gold, etc. See the following articles for reference…
• Building the Lakehouse - Implementing a Data Lake Strategy with Azure Synapse - Microsoft Community Hub
• What is a Medallion Architecture? (databricks.com)
• Get started securing your data in OneLake - Microsoft Fabric | Microsoft Learn
Thus, think about how you can expand this topic to incorporate some of these ideas and more shipwreck / pirates puns too. Medallion architecture and buried treasure? 😁
Also here are some other areas to address.
- The coach/solution steps need to be more thorough. These should be documented to the point that a coach can walk through the items step by step to achieve the success criteria for each challenge or pretty close to it. A good example of this is Jordan’s hack at https://github.com/microsoft/WhatTheHack/tree/master/032-MLOpsFromScratch/Coach. Part of our job as coaches is to understand what they should be doing and what they should not do. For example, I looked at your data flow for the forecast data and thought “I could do this in a notebook” and after trying a few hours of weird errors and using Maven examples you can do in Databricks, I get a reply on one of the Fabric channels the next morning that the maven library is not supported yet. Thus, this is a great example of do not use a notebook on this and here is why.
- For the forecast data, is a notebook example of anonymously connecting to an FTP to download XML data a relevant example of what we would see in the field in 2023? Is there a data set out there that is more modern and would incorporate more mainstream concepts we typically see in a notebook, pipeline or data flow? Think about this is a good springboard to get them used to what they would see in the field. Something like the open data sets at Datasets in Azure Open Datasets - Azure Open Datasets | Microsoft Learn, except for weather.
All in all, it is a good start. Let me know if you want to do a review call.
…tTheHack into xxx-FabricLakehouse
… up Student guides and added more pirate puns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All markdown links were evaluated and are functional.
Spelling and grammatical errors corrected.
No description provided.