Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New Hack] 067-FabricLakehouse #714

Open
wants to merge 55 commits into
base: master
Choose a base branch
from

Conversation

liesel-h
Copy link

@liesel-h liesel-h commented Sep 7, 2023

No description provided.

@liesel-h liesel-h requested a review from a team as a code owner September 7, 2023 02:52
@jrzyshr jrzyshr self-assigned this Sep 7, 2023
@jrzyshr jrzyshr requested a review from jcbendernh September 7, 2023 03:34
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added numbered list to the Success Criteria.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done an initial review and discussed this with Jordan.

First off, we really like the topic and what you did with the shipwreck data set to make it a fun experience. One concern is that it should incorporate more around best practices with regards to lakehouse architectures. For example, what should sit in bronze, silver and gold, etc. See the following articles for reference…
Building the Lakehouse - Implementing a Data Lake Strategy with Azure Synapse - Microsoft Community Hub
What is a Medallion Architecture? (databricks.com)
Get started securing your data in OneLake - Microsoft Fabric | Microsoft Learn

Thus, think about how you can expand this topic to incorporate some of these ideas and more shipwreck / pirates puns too. Medallion architecture and buried treasure? 😁
Also here are some other areas to address.

  1. The coach/solution steps need to be more thorough. These should be documented to the point that a coach can walk through the items step by step to achieve the success criteria for each challenge or pretty close to it. A good example of this is Jordan’s hack at https://github.com/microsoft/WhatTheHack/tree/master/032-MLOpsFromScratch/Coach. Part of our job as coaches is to understand what they should be doing and what they should not do. For example, I looked at your data flow for the forecast data and thought “I could do this in a notebook” and after trying a few hours of weird errors and using Maven examples you can do in Databricks, I get a reply on one of the Fabric channels the next morning that the maven library is not supported yet. Thus, this is a great example of do not use a notebook on this and here is why.
  2. For the forecast data, is a notebook example of anonymously connecting to an FTP to download XML data a relevant example of what we would see in the field in 2023? Is there a data set out there that is more modern and would incorporate more mainstream concepts we typically see in a notebook, pipeline or data flow? Think about this is a good springboard to get them used to what they would see in the field. Something like the open data sets at Datasets in Azure Open Datasets - Azure Open Datasets | Microsoft Learn, except for weather.

All in all, it is a good start. Let me know if you want to do a review call.

@jrzyshr jrzyshr requested a review from kriation January 29, 2024 20:09
Copy link

@kriation kriation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All markdown links were evaluated and are functional.
Spelling and grammatical errors corrected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Review - Awaiting WTH VTeam Action
Development

Successfully merging this pull request may close these issues.

5 participants