You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I took samples from your code in order to test Dataflow. When trying to run the pipeline on the full-dataset, your examples notebook says: "Note, you can change the first arugment to "None" to process the full dataset." This did not work for me. I had to change the create_query-fct:
Instead of AND MOD(ABS(FARM_FINGERPRINT(CAST(pickup_datetime AS STRING))), EVERY_N) = 1
I had to put AND MOD(ABS(FARM_FINGERPRINT(CAST(pickup_datetime AS STRING))), EVERY_N) = 0
and then use in the function call EVERY_N=1: preprocess("1", "DataflowRunner")
Hey,
I took samples from your code in order to test Dataflow. When trying to run the pipeline on the full-dataset, your examples notebook says: "Note, you can change the first arugment to "None" to process the full dataset." This did not work for me. I had to change the
create_query
-fct:Instead of
AND MOD(ABS(FARM_FINGERPRINT(CAST(pickup_datetime AS STRING))), EVERY_N) = 1
I had to put
AND MOD(ABS(FARM_FINGERPRINT(CAST(pickup_datetime AS STRING))), EVERY_N) = 0
and then use in the function call EVERY_N=1:
preprocess("1", "DataflowRunner")
My notebook is available here.
Original Notebook "
Data Preprocessing for Machine Learning": /courses/machine_learning/deepdive/04_advanced_preprocessing/a_dataflow.ipynb
Best, Henry
The text was updated successfully, but these errors were encountered: