You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 12, 2024. It is now read-only.
As data sources wind down (& perhaps shut down?)...
"I haven't seen any winding down yet, but yes this will be an issue in the future. In our data, we have a hack to work around data sources which stop updating which is to flag them with "skip" which prevents unit tests from failing and forces to fetch the latest valid data. But if we make any other configuration changes, the data would be lost.
A good solution to this would be to add a "snapshot ID" field which, if populated, we wouldn't even try to go to the original data source and instead we fetch the intermediate file from the last successful processing of that data source (which we have saved). Unfortunately, the intermediate files are not externally available so that makes the fetch step of the pipelines not reproducible. I don't think there's a way around that: the original data source is gone, so of course you can't reproduce our work."
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
As data sources wind down (& perhaps shut down?)...
"I haven't seen any winding down yet, but yes this will be an issue in the future. In our data, we have a hack to work around data sources which stop updating which is to flag them with "skip" which prevents unit tests from failing and forces to fetch the latest valid data. But if we make any other configuration changes, the data would be lost.
A good solution to this would be to add a "snapshot ID" field which, if populated, we wouldn't even try to go to the original data source and instead we fetch the intermediate file from the last successful processing of that data source (which we have saved). Unfortunately, the intermediate files are not externally available so that makes the fetch step of the pipelines not reproducible. I don't think there's a way around that: the original data source is gone, so of course you can't reproduce our work."
The text was updated successfully, but these errors were encountered: