You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Using Great expectations version 1.2.4 I created a custom expectation using sql and added the expectation to the expectation suite. After the current python session ended (my data bricks cluster was turned off because of inactivity ), I turned on my databricks cluster and I re-initialized the already created data context which was meant to hold the expectation suite. When I tried to access the expectation suite to see the content, i discovered that the custom expectation i created earlier is no longer present and it is throwing an an error.
To Reproduce
create data source, asset, batch definition and expectation suite**
dataframe = spark.sql("SELECT * FROM xxx")
context = gx.get_context(project_root_dir="/dbfs/xxx")
suite_name = "d365_enriched_generaljournaltransaction_expectation_suite"
try:
suite = gx.ExpectationSuite(name=suite_name)
suite = context.suites.add(suite)
except Exception as e:
suite = context.suites.get(name=suite_name)
#now lets create a custom expectation using sql
class ExpectValidLineItemSum(gx.expectations.UnexpectedRowsExpectation):
unexpected_rows_query: str = ("""SELECT CrayonCompanyIdRef, SUM(accountingcurrencyamount) AS total_amount
FROM {batch}
WHERE accountingdate >= MAKE_DATE(YEAR(CURRENT_DATE) - 1, 1, 1)
AND accountingdate <= MAKE_DATE(YEAR(CURRENT_DATE) - 1, 12, 31)
GROUP BY CrayonCompanyIdRef
HAVING SUM(accountingcurrencyamount) NOT BETWEEN -1 AND 1""")
description: str = "Line items should have a valid sum between -1 amd +1"
expectation = ExpectValidLineItemSum()
try:
suite.add_expectation(expectation)
except Exception as e:
print("Expectation already exists in the suite")
Now the expectation was added successfully to the expectation suite.
Try ending the current python session (in my case restarting the databricks cluster will do that)
run the following line of codes to get the expectation suite you saved in the last python session
create data source, asset, batch definition and expectation suite**
suite_name = "d365_enriched_generaljournaltransaction_expectation_suite"
try:
suite = gx.ExpectationSuite(name=suite_name)
suite = context.suites.add(suite)
except Exception as e:
suite = context.suites.get(name=suite_name)
This is the error it throws
ERROR:great_expectations.core.expectation_suite:Could not add expectation; provided configuration is not valid: Could not add expectation; provided configuration is not valid: expect_valid_line_item_sum not found
Traceback (most recent call last):
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/great_expectations/core/expectation_suite.py", line 639, in _build_expectation
expectation = expectation_configuration.to_domain_obj()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/great_expectations/expectations/expectation_configuration.py", line 447, in to_domain_obj
expectation_impl = self._get_expectation_impl()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/great_expectations/expectations/expectation_configuration.py", line 444, in _get_expectation_impl
return get_expectation_impl(self.type)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/great_expectations/expectations/registry.py", line 396, in get_expectation_impl
raise gx_exceptions.ExpectationNotFoundError(f"{expectation_name} not found") # noqa: TRY003
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
great_expectations.exceptions.exceptions.ExpectationNotFoundError: expect_valid_line_item_sum not found
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/great_expectations/core/expectation_suite.py", line 94, in init
self.expectations.append(self._process_expectation(exp))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/great_expectations/core/expectation_suite.py", line 216, in _process_expectation
return self._build_expectation(expectation_configuration=expectation_like)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/great_expectations/core/expectation_suite.py", line 646, in _build_expectation
raise gx_exceptions.InvalidExpectationConfigurationError( # noqa: TRY003
great_expectations.exceptions.exceptions.InvalidExpectationConfigurationError: Could not add expectation; provided configuration is not valid: expect_valid_line_item_sum not found Expected behavior
The expectation which I saved on the expectation suite should be able to be retrieved even after I restart an ended python session. This is because it is meant to persist this expectation in the expectation suite which is a json file inside the data context.
Environment (please complete the following information):
Operating System: Windows
Great Expectations Version: 1.2.4
Data Source: Spark Dataframe
Cloud environment: Databricks
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered:
Erua-chijioke
changed the title
Custom expectation created using SQL becomes missing after initializing the datacontext
[BUG] Custom expectation created using SQL becomes missing after initializing the datacontext
Dec 10, 2024
Does the suggestion earlier of importing your custom expectation in the python session where you try to load your context help with this issue? We believe importing the expectation in the second python session will help
Describe the bug
Using Great expectations version 1.2.4 I created a custom expectation using sql and added the expectation to the expectation suite. After the current python session ended (my data bricks cluster was turned off because of inactivity ), I turned on my databricks cluster and I re-initialized the already created data context which was meant to hold the expectation suite. When I tried to access the expectation suite to see the content, i discovered that the custom expectation i created earlier is no longer present and it is throwing an an error.
To Reproduce
create data source, asset, batch definition and expectation suite**
dataframe = spark.sql("SELECT * FROM xxx")
context = gx.get_context(project_root_dir="/dbfs/xxx")
data_source_name = "my_data_source"
data_asset_name = "my_dataframe_data_asset"
batch_definition_name = "my_batch_definition"
data_source = context.data_sources.add_or_update_spark(name=data_source_name)
try:
data_asset=data_source.add_dataframe_asset(name=data_asset_name)
except Exception as e:
data_asset = context.data_sources.get(data_source_name).get_asset(data_asset_name)
try:
batch_definition = data_asset.add_batch_definition_whole_dataframe(batch_definition_name)
except Exception as e:
batch_definition = data_asset.get_batch_definition(batch_definition_name)
suite_name = "d365_enriched_generaljournaltransaction_expectation_suite"
try:
suite = gx.ExpectationSuite(name=suite_name)
suite = context.suites.add(suite)
except Exception as e:
suite = context.suites.get(name=suite_name)
#now lets create a custom expectation using sql
class ExpectValidLineItemSum(gx.expectations.UnexpectedRowsExpectation):
unexpected_rows_query: str = ("""SELECT CrayonCompanyIdRef, SUM(accountingcurrencyamount) AS total_amount
FROM {batch}
WHERE accountingdate >= MAKE_DATE(YEAR(CURRENT_DATE) - 1, 1, 1)
AND accountingdate <= MAKE_DATE(YEAR(CURRENT_DATE) - 1, 12, 31)
GROUP BY CrayonCompanyIdRef
HAVING SUM(accountingcurrencyamount) NOT BETWEEN -1 AND 1""")
description: str = "Line items should have a valid sum between -1 amd +1"
expectation = ExpectValidLineItemSum()
try:
suite.add_expectation(expectation)
except Exception as e:
print("Expectation already exists in the suite")
Now the expectation was added successfully to the expectation suite.
Try ending the current python session (in my case restarting the databricks cluster will do that)
run the following line of codes to get the expectation suite you saved in the last python session
create data source, asset, batch definition and expectation suite**
context = gx.get_context(project_root_dir="/dbfs/xxx")
data_source_name = "my_data_source"
data_asset_name = "my_dataframe_data_asset"
batch_definition_name = "my_batch_definition"
suite_name = "d365_enriched_generaljournaltransaction_expectation_suite"
try:
suite = gx.ExpectationSuite(name=suite_name)
suite = context.suites.add(suite)
except Exception as e:
suite = context.suites.get(name=suite_name)
This is the error it throws
ERROR:great_expectations.core.expectation_suite:Could not add expectation; provided configuration is not valid: Could not add expectation; provided configuration is not valid: expect_valid_line_item_sum not found
Traceback (most recent call last):
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/great_expectations/core/expectation_suite.py", line 639, in _build_expectation
expectation = expectation_configuration.to_domain_obj()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/great_expectations/expectations/expectation_configuration.py", line 447, in to_domain_obj
expectation_impl = self._get_expectation_impl()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/great_expectations/expectations/expectation_configuration.py", line 444, in _get_expectation_impl
return get_expectation_impl(self.type)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/great_expectations/expectations/registry.py", line 396, in get_expectation_impl
raise gx_exceptions.ExpectationNotFoundError(f"{expectation_name} not found") # noqa: TRY003
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
great_expectations.exceptions.exceptions.ExpectationNotFoundError: expect_valid_line_item_sum not found
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/great_expectations/core/expectation_suite.py", line 94, in init
self.expectations.append(self._process_expectation(exp))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/great_expectations/core/expectation_suite.py", line 216, in _process_expectation
return self._build_expectation(expectation_configuration=expectation_like)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/great_expectations/core/expectation_suite.py", line 646, in _build_expectation
raise gx_exceptions.InvalidExpectationConfigurationError( # noqa: TRY003
great_expectations.exceptions.exceptions.InvalidExpectationConfigurationError: Could not add expectation; provided configuration is not valid: expect_valid_line_item_sum not found
Expected behavior
The expectation which I saved on the expectation suite should be able to be retrieved even after I restart an ended python session. This is because it is meant to persist this expectation in the expectation suite which is a json file inside the data context.
Environment (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: