Skip to content

Commit

Permalink
Merge branch 'main' into ui/feature/deterministic-uuid-demo-projects
Browse files Browse the repository at this point in the history
  • Loading branch information
DimaAmega committed Jul 5, 2024
2 parents 521fac6 + 9f99354 commit ad4c909
Show file tree
Hide file tree
Showing 54 changed files with 396 additions and 150 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ui.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ jobs:
name: UI type-check
runs-on: ubuntu-22.04
needs: changed_files
if: ${{ github.event.pull_request.draft == false && needs.changed_files.outputs.ui_any_modified == 'true' && needs.changed_files.outputs.evidently_python_any_modified == 'true' }}
if: ${{ github.event.pull_request.draft == false && (needs.changed_files.outputs.ui_any_modified == 'true' || needs.changed_files.outputs.evidently_python_any_modified == 'true') }}

steps:
- name: ⬇️ Checkout repo
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
71 changes: 36 additions & 35 deletions docs/book/get-started/quickstart-llm.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ You can run this example in Colab or any Python environment.

Install the Evidently Python library.

```
```python
!pip install evidently[llm]
```

Expand All @@ -19,93 +19,94 @@ import pandas as pd
from sklearn import datasets
from evidently.report import Report
from evidently.metric_preset import TextEvals

import nltk
nltk.download('words')
nltk.download('wordnet')
nltk.download('omw-1.4')
nltk.download('vader_lexicon')
from evidently.descriptors import *
```

**Optional**. Import components to send evaluation results to Evidently Cloud:
**Optional**. Import the components to send evaluation results to Evidently Cloud:

```python
from evidently.ui.workspace.cloud import CloudWorkspace
```

# 2. Import the toy dataset

Import a toy dataset with e-commerce reviews. It contains a column with "Review_Text" that you'll analyze.
Import a toy dataset with e-commerce reviews. It contains a column with "Review_Text". You will take 100 rows to analyze.

```python
reviews_data = datasets.fetch_openml(name='Womens-E-Commerce-Clothing-Reviews', version=2, as_frame='auto')
reviews_data = datasets.fetch_openml(
name='Womens-E-Commerce-Clothing-Reviews',
version=2, as_frame='auto')
reviews = reviews_data.frame[:100]
```

# 3. Run the evals
# 3. Run your first eval

Run an evaluation Preset to check basic text descriptive text properties:
* text sentiment (scale -1 to 1)
* text length (number of symbols)
* number of sentences in a text
* percentage of out-of-vocabulary words (scale 0 to 100)
* percentage of non-letter characters (scale 0 to 100)
Run a few basic evaluations for all texts in the "Review_Text" column:
* text sentiment (measured on a scale from -1 for negative to 1 for positive)
* text length (returns an absolute number of symbols)

```python
text_evals_report = Report(metrics=[
TextEvals(column_name="Review_Text")
]
)
TextEvals(column_name="Review_Text", descriptors=[
Sentiment(),
TextLength(),
]
),
])

text_evals_report.run(reference_data=None, current_data=reviews)
```

There are more evals to choose from. You can also create custom ones, including LLM-as-a-judge.
There are 20+ built-in evals to choose from. You can also create custom ones, including LLM-as-a-judge. We call the result of each such evaluation a `descriptor`.

View a Report in Python:

```
text_evals_report
```

You will see a summary distribution of results for each evaluation.
You will see the summary results: the distribution of length and sentiment for all evaluated texts.

# 4. Send results to Evidently Cloud

To record and monitor evaluations over time, send them to Evidently Cloud. You'll need an API key.
* Sign up for an [Evidently Cloud account](https://app.evidently.cloud/signup), and create your Organization.
* Click on the **Teams** icon on the left menu. Create a Team - for example, "Personal". Copy and save the team ID. ([Team page](https://app.evidently.cloud/teams)).
* Click the **Key** icon in the left menu to go. Generate and save the token. ([Token page](https://app.evidently.cloud/token)).

Connect to Evidently Cloud using your token.
To record and monitor evaluations over time, send them to Evidently Cloud.
* **Sign up**. Create an [Evidently Cloud account](https://app.evidently.cloud/signup) and your Organization.
* **Add a Team**. Click **Teams** in the left menu. Create a Team, copy and save the Team ID. ([Team page](https://app.evidently.cloud/teams)).
* **Get your API token**. Click the **Key** icon in the left menu to go. Generate and save the token. ([Token page](https://app.evidently.cloud/token)).
* **Connect to Evidently Cloud**. Pass your API key to connect from your Python environment.

```python
ws = CloudWorkspace(token="YOUR_TOKEN_HERE", url="https://app.evidently.cloud")
ws = CloudWorkspace(token="YOUR_API_TOKEN", url="https://app.evidently.cloud")
```

Create a Project inside your Team. Pass the `team_id`:
* **Create a Project**. Create a new Project inside your Team, adding your title and description:

```python
project = ws.create_project("My test project", team_id="YOUR_TEAM_ID")
project.description = "My project description"
project.save()
```

Send the Report to the Cloud:
* **Upload the Report to the Project**. Send the evaluation results:

```python
ws.add_report(project.id, text_evals_report)
```

Go to the Evidently Cloud. Open your Project and head to the "Reports" in the left menu. ([Cloud home](https://app.evidently.cloud/)).
* **View the Report**. Go to the Evidently Cloud. Open your Project and head to the "Reports" in the left menu. ([Cloud home](https://app.evidently.cloud/)).

![](../.gitbook/assets/cloud/toy_text_report_preview.gif)

In the future, you can log ongoing evaluation results to build monitoring panels and send alerts.
# 5. Get a dashboard

Go to the "Dashboard" tab and enter the "Edit" mode. Add a new tab, and select the "Descriptors" template.

You'll see a set of panels that show Sentiment and Text Length with a single data point. As you log ongoing evaluation results, you can track trends and set up alerts.

![](../.gitbook/assets/cloud/add_descriptor_tab.gif)

# Want to see more?

Check out a more in-depth tutorial to learn key workflows. It covers using LLM-as-a-judge, running conditional test suites, monitoring results over time and more.
Check out a more in-depth tutorial to learn key workflows. It covers using LLM-as-a-judge, running conditional test suites, monitoring results over time, and more.

{% content-ref url="tutorial-llm.md" %}
[Evidently LLM Tutorial](tutorial-llm.md).
Expand Down
4 changes: 2 additions & 2 deletions docs/book/monitoring/design_dashboard_api.md
Original file line number Diff line number Diff line change
Expand Up @@ -434,7 +434,7 @@ project.dashboard.add_panel(

**Aggregated by Status**. To show the total number of failed Tests (status filter), with daily level aggregation.

```
```python
project.dashboard.add_panel(
DashboardPanelTestSuite(
title="All tests: aggregated",
Expand All @@ -452,7 +452,7 @@ project.dashboard.add_panel(

**Filtered by Test ID**. To show all results for a specified list of Tests (on constant columns, missing values, empty rows) with daily-level aggregation.

```
```python
project.dashboard.add_panel(
DashboardPanelTestSuite(
title="Data quality tests",
Expand Down
2 changes: 1 addition & 1 deletion src/evidently/_version.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/usr/bin/env python
# coding: utf-8

version_info = (0, 4, 29)
version_info = (0, 4, 30)
__version__ = ".".join(map(str, version_info))
2 changes: 1 addition & 1 deletion src/evidently/base_metric.py
Original file line number Diff line number Diff line change
Expand Up @@ -283,7 +283,7 @@ def required_features(self, data_definition: DataDefinition) -> List["GeneratedF
for field, value in sorted(self.__dict__.items(), key=lambda x: x[0]):
if field in ["context"]:
continue
if issubclass(type(value), ColumnName) and value.feature_class is not None:
if isinstance(value, ColumnName) and value.feature_class is not None:
required_features.append(value.feature_class)
return required_features

Expand Down
12 changes: 8 additions & 4 deletions src/evidently/calculation_engine/engine.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import abc
import functools
import logging
from typing import TYPE_CHECKING
from typing import Dict
from typing import Generic
from typing import List
Expand All @@ -19,8 +20,11 @@
from evidently.features.generated_features import GeneratedFeature
from evidently.utils.data_preprocessing import DataDefinition

if TYPE_CHECKING:
from evidently.suite.base_suite import Context

TMetricImplementation = TypeVar("TMetricImplementation", bound=MetricImplementation)
TInputData = TypeVar("TInputData")
TInputData = TypeVar("TInputData", bound=GenericInputData)


class Engine(Generic[TMetricImplementation, TInputData]):
Expand All @@ -34,10 +38,10 @@ def set_metrics(self, metrics):
def set_tests(self, tests):
self.tests = tests

def execute_metrics(self, context, data: GenericInputData):
def execute_metrics(self, context: "Context", data: GenericInputData):
calculations: Dict[Metric, Union[ErrorResult, MetricResult]] = {}
converted_data = self.convert_input_data(data)
context.features = self.generate_additional_features(converted_data)
context.set_features(self.generate_additional_features(converted_data))
context.data = converted_data
for metric, calculation in self.get_metric_execution_iterator():
if calculation not in calculations:
Expand Down Expand Up @@ -65,7 +69,7 @@ def get_data_definition(
raise NotImplementedError()

@abc.abstractmethod
def generate_additional_features(self, data: TInputData):
def generate_additional_features(self, data: TInputData) -> Optional[Dict[tuple, GeneratedFeature]]:
raise NotImplementedError

def get_metric_implementation(self, metric):
Expand Down
4 changes: 2 additions & 2 deletions src/evidently/calculation_engine/python_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,10 @@ def get_data_definition(
raise ValueError("PandasEngine works only with pd.DataFrame input data")
return create_data_definition(reference_data, current_data, column_mapping, categorical_features_cardinality)

def generate_additional_features(self, data: PythonInputData):
def generate_additional_features(self, data: PythonInputData) -> Dict[tuple, GeneratedFeature]:
curr_additional_data = None
ref_additional_data = None
features = {}
features: Dict[tuple, GeneratedFeature] = {}
for metric, calculation in self.get_metric_execution_iterator():
try:
required_features = metric.required_features(data.data_definition)
Expand Down
8 changes: 6 additions & 2 deletions src/evidently/metrics/data_quality/column_category_metric.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ class Config:
"column_name": {IncludeTags.Parameter},
"counts": {IncludeTags.Extra},
}
smart_union = True

def __init__(self, **data):
"""for backward compatibility"""
Expand All @@ -59,7 +60,7 @@ def __init__(self, **data):
super().__init__(**data)

column_name: str
category: Union[int, float, str]
category: Union[bool, int, float, str]
current: CategoryStat
reference: Optional[CategoryStat] = None
counts: CountOfValues
Expand All @@ -76,8 +77,11 @@ def counts_of_values(self) -> Dict[str, pd.DataFrame]:
class ColumnCategoryMetric(Metric[ColumnCategoryMetricResult]):
"""Calculates count and shares of values in the predefined values list"""

class Config:
smart_union = True

column_name: ColumnName
category: Union[int, float, str]
category: Union[bool, int, float, str]

def __init__(
self, column_name: Union[str, ColumnName], category: Union[int, float, str], options: AnyOptions = None
Expand Down
Loading

0 comments on commit ad4c909

Please sign in to comment.