Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: call chaining readability rule #286

Open
sbrugman opened this issue Sep 11, 2023 · 1 comment
Open

[Enhancement]: call chaining readability rule #286

sbrugman opened this issue Sep 11, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@sbrugman
Copy link
Contributor

sbrugman commented Sep 11, 2023

Overview

Many popular python libraries support an API for call chaining, also known as fluent interface. Examples are pandas, pyspark and pytorch.

Using the fluent interface increases readability and has more than once helped in spotting bugs early.
The core pattern of reassigning to the same variable is missing.

def process(spark, file_name: str):
    common_columns = ["col1_renamed", "col2_renamed", "custom_col"]
    df = spark.read.parquet(file_name)
    df = df \
        .withColumnRenamed('col1', 'col1_renamed') \
        .withColumnRenamed('col2', 'col2_renamed')
    df = df \
        .select(common_columns) \
        .withColumn('service_type', F.lit('green'))
    return df
def process(spark, file_name: str):
    common_columns = ["col1_renamed", "col2_renamed", "custom_col"]
    return (
        spark.read.parquet(file_name)
        .withColumnRenamed('col1', 'col1_renamed')
        .withColumnRenamed('col2', 'col2_renamed')
        .select(common_columns)
        .withColumn('service_type', F.lit('green'))
    )

In existing linters and formatters, I have found partial functionality to enforce this pattern. :

Proposal

Rather than creating a stand-alone lint for this, I would like to propose to include a rule to refurb that detects multiple assignments to the same variable that could be chained.

Are you open to adding this rule to the library?

@dosisod
Copy link
Owner

dosisod commented Sep 12, 2023

Thank you @sbrugman for opening this! I think this is a good thing to add. There are probably a bunch of edge cases in how people use (or misuse) fluent API's, meaning it might be hard to detect all cases where chaining could be used instead. I don't think that is much of an issue though, so long as we can detect the trivial/basic cases.

I will go ahead and review the PR you opened sometime tomorrow!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants