formats `assert f(a,) == (b)` and `assert f(a,) == b` very differently #15842

rdaysky · 2025-01-31T01:43:56Z

Description

ruff format transforms

assert f(a,) == (b)
assert f(a,) == b

into

assert f(
    a,
) == (b)
assert (
    f(
        a,
    )
    == b
)

which doesn’t make too much sense.

The text was updated successfully, but these errors were encountered:

dhruvmanila · 2025-01-31T04:11:41Z

Thanks for the report. Can you say why it doesn't make sense?

@MichaReiser will be able to provide more context on why this is the case but he's currently on leave and will be able to reply next week.

Regardless, it matches what Black does https://black.vercel.app/?version=main&state=_Td6WFoAAATm1rRGAgAhARYAAAB0L-Wj4ACpAGFdAD2IimZxl1N_WlPTQ06-Nur1lMjjikbZBUWzXuAQNKhFphrTQvWGzojFvt24KBUXaI3nTaBMACVKkaNvMxbDMiDTeDauQNv4yFlbtT0lJmLmdtWcjJOxsFmoi2ajLXi1CwAAAAAAM4BGDhyTg2EAAX2qAQAAAJAqwvaxxGf7AgAAAAAEWVo%3D

rdaysky · 2025-02-03T00:54:21Z

It doesn’t make sense for two reasons:

Why should a short expression and the same expression but parenthesized behave differently?
Why should outer parentheses appear when the expression can be satisfactorily formatted without them?

dhruvmanila · 2025-02-03T03:42:55Z

I think one part of the reason is the trailing comma that's present in the function call arguments. That indicates the formatter to break the expression, even though it might short and under the configured line length, into multiple lines and keep them as multiline expression. You can opt-out of this behavior by setting format.skip-magic-trailing-comma to true. Refer to the playground to checkout this behavior.

rdaysky · 2025-02-03T12:23:03Z

But the magic trailing comma is an immensely useful parameter that can’t be turned off without major impact on the rest of the code.

In any case, the magic comma in assert f(a,) == b should insert newlines around a, not around b.

MichaReiser · 2025-02-03T13:19:11Z

What you see here is also not specific to assert; you can observe the same in any clause header (if, while, match ...) playground

Specifically, Ruff supports two layouts for clause headers (which are also used for assert conditions):

A layout that avoids adding parentheses around the condition and instead breaks after existing parentheses ([, {, (). This is what you see for assert f(a,) == (b). However, this layout is only applied if the parentheses are at the very right (or very start
If the parentheses aren't at the very start (which isn't for calls because the leftmost part is an identifier), then Ruff/Black parenthesizes the entire expression instead. You see this for f(a,) = b because it neither starts nor ends with parentheses.

You can also see this if you flip the operands:

assert f(a,) == b
assert b == f(a, )

The code for this can be found here:

ruff/crates/ruff_python_formatter/src/expression/mod.rs

Lines 527 to 610 in f3b918d

    
           pub(crate) fn can_omit_optional_parentheses(expr: &Expr, context: &PyFormatContext) -> bool { 
        
               let mut visitor = CanOmitOptionalParenthesesVisitor::new(context); 
        
               visitor.visit_subexpression(expr); 
        
               if !visitor.any_parenthesized_expressions { 
        
                   // Only use the more complex IR when there is any expression that we can possibly split by 
        
                   false 
        
               } else if visitor.max_precedence_count > 1 { 
        
                   false 
        
               } else if visitor.max_precedence == OperatorPrecedence::None { 
        
                   // Micha: This seems to apply for lambda expressions where the body ends in a subscript. 
        
                   // Subscripts are excluded by default because breaking them looks odd, but it seems to be fine for lambda expression. 
        
                   // 
        
                   // ```python 
        
                   // mapper = lambda x: dict_with_default[ 
        
                   //  np.nan if isinstance(x, float) and np.isnan(x) else x 
        
                   // ] 
        
                   // ``` 
        
                   // 
        
                   // to prevent that it gets formatted as: 
        
                   // 
        
                   // ```python 
        
                   // mapper = ( 
        
                   //      lambda x: dict_with_default[ 
        
                   //          np.nan if isinstance(x, float) and np.isnan(x) else x 
        
                   //      ] 
        
                   // ) 
        
                   // ``` 
        
                   // I think we should remove this check in the future and instead parenthesize the body of the lambda expression: 
        
                   // 
        
                   // ```python 
        
                   // mapper = lambda x: ( 
        
                   //      dict_with_default[ 
        
                   //          np.nan if isinstance(x, float) and np.isnan(x) else x 
        
                   //     ] 
        
                   // ) 
        
                   // ``` 
        
                   // 
        
                   // Another case are method chains: 
        
                   // ```python 
        
                   // xxxxxxxx.some_kind_of_method( 
        
                   //     some_argument=[ 
        
                   //         "first", 
        
                   //         "second", 
        
                   //         "third", 
        
                   //     ] 
        
                   // ).another_method(a) 
        
                   // ``` 
        
                   true 
        
               } else if visitor.max_precedence == OperatorPrecedence::Attribute { 
        
                   // A single method call inside a named expression (`:=`) or as the body of a lambda function: 
        
                   // ```python 
        
                   // kwargs["open_with"] = lambda path, _: fsspec.open( 
        
                   //      path, "wb", **(storage_options or {}) 
        
                   // ).open() 
        
                   // 
        
                   // if ret := subprocess.run( 
        
                   //      ["git", "rev-parse", "--short", "HEAD"], 
        
                   //      cwd=package_dir, 
        
                   //      capture_output=True, 
        
                   //      encoding="ascii", 
        
                   //      errors="surrogateescape", 
        
                   // ).stdout: 
        
                   // ``` 
        
                   true 
        
               } else { 
        
                   fn is_parenthesized(expr: &Expr, context: &PyFormatContext) -> bool { 
        
                       // Don't break subscripts except in parenthesized context. It looks weird. 
        
                       !expr.is_subscript_expr() 
        
                           && has_parentheses(expr, context).is_some_and(OwnParentheses::is_non_empty) 
        
                   } 
        
                   // Only use the layout if the first expression starts with parentheses 
        
                   // or the last expression ends with parentheses of some sort, and 
        
                   // those parentheses are non-empty. 
        
                   visitor 
        
                       .last 
        
                       .is_some_and(|last| is_parenthesized(last, context)) 
        
                       || visitor 
        
                           .first 
        
                           .expression() 
        
                           .is_some_and(|first| is_parenthesized(first, context)) 
        
               } 
        
           }

The layout 1 exists to avoid unnecessary parentheses in common cases and it obviously don't pick the ideal solution in this case. However, this is also a very trivial example where the parentheses around b are unnecessary. Respecting the parentheses is useful because you normally want to keep the parenthesized expressions together

assert f(
    a,
) == (
    aaaaaaaaaaaaaa
    and bbbbbbbbbbbbbb
    and cccccccccccc
    and dddddddddddddddddd
    and eeeeeeeeeeee
)
assert f(
    a,
) == (aaaaaaaaaaaaaa and bbbbbbbbbbbbbb and cccccccccccc and dddddddddddddddddd)

I do agree that there's some room for improvement here so I'll keep the issue open but I don't expect to change this anytime soon because changing this behavior is a fairly significant break with Black's style guide and there are other "bad formattings" that I want to improve first.

rdaysky · 2025-02-04T00:56:15Z

The same issue in Black: psf/black#2581

dhruvmanila added question Asking for support or clarification formatter Related to the formatter labels Jan 31, 2025

MichaReiser added style How should formatted code look and removed question Asking for support or clarification labels Feb 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

formats `assert f(a,) == (b)` and `assert f(a,) == b` very differently #15842

formats `assert f(a,) == (b)` and `assert f(a,) == b` very differently #15842

rdaysky commented Jan 31, 2025

dhruvmanila commented Jan 31, 2025

rdaysky commented Feb 3, 2025

dhruvmanila commented Feb 3, 2025

rdaysky commented Feb 3, 2025

MichaReiser commented Feb 3, 2025

rdaysky commented Feb 4, 2025

formats assert f(a,) == (b) and assert f(a,) == b very differently #15842

formats assert f(a,) == (b) and assert f(a,) == b very differently #15842

Comments

rdaysky commented Jan 31, 2025

Description

dhruvmanila commented Jan 31, 2025

rdaysky commented Feb 3, 2025

dhruvmanila commented Feb 3, 2025

rdaysky commented Feb 3, 2025

MichaReiser commented Feb 3, 2025

rdaysky commented Feb 4, 2025

formats `assert f(a,) == (b)` and `assert f(a,) == b` very differently #15842

formats `assert f(a,) == (b)` and `assert f(a,) == b` very differently #15842