Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

formats assert f(a,) == (b) and assert f(a,) == b very differently #15842

Open
rdaysky opened this issue Jan 31, 2025 · 6 comments
Open

formats assert f(a,) == (b) and assert f(a,) == b very differently #15842

rdaysky opened this issue Jan 31, 2025 · 6 comments
Labels
formatter Related to the formatter style How should formatted code look

Comments

@rdaysky
Copy link

rdaysky commented Jan 31, 2025

Description

ruff format transforms

assert f(a,) == (b)
assert f(a,) == b

into

assert f(
    a,
) == (b)
assert (
    f(
        a,
    )
    == b
)

which doesn’t make too much sense.

@dhruvmanila
Copy link
Member

Thanks for the report. Can you say why it doesn't make sense?

@MichaReiser will be able to provide more context on why this is the case but he's currently on leave and will be able to reply next week.

Regardless, it matches what Black does https://black.vercel.app/?version=main&state=_Td6WFoAAATm1rRGAgAhARYAAAB0L-Wj4ACpAGFdAD2IimZxl1N_WlPTQ06-Nur1lMjjikbZBUWzXuAQNKhFphrTQvWGzojFvt24KBUXaI3nTaBMACVKkaNvMxbDMiDTeDauQNv4yFlbtT0lJmLmdtWcjJOxsFmoi2ajLXi1CwAAAAAAM4BGDhyTg2EAAX2qAQAAAJAqwvaxxGf7AgAAAAAEWVo%3D

@dhruvmanila dhruvmanila added question Asking for support or clarification formatter Related to the formatter labels Jan 31, 2025
@rdaysky
Copy link
Author

rdaysky commented Feb 3, 2025

It doesn’t make sense for two reasons:

  1. Why should a short expression and the same expression but parenthesized behave differently?
  2. Why should outer parentheses appear when the expression can be satisfactorily formatted without them?

@dhruvmanila
Copy link
Member

I think one part of the reason is the trailing comma that's present in the function call arguments. That indicates the formatter to break the expression, even though it might short and under the configured line length, into multiple lines and keep them as multiline expression. You can opt-out of this behavior by setting format.skip-magic-trailing-comma to true. Refer to the playground to checkout this behavior.

@rdaysky
Copy link
Author

rdaysky commented Feb 3, 2025

But the magic trailing comma is an immensely useful parameter that can’t be turned off without major impact on the rest of the code.

In any case, the magic comma in assert f(a,) == b should insert newlines around a, not around b.

@MichaReiser MichaReiser added style How should formatted code look and removed question Asking for support or clarification labels Feb 3, 2025
@MichaReiser
Copy link
Member

What you see here is also not specific to assert; you can observe the same in any clause header (if, while, match ...) playground

Specifically, Ruff supports two layouts for clause headers (which are also used for assert conditions):

  1. A layout that avoids adding parentheses around the condition and instead breaks after existing parentheses ([, {, (). This is what you see for assert f(a,) == (b). However, this layout is only applied if the parentheses are at the very right (or very start
  2. If the parentheses aren't at the very start (which isn't for calls because the leftmost part is an identifier), then Ruff/Black parenthesizes the entire expression instead. You see this for f(a,) = b because it neither starts nor ends with parentheses.

You can also see this if you flip the operands:

assert f(a,) == b
assert b == f(a, )

The code for this can be found here:

pub(crate) fn can_omit_optional_parentheses(expr: &Expr, context: &PyFormatContext) -> bool {
let mut visitor = CanOmitOptionalParenthesesVisitor::new(context);
visitor.visit_subexpression(expr);
if !visitor.any_parenthesized_expressions {
// Only use the more complex IR when there is any expression that we can possibly split by
false
} else if visitor.max_precedence_count > 1 {
false
} else if visitor.max_precedence == OperatorPrecedence::None {
// Micha: This seems to apply for lambda expressions where the body ends in a subscript.
// Subscripts are excluded by default because breaking them looks odd, but it seems to be fine for lambda expression.
//
// ```python
// mapper = lambda x: dict_with_default[
// np.nan if isinstance(x, float) and np.isnan(x) else x
// ]
// ```
//
// to prevent that it gets formatted as:
//
// ```python
// mapper = (
// lambda x: dict_with_default[
// np.nan if isinstance(x, float) and np.isnan(x) else x
// ]
// )
// ```
// I think we should remove this check in the future and instead parenthesize the body of the lambda expression:
//
// ```python
// mapper = lambda x: (
// dict_with_default[
// np.nan if isinstance(x, float) and np.isnan(x) else x
// ]
// )
// ```
//
// Another case are method chains:
// ```python
// xxxxxxxx.some_kind_of_method(
// some_argument=[
// "first",
// "second",
// "third",
// ]
// ).another_method(a)
// ```
true
} else if visitor.max_precedence == OperatorPrecedence::Attribute {
// A single method call inside a named expression (`:=`) or as the body of a lambda function:
// ```python
// kwargs["open_with"] = lambda path, _: fsspec.open(
// path, "wb", **(storage_options or {})
// ).open()
//
// if ret := subprocess.run(
// ["git", "rev-parse", "--short", "HEAD"],
// cwd=package_dir,
// capture_output=True,
// encoding="ascii",
// errors="surrogateescape",
// ).stdout:
// ```
true
} else {
fn is_parenthesized(expr: &Expr, context: &PyFormatContext) -> bool {
// Don't break subscripts except in parenthesized context. It looks weird.
!expr.is_subscript_expr()
&& has_parentheses(expr, context).is_some_and(OwnParentheses::is_non_empty)
}
// Only use the layout if the first expression starts with parentheses
// or the last expression ends with parentheses of some sort, and
// those parentheses are non-empty.
visitor
.last
.is_some_and(|last| is_parenthesized(last, context))
|| visitor
.first
.expression()
.is_some_and(|first| is_parenthesized(first, context))
}
}

The layout 1 exists to avoid unnecessary parentheses in common cases and it obviously don't pick the ideal solution in this case. However, this is also a very trivial example where the parentheses around b are unnecessary. Respecting the parentheses is useful because you normally want to keep the parenthesized expressions together

assert f(
    a,
) == (
    aaaaaaaaaaaaaa
    and bbbbbbbbbbbbbb
    and cccccccccccc
    and dddddddddddddddddd
    and eeeeeeeeeeee
)
assert f(
    a,
) == (aaaaaaaaaaaaaa and bbbbbbbbbbbbbb and cccccccccccc and dddddddddddddddddd)

I do agree that there's some room for improvement here so I'll keep the issue open but I don't expect to change this anytime soon because changing this behavior is a fairly significant break with Black's style guide and there are other "bad formattings" that I want to improve first.

@rdaysky
Copy link
Author

rdaysky commented Feb 4, 2025

The same issue in Black: psf/black#2581

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
formatter Related to the formatter style How should formatted code look
Projects
None yet
Development

No branches or pull requests

3 participants