Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[basicprofiles] potential problem in calculations over series of QuantityType #18122

Open
andrewfg opened this issue Jan 18, 2025 · 37 comments · May be fixed by #18144
Open

[basicprofiles] potential problem in calculations over series of QuantityType #18122

andrewfg opened this issue Jan 18, 2025 · 37 comments · May be fixed by #18144

Comments

@andrewfg
Copy link
Contributor

The StateFilterProfile explicitly uses toInvertibleUnit in calculateMedian below, but it is notable that other calculation methods use QuantityType.add() for summation instead, so it is not obvious why this one method requires toInvertibleUnit.

General comment: this calculation use case looks to be similar to the calculation use case that you cite in OH Core, so we should try to follow similar logic in resolving any potential issues.

The good news: for the case of StateFilterProfile there is probably no problem. Reason is that calculation are done over a time series set of values for the SAME Item, and the Item will not flip its unit over that time series, so all calculations will remain self consistent over that series.

The bad news: I am beginning to think that there may be bigger issues in other cases that calculate over sets of QuantityType. I think it is not just a question of toInvertibleUnit, and specifically not something where this PR would have an impact.

Originally posted by @andrewfg in openhab/openhab-core#4561 (comment)

@mherwege
Copy link
Contributor

In this class, toInvertibleUnit really does nothing. Any item with a dimension different from the first item in the list is ignored anyway. And as the inverse unit is of a different dimension, these get ignored. This does lead to unpredictable behavior when there are items with different dimensions in the group (in theory should not be the case, but might be tempting for color temperatures). The first item will dictate which items in the group get considered.

I believe the calculations should always use the unit of the base item and use the items in the group that are in the same dimension. The functions are not consistent. Sometimes this is the case, but not always.

As for invertible units, there are two options:

  1. Replace toInvertibleUnit with toUnit to make it explicit that it only works in the same dimension (would be as is now, but clearer).
  2. Make it work with the invers unit of the base unit as well, by converting to the base u it. I always find implicit calculations with mixes of units and inverse units somewhat dangerous.

I am not sure what the best choice is.

@andrewfg
Copy link
Contributor Author

@mherwege have a look at openhab/openhab-core#4563 .. probably a similar approach applies here.

@andrewfg andrewfg changed the title [basicprofiles] potential problem in calculations over series of QunatityType [basicprofiles] potential problem in calculations over series of QuantityType Jan 19, 2025
@andrewfg
Copy link
Contributor Author

This does lead to unpredictable behavior when there are items with different dimensions in the group

@mherwege I think the StateFilter profile (like all other profiles) applies to a single Item being linked to a single Channel (so your comment about "group" is not relevant). The profile configuration contains one or more conditions that must be fulfilled (see example below) where if the channel receives its raw data in (say) Mirek and the conditions are set in (say) Kelvin then an invertible unit conversion would indeed be required.

// typical expected use case
Number:Temperature ColorTemperature{
  channel="mybinding:mything:mychannel" [ profile="basic-profiles:state-filter", conditions=">= 2000 K", "<= 6500 K" ]
}

Furthermore there is a conceivable edge case where the conditions might even use mixed units..

// mixed unit use case
Number:Temperature ColorTemperature{
  channel="mybinding:mything:mychannel" [ profile="basic-profiles:state-filter", conditions=">= 2000 K", ">= 153 mired" ]
}

@mherwege
Copy link
Contributor

so your comment about "group" is not relevant

Indeed, and I mixed the group calculation issues with the profiles in my responses.

The profile configuration contains one or more conditions that must be fulfilled (see example below) where if the channel receives its raw data in (say) Mirek and the conditions are set in (say) Kelvin then an invertible unit conversion would indeed be required.

So the conversion should be towards the unit of the condition, or the comparison operators may have to be inverted as well.

Furthermore there is a conceivable edge case where the conditions might even use mixed units..

Which means you have to do the conversion towards the unit of the condition for each condition independently.

As I believe this issue would only apply to the State Filter problem, you should be fine using toInvertibleUnit for this.

@andrewfg
Copy link
Contributor Author

Arrgh! There are actually two UoM related issues in this class..

  1. The current (wrong) usage of toInvertibleUnit: This method is only called in the MEDIAN function. The function is not documented. However it seems to apply to a time series of the Item's prior state values. Therefore in that case the units in the time series remain identical across the series so neither toInvertibleUnit, nor even toUnit, need be applied. => So question 1 is whether we should remove this?

  2. The current non usage of toInvertibleUnit in everything else: The class uses Comparable to compare the current state against the filter conditions, and since QuantityType implements Comparable then it does work as documented. With the only edge case being that comparisons involving inverse units would fail. => So question 2 is whether we should add this?

@J-N-K you are marked as the code owner for this (although not the author) so I wonder if you have any thoughts?

@mherwege
Copy link
Contributor

  1. the units in the time series remain identical across the series

I am not sure this is a correct assumption. You could be updating an item with different units over time. I know this would not happen most of the time, but it is possible.

@mherwege
Copy link
Contributor

2. The current non usage of toInvertibleUnit in everything else: The class uses Comparable to compare the current state against the filter conditions, and since QuantityType implements Comparable then it does work as documented. With the only edge case being that comparisons involving inverse units would fail. => So question 2 is whether we should add this?

I am not sure yet how to make this work properly.

The result of the calculation should be in the dimension of what you compare to. So as long as you do not use an invertible unit, it is OK, but as soon as there is an inverse unit in it, the result becomes unpredictable at the moment.

@andrewfg
Copy link
Contributor Author

You could be updating an item with different units over time.

Ok so if it is required to do mixed unit calculations over a time series of Item state values for the MEDIAN function, then we must be consistent in applying the same logic to the other functions..

        enum Function {
            DELTA,
            DELTA_PERCENT,
            AVERAGE,
            AVG,
            MEDIAN,
            STDDEV,
            MIN,
            MAX
        }

@jimtng
Copy link
Contributor

jimtng commented Jan 20, 2025

I believe the median code in StateFilter was more or less copy pasted from the core's median code, hence the toInvertibleUnit there. At the time I wasn't aware of the possibility of mixed units, hence why it didn't make it to the other sum code.

And yes, it needs to be applied at the end too, towards the opposite operand, in the same manner that the delta PR is done. Right now there's a backlog of PRs to StateFilter. Once they've all been merged, I'll start another PR for the invertibleunit case.

@mherwege
Copy link
Contributor

I believe the median code in StateFilter was more or less copy pasted from the core's median code, hence the toInvertibleUnit there. At the time I wasn't aware of the possibility of mixed units, hence why it didn't make it to the other sum code.

@jimtng Yes, I indeed think that was the case. Except that it didn't work for inverse units in the core's group functions either (as they explicitely excluded values from other dimensions, and an inverted unit value is in another dimension). But @andrewfg is working on a fix for that.

Special care has to be taken on how to apply this in these filters. The issue is the result will be different depending on the dimension you convert to. And if you take the first value to define the dimension (e.g. your first value is in Kelvin), everything will calculate with Kelvin. So the next values in mirek will convert to Kelvin. Then take the average and the average will be a Kelvin value. Now take the same series but express the all in the inverse unit, so your first value is in mirek. Now the average will be average mirek values. If you convert this result to Kelvin, it will be very different from the same calculation using Kelvin. This becomes very unpredictable.
With group functions, we can solve this by using the unit of the group base item as the reference unit. I don't know how to solve this with these basic profiles. Ideally you would want to use the unit of the value you compare to, but there you could have different units mixed.

I somewhat think things might have been less complicated if we mandated OH internally always uses mirek (or Kelvin) for color temperature and forced addons to convert to/from if needed. That way we wouldn't be fighting with cross dimension conversions that make calculations hard.

@andrewfg
Copy link
Contributor Author

@mherwege even if you don’t personally like inverted units, the problem remains. In calculations across values with mixed units in a group or a time series, the target unit has to be taken into account. This applies to fahrenheit/celsius/kelvin, feet/metre/furlongs, or whatever. The case of invertible units is only a subset of the problem.

And wishing that Mirek will magically disappear is not a solution. We have to fix it.

@mherwege
Copy link
Contributor

mherwege commented Jan 20, 2025

In calculations across values with mixed units in a group or a time series, the target unit has to be taken into account. This applies to fahrenheit/celsius/kelvin, feet/metre/furlongs, or whatever.

And I disagree on that. If you look at the way the additions are done, it doesn't matter what the unit is for these you use in your calculation. You can start from any unit and convert back to any unit. The result will be the same.

So the case of invertible units is not a subset of the problem. It is an extra problem we created and we have to make sure anything we do to make it work doesn't harm the normal logic.

And for the record, I am not advocating for mirek to go away. I believe it should be defined as a unit in the system. I am questioning the soundness of the decision to allow automatic conversion between Kelvin and mirek and assume you can convert between invertible units in all cases without consequence. I would have wished this conversion was always explicit.

@andrewfg
Copy link
Contributor Author

If you look at the way the additions are done, it doesn't matter what the unit is for these you use in your calculation.

Wrong! As you yourself said, if the target unit is Celsius then 20C plus 20C yields 40C, whereas if the target unit is Kelvin then the result is around 600K.

@andrewfg
Copy link
Contributor Author

I am questioning the soundness of the decision to allow automatic conversion
I would have wished this conversion was always explicit.

For the avoidance of doubt: Do you want to open PRs to revisit the past decisions and do it another way?

Personally I do not want to revisit the past. In my PRs I am working to ensure that all conversions and calculations work in the way that the average Joe User would expect it to, without fully reengineering the system. In other words let us face reality and develop a practical and transparent solution.

@jimtng
Copy link
Contributor

jimtng commented Jan 20, 2025

I'm wondering - in the case of the StateFilter profile, it is not really needed to convert during sum. Why would the binding be sending values and switching its units during its lifetime? If it's reconfigured (manually) to another unit, the profile would get reinitialized, won't it?

It is only needed to compare the result of the calculation result (sum, median, avg) against the other operand (a constant), i.e.

$MEDIAN > X

@andrewfg
Copy link
Contributor Author

it is not really needed to convert during sum

@jimtng that was my own assumption. But others (@jlaur ?) have implied that the Item may be linked to more than one channel so it could in principle receive mixed units. I am not sure the architecture in this case: does the profile apply to each Item / Channel link, or to the overall Item??

@jimtng
Copy link
Contributor

jimtng commented Jan 20, 2025

the Item may be linked to more than one channel

Each link has its own profile

@jimtng
Copy link
Contributor

jimtng commented Jan 20, 2025

It is only needed to compare the result of the calculation result (sum, median, avg) against the other operand (a constant), i.e.

$MEDIAN > X

Come to think of it, no unit conversion is needed here - core already does the inversion if needed, inside QuantityType.compareTo

So apart from removing the unnecessary toInvertibleUnit (which shouldn't do any harm as is), there's nothing to be done in StateFilter?

@andrewfg
Copy link
Contributor Author

Two questions:

  1. If an Item is linked to two channels does that mean there are two separate StateFilter instances?
  2. When the StateFilter stores the time series for incoming values, are these stored "raw" (as received) or are they immediately converted to the unit of the Item? If the former then toUnit/toInvertibleUnit are required in the calculations. If the latter then not (not even in the MEDIAN function).

@jimtng
Copy link
Contributor

jimtng commented Jan 20, 2025

  1. If an Item is linked to two channels does that mean there are two separate StateFilter instances?

Corrrect. Each StateFilter only deals with / is linked to one channel, and only handles data from that one channel. Its job is to pass through the data or block it. Each StateFilter does not see the other channels that are linked to the same item. Only the item gets this data from multiple channels

2. When the StateFilter stores the time series for incoming values, are these stored "raw" (as received) or are they immediately converted to the unit of the Item? If the former then toUnit/toInvertibleUnit are required in the calculations. If the latter then not (not even in the MEDIAN function).

They are stored raw.

So can data from timeseries (from the binding) potentially be in different units within the timeseries?

@andrewfg
Copy link
Contributor Author

andrewfg commented Jan 20, 2025

For reference in the above discussion, below is a hypothetical use case.

Number:Temperature Temperature [ unit="F" ] {
  channel="abinding:athing:achannel" [ profile="basic-profiles:state-filter", conditions="$MEDIAN > 270 K", "$MEDIAN < 290 K" ]
  channel="bbinding:bthing:bchannel" [ profile="basic-profiles:state-filter", conditions="$AVG > 0 C", "$AVG < 20 C" ]
}

@andrewfg
Copy link
Contributor Author

can data from timeseries (from the binding) potentially be in different units within the timeseries?

I think no. => @jlaur ??

They are stored raw.

As discussed above, there are some calculation functions (e.g. sum, median, average, ..) where the result differs depending on the chosen base unit. So the question is whether the chosen base unit is that of a) the raw incoming data (e.g. kelvin in the use case above) or b) the target Item (e.g. fahrenheit in the use case above). I think there are arguments for either option so we just need to make a choice, and ensure it is properly documented..

@mherwege
Copy link
Contributor

mherwege commented Jan 20, 2025

Wrong! As you yourself said, if the target unit is Celsius then 20C plus 20C yields 40C, whereas if the target unit is Kelvin then the result is around 600K.

No, it is not, as the second argument will always be treated as relative when doing the sum. So the result will be 20 C + 20 K = 313.15 K = 40 C.

See openhab/openhab-core#3792

@mherwege
Copy link
Contributor

mherwege commented Jan 20, 2025

So can data from timeseries (from the binding) potentially be in different units within the timeseries?

I actually think it can, but it is highly unlikely. It would mean the binding sends values with different units at different points in time. And while that may very well be the case when using the same dimension, I consider it very unlikely it suddenly starts sending data in another dimension (inverted unit).

An example where it might happen, but without impact (as it is in the same dimension): ping times coming back from a binding sometimes in ms, sometimes in µs.

So if we assume it is always the same unit, we impose a constraint on the binding developer to always provide the same unit, which we don't do right now.

@andrewfg
Copy link
Contributor Author

andrewfg commented Jan 20, 2025

No, it is not, as the second argument will always be treated as relative when doing the sum. So the result will be 20 C + 20 K = 313.15 K = 40 C.

I worry about my sanity. Perhaps I am living on a different planet? So @mherwege perhaps you can indulge me by adding the following JUnit test to #4563 and please kindly tell me your results?

    @ParameterizedTest
    @MethodSource("locales")
    public void testAardvarks(Locale locale) {
        Locale.setDefault(locale);

        Set<Item> items = new LinkedHashSet<>();
        items.add(createNumberItem("TestItem1", Temperature.class, new QuantityType<>("23.54 °C")));
        items.add(createNumberItem("TestItem2", Temperature.class, UnDefType.NULL));
        items.add(createNumberItem("TestItem3", Temperature.class, new QuantityType<>("192.2 °F")));
        items.add(createNumberItem("TestItem4", Temperature.class, UnDefType.UNDEF));
        items.add(createNumberItem("TestItem5", Temperature.class, new QuantityType<>("395.56 K")));

        // base unit is Celsius
        NumberItem baseNumberItem = createNumberItem("BaseItem", Temperature.class, UnDefType.UNDEF);
        GroupFunction function = new QuantityTypeArithmeticGroupFunction.Sum(baseNumberItem);
        State state = function.calculate(items);
        assertEquals(new QuantityType<>("234.95 °C"), state);
        assertNotEquals(new QuantityType<>("1054.40 K"), state);

        // base unit is Kelvin
        baseNumberItem = createKelvinNumberItem("BaseItem", UnDefType.UNDEF);
        function = new QuantityTypeArithmeticGroupFunction.Sum(baseNumberItem);
        state = function.calculate(items);
        assertNotEquals(new QuantityType<>("234.95 °C"), state);
        assertEquals(new QuantityType<>("1054.40 K"), state);
    }

@mherwege
Copy link
Contributor

For the avoidance of doubt: Do you want to open PRs to revisit the past decisions and do it another way?

Personally I do not want to revisit the past. In my PRs I am working to ensure that all conversions and calculations work in the way that the average Joe User would expect it to, without fully reengineering the system. In other words let us face reality and develop a practical and transparent solution.

No, I don't want to revert it, no matter how much I regret it got in in the first place. But I do prefer to not allow calculations with units inverted dimensions when it can lead to unpredictable outcomes. I think there is now a reasonble solution for group functions where I currently don't see an issue in practice (the identified potential issues are all hypothetical in the OH context).
I still think there is more work need in the state profiles to really make them work predictably.

@andrewfg
Copy link
Contributor Author

I do prefer to not allow calculations with units inverted dimensions when it can lead to unpredictable outcomes.

I concur. That is also my goal.

.. and IMHO the set of "predictable outcomes" must include 2000 K <=> 1726.85 C <=> 3140.33 F <=> 500 Mirek :)

@mherwege
Copy link
Contributor

I worry about my sanity. Perhaps I am living on a different planet? So @mherwege perhaps you can indulge me by adding the following JUnit test to #4563 and please kindly tell me your results?

Damn, you are right. It goes wrong for the group functions if they convert to the same unit.

The issue is the conversion to the base unit for all arguments. That should not be done. It should only be done for the first argument.

The QuantityType.add method correctly treats the first argument as absolute and the second as relative: https://github.com/openhab/openhab-core/blob/ce374252fa2c821103da888302f930c1c127f73c/bundles/org.openhab.core/src/main/java/org/openhab/core/library/types/QuantityType.java#L586

But if you first convert everything to Kelvin, the second argument is not the same relative argument anymore.

Which means that for the add group function in the same dimension, no conversion to base unit should happen. But it then leaves the problem with inverted units.

It also means that the average function needs to work differently, because that actually needs to work from the same 0 reference for all arguments (everything absolute) when calculating the sum.

@andrewfg
Copy link
Contributor Author

I still think there is more work needed in the state profiles to really make them work predictably.

Agreed. Given our experience on the Group Item functions, I propose to open a PR also for StateFilter. It might generate a merge conflict with the other StateFilter PRs but we can sort that out later. => @jimtng is that Ok with you?

@andrewfg
Copy link
Contributor Author

andrewfg commented Jan 20, 2025

Damn, you are right.

:)

The issue is the conversion to the base unit for all arguments. That should not be done.

In our new model the top level Group Item defines the target unit. So ALL (repeat ALL) arguments are converted to the target unit. This means if the Group Item is in Celsius then members will be added relative to 0 C whereas if the Group Item is in Kelvin then members will be added relative to 0 K i.e. absolutely.

The QuantityType.add method correctly treats the first argument as absolute and the second as relative

Not any more. Now we do not take the target unit from the first item. We take it from the overall Group Item instead. Indeed that is largely the whole point of my PR..

It also means that the average function needs to work differently

Indeed. And so it does. .. and the Median function too.

.. by contrast the Min and Max functions work anyway.

@mherwege
Copy link
Contributor

So ALL (repeat ALL) arguments are converted to the target unit. This means if the Group Item is in Celsius then members will be added relative to 0 C whereas if the Group Item is in Kelvin then members will be added relative to 0 K i.e. absolutely.

If the group item is in Celcius, all will be converted to Celcius and then added. When adding, the first argument is always converted to absolute and then the second is added relative (treated as an offset from the first argument).
The same applies for K. So if the group item is K, the second argument will also be converted to K and added as an offset (which is equal to the absolute K value), leading to the difference in result.

The whole point of the changes to the add function in QuantityType.add earlier was to make adding temperatures with different units predictable, and make sure it didn't matter in what unit the first and second argument are. It also makes intuitive sense when you add temperatures, the second is typically a temperature offset, not an absolute temperature.
Because the ArithmeticGroupFunction code converts ALL to the same unit, this breaks apart. If you want to keep on treating the second as an offset, we should only convert the first argument to the group base item unit, not the others. Then all others will be treated as offsets.

But that of course does not work for invertible units. They would have to be converted.

In the current code, before the PR, any values in the invers dimension would just get ignored. We still want to consider them. But I am not sure how now.

We can debate what the right logic of adding is here. But I would like to see some consistency overall. If we normal treat adding temperatures as adding offsets to a base, I would like to see it treated in the group arithmetic functions in the same way as well. And I would prefer the same again in the state profiles.

@mherwege
Copy link
Contributor

mherwege commented Jan 20, 2025

Not any more. Now we do not take the target unit from the first item. We take it from the overall Group Item instead. Indeed that is largely the whole point of my PR..

It still does. It gets called from your modified function and is not changed. But because the toUnit call (or toInvertibleUnit call) before calling the add actually changes the relative value the overall calculation does not give the same result anymore.

@mherwege
Copy link
Contributor

and the Median function too

I don't think it impacts Median, as it does not add values, just picks the middle value.

@mherwege
Copy link
Contributor

Come to think of it, no unit conversion is needed here - core already does the inversion if needed, inside QuantityType.compareTo

Actually, the compareTo does not work.

I tried adding following test for QuantityType.compareTo to the QuantityType tests:

    @Test
    public void testCompareTo() {
        QuantityType<Temperature> temp1 = new QuantityType<>("293.15 K");
        QuantityType<Temperature> temp2 = new QuantityType<>("20 °C");
        assertEquals(0, temp1.compareTo(temp2), 0.001);
        temp2 = new QuantityType<>("-5 °C");
        assertEquals(1, temp1.compareTo(temp2));
        temp2 = new QuantityType<>("50 °C");
        assertEquals(-1, temp1.compareTo(temp2));

        temp1 = new QuantityType<>("100000 K");
        temp2 = new QuantityType<>("10 mirek");
        assertEquals(0, temp1.compareTo(temp2), 0.001);
        assertEquals(0, temp2.compareTo(temp1), 0.001);
        temp2 = new QuantityType<>("20 mirek");
        assertEquals(1, temp1.compareTo(temp2));         // temp1 is more than temp2 (20 mirek = 50000 K) in Kelvin
        assertEquals(1, temp2.compareTo(temp1));         // temp2 is more than temp1 (100000 K = 10 mirek) in mirek
        temp2 = new QuantityType<>("0 mirek");
        assertEquals(-1, temp1.compareTo(temp2));
        assertEquals(-1, temp2.compareTo(temp1));

        temp1 = new QuantityType<>("0.1 MK");
        temp2 = new QuantityType<>("10 mirek");
        assertEquals(0, temp1.compareTo(temp2), 0.001);
        assertEquals(0, temp2.compareTo(temp1), 0.001);
        temp2 = new QuantityType<>("20 mirek");
        assertEquals(1, temp1.compareTo(temp2));
        assertEquals(1, temp2.compareTo(temp1));
        temp2 = new QuantityType<>("0 mirek");
        assertEquals(-1, temp1.compareTo(temp2));
        assertEquals(-1, temp2.compareTo(temp1));
    }

These fail when comparing K to mirek because the scale is in the other direction. The commented assertions fail. Also note that it leads to somewhat strange outcomes as, depending on the unit used, the result is different.

@jimtng
Copy link
Contributor

jimtng commented Jan 20, 2025

Agreed. Given our experience on the Group Item functions, I propose to open a PR also for StateFilter. It might generate a merge conflict with the other StateFilter PRs but we can sort that out later. => @jimtng is that Ok with you?

@openhab/add-ons-maintainers I'd really appreciate it if the current StateFilter PRs be merged first. I'm not good with dealing with merge conflicts :( The existing PRs would generate some conflicts already.

@andrewfg
Copy link
Contributor Author

These fail when comparing K to mirek

It should never be an issue since I will normalize all values to the same targetUnit before calling compare.

PS I already started work on coding the StateFilter changes..

@andrewfg
Copy link
Contributor Author

See my work in progress. Not tested!

#18144

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants