Thoughts, Questions and Confusions about the Sum trait [RUST] [QUESTION]

maegul (he/they)@lemmy.ml · edit-2 6 months ago

Thoughts, Questions and Confusions about the Sum trait [RUST] [QUESTION]

metiulekm@sh.itjust.works · edit-2 6 months ago

pub trait Sum<A = Self>: Sized {
    fn sum<I: Iterator<Item = A>>(iter: I) -> Self;
}
So I’d presume the A = Self followed by I: Iterator<Item = A> for the iterator binds the implementation pretty clearly to the type of the iterator’s elements.

Quite confusingly, the two =s have very different meaning here. The Item = A syntax just says that the iterator’s item type, which is set as the trait’s associated type, should be A. So, you could read this as “I should implement the Iterator trait, and the Item associated type of this implementation should be A”.

However, A = Self does not actually mean any requirement of A. Instead, it means that Self is the default value of A: that is, you can do impl Sum<i64> for i32 and then you will have Self equal to i32 and A equal to i64, but you can also do impl Sum for i32 and it will essentially be a shorthand for impl Sum<i32> for i32, giving you both Self and A equal to i32.

In the end, we have the relationship that the iterator item should be the same as A, but we do not have the relationship that Self should be the same as A. So, given this trait, the iterator item can actually be different to A.

Note that the standard library does actually have implementations where these two differ. For instance, it has impl<'a> Sum<&'a i32> for i32, giving you a possibility to sum the iterator of &i32 into i32. This is useful when you think about this: you might want to sum such an iterator without .copied() for some extra ergonomics, but you can’t just return &i32, there is nowhere to store the referenced i32. So, you need to return the i32 itself.

The definition is pretty clear here right? The generic here is Sum<Self::Item>, abbreviated to S … which AFAIU … means that the element type of the iterator — here Self::Item — is the type that has implemented Sum … and the type that will be returned.

In Sum<Self::Item>, Self::Item is the A parameter, and Sum<Self::Item>, or S, is the type that implements the trait (which is called Self in the definition of the Sum trait, but is different to the Self in the sum method definition). As above, A and S can be different.

It might be helpful to contrast this definition with a more usual one, where the trait does not have parameters:

fn some_function<S>(…) -> …
    where
        S: SomeTrait,
{…}

fn sum<S>(…) -> …
    where
        S: Sum<Self::Item>,
{…}

Note that you might have an intuition from some other languages that in case of polymorphism, the chosen function either depends on the type of one special parameter (like in many OOP languages, where everything is decided by the class of the called object), or of the parameter list as a whole (like in C++, where the compiler won’t let you define int f() and float f() at the same time, but will be fine with int f(int) and float f(float)). As you can see, in Rust, the return type also matters. A simpler example of this is the Default trait.

Regarding inference, some examples (Compiler Explorer link):

vec![1i32].into_iter().sum();
// or: <_ as Sum<_>>::sum(vec![1i32].into_iter());
// error[E0283]: type annotations needed
// note: cannot satisfy `_: Sum<i32>`

Compiler knows that the iterator contains i32s, so it looks for something that implements Sum<i32>. But we don’t tell the compiler what to choose, and the compiler does not want to guess by itself.

vec![1i32].into_iter().sum::<i32>();
// or: <i32 as Sum<_>>::sum(vec![1i32].into_iter());

As above the compiler knows that it wants to call something that implements Sum<i32>, but now it only has to check that i32 is such type. It is, so the code compiles.

vec![1i32].iter().sum::<i32>();
// or: <i32 as Sum<_>>::sum(vec![1i32].iter());

Now we actually have a iterator of references, as we used .iter() instead of .into_iter(). But the code still compiles, since i32 also implements Sum<&i32>.

vec![1i64].into_iter().sum::<i32>();
// or: <i32 as Sum<_>>::sum(vec![1i64].into_iter());
// error[E0277]: a value of type `i32` cannot be made by summing an iterator over elements of type `i64`
// help: the trait `Sum<i64>` is not implemented for `i32`

Now the compiler can calculate itself that it want to call something that implements Sum<i64>. However, i32 does not actually implement it, hence the error. If it did, the code would compile correctly.

vec![].into_iter().sum::<i32>();
// or: <i32 as Sum<_>>::sum(vec![].into_iter());
// error[E0283]: type annotations needed
// (in the second case) note: multiple `impl`s satisfying `i32: Sum<_>` found in the `core` crate: impl Sum for i32; impl<'a> Sum<&'a i32> for i32;

Now the situation is reversed. The compiler knows the return type, so it knows that i32 should implement some Sum<_>. But it doesn’t know the iterator element type, and so it doesn’t know if it should choose the owned value, or the reference version. Note that the wording is different, the compiler wants to guess, but it can’t, as there are multiple possible choices. But if there is only one choice, the compiler does guess it:

struct X {}
impl Sum for X {
    fn sum<I: Iterator<Item = X>>(_: I) -> Self { Self{} }
}
vec![].into_iter().sum::<X>();
// or: <X as Sum<_>>::sum(vec![].into_iter());

builds correctly. I am not sure about the reason for the difference (I feel like it’s related to forward compatibility and the fact that outside the standard library I can do impl Sum<i32> for MyType but not impl Sum<MyType> for i32, but I don’t really know).

Hope that helps :3

EDIT:

I’d also caught mentions of the whole zero thing being behind the design. Which is funny because once you get down to the implementation for the numeric types, zero seems (I’m not on top of macro syntax) to be just a parameter of the macro, which then gets undefined in the call of the macro, so I have to presume it defaults to 0 somehow??. In short, the zero has to be provided in the implementation of sum for a specific type. Which I suppose is flexible. Though in this case I can’t discern what the zero is for the integer types (it’s explicitly 0.0 for floats).

Ah, I read this, thought about this, and forgot about this almost immediately. I know almost nothing about macros, but if I understand correctly, the zero is in line 92, here:

    ($($a:ty)*) => (
        integer_sum_product!(@impls 0, 1,
                #[stable(feature = "iter_arith_traits", since = "1.12.0")],
                $($a)*);
        integer_sum_product!(@impls Wrapping(0), Wrapping(1),
                #[stable(feature = "wrapping_iter_arith", since = "1.14.0")],
                $(Wrapping<$a>)*);
    );

The intention seems to be to take a list of types (i8 i16 i32 i64 i128 isize u8 u16 u32 u64 u128 usize), and then for each type to generate both the regular and Wrapping version, each time calling into the path you have seen before. For floats there is no Wrapping version, so this time 0.0 is really the only kind of zero that can appear.

maegul (he/they)@lemmy.ml · 6 months ago

You are a fucking hero my friend!! Thanks so much!!

Thanks for clarifying the macro … I didn’t understand what was going on there, but it makes sense that that’s where the zero comes from.

And yea, as for the generics … I didn’t know about the default value (I was clearly reaching beyond what I could grasp!) … thanks again!!

Quite confusingly, the two =s have very different meaning here. The Item = A syntax just says that the iterator’s item type, which is set as the trait’s associated type, should be A. So, you could read this as “I should implement the Iterator trait, and the Item associated type of this implementation should be A”.

However, A = Self does not actually mean any requirement of A. Instead, it means that Self is the default value of A: that is, you can do impl Sum<i64> for i32 and then you will have Self equal to i32 and A equal to i64, but you can also do impl Sum for i32 and it will essentially be a shorthand for impl Sum<i32> for i32, giving you both Self and A equal to i32.

maegul (he/they)@lemmy.ml · 6 months ago

So, just to riff on this a bit for fun …

It seems then that we could have a Sum trait that didn’t require providing the result type??

If the trait were defined something like:

trait Sum2: Sized {
        fn sum2<I: Iterator<Item = Self>>(iter: I) -> Self;
    }

… that is, without the a generic and with the Item type of the Iterator bound to Self.

I wonder if rust could then do with sum being basic like this and not requiring a result type and then sum_gen for the generic case??

I had a shot at sort of quickly prototyping this and came up with the following:

fn main() {
    // so that there's no need to provide the result type

    trait Sum2: Sized {
        fn sum2<I: Iterator<Item = Self>>(iter: I) -> Self;
    }

    impl Sum2 for i32 {
        fn sum2<I: Iterator<Item = Self>>(iter: I) -> Self {
            iter.fold(0, |a,x| a + x)
        }
    }

    struct MyVec<T>(Vec<T>);

    impl<T> MyVec<T> {
        fn sum2(self) -> T
        where
            Self: Sized,
            T: Sum2,
        {
            Sum2::sum2(self.0.into_iter())
        }
    }

    let z = MyVec(vec![1i32, 2, 3]).sum2();
    println!("My own custom sum trait?: {z}");

    // doesn't compile as `i64: Sum2` not satisfied
    let z2 = MyVec(vec![1i64, 2, 3]).sum2();
}

I don’t know the best way of implementing a new sum method for vecs (let alone all Iterators) and so the best I could come up with was to wrap Vec and then create an Iterator directly in the sum2() method. It seemed to work well enough though!

Once I learn more about traits etc it might be a fun exercise to see how close you can get to implementing one’s own convenience sum method!

Thanks again!!

Thoughts, Questions and Confusions about the Sum trait [RUST] [QUESTION]

Thoughts, Questions and Confusions about the Sum trait [RUST] [QUESTION]

Intro

Trait Definition

First thoughts: Defined on elements not iterators?

Kinda seems so?

Back to the beginning

Confirmation

Why? How?