Intro

I’m not on top of traits or generics but found myself looking some of them up anyhow, and came across the Sum trait.

Here is the Std Lib documentation on Sum (I believe).

And I guess all of the generics and/or type logic and how they interoperate has thrown me for a bit of a spin … so I thought I’d put my thoughts here. Maybe I’ll work things out in writing it or maybe someone here can help me/us out?

A bit long … sorry


Trait Definition

From the docs and source, here is the trait’s signature:

// core::iter::Sum
pub trait Sum<A = Self>: Sized {
    // Required method
    fn sum<I: Iterator<Item = A>>(iter: I) -> Self;
}

First thoughts: Defined on elements not iterators?

  • The part that confused me at first was what Self is actually. Naively, I imagined it was referring to the iterator (or type that’d implemented Iterator) … but that clearly can’t be true because the return type is Self.
  • So … Sum is implemented not on any collection but on the element type?!
  • If so, why not rely on the Add Trait at the element level, which is responsible for the addition operator (see docs here)?

Kinda seems so?

  • So, in trying to understand this, I thought I’d look at the source of Iterator::sum() first figuring that it’d be the main implementation.
  • This is the sum you’d be calling in something like vec![1, 2, 3].into_iter().sum() to get 6.
core::iter::Iterator::sum
fn sum<S>(self) -> S
where
    Self: Sized,
    S: Sum<Self::Item>,
{
    Sum::sum(self)
}
  • Ok, so the call of Sum::sum(self) clearly indicates that this is not where Sum is defined (instead it must be in Sum::sum() somehow).
  • Moreover, self is being passed into Sum::sum(), withself being the Iterator here … which means there’s no method being called on Iterator itself but something from another module.
  • Additionally, this method is bound by the generic <S> which is defined in the where clause as Sum<Self::Item> … which … wait WTF is going on?
    • So this method (Iterator::sum()) must return a type that has implemented the trait Sum??
    • If that’s correct, then that confirms my suspicion that Sum is implemented on the elements of an iterator (where I’m sure those comfortable with the generics syntax of the definition above are yelling YES!! OF course!!)
    • That’s because the return type of sum() would generally have to be the same type as the summed elements, so S is both the type of the elements in the iterator and the return type of sum. All good.
    • And indeed, in the definition of the type alias S we’ve got Sum<Self::Item> which binds the return type of Iterator::sum() to the type of the iterator’s elements (ie Self::Item)
      • Self::Item is technically the Item type of the Iterator which can, AFAIU, be defined as distinct from the type of the elements of the collection from which the iterator is derived but that’s another story.

Back to the beginning

  • So back to trying to understand the definition of core::iter::Sum (which I believe is the definition of the trait):
// core::iter::Sum
pub trait Sum<A = Self>: Sized {
    // Required method
    fn sum<I: Iterator<Item = A>>(iter: I) -> Self;
}
  • The trait itself is bound to Sized. I don’t know the details around Sized (see docs here and The book, ch 19.4 here) but it seems fundamental likely that it applies to vectors and the like.
  • The generic A = Self and its occurrences in the generics for the sum() function and its return type … are a lot:
    • AFAIU, Self, ie the type on Sum is implemented for, must be the Item type for the Iterator that will be passed into the sum method.
    • But it must also be the return type of sum() … which makes sense.
  • So the confusing part here then is the generic type of the sum() method: <I: Iterator<Item = A>>.
    • Remember, A = Self, so it’s really <I: Iterator<Item = Self>> (right?)
    • This generic type is any Iterator whose Item (ie, the type that is returned each iteration) is the same type as Self.
  • Which means that if I want to sum a vector if i32 numbers, I’d have to make sure I’ve implemented Sum not on Vec but on i32 and defined it as a method that takes any iterator of i32 (ie Self) elements to then return an i32 element.
  • Ok …

Confirmation

  • We can look at the implementors of core::iter::Sum ( see docs here) and check the source for the i32 implementation …
  • Which gives us this source code:
integer_sum_product! { i8 i16 i32 i64 i128 isize u8 u16 u32 u64 u128 usize }
macro_rules! integer_sum_product {
    (@impls $zero:expr, $one:expr, #[$attr:meta], $($a:ty)*) => ($(
        #[$attr]
        impl Sum for $a {
            fn sum<I: Iterator<Item=Self>>(iter: I) -> Self {
                iter.fold(
                    $zero,
                    #[rustc_inherit_overflow_checks]
                    |a, b| a + b,
                )
            }
        }
  • which … uses fold() (basically reduce but with an initial value) and plain addition in the anonymous/closure function |a, b| a + b. What!?

Why? How?

  • Ok that was a long way to go to find the addition operator at the bottom of the heap of traits!

  • Hopefully I’ve grasped the mechanics?!

  • I’m not quite clear on why it’s build this way. I’m guessing there’s some flexibility baked into the way that the relevant implementation of Sum depends on the element type, which can be flexibly defined as the Item type of an Iterator independently of the type of the collection’s elements. That is, an iterator can utilise a type different from the actual elements of a collection and then rely on its particular implementation of sum. And then this can be independent from Add.

  • But that feels like a lot of obscure flexibility for a pretty basic operation, no?

  • For example, this code doesn’t compile because a type needs to be specified, presumably type inference gets lost amongst all the generics?

// doesn't compile
let x = vec![1i32, 2, 3].into_iter().sum();

// These do compile
let x2 = vec![1i32, 2, 3].into_iter().sum::<i32>();  // turbofish!!
let x3: i32 = vec![1i32, 2, 3].into_iter().sum();

  • Design choices aside …
  • I’m still unclear as to how Iterator::sum() works
fn sum<S>(self) -> S
where
    Self: Sized,
    S: Sum<Self::Item>,
{
    Sum::sum(self)
}
  • How does Sum::sum(self) work!?
  • self is the Iterator (so vec![1i32, 2, 3].iter()).
  • And Sum::sum() is the essential trait addressed above.
  • How does rust go from a call of a trait’s method to using the actual implementation on the specific type? I guess I hadn’t really thought about it, but it makes sense and it’s what all those Selfs are for.
  • In this case though, it’s rather confusing that the relevant implementation isn’t on the type of self, but because of the definition of Sum, the implementation is on the type of the elements (or Item specifically) of self. Sighs

Thoughts??

  • metiulekm@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    6 months ago
    pub trait Sum<A = Self>: Sized {
        fn sum<I: Iterator<Item = A>>(iter: I) -> Self;
    }
    

    So I’d presume the A = Self followed by I: Iterator<Item = A> for the iterator binds the implementation pretty clearly to the type of the iterator’s elements.

    Quite confusingly, the two =s have very different meaning here. The Item = A syntax just says that the iterator’s item type, which is set as the trait’s associated type, should be A. So, you could read this as “I should implement the Iterator trait, and the Item associated type of this implementation should be A”.

    However, A = Self does not actually mean any requirement of A. Instead, it means that Self is the default value of A: that is, you can do impl Sum<i64> for i32 and then you will have Self equal to i32 and A equal to i64, but you can also do impl Sum for i32 and it will essentially be a shorthand for impl Sum<i32> for i32, giving you both Self and A equal to i32.

    In the end, we have the relationship that the iterator item should be the same as A, but we do not have the relationship that Self should be the same as A. So, given this trait, the iterator item can actually be different to A.

    Note that the standard library does actually have implementations where these two differ. For instance, it has impl<'a> Sum<&'a i32> for i32, giving you a possibility to sum the iterator of &i32 into i32. This is useful when you think about this: you might want to sum such an iterator without .copied() for some extra ergonomics, but you can’t just return &i32, there is nowhere to store the referenced i32. So, you need to return the i32 itself.

    The definition is pretty clear here right? The generic here is Sum<Self::Item>, abbreviated to S … which AFAIU … means that the element type of the iterator — here Self::Item — is the type that has implemented Sum … and the type that will be returned.

    In Sum<Self::Item>, Self::Item is the A parameter, and Sum<Self::Item>, or S, is the type that implements the trait (which is called Self in the definition of the Sum trait, but is different to the Self in the sum method definition). As above, A and S can be different.

    It might be helpful to contrast this definition with a more usual one, where the trait does not have parameters:

    fn some_function<S>(…) ->where
            S: SomeTrait,
    {…}
    
    fn sum<S>(…) ->where
            S: Sum<Self::Item>,
    {…}
    

    Note that you might have an intuition from some other languages that in case of polymorphism, the chosen function either depends on the type of one special parameter (like in many OOP languages, where everything is decided by the class of the called object), or of the parameter list as a whole (like in C++, where the compiler won’t let you define int f() and float f() at the same time, but will be fine with int f(int) and float f(float)). As you can see, in Rust, the return type also matters. A simpler example of this is the Default trait.

    Regarding inference, some examples (Compiler Explorer link):

    vec![1i32].into_iter().sum();
    // or: <_ as Sum<_>>::sum(vec![1i32].into_iter());
    // error[E0283]: type annotations needed
    // note: cannot satisfy `_: Sum<i32>`
    

    Compiler knows that the iterator contains i32s, so it looks for something that implements Sum<i32>. But we don’t tell the compiler what to choose, and the compiler does not want to guess by itself.

    vec![1i32].into_iter().sum::<i32>();
    // or: <i32 as Sum<_>>::sum(vec![1i32].into_iter());
    

    As above the compiler knows that it wants to call something that implements Sum<i32>, but now it only has to check that i32 is such type. It is, so the code compiles.

    vec![1i32].iter().sum::<i32>();
    // or: <i32 as Sum<_>>::sum(vec![1i32].iter());
    

    Now we actually have a iterator of references, as we used .iter() instead of .into_iter(). But the code still compiles, since i32 also implements Sum<&i32>.

    vec![1i64].into_iter().sum::<i32>();
    // or: <i32 as Sum<_>>::sum(vec![1i64].into_iter());
    // error[E0277]: a value of type `i32` cannot be made by summing an iterator over elements of type `i64`
    // help: the trait `Sum<i64>` is not implemented for `i32`
    

    Now the compiler can calculate itself that it want to call something that implements Sum<i64>. However, i32 does not actually implement it, hence the error. If it did, the code would compile correctly.

    vec![].into_iter().sum::<i32>();
    // or: <i32 as Sum<_>>::sum(vec![].into_iter());
    // error[E0283]: type annotations needed
    // (in the second case) note: multiple `impl`s satisfying `i32: Sum<_>` found in the `core` crate: impl Sum for i32; impl<'a> Sum<&'a i32> for i32;
    

    Now the situation is reversed. The compiler knows the return type, so it knows that i32 should implement some Sum<_>. But it doesn’t know the iterator element type, and so it doesn’t know if it should choose the owned value, or the reference version. Note that the wording is different, the compiler wants to guess, but it can’t, as there are multiple possible choices. But if there is only one choice, the compiler does guess it:

    struct X {}
    impl Sum for X {
        fn sum<I: Iterator<Item = X>>(_: I) -> Self { Self{} }
    }
    vec![].into_iter().sum::<X>();
    // or: <X as Sum<_>>::sum(vec![].into_iter());
    

    builds correctly. I am not sure about the reason for the difference (I feel like it’s related to forward compatibility and the fact that outside the standard library I can do impl Sum<i32> for MyType but not impl Sum<MyType> for i32, but I don’t really know).

    Hope that helps :3

    EDIT:

    I’d also caught mentions of the whole zero thing being behind the design. Which is funny because once you get down to the implementation for the numeric types, zero seems (I’m not on top of macro syntax) to be just a parameter of the macro, which then gets undefined in the call of the macro, so I have to presume it defaults to 0 somehow??. In short, the zero has to be provided in the implementation of sum for a specific type. Which I suppose is flexible. Though in this case I can’t discern what the zero is for the integer types (it’s explicitly 0.0 for floats).

    Ah, I read this, thought about this, and forgot about this almost immediately. I know almost nothing about macros, but if I understand correctly, the zero is in line 92, here:

        ($($a:ty)*) => (
            integer_sum_product!(@impls 0, 1,
                    #[stable(feature = "iter_arith_traits", since = "1.12.0")],
                    $($a)*);
            integer_sum_product!(@impls Wrapping(0), Wrapping(1),
                    #[stable(feature = "wrapping_iter_arith", since = "1.14.0")],
                    $(Wrapping<$a>)*);
        );
    

    The intention seems to be to take a list of types (i8 i16 i32 i64 i128 isize u8 u16 u32 u64 u128 usize), and then for each type to generate both the regular and Wrapping version, each time calling into the path you have seen before. For floats there is no Wrapping version, so this time 0.0 is really the only kind of zero that can appear.

    • maegul (he/they)@lemmy.mlOPM
      link
      fedilink
      English
      arrow-up
      2
      ·
      6 months ago

      You are a fucking hero my friend!! Thanks so much!!

      Thanks for clarifying the macro … I didn’t understand what was going on there, but it makes sense that that’s where the zero comes from.

      And yea, as for the generics … I didn’t know about the default value (I was clearly reaching beyond what I could grasp!) … thanks again!!

      Quite confusingly, the two =s have very different meaning here. The Item = A syntax just says that the iterator’s item type, which is set as the trait’s associated type, should be A. So, you could read this as “I should implement the Iterator trait, and the Item associated type of this implementation should be A”.

      However, A = Self does not actually mean any requirement of A. Instead, it means that Self is the default value of A: that is, you can do impl Sum<i64> for i32 and then you will have Self equal to i32 and A equal to i64, but you can also do impl Sum for i32 and it will essentially be a shorthand for impl Sum<i32> for i32, giving you both Self and A equal to i32.

    • maegul (he/they)@lemmy.mlOPM
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 months ago

      So, just to riff on this a bit for fun …

      It seems then that we could have a Sum trait that didn’t require providing the result type??

      If the trait were defined something like:

      trait Sum2: Sized {
              fn sum2<I: Iterator<Item = Self>>(iter: I) -> Self;
          }
      

      … that is, without the a generic and with the Item type of the Iterator bound to Self.

      I wonder if rust could then do with sum being basic like this and not requiring a result type and then sum_gen for the generic case??


      I had a shot at sort of quickly prototyping this and came up with the following:

      fn main() {
          // so that there's no need to provide the result type
      
          trait Sum2: Sized {
              fn sum2<I: Iterator<Item = Self>>(iter: I) -> Self;
          }
      
          impl Sum2 for i32 {
              fn sum2<I: Iterator<Item = Self>>(iter: I) -> Self {
                  iter.fold(0, |a,x| a + x)
              }
          }
      
          struct MyVec<T>(Vec<T>);
      
          impl<T> MyVec<T> {
              fn sum2(self) -> T
              where
                  Self: Sized,
                  T: Sum2,
              {
                  Sum2::sum2(self.0.into_iter())
              }
          }
      
          let z = MyVec(vec![1i32, 2, 3]).sum2();
          println!("My own custom sum trait?: {z}");
      
          // doesn't compile as `i64: Sum2` not satisfied
          let z2 = MyVec(vec![1i64, 2, 3]).sum2();
      }
      

      I don’t know the best way of implementing a new sum method for vecs (let alone all Iterators) and so the best I could come up with was to wrap Vec and then create an Iterator directly in the sum2() method. It seemed to work well enough though!

      Once I learn more about traits etc it might be a fun exercise to see how close you can get to implementing one’s own convenience sum method!

      Thanks again!!