• 0 Posts
  • 2 Comments
Joined 10 months ago
cake
Cake day: November 17th, 2024

help-circle
  • This paragraph suggests that making a profit was intended to be easy.

    As seen in Figure 3, Claude 3.5 Sonnet outperformed the human baseline in mean performance, but its variance was very high. We only have a single sample for the human baseline and therefore cannot compare variances. However, there are qualitative reasons to expect that human variance would be much lower. All models had runs where they went bankrupt. When questioned, the human stated that they estimated this would be very unlikely to happen to them, regardless of the number of samples.