Hacker Newsnew | past | comments | ask | show | jobs | submit | brookst's commentslogin

The observation was that death row represents the highest level of scrutiny, and still had 10% false positives for guilt.

Is there any argument that less-scrutinized cases would have a lower level of false convictions?


The 10% claim has been refuted.

I like it! Though give the focus you had AIpaca sitting right there… (best read with sans serif font)

I’m co-developing lots of projects with AI. Right now I have a hand-rolled backlog system that lives in each project’s git repo with a standard prompt on how to create, triage, and review backlog items.

This looks great for me. Better than what I have, smaller/cheaper/more AI focused than Jira.


Most corrupt US administration in history, by a long shot.

Wonder how many US-based early-stage startups are using Opus to research incorporating and moving overseas at this very moment.

EU isn’t tenable, UK is iffy. Australia? Thailand? Who wants to be innovation-friendly?


And experiences absolutely benefit from productivity improvements. AI has helped me plan better trips, find places and activities I had no idea about, better prepare for weather in remote destinations.

It’s said that “productivity” is mistakenly connoted as scoped to work.


I don’t know that Terry much cares about the opinions of people who judge claims based on innuendo and cynicism rather than the actual merits of the claim.

Your point is contextualixing the humans involved, and it is a good and righteous post.

Yeah it’s one of those words that gets snapped up early, like https://news.ycombinator.com/user?id=stars

That doesn't look like a well used account - how do I get it transferred to me?

Edit: never mind, I guessed the password as it was only five stars.


I’m finding Fable dramatically better for auditing PR’s and large features. In a side by side with the same prompt I’ve been happily using on Opus, Opus found one major and one minor issue, fable found two major and four minor (a superset of Opus).

I’ve taken to using fable to plan arch, specs, build plan, and then to be the final QA. Opus for the actual build.


Is it not a trade off? I think they made the wrong choice, but it seems reductive so say there was no choice at all and should never have been consideration of trade offs of silent versus not.

Even wide open, uncensored models are often the product of a deliberate choice. I have a hard time faulting people for intentionality (even when they get it wrong).


They have a lot of choices, why would that specifically be a tradeoff? It's common for people to construct a tradeoff under which their preferred action is the more virtuous option, and thus they can be "the good guys", but that doesn't mean their framing makes any sense at all. Silently downgrading requests to a weaker model and billing the customer at full price, then framing the debate as how much (not if) this behavior is correct, that's an expression of values. People make mistakes all the time, if they thought it was actually wrong they could well have said so and explained what corrective action they've taken. One of the most famous examples of doing this right was the Pentium FDIV bug. Intel stood behind the product by recalling the affected units at great expense, and that (rightly) earned a lot of trust for decades.

You seem to be focused on the decision making, but I still don’t understand how it’s not a tradeoff. All binary decisions (silent or not) are tradeoffs because there are upsides and downsides to each, the question is which is better on the whole.

If I’m deciding whether or not to eat ice cream, there are trade offs involved because I can’t simultaneously have it both ways.

And Anthropic did apologize, explain reasoning, and what they learned.

They got it wrong; they picked the wrong trade offs and got a net worse decision than they should have. I’m with you on everything except this idea that it was an obvious decision with no upsides to silent and no downsides to loud.


Sure, ad absurdum anything can be called a tradeoff, but when you describe something as a tradeoff in communication you're also making an argument that there are a range of reasonable answers (otherwise why would you bring it up). Whether or not to eat ice cream, it's very reasonable to consider lots of situations where both, or some compromise (a small ice cream), are fine choices. If I came to you on the street while you were holding your ice cream, told you I was going to take your ice cream and eat it, and after your protest changed my mind and said that was the "wrong balance" in how much I took, you could very fairly conclude that I was the sort of person who thinks it's ok to steal people's ice cream (maybe just not today). Now maybe there was some great miscommunication, and I thought you were done with it, but then I wouldn't say I made the wrong tradeoff, I would just tell you I misunderstood the situation and hope you'd invite me to ice cream another time. Anthropic in this case is saying they think there's a balance in how much of the service customers pay for that they should deliver, as clarified in their Wired statement. That's a choice.

(they also reset usage for many accounts after the Mythos/Fable rollback, which is great)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: