[He/Him, Nosist, Touch typist, Enthusiast, Superuser impostorist, keen-eyed humorist, endeavourOS shillist, kotlin useist, wonderful bastard, professinal pedant miser]
Stuped person says stuped things, people boom

I have trouble with using tone in my words but not interpreting tone from others’ words. Weird, isn’t it?

Formerly on kbin.social and thriv.social, now on dbzer0 or piefed.social

  • 78 Posts
  • 1.46K Comments
Joined 2 years ago
cake
Cake day: March 5th, 2024

help-circle














  • the decision to graph per-community (i’ll call those "commag"s) doesn’t make sense to me. it seems to me like a bad approximation for lifetime total users that doesn’t control for either instance attitudes on commag creation or troll account registration (these usually don’t create communities). i’m unacquainted with pawb but i wouldn’t think they’re very ban-happy, and the fact that these graphs show them as twice as ban-happy per-community as, and a bit more ban happy than ml should tell you this isn’t very good methodology.

    it seems more like dbzer0 and pawb don’t create many communities (but don’t close stale ones either, unlike .world), which makes sense as the ones that do exist are quite focused and targetted to the instance userbase.


  • Another researcher, Davi Ottenheimer, pointed out that the security section (Section 3, pages 47-53) of Anthropic’s 244-page documentation “contains no count of zero-days at all. With no CVE list, no CVSS distribution, no severity bucket, no disclosure timeline, no vendor-confirmed-novel table, no false-positive rate.”

    excerpts from the summary of the post linked in “Devanash ultimately concluded”, a lot of which Register repeats (which I think is a good thing since the copyediting makes the language a lot more accessible and wide-reaching and of course it was credited):

    The bugs are real. 17-year-old FreeBSD RCE, 23-year-old Linux kernel heap overflow, 27-year-old OpenBSD TCP flaw. LLMs catch these because they can reason about the gap between what code does and what the developer intended. Fuzzers and static analysis literally cannot do this.

    The coverage is wrong on almost every detail. The “181 Firefox exploits” ran with the browser sandbox ( yes, the thing that stops browser exploits) off. The FreeBSD exploit transcript shows substantial human guidance, not autonomy. The “thousands of severe vulnerabilities” extrapolates from 198 manually reviewed reports. The Linux kernel bug was found by Opus 4.6, the public model, not Mythos.

    The moat is thinner than anyone reported. AISLE tested eight models including a 3.6B model at $0.11/M tokens. All eight found the FreeBSD bug. Mythos’s actual lead is in multi-step exploit development, not detection. That’s a narrower and more replicable advantage than what’s being sold.