Coders spent more time prompting and reviewing AI generations than they saved on coding. On the surface, METR’s results seem to contradict other benchmarks and experiments that demonstrate increases in coding efficiency when AI tools are used. But those often also measure productivity in terms of total lines of code or the number of discrete tasks/code commits/pull requests completed, all of which can be poor proxies for actual coding efficiency. These factors lead the researchers to conclude that current AI coding tools may be particularly ill-suited to “settings with very high quality standards, or with many implicit requirements (e.g., relating to documentation, testing coverage, or linting/formatting) that take humans substantial time to learn.” While those factors may not apply in “many realistic, economically relevant settings” involving simpler code bases, they could limit the impact of AI tools in this study and similar real-world situations.

  • NegentropicBoy@lemmy.world
    link
    fedilink
    English
    arrow-up
    27
    ·
    2 days ago

    Great as an assistant for boring tasks. Still needs checking.

    Can also help suggest improvements, but still needs checking.

    Have to learn when to stop interacting with it and do it yourself.

    • tourist@lemmy.world
      link
      fedilink
      English
      arrow-up
      7
      ·
      2 days ago

      A “junior” project manager at my company vibe coded an entire full stack web app with one of those LLM IDEs. His background is industrial engineering and claims to have basically no programming experience.

      It “works”, as in, it does what it’s meant to, but as you can guess, it relies on calls to LLM APIs where it really doesn’t have to, and has several critical security flaws, inconsistencies in project structure and convention, and uses deprecated library features.

      He already pitched it to one of our largest clients, and they’re on board. They want to start testing at the end of the month.

      He’s had one junior dev who’s been managing to keep things somewhat stable, but the poor dude really had his work cut out for him. I only recently joined the project because “it sounded cool”, so I’ve been trying to fix some flaws while adding new requested features.

      I’ve never worked with the frameworks and libraries before, so it’s a good opportunity to upskill, but god damn I don’t know if I want my name on this project.

      A similar thing is happening with my brother at a different company. An executive vibe coded a web application, but this thing absolutely did not work.

      My brother basically had one night to get it into a working state. He somehow (ritalin) managed to do it. The next day they presented it to one of their major clients. They really want it.

      These AI dev tools absolutely have a direct negative impact on developer productivity, but they also have an indirect impact where non-devs use them and pass their Eldritch abominations to the actual devs to fix, extend and maintain.

      Two years ago, I was worried about AI taking dev jobs, but now it feels like, to me, we’ll need more human devs than ever in the long run.

      Like, weren’t these things supposed to exponentially get better? Like, cool, gh copilot can fuck up my project files now.

      • nyan@lemmy.cafe
        link
        fedilink
        English
        arrow-up
        5
        ·
        2 days ago

        These AI dev tools absolutely have a direct negative impact on developer productivity, but they also have an indirect impact where non-devs use them and pass their Eldritch abominations to the actual devs to fix, extend and maintain.

        Sounds like the next evolution of the Excel spreadsheet macro. Or maybe it’s convergent evolution toward the same niche. (I still have nightmares about Excel spreadsheet macros.)