Industry Insights

From Copilot to Devin: The Evolution of Coding Assistants and How To Overcome Risks to IP

In this article

If you went on the internet last week, you probably saw the viral release of the latest GenAI tool – Devin. Its creator, Cognition Labs, describes Devin as the “world's first AI software engineer”.

‍

It was labeled as “the biggest AI news of 2024”. Another commentator called it “the single most impressive demo I have seen in my entire life”. Redditors, on the other hand, had a typically more cynical take on the readiness of the tool and largely dismissed it as hype.

‍

Whatever you think about the future Devin, GenAI-powered coding assistants are here to stay. If you think your organization doesn’t use these kinds of tools…you are probably mistaken.

‍

This blog digs into the rise of coding assistants, the associated risks, and what to do about it.

‍

Coding assistants constitute 15% of enterprise GenAI apps

Coding assistants can be considered one of the OG use cases of GenAI.

‍

Shortly after OpenAI released GPT-3 in 2020, GitHub began working on GitHub Copilot. In 2022, GitHub surveyed Copilot users and found that it was saving them 55% of their time to complete tasks. Fast forward four years, and Microsoft has extended the “Copilot” brand to almost all of its products.

‍

‍

Nowadays, software engineers have a large number of GenAI tools they can turn to to enhance their code. Amazon CodeWhisperer, AskCodi, and Replit are some of the top alternatives to GitHub. Our own research found that 15% of the enterprise GenAI apps are coding assistants.

‍

‍

Moreover, this is about more than just code generation. There are all sorts of tools to help with documentation, pairing, and testing. No matter the seniority of the engineer, there’s probably a tool that can help save time.

‍

Even further than that, it’s now not only about roles that are “engineers”. Low code tools now exist and put these capabilities into the hands of far more employees.

‍

In short, GenAI coding assistants are about far, far more than Devin.

Oversharing and intellectual property concerns

We don’t yet know precisely how much time-savings coding assistants provide. My instinct is that it’s less than the 55% suggested by GitHub, but higher than 0. However, even small percentage gains in productivity can produce a significant competitive advantage over time. Critically, we are still very early in the evolution of these tools and we can expect rapid improvement of them with time.

‍

But these productivity rewards come with associated risks. Apple, for example, has banned employees from using ChatGPT because they’re concerned about leaking intellectual property to competitors. This concern is particularly acute for code-rich technology companies who want to safeguard their lines of code, but it applies to a broad range of companies whose engineers have access to sensitive data.

‍

The risk of oversharing is not new; it's been a problem for engineers for many years – even before Copilot was a twinkle in GitHub’s eye. For years, I’ve seen employees and contractors inadvertently upload sensitive code and data to Stack Overflow or accidentally make repositories public. This problem is exacerbated by GenAI and the sheer number of tools available to engineers.

Increasingly complex software supply chain

“How much of your code is AI-generated?”

‍

This question sends chills down the spines of some security leaders I have spoken to recently. It’s especially particularly important if you’re selling software to the US government or going through a merger or acquisition. Furthermore, if some apps end up facing legal action for having been trained on copyrighted data, understanding if you have output from these apps will also be important from a legal perspective.

‍

Software Bills of Materials (SBOMs) must now detail the use and origin of GenAI-generated code to clarify IP ownership. This extension of SBOMs ensures that all software components, including those produced by AI, are documented. It will become critical to document the security vetting and licensing compliance of GenAI-produced code.

‍

The supply chain is about more than SBOMs, though. Another pre-existing problem that GenAI worsens is Shadow IT (or, more precisely, Shadow AI). Even if you have blocked engineers from using ChatGPT, how do you know they’re not using one of the other 1,500 coding assistant apps out there? “Whack-a-mole” style blocking of GenAI sites doesn’t work.

Users will bypass controls if they have to

For organizations like Apple and Samsung that seek to ban the use of GenAI tools, there’s bad news. Realistically, if an engineer finds a way to enhance productivity to ship code more quickly, they will probably find a way to make that happen. According to Snyk’s 2023 AI Code Security Report, 80% of developers bypass AI code security policies.

‍

Like it or not, the onus is on the security team to understand the end users, cater to their use cases, and coach them.

‍

Balancing Productivity and Security

If we are to realize the benefits of GenAI-powered coding assistants, we need to overcome the associated risks. You can’t have your cake and eat it.

‍

These are not new problems – but GenAI has exploded them.

‍

Track usage. Understand the use cases. Learn what coding assistants are doing. Do an application audit. This will help to see how engineers are using the tools and the extent of Shadow IT.

‍

Create an AI policy. It sounds obvious, but make sure you have an AI usage policy–even if you don’t yet have a formal AI initiative in your organization. Being proactive here will save pain later on.

‍

Coach employees. Provide clear, ongoing training on how engineers should and should not use the tools. Be specific and cater these trainings to their use cases – don’t just create an AI policy and leave it there.

‍

Provide a safe alternative. Many security teams are opting to block ChatGPT or other GenAI apps. This is OK, but aim to give employees a safe alternative that they can use. Dead ends will often make them circumvent security controls.

‍

‍Detect sensitive data. To truly embrace these tools, you want to be able to detect sensitive data being shared with these tools. That’s what we’re building here at Harmonic. If you’re grappling with these challenges and you’d like to learn more about our unique approach, I’d love to hear from you: https://www.harmonic.security/book-a-meeting.