AI and StackOverflow, The Changing Landscape of Developer Support

When I couldn't solve a critical Hibernate issue last month, I turned to AI first—and hit a wall. Ten years ago, I would have immediately found the answer on StackOverflow. This shift represents a fundamental change in how developers solve problems today. In the early days of computing, during the 1950s and 1960s, people relied heavily on academic papers, textbooks, and manuals provided by computer manufacturers. Institutions like universities and research labs were the main hubs for knowledge sharing. With the rise of microcomputers in the 1970s, books and technical journals became the go-to resources for learning and delving into programming to solve programming problems.

The 1980s saw the rise of personal computers and the early establishment of many influential software companies, such as Microsoft, along with the introduction of the MS-DOS operating system and word processors like Microsoft Word. It was a pivotal decade for the software industry. Compared to the 1970s, software vendors started providing more comprehensive documentation and support for their products in addition to books and journals.

In the 1990s, another technological renaissance was born with the birth of the internet. This further transformed how programmers resolved problems. Usenet groups like comp.lang.c and comp.software.testing served as forums for discussion and problem-solving. This was the early age of web forums, which paved the way for more accessible knowledge-sharing platforms. In parallel, subscription-based email lists started to gain traction, allowing writers to discuss various technical topics of their choosing.

With the 2000s and the Web 2.0 era, web forums became even more consolidated. As a result, StackOverflow was launched in 2008, revolutionizing how engineers seek and share knowledge by providing a centralized Q&A platform with a reputation system. For over a decade, StackOverflow has been the go-to destination for software developers seeking solutions to their coding challenges. The platform's extensive database of questions and answers, combined with its strong community of developers, has made it an indispensable resource for programmers.

My Lifeline as an Engineer

As with the majority of engineers, I also used StackOverflow extensively in the past, especially during the 2010s. You might ask yourself why it was so popular back in the day. Imagine forgetting one tiny detail in a properties file that sets up the logs. You end up checking all known issues you can think of, but you're still unable to locate the problem. It was common to turn to Google and search for StackOverflow entries. Even to this day, I find code snippets that point towards a StackOverflow solution to explain the code. The platform helped me find answers to my issues, and when I couldn't find a solution, I would post the question myself. StackOverflow helped me resolve many issues or saved me time reading documentation. And the biggest power of StackOverflow was its community, similar to Reddit, which was engaged and responsive.

Then came large language models (LLMs in short). The exponential growth and widespread attention for ChatGPT specifically escalated with the release of ChatGPT-3.5. This version was introduced by OpenAI on November 30, 2022. While previous versions, including those based on GPT-3, had generated significant interest and demonstrated impressive capabilities, ChatGPT-3.5 marked a pivotal moment due to its more conversational and responsive nature. The combination of its advanced capabilities, user-friendly interface, and widespread demonstrations on platforms like Twitter and YouTube dramatically increased its popularity. This led to its rapid adoption and broader recognition. With this breakthrough into the mainstream, StackOverflow's popularity began to decline rapidly, and so did community engagement.

Problem with AI

Since the introduction of ChatGPT, I have found myself relying less on StackOverflow. I use Copilot at work, and occasionally Claude for personal matters. These tools tremendously help me speed up parts of my processes. However, the tools come with their own set of problems, and I recently stumbled upon one.

Imagine you encounter an issue while developing, and that issue is not yet covered by any LLM or not covered by the ones you use the most. As you know, LLMs are trained on large datasets up to a certain point in time. Now, imagine that your problem involves something that the LLMs have not yet been trained on. You will have a harder time finding a solution.

This happened to me about two months ago when I tried to update the Dropwizard framework. Part of the upgrade included moving from Hibernate 5.X to Hibernate 6.X. The service uses the jsonb type, I encountered the following issue:

org.postgresql.util.PSQLException: ERROR: column "dummy" is of type jsonb but the expression is of type character varying

I asked Copilot and OpenAI for help, but they either provided generic Hibernate migration advice or misunderstood the JSONB mapping issue entirely. The models hadn't been trained on enough examples of this specific Hibernate 6.x migration pattern. I checked the official documentation, and everything seemed alright. As suggested by the documentation, I had the proper AttributeConverter implementations in place but was still facing this error. Only after extensive googling did I find a Baeldung article that explained the issue. In the past, I could have saved time by finding the answer on StackOverflow, but that's no longer the case with the decline in popularity.

Current State of AI

AI is incredibly powerful, and as I've mentioned, I use it almost daily. I work hard to refine my daily processes and incorporate AI into them. I see tremendous value in many aspects of my work, including development. I might use Copilot to help me debug code or Claude to improve my grammar and wording. There are many use cases where I leverage AI. That being said, we are all aware of the issues that come with AI:

Hallucinations: In the context of LLMs, hallucinations refer to the phenomenon where the model confidently generates false, misleading, or nonsensical information. While LLMs can generate human-like text, they do not truly understand context or have personal experiences. They might struggle with understanding nuances, idioms, or culturally specific references.

Lack of Real-Time Information: LLMs are trained on static datasets and do not have real-time web browsing capabilities, at least the majority do not for now, with a handful of exceptions (ChatGPT and Claude). Therefore, they may not have up-to-date information, and their knowledge cutoff is based on the data they were trained on. Just as an example, GPT-3.5's knowledge cutoff was in January 2022.

Bias and Stereotyping: LLMs can inadvertently perpetuate and amplify biases present in their training data, leading to stereotyped or prejudiced outputs. They may also generate offensive or inappropriate content.

Lack of Common Sense and Logical Reasoning: LLMs may struggle with tasks that require common sense or complex logical reasoning. They can provide illogical or nonsensical responses to certain prompts.

No Fact-Checking: LLMs do not have a built-in mechanism to verify the accuracy of the information they generate. They can confidently provide false or misleading information, which can be challenging to distinguish from accurate information.

Over-Reliance: Becoming overly dependent on LLMs for information or decisions can lead to a lack of critical thinking and skill development. It is essential to maintain and improve your own problem-solving and research skills.

Privacy and Security Concerns: When interacting with LLMs, users might share sensitive information, which could raise privacy concerns. Additionally, the outputs generated by LLMs could potentially be used maliciously, such as for creating convincing misinformation or phishing attempts.

Consistency Issues: LLMs can provide different answers to the same question when asked multiple times due to the probabilistic nature of their outputs. This inconsistency can make it difficult to rely on their responses.

I perceive AI as a great tool, but you are in the driver's seat and must make proper decisions or evaluate the responses you receive. Knowledge and expertise are still crucial, as this is the only way to create proper prompts and get proper responses back from AI.

Going Forward

Knowledge sharing remains valuable for engineers, but the sources of knowledge might change. It may become crucial to rely on sources such as blog posts or guides. It has become clear to me that you can't always depend on AI. There will be occasions when you need to understand the underlying concept of a certain technology and seek answers. Having the option to access someone's knowledge by reading guides or blogs will be important. AI will fall short in certain ways, at least for the foreseeable future. You might end up searching for a solution on Google and finding your answers on page six or seven (most likely sooner) because someone wrote a blog post or created a comprehensive guide.

As engineers, we have both an opportunity and a responsibility to consider how we fill this emerging knowledge gap. AI needs high-quality sources for training, and with the decline of platforms like StackOverflow, this void might be filled with our writing. No one knows what the future holds, but I believe it does not mean the end of writing; perhaps it signifies another evolutionary step in the process. After all, critical thinking remains essential even as AI becomes more capable. What do you think?

Published by...

Jernej Klancic

Visit author page