27
Fri, Dec
0 New Articles

With generative AI, don’t believe the hype (or the anti-hype)

Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

No technology in human history has seen as much interest in such a short time as generative AI (gen AI). Many leading tech companies are pouring billions of dollars into training large language models (LLMs). But can this technology justify the investment? Can it possibly live up to the hype?

High hopes

Back in the spring of 2023—quite a long time in the artificial intelligence (AI) space—Goldman Sachs released a report estimating that the emergence of generative AI could boost global GDP by 7% annually (link resides outside IBM.com), amounting to more than an additional USD 7 trillion each year. 

How might generative AI achieve this? The applications of this technology are numerous, but they can generally be described as improving the efficiency of communication between humans and machines. This improvement will lead to the automation of low-level tasks and the augmentation of human abilities, enabling workers to accomplish more with greater proficiency.

Because of the wide-ranging applications and complexity of generative AI, many media reports might lead readers to believe that the technology is an almost magical cure-all. Indeed, this perspective characterized much of the coverage around generative AI as the release of ChatGPT and other tools mainstreamed the technology in 2022, with some analysts predicting that we were on the brink of a revolution that would reshape the future of work.

4 crises

Not even 2 years later, media enthusiasm around generative AI has cooled slightly. In June, Goldman Sachs released another report (link resides outside IBM.com) with a more measured assessment, questioning whether the benefits of generative AI could justify the trillion-dollar investment in its development. The Financial Times (link resides outside IBM.com), among other outlets, published an op-ed with a similarly skeptical view. The IBM Think Newsletter team summarized and responded to some of these uncertainties in an earlier post.

Subsequent stock market fluctuations led several analysts to proclaim that the “AI bubble” was about to pop and that a market correction on the scale of the dot-com collapse of the ‘90s might follow.

The media skepticism around generative AI can be roughly broken down into 4 distinct crises developers face:

  • The data crisis: The vast troves of data used to train LLMs are diminishing in value. Publishers and online platforms are locking up their data, and our demand for training data might soon exhaust the supply.
  • The compute crisis: The demand for graphics processing units (GPUs) to process this data is leading to a bottleneck in chip supply.
  • The power crisis: Companies developing the largest LLMs are consuming more power every year, and our current energy infrastructure is not equipped to keep up with the demand.
  • The use case crisis: Generative AI has yet to find its “killer app” in the enterprise context. Some especially pessimistic critics suggest that future applications might not meaningfully extend beyond “parlor trick” status.

These are serious hurdles, but many remain optimistic that solving the last problem (use cases) will help resolve the other 3. The good news is, they are already identifying and working on meaningful use cases.

Stepping outside the hype cycle

“Generative AI is having a marked, measurable impact on ourselves and our clients, fundamentally changing the way that we work,” says IBM distinguished engineer Chris Hay. “This is across all industries and disciplines, from transforming HR processes and marketing transformations through branded content to contact centers or software development.” Hay believes we are in the corrective phase that often follows a period of rampant enthusiasm, and perhaps the recent media pessimism can be seen as an attempt to balance out earlier statements that, in hindsight, seem like hype.

“I wouldn’t want to be that analyst,”says Hay, referencing one of the gloomier recent prognostications about the future of AI. “I wouldn’t want to be the person who says, ‘AI is not going to do anything useful in the next 10 years,’ because you’re going to be quoted on that for the rest of your life.”

Such statements might prove as shortsighted as claims that the early internet wouldn’t amount to much or IBM founder Thomas Watson’s 1943 guess that the world wouldn’t need more than 5 computers. Hay argues that part of the problem is that the media often conflates gen AI with a narrower application of LLM-powered chatbots such as ChatGPT, which might indeed not be equipped to solve every problem that enterprises face.

Overcoming limitations and working within them

If we start to run into supply bottlenecks—whether in data, compute or power—Hay believes that engineers will get creative to resolve these impediments.

“When you have an abundance of something, you consume it,” says Hay. “If you’ve got hundreds of thousands of GPUs sitting around, you’re going to use them. But when you have constraints, you become more creative.”

For example, synthetic data represents a promising way to address the data crisis. This data is created algorithmically to mimic the characteristics of real-world data and can serve as an alternative or supplement to it. While machine learning engineers must be careful about overusing synthetic data, a hybrid approach might help overcome the scarcity of real-world data in the short term. For instance, the recent Microsoft PHI-3.5 models or Hugging Face SMOL models have been trained with substantial amounts of synthetic data, resulting in highly capable small models.

Today’s LLMs are power-hungry, but there’s little reason to believe that current transformers are the final architecture. SSM-based models, such as Mistral Codestral Mamba, Jamba 1.5 or Falcon Mamba 1.5, are gaining popularity due to their increased context length capabilities. Hybrid architectures that use multiple types of models are also gaining traction. Beyond architecture, engineers are finding value in other methods, such as quantization, chips designed specifically for inference, and fine-tuning, a deep learning technique that involves adapting a pretrained model for specific use cases.

“I’d love to see more of a community around fine-tuning in the industry, rather than the pretraining,” says Hay. “Pretraining is the most expensive part of the process. Fine-tuning is so much cheaper, and you can potentially get a lot more value out of it.”

Hay suggests that in the future, we might have more GPUs than we know what to do with because our techniques have become much more efficient. He recently experimented with turning a personal laptop into a machine capable of training models. By rebuilding more efficient data pipelines and tinkering with batching, he is figuring out ways to work within the limitations. He could naturally do all this on an expensive H100 Tensor Core GPU, but a scarcity mindset enabled him to find more efficient ways to achieve the wanted results. Necessity was the mother of invention.

Thinking smaller

Models are becoming smaller and more powerful.

“If you look at the smaller models of today, they’re trained with more tokens than the larger models of last year,” says Hay. “People are stuffing more tokens into smaller models, and those models are becoming more efficient and faster.”

“When we think about applications of AI to solve real business problems, what we find is that these specialty models are becoming more important,” says Brent Smolinksi, IBM’s Global Head of Tech, Data and AI Strategy. These include so-called small language models and non-generative models, such as forecasting models, which require a narrower data set. In this context, data quality often outweighs quantity. Also, these specialty models consume less power and are easier to control.

“A lot of research is going into developing more computationally efficient algorithms,” Smolinksi adds. More efficient models address all 4 of the proposed crises: they consume less data, power and compute, and being faster, they open up new use cases.

“The LLMs are great because they have a very natural conversational interface, and the more data you feed in, the more natural the conversation feels,” says Smolinksi. “But these LLMs are, in the context of narrow domains or problems, subject to hallucinations, which is a real problem. So, our clients are often opting for small language models, and if the interface isn’t perfectly natural, that’s OK because for certain problems, it doesn’t need to be.”

Agentic workflows

Generative AI might not be a cure-all, but it is a powerful tool in the belt. Consider the agentic workflow, which refers to a multi-step approach to using LLMs and AI agents to perform tasks. These agents act with a degree of independence and decision-making capability, interacting with data, systems and sometimes people, to complete their assigned tasks. Specialized agents can be designed to handle specific tasks or areas of expertise, bringing in deep knowledge and experience that LLMs might lack. These agents can either draw on more specialized data or integrate domain-specific algorithms and models.

Imagine a telecommunications company where an agentic workflow orchestrated by an LLM efficiently manages customer support inquiries. When a customer submits a request, the LLM processes the inquiry, categorizes the issue, and triggers specific agents to handle various tasks. For instance, one agent retrieves the customer’s account details and verifies the information provided, while another diagnoses the problem, such as running checks on the network or examining billing discrepancies.

When the issue is identified, a third agent formulates a solution, whether that’s resetting equipment, offering a refund or scheduling a technician visit. The LLM then assists a communication agent in generating a personalized response to the customer, helping to ensure that the message is clear and consistent with the company’s brand voice. After resolving the issue, a feedback loop is initiated, where an agent collects customer feedback to determine satisfaction. If the customer is unhappy, the LLM reviews the feedback and might trigger other follow-up actions, such as a call from a human agent.

LLMs, while versatile, can struggle with tasks that require deep domain expertise or specialized knowledge, especially when these tasks fall outside the LLM’s training data. They are also slow and not well-suited for making real-time decisions in dynamic environments. In contrast, agents can operate autonomously and proactively, in real time, by using simpler decision-making algorithms.

Agents, unlike large, monolithic LLMs, can also be designed to learn from and adapt to their environment. They can use reinforcement learning or feedback loops to improve performance over time, adjusting strategies based on the success or failure of previous tasks. Agentic workflows themselves generate new data, which can then be used for further training.

This scenario highlights how an LLM is a useful part of solving a business problem, but not the entire solution. This is good news because the LLM is often the costliest piece of the value chain.

Looking past the hype

Smolinksi argues that people often go to extremes when excited about new technology. We might think a new technology will transform the world, and when it fails to do so, we might become overly pessimistic.

“I think the answer is somewhere in the middle,” he says, arguing that AI needs to be part of a broader strategy to solve business problems. “It’s usually never AI by itself, and even if it is, it’s using possibly multiple types of AI models that you’re applying in tandem to solve a problem. But you need to start with the problem. If there’s an AI application that could have a material impact on your decision-making ability that would, in turn, lead to a material financial impact, focus on those areas, and then figure out how to apply the right set of technologies and AI. Leverage the full toolkit, not just LLMs, but the full breadth of tools available.”

As for the so-called “use case crisis”, Hay is confident that even more compelling use cases justifying the cost of these models will emerge.

“If you wait until the technology is perfect and only enter the market once everything is normalized, that’s a good way to be disrupted,” he says. “I’m not sure I’d take that chance.”

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: