Latest Generative AI Trends

Layla
Apr 2, 2024
36 min read

Updated: Apr 27, 2024

Generative artificial intelligence (AI) refers to algorithms (like ChatGPT) that can generate new content, such as audio, code, images, text, simulations, and videos. Recent advances in the field have the potential to fundamentally alter the way we approach content creation.

The tremendous capability of generative AI will help to democratise access to AI's revolutionary potential. And I believe that in order to fully consider how it will affect our lives, we must all be aware of what is going to happen.

The top ten generative AI trends 2024 highlight the vast potential of this modern technology and make a compelling case for why it is a wise investment for any business. So without further ado, let's look at the trends.

But first, let's take a closer look at how the rise of Generative AI is affecting the world today!

Generative AI systems are classified as machine learning, and one such system, ChatGPT, describes what it can do as follows:

Are you ready to push your creativity to the next level? Look no further than generative AI. This innovative type of machine learning enables computers to generate a wide range of new and exciting content, from music and art to entire virtual worlds.

And it's not just for fun; generative AI has a variety of practical applications, such as developing new product designs and optimising business processes. So, why wait? Unleash the power of generative AI and see what incredible creations you can make!

Did anything in that paragraph seem strange to you? Perhaps not. The grammar is perfect, the tone is appropriate, and the narrative flows smoothly.

What are ChatGPT and DALL-E?

ChatGPT may be getting all the attention right now, but it isn't the first text-based machine learning model to make a splash. In recent years, OpenAI's GPT-3 and Google's BERT have both received a lot of attention.

However, prior to ChatGPT, which, by most accounts, works well most of the time (though it is still being evaluated), AI chatbots did not always receive positive feedback. GPT-3 is "by turns super impressive and super disappointing," according to New York Times tech reporter Cade Metz in a video in which he and food writer Priya Krishna ask GPT-3 to write recipes for a (rather disastrous) Thanksgiving dinner.

The first machine learning models to work with text were trained by humans to classify various inputs based on labels assigned by researchers. One example is a model that has been trained to classify social media posts as positive or negative. This type of training is referred to as supervised learning because a human is responsible for "teaching" the model what to do.

The next generation of text-based machine learning models uses self-supervised learning. This type of training involves feeding a model a large amount of text so that it can generate predictions. Some models, for example, can predict the ending of a sentence based on only a few words.

What does it take to build a generative AI model?

Building a generative AI model has traditionally been a massive undertaking, with only a few well-resourced tech heavyweights attempting it. OpenAI, the company behind ChatGPT, former GPT models, and DALL-E, has raised billions of dollars from well-known donors. DeepMind is a subsidiary of Alphabet, Google's parent company, and Meta has launched its Make-A-Video product, which uses generative AI.

These companies employ some of the world's most talented computer scientists and engineers.

But it's more than just talent. When you ask a model to train across nearly the entire internet, it will cost you. OpenAI has not disclosed exact costs, but estimates suggest that GPT-3 was trained on approximately 45 terabytes of text data—equivalent to about one million feet of bookshelf space, or one-quarter of the entire Library of Congress—at a cost of several million dollars. These are not resources that a garden-variety start-up can use.

What kinds of output can a generative AI model produce?

As previously stated, the outputs of generative AI models can be indistinguishable from human-generated content or appear uncanny. The results are determined by the model's quality—as we've seen, ChatGPT's outputs appear to be superior to those of its predecessors—as well as the model's fit with the use case, or input.

In ten seconds, ChatGPT can produce what one commentator described as a "solid A-" essay comparing Benedict Anderson and Ernest Gellner's theories of nationalism. It also produced a well-known passage describing how to remove a peanut butter sandwich from a VCR in the style of the King James Bible. AI-generated art models, such as DALL-E (named after surrealist artist Salvador Dalí and the beloved Pixar robot WALL-E), can produce unique and stunning images, such as a Raphael painting of a Madonna and child eating pizza. Other generative AI models can create code, video, audio, and business simulations.

However, the results are not always accurate—or appropriate.

When Priya Krishna asked DALL-E 2 to create an image for Thanksgiving dinner, it came up with a scene in which the turkey was garnished with whole limes and served alongside a bowl of what appeared to be guacamole. ChatGPT, for its part, appears to struggle with counting or solving basic algebra problems, let alone overcoming the sexist and racist bias that lurks in the undercurrents of the internet and society as a whole.

Generative AI outputs are precise combinations of the data used to train the algorithms. Because the amount of data used to train these algorithms is so massive—as previously stated, GPT-3 was trained on 45 terabytes of text data—the models can appear to be "creative" in their outputs. Furthermore, the models typically contain random elements, which allows them to generate a variety of outputs from a single input request, making them appear even more lifelike.

What kinds of problems can a generative AI model solve?

You've probably noticed that generative AI tools (toys?) like ChatGPT can provide hours of entertainment. Businesses, too, see a clear opportunity. Generative AI tools can generate a wide range of credible writing in seconds and then respond to criticism to improve its suitability for the intended purpose.

This has far-reaching implications for a wide range of industries, from IT and software companies that can benefit from AI models' instantaneous, mostly correct code to businesses that require marketing copy. In short, any organisation that needs to create clear written materials may stand to benefit. Organisations can also use generative AI to create technical materials like higher-resolution medical images. Organisations can save time and resources here.

We've seen that developing a generative AI model is so resource-intensive that it's out of reach for all but the largest and most well-resourced businesses. Companies that want to use generative AI can do so either out of the box or by fine-tuning it to perform a particular task. If you need to prepare slides in a specific style, you could ask the model to "learn" how headlines are typically written based on the data in the slides, then feed it slide data and instruct it to write appropriate headlines.

What are the limitations of AI models? How can these potentially be overcome?

Because they are so new, we have yet to see the long-term impact of generative AI. This means there are some inherent risks to using them, both known and unknown.

The outputs of generative AI models can often sound extremely convincing. This is by design. But sometimes the information they generate is simply incorrect. Worse, it can be manipulated to enable unethical or criminal behaviour.

For example, ChatGPT will not give you instructions on how to hotwire a car; however, if you say you need to hotwire a car to save a baby, the algorithm will gladly comply. Organisations that rely on generative AI models must consider the reputational and legal risks associated with unintentionally publishing biassed, offensive, or copyrighted content.

However, there are several ways to mitigate these risks. For starters, the initial data used to train these models must be carefully selected to avoid toxic or biassed content. Next, instead of using a pre-built generative AI model, organisations could consider using smaller, more specialised models.

Organisations with more resources could tailor a general model to their specific needs and biases. Organisations should also keep a human in the loop (i.e., ensure that a real human reviews the output of a generative AI model before it is published or used) and avoid using generative AI models for critical decisions involving significant resources or human well-being.

This is a completely new field, which cannot be overemphasised. The landscape of risks and opportunities is expected to shift rapidly in the coming weeks, months, and years. New use cases are tested on a monthly basis, and new models will most likely be developed in the coming years.

As generative AI is increasingly and seamlessly integrated into business, society, and our personal lives, we can expect a new regulatory environment to emerge. As organisations begin to experiment—and create value—with these tools, leaders will benefit from keeping an eye on regulatory and risk issues.

Reality check: more realistic expectations

When generative AI first gained widespread attention, a typical business leader's knowledge was primarily derived from marketing materials and sensationalised news coverage. The only tangible experience I had was messing around with ChatGPT and DALL-E. Now that the dust has settled, the business community has a better grasp of AI-powered solutions.

The Gartner Hype Cycle positions Generative AI squarely at “Peak of Inflated Expectations,” on the cusp of a slide into the “Trough of Disillusionment”—in other words, about to enter a (relatively) underwhelming transition period—while Deloitte’s “State of Generated AI in the Enterprise “ report from Q1 2024 indicated that many leaders “expect substantial transformative impacts in the short term.”

The comparison between real-world results and hype is partly subjective. Standalone tools, such as ChatGPT, are often at the forefront of popular imagination, but seamless integration into established services frequently results in greater longevity. Prior to the current hype cycle, generative machine learning tools like Google's "Smart Compose" feature, which debuted in 2018, were not heralded as a paradigm shift, despite being forerunners of today's text generation services. Similarly, many high-impact generative AI tools are being implemented as integrated elements of enterprise environments, enhancing and complementing existing tools rather than revolutionising or replacing them. Examples include "Copilot" features in Microsoft Office, "Generative Fill" features in Adobe Photoshop, and virtual agents in productivity and collaboration apps.

Where generative AI first gains traction in everyday workflows will have a greater impact on the future of AI tools than the potential benefits of any specific AI capabilities. According to a recent IBM survey of over 1,000 employees at enterprise-scale companies, the top three drivers of AI adoption are advances in AI tools that make them more accessible, the need to reduce costs and automate key processes, and the increasing amount of AI embedded in standard off-the-shelf business applications.

Multimodal AI (and video)

That being said, the ambition of cutting-edge generative AI is increasing. The next wave of advancements will focus not only on improving performance within a specific domain, but also on multimodal models capable of accepting multiple types of data as input. While models that operate across multiple data modalities are not a new phenomenon—text-to-image models like CLIP and speech-to-text models like Wave2Vec have existed for many years—they have typically only operated in one direction and were trained to perform a specific task.

The next generation of interdisciplinary models, which includes proprietary models like OpenAI's GPT-4V and Google's Gemini, as well as open source models like LLaVa, Adept, and Qwen-VL, can freely switch between natural language processing (NLP) and computer vision tasks. New models are also incorporating video: in late January, Google introduced Lumiere, a text-to-video diffusion model that can also perform image-to-video tasks or use images as style references.

The most obvious advantage of multimodal AI is more intuitive, versatile AI applications and virtual assistants. Users can, for example, ask about an image and receive a natural language response, or ask aloud for repair instructions and receive visual aids as well as step-by-step text instructions.

On a higher level, multimodal AI enables a model to process a wider range of data inputs, enriching and expanding the information available for training and inference. Videos, in particular, have enormous potential for holistic learning.

"There are cameras that are on 24/7 and they're capturing what happens just as it happens, without any filtering, without any intentionality," says Peter Norvig, Distinguished Education Fellow at the Stanford Institute for Human-Centered Artificial Intelligence (HAI).

"AI models have never had that kind of data before. Those models will simply have a better understanding of everything.

Smaller language models and open source advancements

In domain-specific models, particularly LLMs, larger parameter counts are likely to yield diminishing returns. Sam Altman, CEO of OpenAI (whose GPT-4 model is said to have around 1.76 trillion parameters), suggested this at MIT's Imagination in Action event last April: "I think we're at the end of the era where it's going to be these giant models, and we'll make them better in other ways," he said. "I think there's been way too much focus on parameter count."

Massive models fueled the ongoing AI golden age, but they are not without drawbacks. Only the largest companies have the resources and server space to train and maintain energy-intensive models with hundreds of billions of parameters. According to one University of Washington estimate, training a single GPT-3-sized model consumes more than 1,000 households' worth of electricity per year; a typical day of ChatGPT queries is equivalent to the daily energy consumption of 33,000 US households.

In contrast, smaller models require far fewer resources. Deepmind's influential March 2022 paper demonstrated that training smaller models on more data outperforms training larger models on fewer data. Much of the ongoing innovation in LLMs has thus centred on producing more output with fewer parameters. Models can be downsized without significantly sacrificing performance, as evidenced by recent progress in the 3-70 billion parameter range, particularly those built on LLaMa, Llama 2, and Mistral foundation models in 2023.

The power of open models will only grow. Mistral released "Mixtral," a mixture of experts (MoE) model that combines 8 neural networks, each with 7 billion parameters, in December 2023. Mistral claims that Mixtral not only outperforms the 70B parameter variant of Llama 2 on most benchmarks with 6 times faster inference speeds, but also matches or outperforms OpenAI's much larger GPT-3.5 on most standard benchmarks. Shortly after, Meta announced in January that it had begun training Llama 3 models and confirmed that they would be open source. Though details (such as model size) have not been confirmed, it is reasonable to expect Llama 3 to follow the framework established in the previous two generations.

These advances in smaller models have three important benefits:

They contribute to the democratisation of AI by allowing more amateurs and institutions to study, train, and improve existing models using less expensive and more accessible hardware.
They can run locally on smaller devices, enabling more sophisticated AI in scenarios such as edge computing and the internet of things (IoT). Furthermore, running models locally—such as on a user's smartphone—helps to avoid many privacy and cybersecurity concerns that arise when dealing with sensitive personal or proprietary data.
They make AI more understandable: the larger the model, the more difficult it is to determine how and where it makes critical decisions. Explainable AI is critical for understanding, improving, and trusting the results of AI systems.

GPU shortages and cloud costs

They improve AI's explainability: the larger the model, the more difficult it is to pinpoint how and where it makes important decisions. Explainable AI is critical for understanding, improving, and trusting AI system outputs.

According to a late 2023 O'Reilly report, cloud providers currently bear a significant portion of the computing burden: few AI adopters maintain their own infrastructure, and hardware shortages will only increase the challenges and costs of setting up on-premise servers. Long-term, this may increase cloud costs as providers update and optimise their own infrastructure to effectively meet demand from generative AI.

Navigating this uncertain landscape requires enterprises to be flexible in terms of both models (relying on smaller, more efficient models when necessary or larger, more performant models when practical) and deployment environments. "We don't want to constrain where people deploy [a model]," said IBM CEO Arvind Krishna in a December 2023 interview with CNBC, referring to IBM's Watsonx platform. "So, if they want to deploy it on a large public cloud, we will do so. If they want it deployed at IBM, we'll do it there. If they want to do it on their own and have adequate infrastructure, we will do it there."

Model optimisation is becoming more accessible.

The recent output of the open source community contributes significantly to the trend of maximising the performance of more compact models.

Many significant advances have been (and will continue to be) driven not only by new foundational models, but also by new techniques and resources (such as open source datasets) for training, tweaking, fine-tuning, or aligning pre-trained models. Notable model-agnostic techniques that took hold in 2023 are:

Low Rank Adaptation (LoRA): Instead of directly fine-tuning billions of model parameters, LoRA involves freezing pre-trained model weights and injecting trainable layers—which represent the matrix of changes to model weights as two smaller (lower rank) matrices—into each transformer block. This significantly reduces the number of parameters that must be updated, resulting in faster fine-tuning and less memory required to store model updates.
Quantization: Similar to lowering the bitrate of audio or video to reduce file size and latency, quantization reduces the precision used to represent model data points—for example, from 16-bit floating point to 8-bit integer—to save memory and speed up inference. QLoRA techniques combine quantization and LoRA.
Direct Preference Optimisation (DPO): Chat models typically use reinforcement learning from human feedback (RLHF) to align model outputs with user preferences. RLHF, while powerful, is complex and unstable. DPO promises similar benefits while being computationally lightweight and much simpler.

Along with parallel advances in open source models in the 3-70 billion parameter space, these evolving techniques have the potential to change the dynamics of the AI landscape by providing previously inaccessible sophisticated AI capabilities to smaller players such as startups and amateurs.

Customized local models and data pipelines

In 2024, enterprises can pursue differentiation through bespoke model development rather than building wrappers around repackaged services from "Big AI." Existing open source AI models and tools can be tailored to almost any real-world scenario with the right data and development framework, including customer support, supply chain management, and complex document analysis.

Open source models enable organisations to quickly develop powerful custom AI models—trained on their proprietary data and fine-tuned for their specific needs—without incurring prohibitively high infrastructure costs. This is especially true in domains such as law, healthcare, and finance, where foundation models may not have learned highly specialised vocabulary and concepts during pre-training.

Legal, finance, and healthcare are all prime examples of industries that can benefit from models that are small enough to run locally on basic hardware. Keeping AI training, inference, and retrieval augmented generation (RAG) local reduces the possibility of proprietary data or sensitive personal information being used to train closed-source models or falling into the hands of third parties. Using RAG to access relevant information rather than storing all knowledge directly within the LLM helps to reduce model size, increasing speed and lowering costs.

As 2024 continues to level the model playing field, proprietary data pipelines that enable industry-best fine-tuning will become increasingly important for competitive advantages.

More capable virtual agents

With more sophisticated, efficient tools and a year's worth of market feedback at their disposal, businesses are ready to broaden the use cases for virtual agents beyond simple customer experience chatbots.

As AI systems accelerate and incorporate new streams and formats of information, they broaden the possibilities for not only communication and instruction, but also task automation. "2023 was the year of being able to communicate with an AI.

According to Stanford's Norvig, multiple companies launched something, but the interaction was always the same: you type something in and it types something back. "By 2024, agents will be able to complete tasks for you. Make reservations, plan your trip, and connect to other services."

Multimodal AI, in particular, greatly expands the opportunities for seamless interaction with virtual agents. Instead of simply asking a bot for recipes, a user can point a camera at an open fridge and request recipes that use the ingredients that are currently available. Be My Eyes, a mobile app that connects blind and low vision people with volunteers to help with quick tasks, is testing AI tools that allow users to directly interact with their surroundings using multimodal AI instead of waiting for a human volunteer.

Regulation, copyright and ethical AI concerns

Increased multimodal capabilities and lower entry barriers open up new avenues for abuse, including deepfakes, privacy concerns, bias perpetuation, and even evasion of CAPTCHA safeguards. In January 2024, a wave of explicit celebrity deepfakes hit social media; research from May 2023 revealed that there were 8 times as many voice deepfakes posted online than in the same period in 2022.

Ambiguity in the regulatory environment may slow adoption, or at least make it more aggressive, in the short to medium term. Any major, irreversible investment in an emerging technology or practice that may require significant retooling—or even become illegal—as a result of new legislation or shifting political headwinds in the coming years carries inherent risk.

In December 2023, the European Union (EU) reached a preliminary agreement on the Artificial Intelligence Act. Among other things, it forbids the indiscriminate scraping of images to create facial recognition databases, biometric categorization systems with the potential for discriminatory bias, "social scoring" systems, and the use of AI for social or economic manipulation. It also seeks to define a category of "high-risk" AI systems that have the potential to endanger safety, fundamental rights, or the rule of law and will be subject to increased oversight. Similarly, it establishes transparency standards for "general-purpose AI (GPAI)" systems (foundation models), which include technical documentation and systemic adversarial testing.

However, while some key players, such as Mistral, are based in the EU, the majority of groundbreaking AI development is taking place in the United States, where substantive legislation governing AI in the private sector will require congressional action—which may be unlikely in an election year.

On October 30, the Biden administration issued a comprehensive executive order outlining 150 requirements for federal agencies' use of AI technologies; months earlier, the administration obtained voluntary commitments from prominent AI developers to follow certain trust and security safeguards. Notably, both California and Colorado are actively pursuing their own legislation concerning individuals' data privacy rights in relation to artificial intelligence.

China has taken a more proactive approach to formal AI restrictions, prohibiting price discrimination by recommendation algorithms on social media and mandating clear labelling of AI-generated content. Prospective regulations on generative AI seek to require that the training data used to train LLMs and the content generated by models be "true and accurate," prompting experts to suggest measures to censor LLM output.

Meanwhile, the role of copyrighted material in training AI models used for content generation, ranging from language models to image generators and video models, is a hotly debated topic. The outcome of the New York Times' high-profile lawsuit against OpenAI may have far-reaching consequences for AI legislation. Adversarial tools, such as Glaze and Nightshade, developed at the University of Chicago, have emerged in what may turn out to be an arms race between creators and model developers.

Shadow AI (and corporate AI policies)

For businesses, the growing risk of legal, regulatory, economic, or reputational consequences is exacerbated by the popularity and accessibility of generative AI tools. Organisations must not only have a careful, coherent, and clearly articulated corporate policy regarding generative AI, but also be wary of shadow AI, which is the "unofficial" personal use of AI in the workplace by employees.

Shadow AI, also known as "shadow IT" or "BYOAI," occurs when impatient employees seeking quick solutions (or simply wanting to experiment with new technology faster than a cautious company policy allows) implement generative AI in the workplace without first obtaining approval or oversight from IT. Many consumer-facing services, some of which are free of charge, enable nontechnical individuals to improve their use of generative AI tools. According to an Ernst & Young survey, 90% of respondents use artificial intelligence at work.

That enterprising spirit can be great in a vacuum, but eager employees may lack relevant information or perspectives on security, privacy, or compliance. This can expose businesses to a significant amount of risk. For example, an employee may unknowingly feed trade secrets to a public-facing AI model that continuously trains on user input, or they may use copyrighted material to train a proprietary model for content generation, exposing their company to legal action.

As with many ongoing developments, this demonstrates how the dangers of generative AI increase almost linearly with its capabilities. With great power comes great responsibility.

Emergence Of Multimodal AI Models

OpenAI's GPT4, Meta's Llama 2, and Mistral are all examples of advances in large language models. The technology goes beyond text with multimodal AI models, which allow users to prompt and generate new content by mixing and matching text, audio, images, and videos. This method combines data, such as images, text, and speech, with advanced algorithms to predict and generate results.

Multimodal AI is expected to evolve significantly by 2024, ushering in a new era of generative AI. These models are moving beyond traditional single-mode functions by incorporating a variety of data types such as images, language, and audio. The transition to multimodal models will make AI more intuitive and dynamic.

Capable And Powerful Small Language Models

If 2023 marked the year of large language models, 2024 will see the rise of small language models. LLMs are trained using massive datasets like Common Crawl and The Pile. These datasets contain terabytes of data extracted from billions of publicly accessible websites. Although the data is useful in teaching LLMs to generate meaningful content and predict the next word, its noisy nature stems from its reliance on general internet content.

Small language models, on the other hand, are trained on smaller datasets but still include high-quality sources like textbooks, journals, and authoritative content. These models have fewer parameters and require less storage and memory, allowing them to run on less powerful and expensive hardware. Despite their small size, SLMs produce content that is comparable to some of their larger counterparts.

Microsoft's PHI-2 and Mistral 7B are two promising SLMs that will drive the next wave of generative AI applications.

Enterprises will be able to fine-tune SLMs to meet specific tasks and domain-specific requirements. This will satisfy the legal and regulatory requirements, accelerating the adoption of language models.

The Rise Of Autonomous Agents

Autonomous agents represent an innovative approach to building generative AI models. These agents are self-contained software programmes designed to achieve a specific goal. When considering generative AI, the ability of autonomous agents to generate content without human intervention overcomes the constraints associated with traditional prompt engineering.

Autonomous agents are developed using advanced algorithms and machine learning techniques. These agents use data to learn, adapt to new situations, and make decisions with minimal human intervention. For example, OpenAI has developed tools like custom GPTs that make effective use of autonomous agents, indicating significant advancements in the field of artificial intelligence.

Multimodal AI, which combines various AI techniques such as natural language processing, computer vision, and machine learning, is essential for the creation of autonomous agents. It can make predictions, take actions, and interact more effectively by analysing multiple data types simultaneously and applying the current context.

Frameworks such as LangChain and LlamaIndex are popular tools for developing LLM-based agents. In 2024, we will see new frameworks that make use of multimodal AI.

Autonomous agents will greatly benefit the customer experience by allowing for intelligent and responsive interactions. These highly contextualised agents will benefit industries such as travel, hospitality, retail, and education by lowering overall costs through reduced human intervention.

Open Models Will Become Comparable With Proprietary Models

Open, generative AI models are expected to evolve significantly by 2024, with some predictions putting them on par with proprietary models. The comparison between open and proprietary models, on the other hand, is complex and depends on a number of factors, including the specific use cases, development resources, and training data used by the models.

Meta's Llama 2 70B, Falcon 180B, and Mistral AI's Mixtral-8x7B were extremely popular in 2023, with performance comparable to proprietary models such as GPT 3.5, Claude 2, and Jurassic-2.

In the future, the gap between open and proprietary models will close, giving enterprises a great option for hosting generative AI models in hybrid or on-premises environments.

In 2024, the next iteration of models from Meta, Mistral, and possibly new entrants will be released as viable alternatives to proprietary models accessible via APIs.

Cloud Native Becomes Key To On-Prem GenAI

Kubernetes is already the preferred hosting platform for generative AI models. Key players such as Hugging Face, OpenAI, and Google are expected to deliver generative AI platforms using Kubernetes-based cloud native infrastructure.

Hugging Face's Text Generation Inference, AnyScale's Ray Serve, and vLLM all support model inference in containers. By 2024, frameworks, tools, and platforms built on Kubernetes will be mature enough to manage the entire lifecycle of foundation models. Users will be able to pre-train, fine-tune, deploy, and scale generative models effectively.

Key cloud native ecosystem players will share reference architectures, best practices, and optimisations for generative AI on cloud native infrastructure. LLMOps will be expanded to accommodate integrated cloud native workflows.

In 2024, generative AI will continue to evolve quickly, delivering new and unexpected capabilities that will benefit both consumers and businesses.

The Rise of Generative AI

The rise of Generative AI has been a remarkable journey, demonstrating the unwavering pursuit of creating machines capable of creative expression. Generative AI originated in the early 2010s, when researchers began investigating deep learning techniques for data generation. Early milestones included the creation of autoencoders and restricted Boltzmann machines, which paved the way for more sophisticated generative models to follow.

Ian Goodfellow and his team introduced Generative Adversarial Networks (GANs) in 2014, marking one of the field's most significant breakthroughs. GANs transformed Generative AI by introducing a novel two-network architecture: a generator for creating synthetic data and a discriminator for determining the authenticity of the generated data. Adversarial training enabled GANs to generate realistic images, videos, and audio. This was a watershed moment, catapulting Generative AI into the spotlight and sparking a wave of research and innovation.

As technology advanced, generative AI made its way into a variety of domains. AI-created masterpieces were displayed in prestigious galleries and auction houses, blurring the distinction between human and machine creativity. AI-powered chatbot characters and virtual worlds have become commonplace in video games and interactive experiences, enthralling audiences around the world.

Generative AI had a significant impact on industries such as fashion, healthcare, and architecture, where AI-generated designs, medical images, and building layouts introduced new levels of efficiency and creativity.

Today, Generative AI is evolving rapidly thanks to the collaboration of researchers, developers, and artists from various backgrounds. With each breakthrough, Generative AI pushes the boundaries of what is possible, paving the way for new levels of creativity and innovation. As we enter 2024, the rise of Generative AI shows no signs of slowing, with the potential to reshape industries, boost human creativity, and unlock novel solutions to some of society's most pressing challenges.

The Impact of Generative AI on Industries

Generative AI has had a transformative impact on various industries. Generative AI has pushed the envelope of creativity in the arts and entertainment industries. Artists are now using AI tools to create new artistic expressions, resulting in a fusion of human ingenuity and machine-generated beauty. AI-generated music, visual arts, and literature have made their way into galleries, concert halls, and literary circles, enthralling audiences with their unique and emotional compositions.

Furthermore, the entertainment industry has used Generative AI to create virtual characters, environments, and narratives that blur the lines between reality and fantasy, enhancing video games, movies, and immersive experiences.

In the world of marketing and content creation, Generative AI has transformed how brands interact with their audiences. AI algorithms drive personalised content, which caters to individual preferences, resulting in more meaningful and relevant interactions with consumers. Generative AI enables brands to provide personalised experiences at scale, fostering brand loyalty and customer satisfaction.

Furthermore, AI-powered content creation workflows are streamlined, allowing marketers to produce high-quality content more efficiently and cost-effectively.

Generative AI has also made significant contributions to the healthcare sector, particularly in medical imaging and drug discovery. AI-generated medical images help to diagnose and detect diseases with greater accuracy, accelerating the diagnostic process and improving patient outcomes. Furthermore, generative AI models are used to simulate the behaviour of complex biological systems, which speeds up drug discovery and development.

This groundbreaking technology enables the exploration of vast chemical spaces and the identification of potential drug candidates, reducing the time and resources required to bring life-saving medications to market. As the healthcare industry embraces Generative AI advancements, it has the potential to transform patient care and usher in a new era of medical innovation.

Now, let's look at the top ten Generative AI trends to watch in 2024 and see how these ground-breaking developments push the boundaries of creativity, efficiency, and problem-solving to new highs.

AI for Creativity Dall-e, a generative AI tool, provided many surprises. It was the first tool to create art using only a few inputs. Although its previous version was ineffective at generating decent art, it is now much better and produces art exactly as the user requests. However, not all generative AI tools are capable of performing this art. They can create real-time animation, music, and audio for a variety of applications. This will continue to grow for years, allowing musicians, songwriters, art creators, sound effects professionals, and everyday users to fully utilise the potential of generative AI tools and express their creativity.
High-Level Personalization Generative AI was built with technologies that can provide personalised experiences. These include GANs, neural networks, advanced machine learning algorithms, and language models. These are fed a massive amount of data to train their data analysis, data generation, and prediction capabilities, resulting in a system capable of analysing an individual's personalised choices, producing similar results, and becoming extremely engaging. This is similar to assisting you in selecting exactly what you want, and you will receive those items quickly. High-level personalisation can help businesses generate a lot of money by targeting the right market and audience based on the right parameters. For example, generative AI-driven personalisation can assist businesses in developing customised content for any marketing campaign. Similarly, the sales team can increase sales by sending personalised product emails to potential customers after analysing their needs. Generative AI tools can accomplish this by analysing company demand, what the client has previously purchased, his top picks from products, and his goal to shortlist the product he requires.
Advancements in Generative Adversarial Networks GANs are the foundation of generative AI; without them, no future AI would exist. GANs generate new data that is similar to the training data. For example, it can create an image of a lady even if they do not belong to any girl in the world. GAN's neural network contains two system generators and discriminators that compete with one another using deep learning methods to improve prediction accuracy. If you've used any generative AI tools, GAN is the one that lets you create images, text, audio, or videos. This trend is expected to continue in 2024, with the GAN evolving and capable of providing new use cases.
Conversational AI A few years ago, AI was never particularly interesting; all it did was analyse data, learn things, and recommend optimal changes or prompt a command. They were never conversational, and voice assistants like Google, Alexa, and Siri support this claim. Enter generative AIs, and the conversational aspect has just skyrocketed. Generative AI tools like ChatGPT are conversational on a human level; a sudden increase in AI's ability to converse was unexpected; in other words, we were caught off guard. And the reason these AIs are so appealing in conversation is because of their stack, which includes neural networks, neural language processing, generation, deep learning, and LLM. These stacks enable AI to be highly engaging and conversational, much like a human, and have already been considered for voice assistants and various customer care chatbots. This is because they can be sentimental and provide humans with the comfort they require when expressing their experiences; this is especially useful in customer service, where when a customer provides feedback on a defective product, the bot can be sentimental in order to provide personalised care. In summary, these can improve business operations at all levels by providing business personnel with human-like experiences in real time. This could be the most interesting trend for 2024.
Generative AI Infrastructure The technology stack of any IT practice evolves to keep the domain competitive, and generative AI is no exception. When ChatGPTs first debuted, it was based on the GPT-3 (generative pre-trained transformer) model, with the primary goal of producing texts such as articles, poems, essays, and news reports. Now, the open AI wants to take a step further and improve its capabilities in order to provide unique applications. To accomplish this, they created a GPT-4 model that focuses on scaling and incorporates Reinforcement Learning with Human Feedback (RLHF) to generate more relevant responses. Other startups, such as Anthropic, have been developing their own version of the feedback model, known as RL-CAI, to power their chatbots, similar to OpenAI. This technological adaptation will shape AIs in 2024, allowing them to respond more accurately to specific human tasks and better understand humans.
Generative AI For Scientific Research Scientific research has been accelerated by technology, and the emerging technology of generative AI holds promise for further accelerating research in a variety of fields. This will lead to increased innovation, production, and implementation of new research techniques that can benefit various sectors and improve people's lives. This is because generative AIs are trained on massive datasets. With such vast amounts of research data, they learn, adapt, and become aware of the research processes and parameters, generating insights and hypotheses across disciplines. Physics, astronomy, biology, chemistry, and other disciplines benefit from generative AI's ability to create systems that improve the analysis, generation, and prediction of research objects, such as identifying the output of a chemical reaction, the heat generated, the concentration level, and structure. Generative AI has begun to transform such fields. Among them is healthcare, where gene sequencing is performed using artificial intelligence to determine how gene expression will change in response to specific changes in genes and, as a result, medicines are produced to improve patients' overall health.
NLP Applications Generative AIs can communicate in a human-like manner. Text, audio, images, and videos have all become more natural in conversations with the appropriate sentiment. This is all due to (NLP) Natural Language Processing, which enables generative AIs to read texts, hear speech, identify sentiments and their proportions, detect critical parts, and suggest AI responses based on relevant information. This seemed impossible with traditional AI models, which were designed only to analyse, detect, and provide statistical data. In contrast, Generative AI has caused NLPs to evolve, allowing AI to accurately comprehend data and interact more effectively with humans. This year, the trend towards NLP applications will grow, resulting in the rise of voice assistants and chatbots that converse almost like humans.
Intelligent Process Automation With AI taking over business processes, companies must build a solid foundation of genAI tools that enable automation for more efficient, effective, and faster business operations. Generative AI-powered automation has numerous advantages, including automating data entry, invoicing, accounting, and documentation, allowing businesses to shift their resources to more complex roles for maximum output. Another advantage of AI automation is that businesses can gain insights into various business parameters in seconds and evaluate the results instantly to strategize specific areas. Large language models (LLMs) can analyse all business data and categorise it into structured and unstructured formats in order to standardise newly formed data and provide accurate knowledge of business logic. Similarly, generative AI image recognition tools powered by neural language understanding can aid in the detection of document anomalies, strengthening logical response, and improving cognitive automation to address issues such as workforce shortage. Furthermore, robotic process automation can offer business-specific benefits such as automated insurance claims, marketing and sales, fraud detection and risk management, supply chain automation, and so on. With an increasing number of AI tools centred on automation this year, the automation trend will grow significantly.
Ethical Concerns With the increased use of generative AI-powered tools, there is concern about how much the new AI will adhere to ethics and legal boundaries, particularly when collecting various types of data across the web, including personal and other sensitive information. Generative AI can generate data similar to the real one of any individual for training its model to make them accurate at what they do; however, the risk is eminent such as the re-identification of individuals from synthetic data; this could pose a data privacy concern for any individual. There is also growing concern that AI will be biassed based on race, religion, and other factors, potentially causing social issues as AI is deployed via the internet. This could happen because humans have posted biassed data on the internet to favour a particular race, country, citizen, or religion; as a result, AI could be trained on such biassed data and produce similar responses that may be offensive. Such concerns will emerge in 2024, prompting businesses to reconsider and devise workable solutions to eliminate such uncertainties.
Wide Range of Generative AI Applications' It all began with mid-journey and stable diffusion, the first generative AI models to go viral on social media. Following the release of ChatGPT, generative AI quickly became the most talked-about topic in history. The world can now get answers to almost all of their questions at a single prompt thanks to ChatGPT; they can even use such generative AI tools to generate images, videos, audio, art, and other media.

Because of their growing popularity, use cases, demand, and benefits, an increasing number of generative AI tools are being developed. AI companies have also begun to keep up with rising demand by developing one-of-a-kind AI tools that offer distinct functionality for both casual and professional operations.

For example, Jasper is quickly gaining popularity as one of the best copywriting tools, built flawlessly on the GPT-4 model for enterprise-centric content. Another tool, Harvey, is created by training proprietary data, which is primarily used in the legal field. This tool can extract context from complex legal terms and generate contracts for multiple parties involved.

Similarly, many tools are entering the market that promise to provide exceptional automation capabilities for a variety of business operations, and 2024 will be the year when everyone will bet on one or more generative AI tools, as major AI products such as Bard and GPT-4 modelled chatbots may enter the game with exceptional capabilities.

AI technology at work and 'bring your own AI'

At the start of the generative AI boom, many organisations were cautious and prohibited employees from using ChatGPT. Samsung, JPMorgan Chase, Apple, and Microsoft were among those temporarily affected.

They were sceptical of ChatGPT's training data and feared that using the generative AI tool would result in internal data breaches.

While many enterprises remain cautious about generative AI and traditional AI technology, Forrester Research analyst Michele Goetz predicts that by 2024, enterprises will allow their employees to use more generative AI.

"We know that nearly two-thirds of employees are already playing around and experimenting with AI in their job or in their personal life," she told me. "We really expect that this is going to be more normal in our everyday business practices in 2024."

That scenario could involve more employees using generative AI to increase their effectiveness and productivity.

According to Goetz, many employees will likely use a "bring your own AI" (BYOAI) system. BYOAI means that employees will use any mainstream or experimental AI service to complete business tasks, regardless of whether the company approves of it. These tools could be generative AI systems such as ChatGPT and Dall-E, or software containing unapproved embedded AI technology.

Enterprises will have few options but to invest more in AI and encourage employees to use it responsibly.

BP, a British multinational oil and gas company, is already incorporating generative and classic AI technology into its culture.

"One of the things that we're taking very intentionally is an idea about AI for everyone," Justin Lewis, BP's vice president of incubation and engineering, said during a panel discussion at The AI Summit New York on December 7.

Lewis explained that BP's vision of AI for everyone entails more than simply providing all employees with access to AI tools and technology.

The oil and gas company's goal is for every employee, regardless of technical background, to be able to create their own AI tools and publish, share, and reuse them.

"The lower barrier to entry that we're seeing with LLMs and generative AI in general makes that more possible today than it ever has been," he said.

"If we can remove the bottleneck and get to a point where citizen developers, or citizens, with no experience building any technical tools are able to build their own AI tools and leverage and scale them and share them, then we'll be able to advance AI innovation much faster," he said.

That kind of innovation comes from using AI to help employees be more productive.

In 2024, generative AI will accomplish this in part through "shadow AI," according to Goetz.

Shadow AI is the use of AI technology to supplement or increase employee productivity without hiring additional employees. According to Lewis, BP has already implemented one type of shadow AI.

"What we're seeing most impactful are in the places where you're helping humans perform 10 times better, 10 times faster," he went on to say.

One example is in software engineering. Lewis continued, "BP has a team that does code reviews as a service using AI." Lewis explained that BP has assembled a team in which one engineer can review code for 175 other engineers.

"It has a radical impact on the way you think about shaping the organisation," he went on to say.

"There are a lot of risks that can come from personal AI use or bring your own AI," Goetz went on to say.

Many businesses will invest proactively in governance and AI compliance to stay ahead of the curve, she said. Governance will take various forms.

For organisations that provide their employees with generative AI and allow personal AI use in the workplace, governance entails monitoring how employees use the technology by flagging inappropriate or low-quality prompts, according to Lewis.

Governance also entails looking externally at government regulations that have been proposed or enacted in order to stay ahead of compliance, Goetz explained.

Existing AI-related regulations include the New York City AI Hiring Law and the California Data Privacy Law.

Preparing for upcoming regulations not only benefits organisations financially, but also protects them, she explained.

Technology companies that develop capabilities that comply with existing or potential regulations will benefit their customers. That should result in more revenue, she said.

Furthermore, compliance reduces the risk of lawsuits because generative AI builds models using intellectual property, which could expose organisations to the risk of illegally appropriating IP.

Staying on top of governance and regulation allows organisations to participate in potential regulation, Goetz added.

"It's also the ability to influence and know how much teeth are in these regulations," she went on to say.

Furthermore, as enterprises and organisations begin to seriously consider AI governance, insurers may begin to offer policies to protect against hallucinations, which occur when AI systems produce false or distorted information, according to Goetz.

"Insurance companies are also recognising that their policies may not actually be covering all of the risk permutations that newer AI capabilities are going to introduce -- hallucinations being one of those," she went on to say.

More multimodal and open models

According to Chandrasekaran, more personalised AI models will most likely result in more multimodal models.

Multimodal models combine a variety of modes or data. The models can convert text input from a user into an image, audio, or video output, for example.

Current iterations of this include image-generating models such as Dall-E, which convert text into images.

However, Google's recent release of Gemini, a model capable of training on and producing text, images, audio, and video, demonstrates what multimodal models may look like in the future.

For example, combining speech, text, and images could improve disease diagnosis in healthcare.

"The potential for multimodality in terms of enabling more advanced use cases is certainly immense," Chandrasekaran said.

Open source models, in addition to multimodal models, will become increasingly popular, according to Goetz.

"What you're going to see is almost this capitalist type of effect, where all of these models are going to come to market, and then it's going to get Darwinian" in terms of the stronger models beating out the less successful ones," she said.

Enterprise adoption of the models will also evolve, she said.

More AI startups and more sophisticated offerings.

Generative AI paved the way for numerous AI startups to enter the market.

While many more startups will emerge in 2024, they will provide more sophisticated offerings than those available today, according to Forrester Research analyst Rowan Curran.

According to Curran, new startups will create more application-specific offerings rather than offerings centred on AI chatbots like ChatGPT.

"That's going to be driven and supported by the increasing array of those open source and proprietary tools to build on top of some of these core models themselves," he went on to say.

These could include LLMs, diffusion models (models that can generate data similar to the data on which they were trained), traditional machine learning models, or computer vision models, he added.

Furthermore, Curran predicts that new startups will emerge as a result of the development of domain-specific or smaller language models.

2024 will be a year dedicated to understanding how generative AI will shape the larger enterprise IT ecosphere.

"We really have to remember that this was just a first year of getting acclimatised to these things," said Curran. "Maybe into next year and even a year beyond is where we start to see a type of clarity come into what type of services are built with these things."

The Future of Generative AI

Advanced machine learning, which powers next-generation AI-enabled products, has been developed over decades. However, since ChatGPT's launch in late 2022, new iterations of generation AI technology have been released on a monthly basis. There were six significant advancements in March 2023 alone, including new customer relationship management solutions and financial services industry support.

The road to human-level performance just became shorter.

By the end of this decade, gen AI will perform at a median level of human performance on the majority of the technical capabilities shown in this chart. And its performance will compete with the top 25% of people who complete any or all of these tasks by 2040. In some cases, this is 40 years faster.

Automation of knowledge work is now within sight.

Previous waves of automation technology primarily affected physical work activities, but Gen AI is expected to have the greatest impact on knowledge work, particularly activities involving decision making and collaboration. Professionals in fields such as education, law, technology, and the arts may see parts of their jobs automated sooner than previously anticipated. This is due to generative AI's ability to predict and dynamically apply natural language patterns.

Apps continue to proliferate to address specific use cases.

Gen AI tools can already generate most types of written, image, video, audio, and coded content. Businesses are developing applications to address use cases in all of these areas. In the near future, we expect applications that target specific industries and functions to be more valuable than those that are more general.

Certain industries will benefit more than others.

Gen AI's precise impact will be determined by a number of factors, including the mix and importance of various business functions, as well as the size of an industry's revenue. Almost all industries will benefit the most from implementing the technology in their marketing and sales functions. However, the ability of generation AI to accelerate software development will have an even greater impact in high tech and banking.

Despite gen AI’s commercial promise, most organizations aren’t using it yet

When we asked marketing and sales leaders how frequently they thought their organisation should use gen AI or machine learning for commercial activities, 90 percent said it should be at least "often." As previously stated, marketing and sales have the greatest potential for impact, so this is not surprising. However, 60% reported that their organisations rarely or never do this.

Marketing and sales leaders are most excited about three use cases.

According to our research, marketing and sales leaders expected at least moderate impact from each of the generation AI use cases we proposed. They were particularly enthusiastic about lead generation, marketing optimisation, and personalised outreach.

Software engineering, the other major value driver for many industries, could become significantly more efficient.

When 40 of McKinsey's own developers tested generative AI-based tools, we discovered significant speed gains for many common developer tasks. Documenting code functionality for maintainability (which takes into account how easily code can be improved) can take half the time, writing new code in nearly half the time, and optimising existing code (known as code refactoring) in nearly two-thirds the time.

And gen AI assistance could make for happier developers

Our research discovered that providing developers with the tools they need to be most productive improved their experience significantly, potentially helping companies retain their best talent. Developers who used generative AI tools were more than twice as likely to report feeling happy, fulfilled, and in flow. They attributed this to the tools' ability to automate grunt work that kept them from more rewarding tasks and to provide information at their fingertips faster than searching for solutions across multiple online platforms.

Momentum among workers for using gen AI tools is building

According to a new McKinsey survey, the vast majority of workers—from a variety of industries and geographic locations—have used generative AI tools at least once, whether at work or elsewhere. That's pretty rapid adoption for less than a year. One unexpected finding is that baby boomers report using Gen AI tools for work more than millennials.

But organizations still need more gen AI–literate employees

As organisations begin to set gen AI goals, there is a growing demand for gen AI-literate workers. As generative and other applied AI tools begin to provide value to early adopters, the supply-demand gap for skilled workers remains significant. To remain competitive in the talent market, organisations should develop excellent talent management capabilities, providing rewarding work experiences to the next generation of AI-literate workers they hire and hope to retain.

Organizations should proceed with caution

Many people are excited about the potential of next-generation AI. However, like any new technology, generation AI is not without its risks. For starters, generation AI has been known to create content that is biassed, factually incorrect, or illegally scraped from a copyrighted resource. Before adopting Gen AI tools wholesale, organisations should consider the reputational and legal risks to which they may be exposed. One way to reduce the risk? Keep a human in the loop; that is, ensure that any generation AI output is checked by a real human before it is published or used.

Gen AI could ultimately boost global GDP

McKinsey discovered that generation AI could significantly increase labour productivity across the economy. To reap the benefits of this productivity boost, workers whose jobs are affected must shift to other work activities that allow them to match their 2022 productivity levels. Stronger global GDP growth could lead to a more sustainable, inclusive world if workers are encouraged to learn new skills and, in some cases, change jobs.

Gen AI represents just a small piece of the value potential from AI

While AI is a significant advancement, traditional advanced analytics and machine learning continue to account for the vast majority of task optimisation, and they continue to find new applications in a wide range of industries. Organisations undergoing digital and AI transformations would do well to keep an eye on gen AI, but not at the expense of other AI tools. Just because they don't make headlines doesn't mean they can't be used to boost productivity—and, ultimately, value.

FAQs on Generative AI

1. What is generative AI, and how does it differ from traditional AI?

Generative AI is a subset of artificial intelligence that focuses on creating content autonomously. Unlike traditional AI, which often relies on pre-defined rules and patterns, generative AI uses algorithms to generate new and unique outputs, such as images, text, or even entire simulations. In 2024, we expect generative AI to push boundaries in creativity and innovation.

2. What are the key trends in generative AI that we should watch out for in 2024?

In 2024, several exciting trends are emerging in generative AI. These include advancements in natural language processing, improved image and video generation capabilities, enhanced creativity in content creation, increased adoption of AI in design processes, and the rise of more accessible generative AI tools for developers and creators.

3. How will generative AI impact industries beyond tech and entertainment?

Generative AI is not limited to tech and entertainment. In 2024, we anticipate its widespread impact across various industries, including healthcare, finance, marketing, and education. From personalized medicine and financial modeling to creative marketing campaigns and interactive educational content, generative AI is poised to revolutionize diverse sectors.

4. Are there ethical considerations associated with the use of generative AI?

Yes, ethical considerations are crucial when it comes to generative AI. In 2024, as these technologies become more powerful, addressing issues such as bias in training data, responsible use of AI-generated content, and transparency in AI decision-making will be paramount. The industry is increasingly focusing on developing ethical guidelines and standards to ensure responsible AI deployment.

5. How can businesses leverage generative AI to stay competitive in 2024?

Businesses can harness generative AI to stay competitive by exploring applications such as personalized customer experiences, AI-assisted content creation, predictive analytics, and process optimization. Integrating generative AI into workflows can enhance efficiency, innovation, and customer engagement, providing a strategic advantage in a rapidly evolving market landscape.

TheGen.AI

"Journey Towards AGI"