Search Topics

Blog Posts (129)

Other Pages (46)

129 results found with an empty search

Meta raises the bar with open source Llama 3 LLM
Meta has introduced Llama 3, the next generation of its state-of-the-art open source large language model (LLM). The tech giant claims Llama 3 establishes new performance benchmarks, surpassing previous industry-leading models like GPT-3.5 in real-world scenarios. “With Llama 3, we set out to build the best open models that are on par with the best proprietary models available today,” said Meta in a blog post announcing the release. The initial Llama 3 models being opened up are 8 billion and 70 billion parameter versions. Meta says its teams are still training larger 400 billion+ parameter models which will be released over the coming months, alongside research papers detailing the work. Llama 3 has been over two years in the making with significant resources dedicated to assembling high-quality training data, scaling up distributed training, optimising the model architecture, and innovative approaches to instruction fine-tuning. Meta’s 70 billion parameter instruction fine-tuned model outperformed GPT-3.5, Claude, and other LLMs of comparable scale in human evaluations across 12 key usage scenarios like coding, reasoning, and creative writing. The company’s 8 billion parameter pretrained model also sets new benchmarks on popular LLM evaluation tasks: “We believe these are the best open source models of their class, period,” stated Meta. The tech giant is releasing the models via an “open by default” approach to further an open ecosystem around AI development. Llama 3 will be available across all major cloud providers, model hosts, hardware manufacturers, and AI platforms. Victor Botev, CTO and co-founder of Iris.ai, said: “With the global shift towards AI regulation, the launch of Meta’s Llama 3 model is notable. By embracing transparency through open-sourcing, Meta aligns with the growing emphasis on responsible AI practices and ethical development. ”Moreover, this grants the opportunity for wider community education as open models facilitate insights into development and the ability to scrutinise various approaches, with this transparency feeding back into the drafting and enforcement of regulation.” Accompanying Meta’s latest models is an updated suite of AI safety tools, including the second iterations of Llama Guard for classifying risks and CyberSec Eval for assessing potential misuse. A new component called Code Shield has also been introduced to filter insecure code suggestions at inference time. “However, it’s important to maintain perspective – a model simply being open-source does not automatically equate to ethical AI,” Botev continued. “Addressing AI’s challenges requires a comprehensive approach to tackling issues like data privacy, algorithmic bias, and societal impacts – all key focuses of emerging AI regulations worldwide. ”While open initiatives like Llama 3 promote scrutiny and collaboration, their true impact hinges on a holistic approach to AI governance compliance and embedding ethics into AI systems’ lifecycles. Meta’s continuing efforts with the Llama model is a step in the right direction, but ethical AI demands sustained commitment from all stakeholders.” Meta says it has adopted a “system-level approach” to responsible AI development and deployment with Llama 3. While the models have undergone extensive safety testing, the company emphasises that developers should implement their own input/output filtering in line with their application’s requirements. The company’s end-user product integrating Llama 3 is Meta AI, which Meta claims is now the world’s leading AI assistant thanks to the new models. Users can access Meta AI via Facebook, Instagram, WhatsApp, Messenger and the web for productivity, learning, creativity, and general queries. Multimodal versions of Meta AI integrating vision capabilities are on the way, with an early preview coming to Meta’s Ray-Ban smart glasses. Despite the considerable achievements of Llama 3, some in the AI field have expressed scepticism over Meta’s motivation being an open approach “for the good of society.” However, just a day after Mistral AI set a new benchmark for open source models with Mixtral 8x22B, Meta’s release does once again raise the bar for openly-available LLMs.
Amazon adds Andrew Ng, a leading voice in artificial intelligence, to its board of directors
Amazon is adding artificial intelligence visionary Andrew Ng to its board of directors, a move that comes amid intense AI competition among startups and big technology companies. The Seattle company said Thursday that Ng, a managing director at the Palo Alto, California-based AI Fund, will replace a seat vacated by Judy McGrath, a former CEO of MTV who told Amazon she won’t run for reelection. Ng’s AI Fund, which he founded in 2017, invests in entrepreneurs building artificial intelligence companies. Previously, he led AI teams at the Chinese tech company Baidu and Google, where the team he oversaw taught a computer system to recognize cats in YouTube videos without ever being taught what a cat was. Ng’s addition to the board comes as Amazon, like other tech companies, makes massive investments in generative artificial intelligence. The company has invested $4 billion in the San Francisco-based startup Anthropic, which is partnering with Amazon to develop so-called foundation models that underpin generative AI technologies. In the past year, Amazon also rolled out a chatbot for businesses called Q and a generative-AI powered shopping assistant named Rufus. In an annual shareholder letter released Thursday, Amazon CEO Andy Jassy suggested generative AI could be the next big pillar of Amazon’s business, joining the company’s prominent online marketplace, Prime subscription program and its cloud computing unit, AWS. Jassy wrote that generative AI may be the largest technological transformation since cloud computing, and “perhaps since the internet.” Meanwhile, other Amazon innovations have encountered some hiccups. The company said last week it was pulling its Just Walk Out technology from Amazon Fresh stores in the U.S. after receiving some customer feedback. Amazon said it was replacing the technology, which allows customers to skip the checkout line, with smart carts that would allow them still to do that but also see their spending in real time.
Hugging Face launches Idefics2 vision-language model
Hugging Face has announced the release of Idefics2, a versatile model capable of understanding and generating text responses based on both images and texts. The model sets a new benchmark for answering visual questions, describing visual content, story creation from images, document information extraction, and even performing arithmetic operations based on visual input. Idefics2 leapfrogs its predecessor, Idefics1, with just eight billion parameters and the versatility afforded by its open license (Apache 2.0), along with remarkably enhanced Optical Character Recognition (OCR) capabilities. The model not only showcases exceptional performance in visual question answering benchmarks but also holds its ground against far larger contemporaries such as LLava-Next-34B and MM1-30B-chat: Central to Idefics2’s appeal is its integration with Hugging Face’s Transformers from the outset, ensuring ease of fine-tuning for a broad array of multimodal applications. For those eager to dive in, models are available for experimentation on the Hugging Face Hub. A standout feature of Idefics2 is its comprehensive training philosophy, blending openly available datasets including web documents, image-caption pairs, and OCR data. Furthermore, it introduces an innovative fine-tuning dataset dubbed ‘The Cauldron,’ amalgamating 50 meticulously curated datasets for multifaceted conversational training. Idefics2 exhibits a refined approach to image manipulation, maintaining native resolutions and aspect ratios—a notable deviation from conventional resizing norms in computer vision. Its architecture benefits significantly from advanced OCR capabilities, adeptly transcribing textual content within images and documents, and boasts improved performance in interpreting charts and figures. Simplifying the integration of visual features into the language backbone marks a shift from its predecessor’s architecture, with the adoption of a learned Perceiver pooling and MLP modality projection enhancing Idefics2’s overall efficacy. This advancement in vision-language models opens up new avenues for exploring multimodal interactions, with Idefics2 poised to serve as a foundational tool for the community. Its performance enhancements and technical innovations underscore the potential of combining visual and textual data in creating sophisticated, contextually-aware AI systems.
Latest Generative AI Trends
Generative artificial intelligence (AI) refers to algorithms (like ChatGPT) that can generate new content, such as audio, code, images, text, simulations, and videos. Recent advances in the field have the potential to fundamentally alter the way we approach content creation. The tremendous capability of generative AI will help to democratise access to AI's revolutionary potential. And I believe that in order to fully consider how it will affect our lives, we must all be aware of what is going to happen. The top ten generative AI trends 2024 highlight the vast potential of this modern technology and make a compelling case for why it is a wise investment for any business. So without further ado, let's look at the trends. But first, let's take a closer look at how the rise of Generative AI is affecting the world today! Generative AI systems are classified as machine learning, and one such system, ChatGPT, describes what it can do as follows: Are you ready to push your creativity to the next level? Look no further than generative AI. This innovative type of machine learning enables computers to generate a wide range of new and exciting content, from music and art to entire virtual worlds. And it's not just for fun; generative AI has a variety of practical applications, such as developing new product designs and optimising business processes. So, why wait? Unleash the power of generative AI and see what incredible creations you can make! Did anything in that paragraph seem strange to you? Perhaps not. The grammar is perfect, the tone is appropriate, and the narrative flows smoothly. What are ChatGPT and DALL-E? ChatGPT may be getting all the attention right now, but it isn't the first text-based machine learning model to make a splash. In recent years, OpenAI's GPT-3 and Google's BERT have both received a lot of attention. However, prior to ChatGPT, which, by most accounts, works well most of the time (though it is still being evaluated), AI chatbots did not always receive positive feedback. GPT-3 is "by turns super impressive and super disappointing," according to New York Times tech reporter Cade Metz in a video in which he and food writer Priya Krishna ask GPT-3 to write recipes for a (rather disastrous) Thanksgiving dinner. The first machine learning models to work with text were trained by humans to classify various inputs based on labels assigned by researchers. One example is a model that has been trained to classify social media posts as positive or negative. This type of training is referred to as supervised learning because a human is responsible for "teaching" the model what to do. The next generation of text-based machine learning models uses self-supervised learning. This type of training involves feeding a model a large amount of text so that it can generate predictions. Some models, for example, can predict the ending of a sentence based on only a few words. What does it take to build a generative AI model? Building a generative AI model has traditionally been a massive undertaking, with only a few well-resourced tech heavyweights attempting it. OpenAI, the company behind ChatGPT, former GPT models, and DALL-E, has raised billions of dollars from well-known donors. DeepMind is a subsidiary of Alphabet, Google's parent company, and Meta has launched its Make-A-Video product, which uses generative AI. These companies employ some of the world's most talented computer scientists and engineers. But it's more than just talent. When you ask a model to train across nearly the entire internet, it will cost you. OpenAI has not disclosed exact costs, but estimates suggest that GPT-3 was trained on approximately 45 terabytes of text data—equivalent to about one million feet of bookshelf space, or one-quarter of the entire Library of Congress—at a cost of several million dollars. These are not resources that a garden-variety start-up can use. What kinds of output can a generative AI model produce? As previously stated, the outputs of generative AI models can be indistinguishable from human-generated content or appear uncanny. The results are determined by the model's quality—as we've seen, ChatGPT's outputs appear to be superior to those of its predecessors—as well as the model's fit with the use case, or input. In ten seconds, ChatGPT can produce what one commentator described as a "solid A-" essay comparing Benedict Anderson and Ernest Gellner's theories of nationalism. It also produced a well-known passage describing how to remove a peanut butter sandwich from a VCR in the style of the King James Bible. AI-generated art models, such as DALL-E (named after surrealist artist Salvador Dalí and the beloved Pixar robot WALL-E), can produce unique and stunning images, such as a Raphael painting of a Madonna and child eating pizza. Other generative AI models can create code, video, audio, and business simulations. However, the results are not always accurate—or appropriate. When Priya Krishna asked DALL-E 2 to create an image for Thanksgiving dinner, it came up with a scene in which the turkey was garnished with whole limes and served alongside a bowl of what appeared to be guacamole. ChatGPT, for its part, appears to struggle with counting or solving basic algebra problems, let alone overcoming the sexist and racist bias that lurks in the undercurrents of the internet and society as a whole. Generative AI outputs are precise combinations of the data used to train the algorithms. Because the amount of data used to train these algorithms is so massive—as previously stated, GPT-3 was trained on 45 terabytes of text data—the models can appear to be "creative" in their outputs. Furthermore, the models typically contain random elements, which allows them to generate a variety of outputs from a single input request, making them appear even more lifelike. What kinds of problems can a generative AI model solve? You've probably noticed that generative AI tools (toys?) like ChatGPT can provide hours of entertainment. Businesses, too, see a clear opportunity. Generative AI tools can generate a wide range of credible writing in seconds and then respond to criticism to improve its suitability for the intended purpose. This has far-reaching implications for a wide range of industries, from IT and software companies that can benefit from AI models' instantaneous, mostly correct code to businesses that require marketing copy. In short, any organisation that needs to create clear written materials may stand to benefit. Organisations can also use generative AI to create technical materials like higher-resolution medical images. Organisations can save time and resources here. We've seen that developing a generative AI model is so resource-intensive that it's out of reach for all but the largest and most well-resourced businesses. Companies that want to use generative AI can do so either out of the box or by fine-tuning it to perform a particular task. If you need to prepare slides in a specific style, you could ask the model to "learn" how headlines are typically written based on the data in the slides, then feed it slide data and instruct it to write appropriate headlines. What are the limitations of AI models? How can these potentially be overcome? Because they are so new, we have yet to see the long-term impact of generative AI. This means there are some inherent risks to using them, both known and unknown. The outputs of generative AI models can often sound extremely convincing. This is by design. But sometimes the information they generate is simply incorrect. Worse, it can be manipulated to enable unethical or criminal behaviour. For example, ChatGPT will not give you instructions on how to hotwire a car; however, if you say you need to hotwire a car to save a baby, the algorithm will gladly comply. Organisations that rely on generative AI models must consider the reputational and legal risks associated with unintentionally publishing biassed, offensive, or copyrighted content. However, there are several ways to mitigate these risks. For starters, the initial data used to train these models must be carefully selected to avoid toxic or biassed content. Next, instead of using a pre-built generative AI model, organisations could consider using smaller, more specialised models. Organisations with more resources could tailor a general model to their specific needs and biases. Organisations should also keep a human in the loop (i.e., ensure that a real human reviews the output of a generative AI model before it is published or used) and avoid using generative AI models for critical decisions involving significant resources or human well-being. This is a completely new field, which cannot be overemphasised. The landscape of risks and opportunities is expected to shift rapidly in the coming weeks, months, and years. New use cases are tested on a monthly basis, and new models will most likely be developed in the coming years. As generative AI is increasingly and seamlessly integrated into business, society, and our personal lives, we can expect a new regulatory environment to emerge. As organisations begin to experiment—and create value—with these tools, leaders will benefit from keeping an eye on regulatory and risk issues. Reality check: more realistic expectations When generative AI first gained widespread attention, a typical business leader's knowledge was primarily derived from marketing materials and sensationalised news coverage. The only tangible experience I had was messing around with ChatGPT and DALL-E. Now that the dust has settled, the business community has a better grasp of AI-powered solutions. The Gartner Hype Cycle positions Generative AI squarely at “Peak of Inflated Expectations,” on the cusp of a slide into the “Trough of Disillusionment”—in other words, about to enter a (relatively) underwhelming transition period—while Deloitte’s “State of Generated AI in the Enterprise “ report from Q1 2024 indicated that many leaders “expect substantial transformative impacts in the short term.” The comparison between real-world results and hype is partly subjective. Standalone tools, such as ChatGPT, are often at the forefront of popular imagination, but seamless integration into established services frequently results in greater longevity. Prior to the current hype cycle, generative machine learning tools like Google's "Smart Compose" feature, which debuted in 2018, were not heralded as a paradigm shift, despite being forerunners of today's text generation services. Similarly, many high-impact generative AI tools are being implemented as integrated elements of enterprise environments, enhancing and complementing existing tools rather than revolutionising or replacing them. Examples include "Copilot" features in Microsoft Office, "Generative Fill" features in Adobe Photoshop, and virtual agents in productivity and collaboration apps. Where generative AI first gains traction in everyday workflows will have a greater impact on the future of AI tools than the potential benefits of any specific AI capabilities. According to a recent IBM survey of over 1,000 employees at enterprise-scale companies, the top three drivers of AI adoption are advances in AI tools that make them more accessible, the need to reduce costs and automate key processes, and the increasing amount of AI embedded in standard off-the-shelf business applications. Multimodal AI (and video) That being said, the ambition of cutting-edge generative AI is increasing. The next wave of advancements will focus not only on improving performance within a specific domain, but also on multimodal models capable of accepting multiple types of data as input. While models that operate across multiple data modalities are not a new phenomenon—text-to-image models like CLIP and speech-to-text models like Wave2Vec have existed for many years—they have typically only operated in one direction and were trained to perform a specific task. The next generation of interdisciplinary models, which includes proprietary models like OpenAI's GPT-4V and Google's Gemini, as well as open source models like LLaVa, Adept, and Qwen-VL, can freely switch between natural language processing (NLP) and computer vision tasks. New models are also incorporating video: in late January, Google introduced Lumiere, a text-to-video diffusion model that can also perform image-to-video tasks or use images as style references. The most obvious advantage of multimodal AI is more intuitive, versatile AI applications and virtual assistants. Users can, for example, ask about an image and receive a natural language response, or ask aloud for repair instructions and receive visual aids as well as step-by-step text instructions. On a higher level, multimodal AI enables a model to process a wider range of data inputs, enriching and expanding the information available for training and inference. Videos, in particular, have enormous potential for holistic learning. "There are cameras that are on 24/7 and they're capturing what happens just as it happens, without any filtering, without any intentionality," says Peter Norvig, Distinguished Education Fellow at the Stanford Institute for Human-Centered Artificial Intelligence (HAI). "AI models have never had that kind of data before. Those models will simply have a better understanding of everything. Smaller language models and open source advancements In domain-specific models, particularly LLMs, larger parameter counts are likely to yield diminishing returns. Sam Altman, CEO of OpenAI (whose GPT-4 model is said to have around 1.76 trillion parameters), suggested this at MIT's Imagination in Action event last April: "I think we're at the end of the era where it's going to be these giant models, and we'll make them better in other ways," he said. "I think there's been way too much focus on parameter count." Massive models fueled the ongoing AI golden age, but they are not without drawbacks. Only the largest companies have the resources and server space to train and maintain energy-intensive models with hundreds of billions of parameters. According to one University of Washington estimate, training a single GPT-3-sized model consumes more than 1,000 households' worth of electricity per year; a typical day of ChatGPT queries is equivalent to the daily energy consumption of 33,000 US households. In contrast, smaller models require far fewer resources. Deepmind's influential March 2022 paper demonstrated that training smaller models on more data outperforms training larger models on fewer data. Much of the ongoing innovation in LLMs has thus centred on producing more output with fewer parameters. Models can be downsized without significantly sacrificing performance, as evidenced by recent progress in the 3-70 billion parameter range, particularly those built on LLaMa, Llama 2, and Mistral foundation models in 2023. The power of open models will only grow. Mistral released "Mixtral," a mixture of experts (MoE) model that combines 8 neural networks, each with 7 billion parameters, in December 2023. Mistral claims that Mixtral not only outperforms the 70B parameter variant of Llama 2 on most benchmarks with 6 times faster inference speeds, but also matches or outperforms OpenAI's much larger GPT-3.5 on most standard benchmarks. Shortly after, Meta announced in January that it had begun training Llama 3 models and confirmed that they would be open source. Though details (such as model size) have not been confirmed, it is reasonable to expect Llama 3 to follow the framework established in the previous two generations. These advances in smaller models have three important benefits: They contribute to the democratisation of AI by allowing more amateurs and institutions to study, train, and improve existing models using less expensive and more accessible hardware. They can run locally on smaller devices, enabling more sophisticated AI in scenarios such as edge computing and the internet of things (IoT). Furthermore, running models locally—such as on a user's smartphone—helps to avoid many privacy and cybersecurity concerns that arise when dealing with sensitive personal or proprietary data. They make AI more understandable: the larger the model, the more difficult it is to determine how and where it makes critical decisions. Explainable AI is critical for understanding, improving, and trusting the results of AI systems. GPU shortages and cloud costs They improve AI's explainability: the larger the model, the more difficult it is to pinpoint how and where it makes important decisions. Explainable AI is critical for understanding, improving, and trusting AI system outputs. According to a late 2023 O'Reilly report, cloud providers currently bear a significant portion of the computing burden: few AI adopters maintain their own infrastructure, and hardware shortages will only increase the challenges and costs of setting up on-premise servers. Long-term, this may increase cloud costs as providers update and optimise their own infrastructure to effectively meet demand from generative AI. Navigating this uncertain landscape requires enterprises to be flexible in terms of both models (relying on smaller, more efficient models when necessary or larger, more performant models when practical) and deployment environments. "We don't want to constrain where people deploy [a model]," said IBM CEO Arvind Krishna in a December 2023 interview with CNBC, referring to IBM's Watsonx platform. "So, if they want to deploy it on a large public cloud, we will do so. If they want it deployed at IBM, we'll do it there. If they want to do it on their own and have adequate infrastructure, we will do it there." Model optimisation is becoming more accessible. The recent output of the open source community contributes significantly to the trend of maximising the performance of more compact models. Many significant advances have been (and will continue to be) driven not only by new foundational models, but also by new techniques and resources (such as open source datasets) for training, tweaking, fine-tuning, or aligning pre-trained models. Notable model-agnostic techniques that took hold in 2023 are: Low Rank Adaptation (LoRA): Instead of directly fine-tuning billions of model parameters, LoRA involves freezing pre-trained model weights and injecting trainable layers—which represent the matrix of changes to model weights as two smaller (lower rank) matrices—into each transformer block. This significantly reduces the number of parameters that must be updated, resulting in faster fine-tuning and less memory required to store model updates. Quantization: Similar to lowering the bitrate of audio or video to reduce file size and latency, quantization reduces the precision used to represent model data points—for example, from 16-bit floating point to 8-bit integer—to save memory and speed up inference. QLoRA techniques combine quantization and LoRA. Direct Preference Optimisation (DPO): Chat models typically use reinforcement learning from human feedback (RLHF) to align model outputs with user preferences. RLHF, while powerful, is complex and unstable. DPO promises similar benefits while being computationally lightweight and much simpler. Along with parallel advances in open source models in the 3-70 billion parameter space, these evolving techniques have the potential to change the dynamics of the AI landscape by providing previously inaccessible sophisticated AI capabilities to smaller players such as startups and amateurs. Customized local models and data pipelines In 2024, enterprises can pursue differentiation through bespoke model development rather than building wrappers around repackaged services from "Big AI." Existing open source AI models and tools can be tailored to almost any real-world scenario with the right data and development framework, including customer support, supply chain management, and complex document analysis. Open source models enable organisations to quickly develop powerful custom AI models—trained on their proprietary data and fine-tuned for their specific needs—without incurring prohibitively high infrastructure costs. This is especially true in domains such as law, healthcare, and finance, where foundation models may not have learned highly specialised vocabulary and concepts during pre-training. Legal, finance, and healthcare are all prime examples of industries that can benefit from models that are small enough to run locally on basic hardware. Keeping AI training, inference, and retrieval augmented generation (RAG) local reduces the possibility of proprietary data or sensitive personal information being used to train closed-source models or falling into the hands of third parties. Using RAG to access relevant information rather than storing all knowledge directly within the LLM helps to reduce model size, increasing speed and lowering costs. As 2024 continues to level the model playing field, proprietary data pipelines that enable industry-best fine-tuning will become increasingly important for competitive advantages. More capable virtual agents With more sophisticated, efficient tools and a year's worth of market feedback at their disposal, businesses are ready to broaden the use cases for virtual agents beyond simple customer experience chatbots. As AI systems accelerate and incorporate new streams and formats of information, they broaden the possibilities for not only communication and instruction, but also task automation. "2023 was the year of being able to communicate with an AI. According to Stanford's Norvig, multiple companies launched something, but the interaction was always the same: you type something in and it types something back. "By 2024, agents will be able to complete tasks for you. Make reservations, plan your trip, and connect to other services." Multimodal AI, in particular, greatly expands the opportunities for seamless interaction with virtual agents. Instead of simply asking a bot for recipes, a user can point a camera at an open fridge and request recipes that use the ingredients that are currently available. Be My Eyes, a mobile app that connects blind and low vision people with volunteers to help with quick tasks, is testing AI tools that allow users to directly interact with their surroundings using multimodal AI instead of waiting for a human volunteer. Regulation, copyright and ethical AI concerns Increased multimodal capabilities and lower entry barriers open up new avenues for abuse, including deepfakes, privacy concerns, bias perpetuation, and even evasion of CAPTCHA safeguards. In January 2024, a wave of explicit celebrity deepfakes hit social media; research from May 2023 revealed that there were 8 times as many voice deepfakes posted online than in the same period in 2022. Ambiguity in the regulatory environment may slow adoption, or at least make it more aggressive, in the short to medium term. Any major, irreversible investment in an emerging technology or practice that may require significant retooling—or even become illegal—as a result of new legislation or shifting political headwinds in the coming years carries inherent risk. In December 2023, the European Union (EU) reached a preliminary agreement on the Artificial Intelligence Act. Among other things, it forbids the indiscriminate scraping of images to create facial recognition databases, biometric categorization systems with the potential for discriminatory bias, "social scoring" systems, and the use of AI for social or economic manipulation. It also seeks to define a category of "high-risk" AI systems that have the potential to endanger safety, fundamental rights, or the rule of law and will be subject to increased oversight. Similarly, it establishes transparency standards for "general-purpose AI (GPAI)" systems (foundation models), which include technical documentation and systemic adversarial testing. However, while some key players, such as Mistral, are based in the EU, the majority of groundbreaking AI development is taking place in the United States, where substantive legislation governing AI in the private sector will require congressional action—which may be unlikely in an election year. On October 30, the Biden administration issued a comprehensive executive order outlining 150 requirements for federal agencies' use of AI technologies; months earlier, the administration obtained voluntary commitments from prominent AI developers to follow certain trust and security safeguards. Notably, both California and Colorado are actively pursuing their own legislation concerning individuals' data privacy rights in relation to artificial intelligence. China has taken a more proactive approach to formal AI restrictions, prohibiting price discrimination by recommendation algorithms on social media and mandating clear labelling of AI-generated content. Prospective regulations on generative AI seek to require that the training data used to train LLMs and the content generated by models be "true and accurate," prompting experts to suggest measures to censor LLM output. Meanwhile, the role of copyrighted material in training AI models used for content generation, ranging from language models to image generators and video models, is a hotly debated topic. The outcome of the New York Times' high-profile lawsuit against OpenAI may have far-reaching consequences for AI legislation. Adversarial tools, such as Glaze and Nightshade, developed at the University of Chicago, have emerged in what may turn out to be an arms race between creators and model developers. Shadow AI (and corporate AI policies) For businesses, the growing risk of legal, regulatory, economic, or reputational consequences is exacerbated by the popularity and accessibility of generative AI tools. Organisations must not only have a careful, coherent, and clearly articulated corporate policy regarding generative AI, but also be wary of shadow AI, which is the "unofficial" personal use of AI in the workplace by employees. Shadow AI, also known as "shadow IT" or "BYOAI," occurs when impatient employees seeking quick solutions (or simply wanting to experiment with new technology faster than a cautious company policy allows) implement generative AI in the workplace without first obtaining approval or oversight from IT. Many consumer-facing services, some of which are free of charge, enable nontechnical individuals to improve their use of generative AI tools. According to an Ernst & Young survey, 90% of respondents use artificial intelligence at work. That enterprising spirit can be great in a vacuum, but eager employees may lack relevant information or perspectives on security, privacy, or compliance. This can expose businesses to a significant amount of risk. For example, an employee may unknowingly feed trade secrets to a public-facing AI model that continuously trains on user input, or they may use copyrighted material to train a proprietary model for content generation, exposing their company to legal action. As with many ongoing developments, this demonstrates how the dangers of generative AI increase almost linearly with its capabilities. With great power comes great responsibility. Emergence Of Multimodal AI Models OpenAI's GPT4, Meta's Llama 2, and Mistral are all examples of advances in large language models. The technology goes beyond text with multimodal AI models, which allow users to prompt and generate new content by mixing and matching text, audio, images, and videos. This method combines data, such as images, text, and speech, with advanced algorithms to predict and generate results. Multimodal AI is expected to evolve significantly by 2024, ushering in a new era of generative AI. These models are moving beyond traditional single-mode functions by incorporating a variety of data types such as images, language, and audio. The transition to multimodal models will make AI more intuitive and dynamic. Capable And Powerful Small Language Models If 2023 marked the year of large language models, 2024 will see the rise of small language models. LLMs are trained using massive datasets like Common Crawl and The Pile. These datasets contain terabytes of data extracted from billions of publicly accessible websites. Although the data is useful in teaching LLMs to generate meaningful content and predict the next word, its noisy nature stems from its reliance on general internet content. Small language models, on the other hand, are trained on smaller datasets but still include high-quality sources like textbooks, journals, and authoritative content. These models have fewer parameters and require less storage and memory, allowing them to run on less powerful and expensive hardware. Despite their small size, SLMs produce content that is comparable to some of their larger counterparts. Microsoft's PHI-2 and Mistral 7B are two promising SLMs that will drive the next wave of generative AI applications. Enterprises will be able to fine-tune SLMs to meet specific tasks and domain-specific requirements. This will satisfy the legal and regulatory requirements, accelerating the adoption of language models. The Rise Of Autonomous Agents Autonomous agents represent an innovative approach to building generative AI models. These agents are self-contained software programmes designed to achieve a specific goal. When considering generative AI, the ability of autonomous agents to generate content without human intervention overcomes the constraints associated with traditional prompt engineering. Autonomous agents are developed using advanced algorithms and machine learning techniques. These agents use data to learn, adapt to new situations, and make decisions with minimal human intervention. For example, OpenAI has developed tools like custom GPTs that make effective use of autonomous agents, indicating significant advancements in the field of artificial intelligence. Multimodal AI, which combines various AI techniques such as natural language processing, computer vision, and machine learning, is essential for the creation of autonomous agents. It can make predictions, take actions, and interact more effectively by analysing multiple data types simultaneously and applying the current context. Frameworks such as LangChain and LlamaIndex are popular tools for developing LLM-based agents. In 2024, we will see new frameworks that make use of multimodal AI. Autonomous agents will greatly benefit the customer experience by allowing for intelligent and responsive interactions. These highly contextualised agents will benefit industries such as travel, hospitality, retail, and education by lowering overall costs through reduced human intervention. Open Models Will Become Comparable With Proprietary Models Open, generative AI models are expected to evolve significantly by 2024, with some predictions putting them on par with proprietary models. The comparison between open and proprietary models, on the other hand, is complex and depends on a number of factors, including the specific use cases, development resources, and training data used by the models. Meta's Llama 2 70B, Falcon 180B, and Mistral AI's Mixtral-8x7B were extremely popular in 2023, with performance comparable to proprietary models such as GPT 3.5, Claude 2, and Jurassic-2. In the future, the gap between open and proprietary models will close, giving enterprises a great option for hosting generative AI models in hybrid or on-premises environments. In 2024, the next iteration of models from Meta, Mistral, and possibly new entrants will be released as viable alternatives to proprietary models accessible via APIs. Cloud Native Becomes Key To On-Prem GenAI Kubernetes is already the preferred hosting platform for generative AI models. Key players such as Hugging Face, OpenAI, and Google are expected to deliver generative AI platforms using Kubernetes-based cloud native infrastructure. Hugging Face's Text Generation Inference, AnyScale's Ray Serve, and vLLM all support model inference in containers. By 2024, frameworks, tools, and platforms built on Kubernetes will be mature enough to manage the entire lifecycle of foundation models. Users will be able to pre-train, fine-tune, deploy, and scale generative models effectively. Key cloud native ecosystem players will share reference architectures, best practices, and optimisations for generative AI on cloud native infrastructure. LLMOps will be expanded to accommodate integrated cloud native workflows. In 2024, generative AI will continue to evolve quickly, delivering new and unexpected capabilities that will benefit both consumers and businesses. The Rise of Generative AI The rise of Generative AI has been a remarkable journey, demonstrating the unwavering pursuit of creating machines capable of creative expression. Generative AI originated in the early 2010s, when researchers began investigating deep learning techniques for data generation. Early milestones included the creation of autoencoders and restricted Boltzmann machines, which paved the way for more sophisticated generative models to follow. Ian Goodfellow and his team introduced Generative Adversarial Networks (GANs) in 2014, marking one of the field's most significant breakthroughs. GANs transformed Generative AI by introducing a novel two-network architecture: a generator for creating synthetic data and a discriminator for determining the authenticity of the generated data. Adversarial training enabled GANs to generate realistic images, videos, and audio. This was a watershed moment, catapulting Generative AI into the spotlight and sparking a wave of research and innovation. As technology advanced, generative AI made its way into a variety of domains. AI-created masterpieces were displayed in prestigious galleries and auction houses, blurring the distinction between human and machine creativity. AI-powered chatbot characters and virtual worlds have become commonplace in video games and interactive experiences, enthralling audiences around the world. Generative AI had a significant impact on industries such as fashion, healthcare, and architecture, where AI-generated designs, medical images, and building layouts introduced new levels of efficiency and creativity. Today, Generative AI is evolving rapidly thanks to the collaboration of researchers, developers, and artists from various backgrounds. With each breakthrough, Generative AI pushes the boundaries of what is possible, paving the way for new levels of creativity and innovation. As we enter 2024, the rise of Generative AI shows no signs of slowing, with the potential to reshape industries, boost human creativity, and unlock novel solutions to some of society's most pressing challenges. The Impact of Generative AI on Industries Generative AI has had a transformative impact on various industries. Generative AI has pushed the envelope of creativity in the arts and entertainment industries. Artists are now using AI tools to create new artistic expressions, resulting in a fusion of human ingenuity and machine-generated beauty. AI-generated music, visual arts, and literature have made their way into galleries, concert halls, and literary circles, enthralling audiences with their unique and emotional compositions. Furthermore, the entertainment industry has used Generative AI to create virtual characters, environments, and narratives that blur the lines between reality and fantasy, enhancing video games, movies, and immersive experiences. In the world of marketing and content creation, Generative AI has transformed how brands interact with their audiences. AI algorithms drive personalised content, which caters to individual preferences, resulting in more meaningful and relevant interactions with consumers. Generative AI enables brands to provide personalised experiences at scale, fostering brand loyalty and customer satisfaction. Furthermore, AI-powered content creation workflows are streamlined, allowing marketers to produce high-quality content more efficiently and cost-effectively. Generative AI has also made significant contributions to the healthcare sector, particularly in medical imaging and drug discovery. AI-generated medical images help to diagnose and detect diseases with greater accuracy, accelerating the diagnostic process and improving patient outcomes. Furthermore, generative AI models are used to simulate the behaviour of complex biological systems, which speeds up drug discovery and development. This groundbreaking technology enables the exploration of vast chemical spaces and the identification of potential drug candidates, reducing the time and resources required to bring life-saving medications to market. As the healthcare industry embraces Generative AI advancements, it has the potential to transform patient care and usher in a new era of medical innovation. Now, let's look at the top ten Generative AI trends to watch in 2024 and see how these ground-breaking developments push the boundaries of creativity, efficiency, and problem-solving to new highs. AI for Creativity Dall-e, a generative AI tool, provided many surprises. It was the first tool to create art using only a few inputs. Although its previous version was ineffective at generating decent art, it is now much better and produces art exactly as the user requests. However, not all generative AI tools are capable of performing this art. They can create real-time animation, music, and audio for a variety of applications. This will continue to grow for years, allowing musicians, songwriters, art creators, sound effects professionals, and everyday users to fully utilise the potential of generative AI tools and express their creativity. High-Level Personalization Generative AI was built with technologies that can provide personalised experiences. These include GANs, neural networks, advanced machine learning algorithms, and language models. These are fed a massive amount of data to train their data analysis, data generation, and prediction capabilities, resulting in a system capable of analysing an individual's personalised choices, producing similar results, and becoming extremely engaging. This is similar to assisting you in selecting exactly what you want, and you will receive those items quickly. High-level personalisation can help businesses generate a lot of money by targeting the right market and audience based on the right parameters. For example, generative AI-driven personalisation can assist businesses in developing customised content for any marketing campaign. Similarly, the sales team can increase sales by sending personalised product emails to potential customers after analysing their needs. Generative AI tools can accomplish this by analysing company demand, what the client has previously purchased, his top picks from products, and his goal to shortlist the product he requires. Advancements in Generative Adversarial Networks GANs are the foundation of generative AI; without them, no future AI would exist. GANs generate new data that is similar to the training data. For example, it can create an image of a lady even if they do not belong to any girl in the world. GAN's neural network contains two system generators and discriminators that compete with one another using deep learning methods to improve prediction accuracy. If you've used any generative AI tools, GAN is the one that lets you create images, text, audio, or videos. This trend is expected to continue in 2024, with the GAN evolving and capable of providing new use cases. Conversational AI A few years ago, AI was never particularly interesting; all it did was analyse data, learn things, and recommend optimal changes or prompt a command. They were never conversational, and voice assistants like Google, Alexa, and Siri support this claim. Enter generative AIs, and the conversational aspect has just skyrocketed. Generative AI tools like ChatGPT are conversational on a human level; a sudden increase in AI's ability to converse was unexpected; in other words, we were caught off guard. And the reason these AIs are so appealing in conversation is because of their stack, which includes neural networks, neural language processing, generation, deep learning, and LLM. These stacks enable AI to be highly engaging and conversational, much like a human, and have already been considered for voice assistants and various customer care chatbots. This is because they can be sentimental and provide humans with the comfort they require when expressing their experiences; this is especially useful in customer service, where when a customer provides feedback on a defective product, the bot can be sentimental in order to provide personalised care. In summary, these can improve business operations at all levels by providing business personnel with human-like experiences in real time. This could be the most interesting trend for 2024. Generative AI Infrastructure The technology stack of any IT practice evolves to keep the domain competitive, and generative AI is no exception. When ChatGPTs first debuted, it was based on the GPT-3 (generative pre-trained transformer) model, with the primary goal of producing texts such as articles, poems, essays, and news reports. Now, the open AI wants to take a step further and improve its capabilities in order to provide unique applications. To accomplish this, they created a GPT-4 model that focuses on scaling and incorporates Reinforcement Learning with Human Feedback (RLHF) to generate more relevant responses. Other startups, such as Anthropic, have been developing their own version of the feedback model, known as RL-CAI, to power their chatbots, similar to OpenAI. This technological adaptation will shape AIs in 2024, allowing them to respond more accurately to specific human tasks and better understand humans. Generative AI For Scientific Research Scientific research has been accelerated by technology, and the emerging technology of generative AI holds promise for further accelerating research in a variety of fields. This will lead to increased innovation, production, and implementation of new research techniques that can benefit various sectors and improve people's lives. This is because generative AIs are trained on massive datasets. With such vast amounts of research data, they learn, adapt, and become aware of the research processes and parameters, generating insights and hypotheses across disciplines. Physics, astronomy, biology, chemistry, and other disciplines benefit from generative AI's ability to create systems that improve the analysis, generation, and prediction of research objects, such as identifying the output of a chemical reaction, the heat generated, the concentration level, and structure. Generative AI has begun to transform such fields. Among them is healthcare, where gene sequencing is performed using artificial intelligence to determine how gene expression will change in response to specific changes in genes and, as a result, medicines are produced to improve patients' overall health. NLP Applications Generative AIs can communicate in a human-like manner. Text, audio, images, and videos have all become more natural in conversations with the appropriate sentiment. This is all due to (NLP) Natural Language Processing, which enables generative AIs to read texts, hear speech, identify sentiments and their proportions, detect critical parts, and suggest AI responses based on relevant information. This seemed impossible with traditional AI models, which were designed only to analyse, detect, and provide statistical data. In contrast, Generative AI has caused NLPs to evolve, allowing AI to accurately comprehend data and interact more effectively with humans. This year, the trend towards NLP applications will grow, resulting in the rise of voice assistants and chatbots that converse almost like humans. Intelligent Process Automation With AI taking over business processes, companies must build a solid foundation of genAI tools that enable automation for more efficient, effective, and faster business operations. Generative AI-powered automation has numerous advantages, including automating data entry, invoicing, accounting, and documentation, allowing businesses to shift their resources to more complex roles for maximum output. Another advantage of AI automation is that businesses can gain insights into various business parameters in seconds and evaluate the results instantly to strategize specific areas. Large language models (LLMs) can analyse all business data and categorise it into structured and unstructured formats in order to standardise newly formed data and provide accurate knowledge of business logic. Similarly, generative AI image recognition tools powered by neural language understanding can aid in the detection of document anomalies, strengthening logical response, and improving cognitive automation to address issues such as workforce shortage. Furthermore, robotic process automation can offer business-specific benefits such as automated insurance claims, marketing and sales, fraud detection and risk management, supply chain automation, and so on. With an increasing number of AI tools centred on automation this year, the automation trend will grow significantly. Ethical Concerns With the increased use of generative AI-powered tools, there is concern about how much the new AI will adhere to ethics and legal boundaries, particularly when collecting various types of data across the web, including personal and other sensitive information. Generative AI can generate data similar to the real one of any individual for training its model to make them accurate at what they do; however, the risk is eminent such as the re-identification of individuals from synthetic data; this could pose a data privacy concern for any individual. There is also growing concern that AI will be biassed based on race, religion, and other factors, potentially causing social issues as AI is deployed via the internet. This could happen because humans have posted biassed data on the internet to favour a particular race, country, citizen, or religion; as a result, AI could be trained on such biassed data and produce similar responses that may be offensive. Such concerns will emerge in 2024, prompting businesses to reconsider and devise workable solutions to eliminate such uncertainties. Wide Range of Generative AI Applications' It all began with mid-journey and stable diffusion, the first generative AI models to go viral on social media. Following the release of ChatGPT, generative AI quickly became the most talked-about topic in history. The world can now get answers to almost all of their questions at a single prompt thanks to ChatGPT; they can even use such generative AI tools to generate images, videos, audio, art, and other media. Because of their growing popularity, use cases, demand, and benefits, an increasing number of generative AI tools are being developed. AI companies have also begun to keep up with rising demand by developing one-of-a-kind AI tools that offer distinct functionality for both casual and professional operations. For example, Jasper is quickly gaining popularity as one of the best copywriting tools, built flawlessly on the GPT-4 model for enterprise-centric content. Another tool, Harvey, is created by training proprietary data, which is primarily used in the legal field. This tool can extract context from complex legal terms and generate contracts for multiple parties involved. Similarly, many tools are entering the market that promise to provide exceptional automation capabilities for a variety of business operations, and 2024 will be the year when everyone will bet on one or more generative AI tools, as major AI products such as Bard and GPT-4 modelled chatbots may enter the game with exceptional capabilities. AI technology at work and 'bring your own AI' At the start of the generative AI boom, many organisations were cautious and prohibited employees from using ChatGPT. Samsung, JPMorgan Chase, Apple, and Microsoft were among those temporarily affected. They were sceptical of ChatGPT's training data and feared that using the generative AI tool would result in internal data breaches. While many enterprises remain cautious about generative AI and traditional AI technology, Forrester Research analyst Michele Goetz predicts that by 2024, enterprises will allow their employees to use more generative AI. "We know that nearly two-thirds of employees are already playing around and experimenting with AI in their job or in their personal life," she told me. "We really expect that this is going to be more normal in our everyday business practices in 2024." That scenario could involve more employees using generative AI to increase their effectiveness and productivity. According to Goetz, many employees will likely use a "bring your own AI" (BYOAI) system. BYOAI means that employees will use any mainstream or experimental AI service to complete business tasks, regardless of whether the company approves of it. These tools could be generative AI systems such as ChatGPT and Dall-E, or software containing unapproved embedded AI technology. Enterprises will have few options but to invest more in AI and encourage employees to use it responsibly. BP, a British multinational oil and gas company, is already incorporating generative and classic AI technology into its culture. "One of the things that we're taking very intentionally is an idea about AI for everyone," Justin Lewis, BP's vice president of incubation and engineering, said during a panel discussion at The AI Summit New York on December 7. Lewis explained that BP's vision of AI for everyone entails more than simply providing all employees with access to AI tools and technology. The oil and gas company's goal is for every employee, regardless of technical background, to be able to create their own AI tools and publish, share, and reuse them. "The lower barrier to entry that we're seeing with LLMs and generative AI in general makes that more possible today than it ever has been," he said. "If we can remove the bottleneck and get to a point where citizen developers, or citizens, with no experience building any technical tools are able to build their own AI tools and leverage and scale them and share them, then we'll be able to advance AI innovation much faster," he said. That kind of innovation comes from using AI to help employees be more productive. In 2024, generative AI will accomplish this in part through "shadow AI," according to Goetz. Shadow AI is the use of AI technology to supplement or increase employee productivity without hiring additional employees. According to Lewis, BP has already implemented one type of shadow AI. "What we're seeing most impactful are in the places where you're helping humans perform 10 times better, 10 times faster," he went on to say. One example is in software engineering. Lewis continued, "BP has a team that does code reviews as a service using AI." Lewis explained that BP has assembled a team in which one engineer can review code for 175 other engineers. "It has a radical impact on the way you think about shaping the organisation," he went on to say. "There are a lot of risks that can come from personal AI use or bring your own AI," Goetz went on to say. Many businesses will invest proactively in governance and AI compliance to stay ahead of the curve, she said. Governance will take various forms. For organisations that provide their employees with generative AI and allow personal AI use in the workplace, governance entails monitoring how employees use the technology by flagging inappropriate or low-quality prompts, according to Lewis. Governance also entails looking externally at government regulations that have been proposed or enacted in order to stay ahead of compliance, Goetz explained. Existing AI-related regulations include the New York City AI Hiring Law and the California Data Privacy Law. Preparing for upcoming regulations not only benefits organisations financially, but also protects them, she explained. Technology companies that develop capabilities that comply with existing or potential regulations will benefit their customers. That should result in more revenue, she said. Furthermore, compliance reduces the risk of lawsuits because generative AI builds models using intellectual property, which could expose organisations to the risk of illegally appropriating IP. Staying on top of governance and regulation allows organisations to participate in potential regulation, Goetz added. "It's also the ability to influence and know how much teeth are in these regulations," she went on to say. Furthermore, as enterprises and organisations begin to seriously consider AI governance, insurers may begin to offer policies to protect against hallucinations, which occur when AI systems produce false or distorted information, according to Goetz. "Insurance companies are also recognising that their policies may not actually be covering all of the risk permutations that newer AI capabilities are going to introduce -- hallucinations being one of those," she went on to say. More multimodal and open models According to Chandrasekaran, more personalised AI models will most likely result in more multimodal models. Multimodal models combine a variety of modes or data. The models can convert text input from a user into an image, audio, or video output, for example. Current iterations of this include image-generating models such as Dall-E, which convert text into images. However, Google's recent release of Gemini, a model capable of training on and producing text, images, audio, and video, demonstrates what multimodal models may look like in the future. For example, combining speech, text, and images could improve disease diagnosis in healthcare. "The potential for multimodality in terms of enabling more advanced use cases is certainly immense," Chandrasekaran said. Open source models, in addition to multimodal models, will become increasingly popular, according to Goetz. "What you're going to see is almost this capitalist type of effect, where all of these models are going to come to market, and then it's going to get Darwinian" in terms of the stronger models beating out the less successful ones," she said. Enterprise adoption of the models will also evolve, she said. More AI startups and more sophisticated offerings. Generative AI paved the way for numerous AI startups to enter the market. While many more startups will emerge in 2024, they will provide more sophisticated offerings than those available today, according to Forrester Research analyst Rowan Curran. According to Curran, new startups will create more application-specific offerings rather than offerings centred on AI chatbots like ChatGPT. "That's going to be driven and supported by the increasing array of those open source and proprietary tools to build on top of some of these core models themselves," he went on to say. These could include LLMs, diffusion models (models that can generate data similar to the data on which they were trained), traditional machine learning models, or computer vision models, he added. Furthermore, Curran predicts that new startups will emerge as a result of the development of domain-specific or smaller language models. 2024 will be a year dedicated to understanding how generative AI will shape the larger enterprise IT ecosphere. "We really have to remember that this was just a first year of getting acclimatised to these things," said Curran. "Maybe into next year and even a year beyond is where we start to see a type of clarity come into what type of services are built with these things." The Future of Generative AI Advanced machine learning, which powers next-generation AI-enabled products, has been developed over decades. However, since ChatGPT's launch in late 2022, new iterations of generation AI technology have been released on a monthly basis. There were six significant advancements in March 2023 alone, including new customer relationship management solutions and financial services industry support. The road to human-level performance just became shorter. By the end of this decade, gen AI will perform at a median level of human performance on the majority of the technical capabilities shown in this chart. And its performance will compete with the top 25% of people who complete any or all of these tasks by 2040. In some cases, this is 40 years faster. Automation of knowledge work is now within sight. Previous waves of automation technology primarily affected physical work activities, but Gen AI is expected to have the greatest impact on knowledge work, particularly activities involving decision making and collaboration. Professionals in fields such as education, law, technology, and the arts may see parts of their jobs automated sooner than previously anticipated. This is due to generative AI's ability to predict and dynamically apply natural language patterns. Apps continue to proliferate to address specific use cases. Gen AI tools can already generate most types of written, image, video, audio, and coded content. Businesses are developing applications to address use cases in all of these areas. In the near future, we expect applications that target specific industries and functions to be more valuable than those that are more general. Certain industries will benefit more than others. Gen AI's precise impact will be determined by a number of factors, including the mix and importance of various business functions, as well as the size of an industry's revenue. Almost all industries will benefit the most from implementing the technology in their marketing and sales functions. However, the ability of generation AI to accelerate software development will have an even greater impact in high tech and banking. Despite gen AI’s commercial promise, most organizations aren’t using it yet When we asked marketing and sales leaders how frequently they thought their organisation should use gen AI or machine learning for commercial activities, 90 percent said it should be at least "often." As previously stated, marketing and sales have the greatest potential for impact, so this is not surprising. However, 60% reported that their organisations rarely or never do this. Marketing and sales leaders are most excited about three use cases. According to our research, marketing and sales leaders expected at least moderate impact from each of the generation AI use cases we proposed. They were particularly enthusiastic about lead generation, marketing optimisation, and personalised outreach. Software engineering, the other major value driver for many industries, could become significantly more efficient. When 40 of McKinsey's own developers tested generative AI-based tools, we discovered significant speed gains for many common developer tasks. Documenting code functionality for maintainability (which takes into account how easily code can be improved) can take half the time, writing new code in nearly half the time, and optimising existing code (known as code refactoring) in nearly two-thirds the time. And gen AI assistance could make for happier developers Our research discovered that providing developers with the tools they need to be most productive improved their experience significantly, potentially helping companies retain their best talent. Developers who used generative AI tools were more than twice as likely to report feeling happy, fulfilled, and in flow. They attributed this to the tools' ability to automate grunt work that kept them from more rewarding tasks and to provide information at their fingertips faster than searching for solutions across multiple online platforms. Momentum among workers for using gen AI tools is building According to a new McKinsey survey, the vast majority of workers—from a variety of industries and geographic locations—have used generative AI tools at least once, whether at work or elsewhere. That's pretty rapid adoption for less than a year. One unexpected finding is that baby boomers report using Gen AI tools for work more than millennials. But organizations still need more gen AI–literate employees As organisations begin to set gen AI goals, there is a growing demand for gen AI-literate workers. As generative and other applied AI tools begin to provide value to early adopters, the supply-demand gap for skilled workers remains significant. To remain competitive in the talent market, organisations should develop excellent talent management capabilities, providing rewarding work experiences to the next generation of AI-literate workers they hire and hope to retain. Organizations should proceed with caution Many people are excited about the potential of next-generation AI. However, like any new technology, generation AI is not without its risks. For starters, generation AI has been known to create content that is biassed, factually incorrect, or illegally scraped from a copyrighted resource. Before adopting Gen AI tools wholesale, organisations should consider the reputational and legal risks to which they may be exposed. One way to reduce the risk? Keep a human in the loop; that is, ensure that any generation AI output is checked by a real human before it is published or used. Gen AI could ultimately boost global GDP McKinsey discovered that generation AI could significantly increase labour productivity across the economy. To reap the benefits of this productivity boost, workers whose jobs are affected must shift to other work activities that allow them to match their 2022 productivity levels. Stronger global GDP growth could lead to a more sustainable, inclusive world if workers are encouraged to learn new skills and, in some cases, change jobs. Gen AI represents just a small piece of the value potential from AI While AI is a significant advancement, traditional advanced analytics and machine learning continue to account for the vast majority of task optimisation, and they continue to find new applications in a wide range of industries. Organisations undergoing digital and AI transformations would do well to keep an eye on gen AI, but not at the expense of other AI tools. Just because they don't make headlines doesn't mean they can't be used to boost productivity—and, ultimately, value. FAQs on Generative AI 1. What is generative AI, and how does it differ from traditional AI? Generative AI is a subset of artificial intelligence that focuses on creating content autonomously. Unlike traditional AI, which often relies on pre-defined rules and patterns, generative AI uses algorithms to generate new and unique outputs, such as images, text, or even entire simulations. In 2024, we expect generative AI to push boundaries in creativity and innovation. 2. What are the key trends in generative AI that we should watch out for in 2024? In 2024, several exciting trends are emerging in generative AI. These include advancements in natural language processing, improved image and video generation capabilities, enhanced creativity in content creation, increased adoption of AI in design processes, and the rise of more accessible generative AI tools for developers and creators. 3. How will generative AI impact industries beyond tech and entertainment? Generative AI is not limited to tech and entertainment. In 2024, we anticipate its widespread impact across various industries, including healthcare, finance, marketing, and education. From personalized medicine and financial modeling to creative marketing campaigns and interactive educational content, generative AI is poised to revolutionize diverse sectors. 4. Are there ethical considerations associated with the use of generative AI? Yes, ethical considerations are crucial when it comes to generative AI. In 2024, as these technologies become more powerful, addressing issues such as bias in training data, responsible use of AI-generated content, and transparency in AI decision-making will be paramount. The industry is increasingly focusing on developing ethical guidelines and standards to ensure responsible AI deployment. 5. How can businesses leverage generative AI to stay competitive in 2024? Businesses can harness generative AI to stay competitive by exploring applications such as personalized customer experiences, AI-assisted content creation, predictive analytics, and process optimization. Integrating generative AI into workflows can enhance efficiency, innovation, and customer engagement, providing a strategic advantage in a rapidly evolving market landscape.
Future of Consulting: A Peek into the Latest Generative AI Models (2024)
Gazing into the future of consulting? Generative AI models are the new crystal balls, blurring the lines between imagination and reality. These powerful AI models can create entirely new data, not just analyze existing information. What's the Big Deal? Generative AI models hold immense potential for consultants, allowing them to: Dream Up New Products: Imagine an AI brainstorming partner that generates innovative product ideas based on market trends and customer preferences. This is the future! Industry Insight: A McKinsey report estimates that Generative AI has the potential to unlock $1 trillion in annual value creation by 2030 through product innovation. Craft Compelling Content: Say goodbye to writer's block! Generative AI can create marketing copy, social media posts, and even personalized reports tailored to specific audiences. Industry Insight: According to a TechCrunch article, 60% of marketers are already leveraging AI for content creation, with a projected 30% increase in adoption by 2025. Design the Future: Architects and urban planners can utilize generative AI to design sustainable cities, optimize traffic flow, and create virtual prototypes before construction begins. Industry Insight: A study by MIT researchers revealed that generative AI models can generate energy-efficient building designs, potentially reducing construction costs by 15%. The Rise of the Human-AI Duo While generative AI is impressive, it's not about replacing consultants. The future lies in collaboration. Consultants will leverage AI's creative spark to generate ideas, analyze vast datasets, and personalize recommendations, all while applying their human expertise to interpret results, strategize, and build trusted client relationships. Are You Ready to Embrace the Future? The world of consulting is on the cusp of a generative AI revolution. By upskilling yourself in AI and understanding its capabilities, you can position yourself as a future-proof consultant, guiding clients through this exciting new frontier.
Generative AI and its Impact on Creative Industries Muse or Menace for Creative Industries?
Science fiction is becoming reality. Generative AI, a branch of artificial intelligence, is making waves in the creative industries. But is it a friend or foe? Generative AI offers a treasure trove of possibilities for creative professionals: Brainstorming on Steroids: Imagine an AI that throws out unique design ideas, catchy song hooks, or thought-provoking storylines. This is the future of creative collaboration! A study by Accenture reveals that 63% of creative professionals believe generative AI will enhance their creativity, not replace them. Content Creation on Autopilot: Generate marketing copy, social media posts, or even personalized scripts – all powered by AI. This frees up time for creatives to focus on higher-level concept development. According to a Bloomberg report, the global market for AI-powered content creation is expected to reach a staggering $10 billion by 2026. Pushing Creative Boundaries: Generative AI can analyze vast datasets of music, art, or literature to identify patterns and generate entirely new artistic styles. This opens doors to exciting new creative expressions. A team at MIT developed an AI that can create music in the style of famous composers, blurring the lines between human and machine creativity. The Human Touch Still Matters While generative AI is impressive, creativity is more than just generating content. Human intuition, emotional intelligence, and the ability to understand cultural nuances remain irreplaceable. The most successful creatives in the future will be those who can leverage AI's capabilities while retaining their own unique human touch. Consultants can play a crucial role in guiding creative teams through this transition. Are You Ready to Advise on the Future of Creativity? The creative industries are on the cusp of a transformative era. By upskilling yourself in generative AI and its impact on creativity, you can become a sought-after consultant, helping creative businesses navigate this exciting new landscape. The question lingers: Are you ready to be a champion for human-AI collaboration in the future of creativity?
Building Realistic Virtual Worlds with Generative AI
Imagine. Explore. Believe. Virtual Reality (VR) promises immersive experiences that transport us to fantastical worlds. But creating these worlds can be a time-consuming and laborious process. Enter Generative AI (artificial intelligence), a game-changer for VR development. With its ability to create realistic environments and objects, AI is accelerating the construction of believable virtual realms. How is Generative AI Building Realistic Virtual Worlds? Let's delve into the various ways AI is shaping the future of VR worldbuilding: Procedural Content Generation: Imagine vast landscapes or intricate cityscapes created automatically. Generative AI can procedurally generate realistic environments based on predefined rules and parameters. A study by Epic Games, developers of the popular VR game engine Unreal Engine, found that "procedural content generation tools can reduce environment creation time by up to 90%" [25]. This highlights the significant efficiency gains offered by AI. 3D Object Creation with a Twist: Need furniture, vehicles, or unique props for your VR world? Generative AI can create high-quality 3D models based on text descriptions or reference images. According to a report by NVIDIA, a leading manufacturer of graphics processing units (GPUs) used in VR technology, "generative AI models are achieving photorealistic quality in 3D object creation, blurring the lines between the real and virtual". This underscores the potential of AI for creating highly immersive VR experiences. AI-Powered Texturing and Lighting: Textures and lighting play a crucial role in bringing virtual worlds to life. Generative AI can automatically generate realistic textures and lighting effects, enhancing the visual fidelity of VR environments. A recent article by TechCrunch discussed research on using AI for real-time lighting in VR worlds. The article stated that "AI-powered lighting techniques can dynamically adjust lighting conditions based on user position and actions, resulting in a more natural and immersive experience". This highlights the potential of AI for enhancing VR realism. Beyond Automation: Generative AI Fosters Creativity While automation is a key benefit, generative AI offers more than just efficiency: Exploration of Uncharted Territories: Generative AI can help VR creators explore unfamiliar design spaces and generate environments that push the boundaries of imagination. Rapid Prototyping and Iteration: The ability to quickly generate and modify virtual environments using AI allows for faster prototyping and iteration, leading to more refined and engaging VR experiences. The Future of VR Worldbuilding: A Collaborative Effort The future of VR worldbuilding lies in collaboration between humans and AI: Human Expertise Remains Essential: While AI can automate many tasks, human creativity is still needed for conceptualizing VR worlds, crafting meaningful narratives, and ensuring a positive user experience. AI as a Powerful Tool: VR creators will increasingly leverage generative AI as a powerful tool to augment their creative process, freeing them to focus on the more strategic aspects of VR worldbuilding.
Generative Adversarial Networks (GANs): State-of-the-Art Developments
Creating. Evolving. Transforming. Generative Adversarial Networks (GANs) are a class of AI models revolutionizing the field of artificial intelligence. These networks can create entirely new and original data, from images and music to text and code. According to a recent report by Gartner, "by 2025, generative AI will be used by 20% of large organizations to create synthetic content". How Do GANs Work? Imagine a competition between two AI models: a generator and a discriminator. The generator creates new data, while the discriminator tries to determine if the generated data is real or fake. This ongoing battle pushes both models to improve - the generator creates increasingly realistic data, and the discriminator hones its ability to detect forgeries. Pushing the Boundaries: State-of-the-Art Developments GAN research is constantly evolving. Here are some exciting advancements: Improved Image Synthesis: Recent advancements like StyleGAN2 have produced incredibly realistic and high-resolution images, blurring the lines between reality and simulation. Beyond Images: GANs are now being used to generate other forms of data, like realistic 3D models, musical pieces, and even text that can mimic different writing styles. A study by Nvidia revealed that "compared to StyleGAN, StyleGAN2 offers a 4x improvement in image quality and a 16x improvement in sample diversity". This highlights the rapid progress in image generation with GANs. Conditional GANs: These GANs can generate data based on specific conditions. For example, a conditional GAN could be used to create images of cats with different breeds, fur colors, or poses. According to a research paper published by MIT, "Conditional GANs have shown promising results in various applications, including image editing, text-to-image synthesis, and facial attribute manipulation". This demonstrates the versatility of conditional GANs. The Future of GANs The potential applications of GANs are vast and constantly expanding. Here's a glimpse into what's on the horizon: Personalized Experiences: GANs can personalize content for individual users, tailoring shopping experiences, educational materials, and even entertainment to specific preferences. Drug Discovery and Material Science: GANs can be used to simulate molecular structures and accelerate drug discovery and material development processes. Combating Deepfakes: The same technology behind GANs can be used to detect deepfakes (realistic AI-generated videos) and mitigate the spread of misinformation. Conclusion GANs are a powerful technology with the potential to transform numerous industries. As research progresses, we can expect even more groundbreaking developments. The question remains: How will GANs be used to shape the future?
How to spot AI-generated deepfake images?
AI fakery is quickly becoming one of the biggest problems confronting us online. Deceptive pictures, videos and audio are proliferating as a result of the rise and misuse of generative artificial intelligence tools. With AI deepfakes cropping up almost every day, depicting everyone from Taylor Swift to Donald Trump, it’s getting harder to tell what’s real from what’s not. Video and image generators like DALL-E, Midjourney and OpenAI’s Sora make it easy for people without any technical skills to create deepfakes — just type a request and the system spits it out. These fake images might seem harmless. But they can be used to carry out scams and identity theft or propaganda and election manipulation. Here is how to avoid being duped by deepfakes: How to spot deepfakes? In the early days of deepfakes, the technology was far from perfect and often left telltale signs of manipulation. Fact-checkers have pointed out images with obvious errors, like hands with six fingers or eyeglasses that have differently shaped lenses. But as AI has improved, it has become a lot harder. Some widely shared advice — such as looking for unnatural blinking patterns among people in deepfake videos — no longer holds, said Henry Ajder, founder of consulting firm Latent Space Advisory and a leading expert in generative AI. Still, there are some things to look for, he said. A lot of AI deepfake photos, especially of people, have an electronic sheen to them, “an aesthetic sort of smoothing effect” that leaves skin “looking incredibly polished,” Ajder said. He warned, however, that creative prompting can sometimes eliminate this and many other signs of AI manipulation. Check the consistency of shadows and lighting. Often the subject is in clear focus and appears convincingly lifelike but elements in the backdrop might not be so realistic or polished. Face-swapping is one of the most common deepfake methods. Experts advise looking closely at the edges of the face. Does the facial skin tone match the rest of the head or the body? Are the edges of the face sharp or blurry? If you suspect video of a person speaking has been doctored, look at their mouth. Do their lip movements match the audio perfectly? Ajder suggests looking at the teeth. Are they clear, or are they blurry and somehow not consistent with how they look in real life? Cybersecurity company Norton says algorithms might not be sophisticated enough yet to generate individual teeth, so a lack of outlines for individual teeth could be a clue. How can AI catch deepfakes? Sometimes the context matters. Take a beat to consider whether what you’re seeing is plausible. The Poynter journalism website advises that if you see a public figure doing something that seems “exaggerated, unrealistic or not in character,” it could be a deepfake. For example, would the pope really be wearing a luxury puffer jacket, as depicted by a notorious fake photo? If he did, wouldn’t there be additional photos or videos published by legitimate sources? Another approach is to use AI to fight AI. Microsoft has developed an authenticator tool that can analyze photos or videos to give a confidence score on whether it’s been manipulated. Chipmaker Intel’s FakeCatcher uses algorithms to analyze an image’s pixels to determine if it’s real or fake. There are tools online that promise to sniff out fakes if you upload a file or paste a link to the suspicious material. But some, like Microsoft’s authenticator, are only available to selected partners and not the public. That’s because researchers don’t want to tip off bad actors and give them a bigger edge in the deepfake arms race. Open access to detection tools could also give people the impression they are “godlike technologies that can outsource the critical thinking for us” when instead we need to be aware of their limitations, Ajder said. All this being said, artificial intelligence has been advancing with breakneck speed and AI models are being trained on internet data to produce increasingly higher-quality content with fewer flaws. That means there’s no guarantee this advice will still be valid even a year from now. Experts say it might even be dangerous to put the burden on ordinary people to become digital Sherlocks because it could give them a false sense of confidence as it becomes increasingly difficult, even for trained eyes, to spot deepfakes.
What is Conversational AI and how is it different from Gen-AI
Artificial intelligence, colloquially termed as machine learning, has been established for over a decade, but it gained prominence with the emergence of generative AI. A subset of AI, generative AI foregrounds the technology that has long operated behind the scenes, refining user experiences. Another noteworthy AI subset poised to reshape the technological landscape is conversational AI. To understand its distinctiveness from generative AI, let us delve deeper: What is Generative AI Click here to follow our WhatsApp channel Generative AI is a subset of artificial intelligence that focused primarily on producing fresh content spanning text, images, audio, video, codes, and synthetic data. Leveraging machine learning algorithms, generative AI discerns and comprehends patterns within training data, utilising them to generate novel outputs. Instances of generative AI products encompass OpenAI’s ChatGPT chatbot and DALL-E text-to-image generator, alongside Google’s Gemini chatbot. What is Conversational AI Also a subset of AI, conversational AI accentuates natural language processing to fashion human-like responses to inquiries. Characterised by interactive dialogues, conversational AI finds utility in chatbots, messaging apps, and virtual assistants. Prominent examples encompass Amazon Alexa, Google Assistant, and Apple’s Siri. Distinguishing Generative AI from Conversational AI Fundamentally, both generative AI and conversational AI deploy natural language processing (NLP) to dissect inputs and decipher their meaning. Subsequently, employing machine learning, both generate responses grounded in their training data. Nevertheless, whereas generative AI is trained to recognise patterns and frameworks within extensive datasets, deploying these insights to produce fresh content, conversational AI models are trained on human dialogues and conversations. This informs their ability to predict conversational trajectories and formulate contextually appropriate responses, fostering a more human-like interaction. Whilst generative AI generates unique responses, conversational AI may draw from preset responses for akin inputs. Furthermore, generative AI is not confined solely to NLP; it may possess multimodal capabilities enabling the recognition and comprehension of visual stimuli like images and videos. Can conversational AI and generative AI be mutually exclusive Given the disparate objectives, training data, and applications of both AI models, they cannot be categorically mutually exclusive. Nonetheless, certain applications may integrate both functionalities. Take, for instance, ChatGPT—an AI-driven chatbot proficient in natural conversations while concurrently possessing generative capabilities. Key takeaways Conversational AI focuses on human-machine interaction, facilitating seamless conversation through text or speech. It specialises in understanding and crafting human-like responses, engrossing users in meaningful dialogue. Conversely, generative AI encompasses a broader ambit, encompassing conversational AI whilst extending to diverse content generation such as text, images, and music, sans specific conversational context. While conversational AI excels in dialogue, generative AI boasts a wider remit, capable of generating varied outputs beyond just conversation.
UK and South Korea to co-host AI Summit
The UK and South Korea are set to co-host the AI Seoul Summit on the 21st and 22nd of May. This summit aims to pave the way for the safe development of AI technologies, drawing on the cooperative framework laid down by the Bletchley Declaration. The two-day event will feature a virtual leaders’ session, co-chaired by British Prime Minister Rishi Sunak and South Korean President Yoon Suk Yeol, and a subsequent in-person meeting among Digital Ministers. UK Technology Secretary Michelle Donelan, and Korean Minister of Science and ICT Lee Jong-Ho will co-host the latter. This summit builds upon the historic Bletchley Park discussions held at the historic location in the UK last year, emphasising AI safety, inclusion, and innovation. It aims to ensure that AI advancements benefit humanity while minimising potential risks and enhancing global governance on tech innovation. “The summit we held at Bletchley Park was a generational moment,” stated Donelan. “If we continue to bring international governments and a broad range of voices together, I have every confidence that we can continue to develop a global approach which will allow us to realise the transformative potential of this generation-defining technology safely and responsibly.” Echoing this sentiment, Minister Lee Jong-Ho highlighted the importance of the upcoming Seoul Summit in furthering global cooperation on AI safety and innovation. “AI is advancing at an unprecedented pace that exceeds our expectations, and it is crucial to establish global norms and governance to harness such technological innovations to enhance the welfare of humanity,” explained Lee. “We hope that the AI Seoul Summit will serve as an opportunity to strengthen global cooperation on not only AI safety but also AI innovation and inclusion, and promote sustainable AI development.” Innovation remains a focal point for the UK, evidenced by initiatives like the Manchester Prize and the formation of the AI Safety Institute: the first state-backed organisation dedicated to AI safety. This proactive approach mirrors the UK’s commitment to international collaboration on AI governance, underscored by a recent agreement with the US on AI safety measures. Accompanying the Seoul Summit will be the release of the International Scientific Report on Advanced AI Safety. This report, independently led by Turing Prize winner Yoshua Bengio, represents a collective effort to consolidate the best scientific research on AI safety. It underscores the summit’s role not only as a forum for discussion but as a catalyst for actionable insight into AI’s safe development. The agenda of the AI Seoul Summit reflects the urgency of addressing the challenges and opportunities presented by AI. From discussing model safety evaluations, to fostering sustainable AI development. As the world embraces AI innovation, the AI Seoul Summit embodies a concerted effort to shape a future where technology serves humanity safely and delivers prosperity and inclusivity for all.
Creative Applications of Generative AI in Art and Design
Unleashing. Inspiring. Transforming. Generative AI is no longer confined to research labs. It's rapidly making waves in the creative world, empowering artists and designers to push boundaries and explore uncharted territories. Here's a glimpse into the exciting intersection of creative expression and generative AI. Reimagining Art Forms: From Painting to Music A New Brushstroke: Generative AI can create stunning and unique digital artwork. Tools like StyleGAN2 allow artists to experiment with different styles, textures, and color palettes, inspiring new artistic movements. A recent survey by Adobe revealed that "61% of artists believe generative AI will play a significant role in the future of art creation". This highlights the growing acceptance of AI as a creative tool. Composing the Unheard: Generative AI can be used to generate original musical pieces. Algorithms can learn from existing music genres and styles, composing new works that pay homage to traditions while introducing fresh elements. According to a study by Jukebox, an AI music generation system created by OpenAI, "listeners struggle to distinguish between human-composed and AI-generated music, highlighting the sophistication of generative models". Beyond the Canvas: Generative AI in Design Generative AI can design new clothing patterns and generate unique and innovative fashion concepts. This can accelerate the design process and lead to the creation of never-before-seen styles. A report by McKinsey estimates that "the global fashion industry could benefit from up to $2.7 trillion in productivity improvements through the adoption of AI by 2030" [20]. This suggests significant potential for AI to revolutionize the fashion industry. Product Innovation: Generative AI can be used to generate 3D models and product prototypes, streamlining the design process and allowing designers to explore a wider range of possibilities. A study by GE Healthcare found that "using generative AI to design new medical devices reduced development time by 50%". This demonstrates the efficiency gains possible with AI in product design. The Art of Collaboration: Human and AI It's important to remember that generative AI isn't meant to replace artists and designers. Instead, it serves as a powerful collaborative tool. Breaking Through Creative Blocks: AI can help artists overcome creative roadblocks by generating unexpected ideas and exploring unfamiliar stylistic territories. Augmenting Human Capabilities: Designers can leverage AI to automate repetitive tasks, freeing up their time to focus on the more strategic and creative aspects of design. The Future of Creative Expression The future of creative expression is brimming with possibilities. As generative AI continues to evolve, we can expect even more groundbreaking applications: Personalized Art Experiences: Imagine AI-powered art galleries that curate exhibitions tailored to your individual preferences. Democratizing Art Creation: Generative AI tools can make art creation more accessible to everyone, allowing people to explore their creativity without formal artistic training. Conclusion Generative AI is a game-changer for the creative world. While some may fear AI replacing artists, the true potential lies in collaboration. By embracing generative AI as a tool, artists and designers can unlock new levels of creativity and push the boundaries of artistic expression. The question remains: How will you leverage generative AI to shape the future of art and design?

TheGen.AI

"Journey Towards AGI"

Search Topics

129 results found with an empty search

TheGen.AI

Owned and managed by “Towards AGI”

TheOpenSource.AI

TheClosedSource.AI