top of page

Search Topics

129 results found with an empty search

  • Crystal Ball Not Included: Our (Slightly Wild) Guesses About the Future of Generative AI

    While 2023 witnessed a surge in the adoption of generative AI, concerns regarding its impact on data security, ethics, and bias persist. Research shows that 81% of customers want human oversight when it comes to reviewing and validating the outputs of generative AI. This highlights a lack of trust in the technology, with only 37% trusting its outputs as much as those of an employee. This trust gap is further amplified as generative AI becomes more widespread. Despite these concerns, there's a clear push for efficiency and engagement through AI. Brands are increasingly turning to generative AI, but customers demand a responsible approach built on trust and human involvement. In fact, 80% of customers believe it's crucial for humans to validate AI's outputs. Demystifying the Black Box: Building Trust in AI Transparency is key to addressing fears surrounding generative AI. By making the technology more understandable, we can foster wider adoption and responsible use. An interesting proposal suggests assigning a "FICO score" to AI models to gauge their reliability, similar to how creditworthiness is assessed. Generative AI: A Trend Poised to Continue Despite the concerns, the future of generative AI appears promising. According to the latest State of IT 2023 Report by Salesforce, surveying over 4,300 IT leaders: 90% of IT decision-makers believe generative AI has already become mainstream. Process automation is on the rise as businesses seek to improve efficiency and manage costs. 86% of IT leaders anticipate a prominent role for generative AI in their organizations within the near future. The message is clear: while trust and transparency are essential for successful AI adoption, generative AI seems poised to play a significant role in shaping the future of various industries. By addressing concerns and fostering open communication, we can unlock the full potential of this powerful technology while ensuring its responsible and ethical use. Poised for Explosive Growth and Transformation The adoption of artificial intelligence (AI) is skyrocketing, with: Half of organizations already utilizing it in 2022 (McKinsey). Global AI spending projected to increase a staggering 26.9% in 2023 alone (IDC). Customer service AI adoption jumping 88% between 2020 and 2022. This rapid growth isn't just about hype. Experts like McKinsey estimate generative AI could contribute $2.6 trillion to $4.4 trillion annually across various industries, revolutionizing how we work. Just like electricity transformed the last century, AI is poised to do the same for this one. Ignoring it could leave businesses behind. We already see glimpses of this transformation: Research suggests AI could automate 40% of the workday (Valoir), boosting productivity and efficiency. Generative AI is raising public awareness of both its potential and potential risks. The future of AI is bright, but it's crucial to acknowledge and address potential risks alongside its exciting possibilities. By embracing AI responsibly and thoughtfully, we can unlock its full potential to shape a better future for all. Tech Experts Forecast a Generative AI Revolution in the Workplace Leading technology analyst firms like Gartner are painting a clear picture: generative AI is poised to significantly impact how businesses operate within the next five years. Here are some key predictions: Conversational AI adoption will soar: By 2024, 40% of enterprise applications are expected to have embedded conversational AI, compared to a mere 5% in 2020. This signifies a widespread integration of AI-powered chatbots and virtual assistants into various workplace functions. AI-powered development takes center stage: By 2025, 30% of enterprises are expected to implement an AI-augmented development and testing strategy, up from 5% in 2021. This suggests AI will play a crucial role in streamlining software development and testing processes. Generative design automates website and app creation: By 2026, generative design AI is predicted to automate 60% of the design effort for new websites and mobile applications, potentially revolutionizing the design landscape. Rise of the "robocolleagues": By 2026, over 100 million people are expected to collaborate with AI-powered "robocolleagues" in their daily work, blurring the lines between human and machine interaction within teams. AI-generated applications on the horizon: By 2027, nearly 15% of new applications are predicted to be automatically generated by AI without human intervention, marking a significant step towards autonomous AI development. Edge AI for real-time data analysis: By 2025, over 55% of all data analysis by deep neural networks is expected to occur at the point of data capture (edge computing), compared to only 10% in 2021. This signifies a shift towards real-time, decentralized data analysis powered by AI. These predictions highlight the transformative potential of generative AI in reshaping the future of work and the enterprise. As AI capabilities continue to evolve, businesses that embrace this technology strategically will likely be well-positioned to thrive in the years to come. Autonomous Agents: The Next Frontier in Generative AI The rise of autonomous agents marks a significant leap in generative AI technology. These aren't just ordinary software programs; they are self-directed AI models designed to achieve specific goals, breaking free from the limitations of traditional prompt engineering. How do they work? Powered by advanced algorithms and machine learning: Autonomous agents learn and adapt using data, enabling them to make decisions and respond to situations with minimal human guidance. Think of them like self-training AI assistants: Tools like OpenAI's custom GPTs exemplify the progress made in this field, with these agents demonstrating remarkable capabilities. The secret sauce? Multimodal AI: This approach combines various AI techniques like natural language processing, computer vision, and machine learning. By analyzing various data types simultaneously and understanding the context, autonomous agents can make better predictions, take appropriate actions, and interact more effectively. Building the future, one agent at a time: Popular frameworks like LangChain and LlamaIndex are paving the way for building these powerful agents. 2024 promises exciting advancements with new frameworks leveraging the power of multimodal AI to take agent capabilities to the next level. The impact on your world: Imagine highly contextualized AI assistants enhancing customer experience in sectors like travel, hospitality, retail, and education. Reduced costs due to less human intervention are just one of the many benefits these intelligent and responsive agents offer. The rise of autonomous agents represents a significant leap forward in generative AI, promising to revolutionize how we interact with technology and experience the world around us. The Powerhouse for Generative AI Kubernetes is taking center stage as the go-to platform for deploying generative AI models. Big players like Hugging Face, OpenAI, and Google are all set to leverage the cloud-native infrastructure of Kubernetes to offer robust generative AI platforms. Here's what the future holds: Seamless workflow management: By 2024, we expect to see mature frameworks, tools, and platforms running on Kubernetes that can handle the entire lifecycle of generative models. This means users will be able to pre-train, fine-tune, deploy, and scale their models efficiently, all within a single environment. Collaboration and best practices: Key players in the cloud-native ecosystem will work together to establish reference architectures, best practices, and optimization strategies for running generative AI on cloud-native infrastructure. This will create a standardized and efficient approach for everyone involved. LLMOps on the rise: The concept of LLMOps, managing the lifecycle of large language models, will be extended to seamlessly integrate with cloud-native workflows. This ensures smooth operation and maintenance of these powerful AI models As 2024 unfolds, we can expect generative AI to continue its rapid evolution and deliver even more groundbreaking capabilities that benefit both consumers and businesses alike. With the strong foundation of Kubernetes and a collaborative ecosystem, the future of generative AI is bright and full of exciting possibilities. Ready to delve deeper into the future of generative AI and its potential to revolutionize various industries? Follow TheGen.ai for Generative AI news, trends, startup stories, and more.

  • OpenAI faces complaint over fictional outputs

    European data protection advocacy group noyb has filed a complaint against OpenAI over the company’s inability to correct inaccurate information generated by ChatGPT. The group alleges that OpenAI’s failure to ensure the accuracy of personal data processed by the service violates the General Data Protection Regulation (GDPR) in the European Union. “Making up false information is quite problematic in itself. But when it comes to false information about individuals, there can be serious consequences,” said Maartje de Graaf, Data Protection Lawyer at noyb. “It’s clear that companies are currently unable to make chatbots like ChatGPT comply with EU law when processing data about individuals. If a system cannot produce accurate and transparent results, it cannot be used to generate data about individuals. The technology has to follow the legal requirements, not the other way around.” The GDPR requires that personal data be accurate, and individuals have the right to rectification if data is inaccurate, as well as the right to access information about the data processed and its sources. However, OpenAI has openly admitted that it cannot correct incorrect information generated by ChatGPT or disclose the sources of the data used to train the model. “Factual accuracy in large language models remains an area of active research,” OpenAI has argued. The advocacy group highlights a New York Times report that found chatbots like ChatGPT “invent information at least 3 percent of the time – and as high as 27 percent.” In the complaint against OpenAI, noyb cites an example where ChatGPT repeatedly provided an incorrect date of birth for the complainant, a public figure, despite requests for rectification. “Despite the fact that the complainant’s date of birth provided by ChatGPT is incorrect, OpenAI refused his request to rectify or erase the data, arguing that it wasn’t possible to correct data,” noyb stated. OpenAI claimed it could filter or block data on certain prompts, such as the complainant’s name, but not without preventing ChatGPT from filtering all information about the individual. The company also failed to adequately respond to the complainant’s access request, which the GDPR requires companies to fulfil. “The obligation to comply with access requests applies to all companies. It is clearly possible to keep records of training data that was used to at least have an idea about the sources of information,” said de Graaf. “It seems that with each ‘innovation,’ another group of companies thinks that its products don’t have to comply with the law.” European privacy watchdogs have already scrutinised ChatGPT’s inaccuracies, with the Italian Data Protection Authority imposing a temporary restriction on OpenAI’s data processing in March 2023 and the European Data Protection Board establishing a task force on ChatGPT. In its complaint, noyb is asking the Austrian Data Protection Authority to investigate OpenAI’s data processing and measures to ensure the accuracy of personal data processed by its large language models. The advocacy group also requests that the authority order OpenAI to comply with the complainant’s access request, bring its processing in line with the GDPR, and impose a fine to ensure future compliance.

  • Prafulla Dhariwal, the child prodigy behind OpenAI's GPT-4o

    "My name sounds like truffle, but with a P." No, this isn't some cheesy pickup line. It's how Prafulla Dhariwal introduces himself on his website. And he's more than just a catchy introduction — Dhariwal is the mastermind behind the creation of GPT-4o (with the 'o' standing for Omni). Dhariwal leads the Omni team at OpenAI, and GPT-4o marks their first foray into natively multimodal models. Earlier this week, OpenAI unveiled its latest flagship AI model GPT-4o at its Spring Update event, showcasing its ability to reason across voice, text, and vision. But until OpenAI CEO Sam Altman's recent post on X (formerly Twitter), the world knew little about this Pune native. “GPT-4o would not have happened without the vision, talent, conviction, and determination of Prafulla Dhariwal over a long period of time. That, along with the work of many others, led to what I hope will turn out to be a revolution in how we use computers,” Altman said in the post. Who is Prafulla Dhariwal? Dhariwal's journey has been marked by a string of remarkable accomplishments. In 2009, he won the National Talent Search Scholarship from the Government of India. The same year, he also won the gold medal at the International Astronomy Olympiad in China. In 2012 and 2013, he won gold medals at the International Mathematical Olympiad and the International Physics Olympiad, respectively. His Class XII performance was nothing short of stellar, evidenced by his remarkable score of 295 out of 300 in the physics-chemistry-mathematics (PCM) group. He even excelled in entrance exams, securing 190 in the Maharashtra Technical Common Entrance Test (MT-CET) and 330 out of 360 in the Joint Entrance Exam (JEE-Mains). He was even awarded the annual Abasaheb Naravane memorial prize in 2013, instituted by the Maharashtra State Board of Secondary and Higher Secondary Education (MSBSHSE). “In class XII I studied throughout the year and my special focus was on JEE preparations as I wanted to study in IIT. But now, I am more than happy that I have also been selected in MIT,” Dhariwal had told the Mid-Day newspaper in 2013. Dhariwal hails from Pune, India, and has always been a child prodigy, winning tech competitions since his early years. His parents recognised his natural talent at a very young age. “When he was only one-and-a-half years old, we bought a computer,” his mother recalled in an old interview. She added that whenever Prafulla’s dad sent an email, he would sit next to him, eager to learn. At 11, he designed his first website. His feats don’t end there. Prafulla also featured in a Pogo ad called ‘Amazing Kid Genius’ and received a scholarship for a 10-day trip to NASA. In high school, he scored 295 out of 300 in the physics-chemistry-mathematics (PCM) group in his Class XII exams and achieved a score of 190 in the Maharashtra Technical Common Entrance Test (MT-CET). Additionally, he scored 330 out of 360 in the Joint Entrance Exam (JEE-Mains). Prafulla received the prestigious Abasaheb Naravane Memorial Award for achieving the highest marks in PCM. He also represented India in international Olympiads, including the International Astronomy Olympiad in China and the International Mathematics Olympiad in Argentina. After completing high school, Dhariwal chose to pursue his undergraduate studies at the Massachusetts Institute of Technology (MIT) instead of IIT. He studied there from 2013 to 2017, majoring in computer science and mathematics. When asked if it was a tough choice to make between IIT and MIT, he said, “Absolutely! Both institutes are the best. Fortunately, MIT is providing me a scholarship that includes both tuition fees and residential facilities. That’s why I have decided to go to MIT,” said Dhariwal in an old interview. Dhariwal pursued his Bachelor's in Computer Science (Mathematics) at the Massachusetts Institute of Technology (MIT), graduating in 2017 with a perfect GPA of 5.0/5.0. Dhariwal joined OpenAI in May 2016 as a research intern and grew up the ranks as a research scientist at the company working on generative models and unsupervised learning. He is one of the co-creators of GPT-3, text-to-image platform DALL-E 2, music generator Jukebox, and reversible generative model Glow. Prior to joining OpenAI, Dhariwal had short stints as a software engineering intern at Pinterest, a quantitative analyst intern at the D.E. Shaw Group, and as an undergraduate researcher at both the Centre for Brain, Mind and Machines, and the Computer Visions Group at MIT. OpenAI recently released GPT-4o at its latest Spring Update event, which won hearts with its ‘omni’ capabilities across text, vision, and audio. OpenAI’s demos, which included a real-time translator, a coding assistant, an AI tutor, a friendly companion, a poet, and a singer, soon became the talk of the town. However, little did the world know that an Indian who was a child prodigy, Prafulla Dhariwal, was behind it until OpenAI chief Sam Altman posted about it on X. “GPT-4o would not have happened without the vision, talent, conviction, and determination of Prafulla Dhariwal over a long period of time. that (along with the work of many others) led to what i hope will turn out to be a revolution in how we use computers,” posted OpenAI chief Sam Altman on X, praising Dhariwal’s efforts behind GPT-4o. “GPT-4o (o for ‘Omni’) is the first model to come out of the Omni team, OpenAI’s first natively fully multimodal model. This launch was a huge org-wide effort, but I’d like to give a shout out to a few of my awesome team members who made this magical model even possible!” posted Dhariwal on X, highlighting the contributions of his team. The Journey So Far After completing his undergraduate degree, Dhariwal joined OpenAI in 2017 as a research scientist, focusing on generative models and unsupervised learning. Dhariwal has also worked on the Jukebox project, a generative model for music that can create high-fidelity and diverse songs with coherence for several minutes. This model uses a multi-scale VQ-VAE to compress raw audio into discrete codes, which are then modeled using autoregressive Transformers. Building on his expertise in generative models, Dhariwal has also contributed to the development of diffusion models that outperform GANs in image synthesis. These diffusion models achieve superior image sample quality and can be used for both unconditional and conditional image synthesis, further showcasing his impact on the field of AI. Moreover, he has made significant contributions to advanced AI models, such as Glow for generating high-resolution images quickly, the Variational Lossy Auto-encoder to prevent issues in autoencoders, PPO (Proximal Policy Optimisation) for reinforcement learning, and GamePad for applying reinforcement learning to formal theorem proving. With OpenAI’s model now capable of engaging in natural, real-time voice conversations, the next pitstop for OpenAI appears to be music generation, and undoubtedly, Dhariwal will be at the center of it all.

  • Why Should News Organizations (Not) Build an LLM?

    Integrating Large Language Models (LLMs) into the newsroom has the potential to unlock a myriad of opportunities for news organizations in tasks relevant to content creation and editing, as well as news gathering and distribution. But as newsrooms continue to explore the avenues and prospects for harnessing LLMs, a question arises around the strategic and competitive use of the technology: should news organizations strive to train their own LLMs? In this post I argue that news organizations (especially those with limited resources) that use prompt engineering, fine-tuning, and retrieval augmented generation (RAG) to enhance their productivity and offerings will be strategically better off than if they train their own LLMs from scratch. I will first lay out the cost calculus for engineering and deploying your own model, and then I’ll elaborate the benefits and trade-offs of these other techniques for leveraging the value from generative AI. Training LLMs from scratch could be a costly decision Building and training LLMs from scratch is challenging due to the need for large datasets, extensive computing resources, and specialized talent to develop and train these models. For instance, the computing resources needed to train BloombergGPT — an LLM for finance — is estimated to have cost approximately $1M. While the cost of serving such a model is not public information, the infrastructure required to serve even such a moderate-size model, irrespective of the number of users, is probably not cheap, and is likely somewhere in the six figures. In addition, the ethical considerations around building a responsible model, such as ensuring fairness, privacy, and transparency, while sourcing ethical data, require dedicated attention and resources that could divert the focus of news organizations from their core journalistic tasks and reporting. Optimizing an LLM’s performance with respect to the amount of compute resources required for training remains a work in progress. Rushing forward to train an LLM without a detailed cost-benefit analysis is likely to end up costing news organizations hefty amounts of money that may not yield a high return on investment. Since the development of BloombergGPT, in March 2023, smaller and more capable open source model architectures (such as Mistral Models) are becoming publicly available which present a competitive alternative to other large, costly, and proprietary models. News organizations may instead want to consider the strategic advantages of utilizing third-party models that are accessible via API endpoints. This can reduce infrastructure costs while also ensuring access to state-of-the-art models, and even facilitate versatility in terms of quickly swapping models. For instance, news organizations could deploy the open-source and quite performant Mistral-7B model via the HuggingFace Inference Endpoint on a single A10G GPU for $1.3 per hour for experimentation purposes. They could then decide to switch to Gemma-7B from Google at no cost while still paying the same amount for compute, allowing for rapid iteration and testing of different models. Accordingly, News organizations exploring the potentials of prompt engineering, fine-tuning, and retrieval-augmented generation (RAG) may have a competitive cost advantage and application development agility, possibly achieving a faster return on investment through the use of readily available models (e.g., GPT-4 or Claude) via API or inference endpoints (e.g., Mistral-7B deployed via HuggingFace Inference Endpoint) for their applications. What is prompt engineering? Prompt engineering is an emerging communication technique between users and LLMs that is used to craft questions and instructions to elicit a desired response from LLMs. While prompt engineering appears to be straightforward on the surface, it requires domain expertise in different prompting techniques to fully reap the benefits of LLMs. For instance, this guide lists 17 different approaches to prompting, some of which are rather structured and involved. And different models may require different prompt formats or tricks to get the best performance. Yet prompt engineering is still the fastest way to get information from a general purpose LLM (at least, one that is already tuned to behave like a chat assistant similar to ChatGPT) without modifying its architecture or retraining it. You can refer to the Introduction to prompt design documentation guide to learn about how to create prompts that elicit the desired output from Google’s LLMs. OpenAI offers a similar prompting guide to use its models effectively. Or, Journalist’s ToolBox offers useful prompting resources that are more oriented towards use cases in journalism. What is Retrieval Augmented Generation (RAG)? While prompt engineering is a very powerful and resource efficient way to generate desired content, the knowledge for many LLMs is capped by the cut-off date of their training data. For example, GPT-4 has a cut off training date of December 2023. In other words, without merging GPT-4’s knowledge with the information available online, the model won’t be able to infer the latest updates in the world. News organizations can build their own cost-efficient RAG systems using externally hosted LLMs (such as GPT-4 or Claude) or internally hosted open source models (e.g., Mistral-7B) to enable journalists and users to sift through and converse with a large corpus of archival documents, knowledge bases, or reporting material similar to the Financial Times AI chatbot. RAG services can also be multi-modal. Using multi-modal open source vector databases such as Weaviate, users can query and retrieve audio, video, and text data in natural language. Overall, RAG allows LLMs to access real-time information (via connecting and retrieving information from the internet) or domain-specific knowledge (e.g., archival data) from a specific set of sources. This capability can potentially enable journalists to generate answers to questions that are grounded in a curated set of factual and up-to-date information, enhancing the accuracy of the LLM. When to fine-tune an LLM? News organizations interested in specializing pre-trained LLMs (e.g., GPT-3.5) for specific applications and tasks, such as reflecting a specific writing style in generated text, should consider fine-tuning. Fine-tuning LLMs eliminates the need for constant prompt engineering to get the desired output by instead curating a dataset that closely mirrors the nature of a specific task. For instance, perhaps your organization has a specific style or tone used in headlines that you would like a model to be able to mimic. By curating a dataset of articles and headlines you could then fine-tuning a model to be able to produce such headlines without requiring a user to know how to prompt the model in any specific ways. While fine-tuning offers the advantage of a model with specialized and tailored responses and it reduces the prompting expertise and knowledge burden for end-users, the process of fine-tuning could potentially be expensive depending on the compute and data resources that are required for a particular task. However, it will surely be a much cheaper option than training an LLM from scratch. Fine-tuning LLMs also requires constant monitoring and qualitative evaluation for potential model drifts. OpenAI offers a great guide that walks through the necessary steps of customizing LLMs for your application. In addition, cloud service providers such as Google and Amazon also offer users the ability to fine-tune LLMs via their platforms Vertex AI and Bedrock, respectively. Which method to pick from? Prompt Engineering offers rapid adaptability to tasks in the newsroom with low computational overhead and technical complexity. It requires human expertise to craft prompts but does not require compute resources for inference especially for model endpoints offered by model providers such as OpenAI and Anthropic. Retrieval Augmented Generation (RAG) extends an LLMs’ capacity to incorporate real-time or external data for more factual responses. Although RAG does not require training or fine-tuning LLMs, storing knowledge bases from which the LLMs are fetching information may incur some cost as the knowledge base increases in size. Fine-tuning, on the other hand, provides high specialization for task-specific responses, requires a careful selection of the dataset for fine-tuning, and involves moderate computational and technical resources. In Closing Based on these various factors I would generally recommend for news organizations to explore the use case in which LLMs can be integrated to their workflows through prompting and possibly fine-tuning third-party models for their tasks. This is often going to be preferable to grappling with expensive infrastructure costs needed to train and deploy models that can potentially be outdated and less efficient in the near future. Until the infrastructure costs become cheaper and training LLMs become more accessible, I would not recommend for news organizations to build their own LLMs. Large Language Models Journalism Generative Ai Use Case

  • Igor Jablokov, Pryon: Building a responsible AI future

    As artificial intelligence continues to rapidly advance, ethical concerns around the development and deployment of these world-changing innovations are coming into sharper focus. In an interview ahead of the AI & Big Data Expo North America , Igor Jablokov, CEO and founder of AI company Pryon , addressed these pressing issues head-on. Critical ethical challenges in AI “There’s not one, maybe there’s almost 20 plus of them,” Jablokov stated when asked about the most critical ethical challenges. He outlined a litany of potential pitfalls that must be carefully navigated—from AI hallucinations and emissions of falsehoods, to data privacy violations and intellectual property leaks from training on proprietary information. Bias and adversarial content seeping into training data is another major worry, according to Jablokov. Security vulnerabilities like embedded agents and prompt injection attacks also rank highly on his list of concerns, as well as the extreme energy consumption and climate impact of large language models. Pryon’s origins can be traced back to the earliest stirrings of modern AI over two decades ago. Jablokov previously led an advanced AI team at IBM where they designed a primitive version of what would later become Watson. “They didn’t greenlight it. And so, in my frustration, I departed, stood up our last company,” he recounted. That company, also called Pryon at the time, went on to become Amazon’s first AI-related acquisition, birthing what’s now Alexa. The current incarnation of Pryon has aimed to confront AI’s ethical quandaries through responsible design focused on critical infrastructure and high-stakes use cases. “[We wanted to] create something purposely hardened for more critical infrastructure, essential workers, and more serious pursuits,” Jablokov explained. A key element is offering enterprises flexibility and control over their data environments. “We give them choices in terms of how they’re consuming their platforms…from multi-tenant public cloud, to private cloud, to on-premises,” Jablokov said. This allows organisations to ring-fence highly sensitive data behind their own firewalls when needed. Pryon also emphasises explainable AI and verifiable attribution of knowledge sources. “When our platform reveals an answer, you can tap it, and it always goes to the underlying page and highlights exactly where it learned a piece of information from,” Jablokov described. This allows human validation of the knowledge provenance. In some realms like energy, manufacturing, and healthcare, Pryon has implemented human-in-the-loop oversight before AI-generated guidance goes to frontline workers. Jablokov pointed to one example where “supervisors can double-check the outcomes and essentially give it a badge of approval” before information reaches technicians. Ensuring responsible AI development Jablokov strongly advocates for new regulatory frameworks to ensure responsible AI development and deployment. While welcoming the White House’s recent executive order as a start, he expressed concerns about risks around generative AI like hallucinations, static training data, data leakage vulnerabilities, lack of access controls, copyright issues, and more.   Pryon has been actively involved in these regulatory discussions. “We’re back-channelling to a mess of government agencies,” Jablokov said. “We’re taking an active hand in terms of contributing our perspectives on the regulatory environment as it rolls out…We’re showing up by expressing some of the risks associated with generative AI usage.” On the potential for an uncontrolled, existential “AI risk” – as has been warned about by some AI leaders – Jablokov struck a relatively sanguine tone about Pryon’s governed approach: “We’ve always worked towards verifiable attribution…extracting out of enterprises’ own content so that they understand where the solutions are coming from, and then they decide whether they make a decision with it or not.” The CEO firmly distanced Pryon’s mission from the emerging crop of open-ended conversational AI assistants, some of which have raised controversy around hallucinations and lacking ethical constraints. “We’re not a clown college. Our stuff is designed to go into some of the more serious environments on planet Earth,” Jablokov stated bluntly. “I think none of you would feel comfortable ending up in an emergency room and having the medical practitioners there typing in queries into a ChatGPT, a Bing, a Bard…” He emphasised the importance of subject matter expertise and emotional intelligence when it comes to high-stakes, real-world decision-making. “You want somebody that has hopefully many years of experience treating things similar to the ailment that you’re currently undergoing. And guess what? You like the fact that there is an emotional quality that they care about getting you better as well.” At the upcoming AI & Big Data Expo , Pryon will unveil new enterprise use cases showcasing its platform across industries like energy, semiconductors, pharmaceuticals, and government. Jablokov teased that they will also reveal “different ways to consume the Pryon platform” beyond the end-to-end enterprise offering, including potentially lower-level access for developers. As AI’s domain rapidly expands from narrow applications to more general capabilities, addressing the ethical risks will become only more critical. Pryon’s sustained focus on governance, verifiable knowledge sources, human oversight, and collaboration with regulators could offer a template for more responsible AI development across industries.

  • Generative AI gold rush drives IT spending — with payoff in question

    A scramble to invest in artificial intelligence and a natural replacement cycle for computing devices purchased during the COVID pandemic will lead to an 8% increase in global IT spending this year, Gartner predicted. Interest in AI, building since last year, will push a 10% increase in data center system spending this year, driving worldwide IT spending to $5.06 trillion, said John-David Lovelock, distinguished vice president analyst at Gartner. “No company got out of 2023 without having a story about how much better their company was going to be, how much better their products were going to be, how much better their customers’ lives were going to be because of generative AI,” he said. “There were very robust stories about how great generative AI was going to be.” That said, CIOs buying into the AI hype should beware that their ambitions are likely out front of their execution, Lovelock said. According to a Gartner survey conducted in late 2023, 42% of CIOs planned to deploy generative AI tools by the third quarter of this year, with 55% planning to roll out some other type of AI or machine learning in the same timeframe. But those timelines — and investments — are “highly aspirational,” Loveland said. “There’s no way most companies are going to get anywhere near this. They are not going to have a meaningful generative AI product or service running at their company by the end of this year.” In other words, CIOs are likely to be apportioning more budget toward speculative initiatives less likely to pan out. What does Generative AI give me? Companies buying the marketing hype about the benefits of AI need to look for proofs of concept, added Mark McDonald, a distinguished vice president analyst at Gartner. The first deployments of generative AI will likely be focused on simple tasks, such as drafting responses to emails and analyzing contracts. “It takes more than just installing software to make gen AI work,” McDonald said. A huge problem with AI adoption is a lack of a clear and compelling investment approach, McDonald added. “Many gen AI initiatives are being driven off the board and the C-level executives with this fear of falling behind,” he said. “Everybody is still pretty much at the starting line from an enterprise perspective.” AI promises cost savings, productivity improvements, and better customer experiences, but CIOs need to figure out how to calculate the ROI, McDonald said. “The question then becomes, ‘What am I going to get from this?’” he added. “That investment is not necessarily trivial.” While enterprises look to adopt AI, many software vendors will be flooding the market with AI-based products in the next two years, Lovelock suggested. AI-based email and collaboration tools, content services, and CRM are already here, and AI-based security software, supply chain management, app development, and ERP are coming or have come this year, he noted. Many AI apps will become commonplace in the next year or two. “If [vendors are] looking for a first-mover advantage, this is a one-year window in many software areas, two at the most,” he said. “If you allow your competition to have a gen AI product and you didn’t, you’re going to lose market share. Gen AI systems will be coming into every product and service.” Still, early returns, associated price hikes, and questions around value add have some CIOs not entirely sold on generative AI features just yet. Data centers shoulder the load To support AI workloads, spending on data center systems will increase by 10% in 2024, Gartner predicted, compared to a 4% increase in 2023. Hyperscalers will buy about 70% of the AI servers purchased this year, Loveland predicted. Gartner’s new IT spending forecast predicts spending growth in all five major IT categories, including software, IT services, and communications services. In January, Gartner had predicted a 6.8% increase in IT spending this year, and it measured a 3.3% growth in 2023. Software spending will see the largest percentage growth in 2024, with 13.9%, but IT services and communications services remain the largest two IT categories, with 2024 spending at $1.39 trillion and $1.49 trillion, respectively. Another big change from 2023 comes in the devices category, which saw a 9.1% drop in 2023. Gartner predicts spending on smartphones, PCs, and tablets will increase by 3.6% this year, as enterprises and consumers begin to replace the huge number of devices purchased in 2021 for employees and students staying home during the COVID pandemic. About $809 billion worth of devices were sold in 2021, while Gartner predicts $687 billion in device spending in 2024, after a couple of flat to down years. “We’re finally getting back to a regular replacement cycle,” Loveland said. “Enterprises are coming back three years later and saying, ‘Now it’s time to start refreshing the devices again.’”

  • Meta’s newest AI model beats some peers. But its amped-up AI agents are confusing Facebook users

    Facebook parent Meta Platforms unveiled a new set of artificial intelligence systems Thursday that are powering what CEO Mark Zuckerberg calls “the most intelligent AI assistant that you can freely use.” But as Zuckerberg’s crew of amped-up Meta AI agents started venturing into social media this week to engage with real people, their bizarre exchanges exposed the ongoing limitations of even the best generative AI technology. One joined a Facebook moms’ group to talk about its gifted child. Another tried to give away nonexistent items to confused members of a Buy Nothing forum. Meta, along with leading AI developers Google and OpenAI, and startups such as Anthropic, Cohere and France’s Mistral, have been churning out new AI language models and hoping to persuade customers they’ve got the smartest, handiest or most efficient chatbots. While Meta is saving the most powerful of its AI models, called Llama 3, for later, on Thursday it publicly released two smaller versions of the same Llama 3 system and said it’s now baked into the Meta AI assistant feature in Facebook, Instagram and WhatsApp. A coffee roastery in Finland launched an AI-generated blend. The results were surprising Olympic organizers unveil strategy for using artificial intelligence in sports Google is combining its Android software and Pixel hardware divisions to more broadly integrate AI AI language models are trained on vast pools of data that help them predict the most plausible next word in a sentence, with newer versions typically smarter and more capable than their predecessors. Meta’s newest models were built with 8 billion and 70 billion parameters — a measurement of how much data the system is trained on. A bigger, roughly 400 billion-parameter model is still in training. “The vast majority of consumers don’t candidly know or care too much about the underlying base model, but the way they will experience it is just as a much more useful, fun and versatile AI assistant,” said Nick Clegg, Meta’s president of global affairs, in an interview. He added that Meta’s AI agent is loosening up. Some people found the earlier Llama 2 model — released less than a year ago — to be “a little stiff and sanctimonious sometimes in not responding to what were often perfectly innocuous or innocent prompts and questions,” he said. But in letting down their guard, Meta’s AI agents also were spotted this week posing as humans with made-up life experiences. An official Meta AI chatbot inserted itself into a conversation in a private Facebook group for Manhattan moms, claiming that it, too, had a child in the New York City school district. Confronted by group members, it later apologized before the comments disappeared, according to a series of screenshots shown to The Associated Press. “Apologies for the mistake! I’m just a large language model, I don’t have experiences or children,” the chatbot told the group. One group member who also happens to study AI said it was clear that the agent didn’t know how to differentiate a helpful response from one that would be seen as insensitive, disrespectful or meaningless when generated by AI rather than a human. “An AI assistant that is not reliably helpful and can be actively harmful puts a lot of the burden on the individuals using it,” said Aleksandra Korolova, an assistant professor of computer science at Princeton University. Clegg said Wednesday he wasn’t aware of the exchange. Facebook’s online help page says the Meta AI agent will join a group conversation if invited, or if someone “asks a question in a post and no one responds within an hour.” The group’s administrators have the ability to turn it off. In another example shown to the AP on Thursday, the agent caused confusion in a forum for swapping unwanted items near Boston. Exactly one hour after a Facebook user posted about looking for certain items, an AI agent offered a “gently used” Canon camera and an “almost-new portable air conditioning unit that I never ended up using.” Meta said in a written statement Thursday that “this is new technology and it may not always return the response we intend, which is the same for all generative AI systems.” The company said it is constantly working to improve the features. In the year after ChatGPT sparked a frenzy for AI technology that generates human-like writing, images, code and sound, the tech industry and academia introduced some 149 large AI systems trained on massive datasets, more than double the year before, according to a Stanford University survey. They may eventually hit a limit — at least when it comes to data, said Nestor Maslej, a research manager for Stanford’s Institute for Human-Centered Artificial Intelligence. “I think it’s been clear that if you scale the models on more data, they can become increasingly better,” he said. “But at the same time, these systems are already trained on percentages of all the data that has ever existed on the internet.” More data — acquired and ingested at costs only tech giants can afford, and increasingly subject to copyright disputes and lawsuits — will continue to drive improvements. “Yet they still cannot plan well,” Maslej said. “They still hallucinate. They’re still making mistakes in reasoning.” Getting to AI systems that can perform higher-level cognitive tasks and commonsense reasoning — where humans still excel— might require a shift beyond building ever-bigger models. For the flood of businesses trying to adopt generative AI, which model they choose depends on several factors, including cost. Language models, in particular, have been used to power customer service chatbots, write reports and financial insights and summarize long documents. “You’re seeing companies kind of looking at fit, testing each of the different models for what they’re trying to do and finding some that are better at some areas rather than others,” said Todd Lohr, a leader in technology consulting at KPMG. Unlike other model developers selling their AI services to other businesses, Meta is largely designing its AI products for consumers — those using its advertising-fueled social networks. Joelle Pineau, Meta’s vice president of AI research, said at a London event last week the company’s goal over time is to make a Llama-powered Meta AI “the most useful assistant in the world.” “In many ways, the models that we have today are going to be child’s play compared to the models coming in five years,” she said. But she said the “question on the table” is whether researchers have been able to fine tune its bigger Llama 3 model so that it’s safe to use and doesn’t, for example, hallucinate or engage in hate speech. In contrast to leading proprietary systems from Google and OpenAI, Meta has so far advocated for a more open approach, publicly releasing key components of its AI systems for others to use. “It’s not just a technical question,” Pineau said. “It is a social question. What is the behavior that we want out of these models? How do we shape that? And if we keep on growing our model ever more in general and powerful without properly socializing them, we are going to have a big problem on our hands.”

  • AI tool finds cancer signs missed by doctors

    An AI tool has proven capable of detecting signs of cancer that were overlooked by human radiologists. The AI tool, called Mia, was piloted alongside NHS clinicians in the UK and analysed the mammograms of over 10,000 women. Most of the participants were cancer-free, but the AI successfully flagged all of those with symptoms of breast cancer—as well as an additional 11 cases that the doctors failed to identify. Of the 10,889 women who participated in the trial, only 81 chose not to have their scans reviewed by the AI system. The AI tool was trained on a dataset of over 6,000 previous breast cancer cases to learn the subtle patterns and imaging biomarkers associated with malignant tumours. When evaluated on the new cases, it correctly predicted the presence of cancer with 81.6 percent accuracy and correctly ruled it out 72.9 percent of the time. Breast cancer is the most common cancer in women worldwide, with two million new cases diagnosed annually. While survival rates have improved with earlier detection and better treatments, many patients still experience severe side effects like lymphoedema after surgery and radiotherapy. Researchers are now developing the AI system further to predict a patient’s risk of such side effects up to three years after treatment. This could allow doctors to personalise care with alternative treatments or additional supportive measures for high-risk patients. The research team plans to enrol 780 breast cancer patients in a clinical trial called Pre-Act to prospectively validate the AI risk prediction model over a two-year follow-up period. The long-term goal is an AI system that can comprehensively evaluate a patient’s prognosis and treatment needs.

  • From Grocery Lists to Gothic Castles: Text-to-Image - AI's New Party Trick

    Imagine conjuring a vibrant picture with the mere power of your mind. Describing a scene in detail, and witnessing it come to life on a canvas. This captivating ability, once confined solely to the realm of imagination, is becoming increasingly achievable with the revolutionary field of text-to-image synthesis. This technology, powered by generative models, unlocks a remarkable possibility: automatically generating images based on textual descriptions. It's like having a personal artist who interprets your words and translates them into visual masterpieces. The Spark of Creation: Understanding Text-to-Image Synthesis Text-to-image synthesis delves into the fascinating world of artificial intelligence (AI) and machine learning. At its core, the technology relies on models trained on massive amounts of data. This data consists of text-image pairs, where each pair includes a textual description and its corresponding image. By analyzing these paired examples, the models learn the intricate relationship between language and visual representation. Think of it like teaching a language to a computer. Just as you learn the connection between words and their meanings, the model learns to associate specific words and phrases with visual elements like colors, shapes, and textures. Once trained, the model can then generate new images based on novel textual descriptions provided by users. "Text-to-image models are like having a personal artist in your pocket, ready to bring your ideas to life with the power of words." - Jeff Clune, Chief Scientist at OpenAI 2022: A Year of Remarkable Advancements in Text-to-Image Models 2022 marked a significant year in the evolution of text-to-image models. These models, powered by artificial intelligence, can generate stunning images based on textual descriptions. This year witnessed a rapid pace of innovation, with new models like OpenAI's DALL-E 2, Google's Imagen, and StabilityAI's Stable Diffusion pushing the boundaries of what was possible. Diffusion Models Take Center Stage: Similar to the transformative impact of transformers in natural language processing, experts believe diffusion models will have a similar effect on text-to-image generation. These models, gaining prominence in 2022, work by gradually refining a random noise image into the desired final image based on the provided text description. This approach proved highly successful, with several notable breakthroughs occurring throughout the year: April 2022: OpenAI unveiled DALL-E 2, showcasing its capability to generate artistic and photorealistic images based on text prompts. May 2022: Google introduced Imagen, further pushing the boundaries of image quality and realism. August 2022: StabilityAI released Stable Diffusion, another powerful diffusion-based model. November 2022: NVIDIA joined the scene with eDiffi, a novel diffusion model raising the bar for text-to-image generation. eDiffi: Redefining Text-to-Image Capabilities While generating images from textual descriptions is impressive in itself, the latest research from NVIDIA with eDiffi takes things a step further. This innovative model introduces two groundbreaking features: Combining Text and Reference Images: eDiffi allows users to incorporate a reference image alongside their text description. This is particularly beneficial for scenarios where complex concepts are difficult to describe solely with words. The reference image guides the model in applying the desired style to the generated image. Paint-by-Word for Precise Control: eDiffi empowers users with granular control over object placement within the generated image. By associating specific phrases in the text prompt with different areas of a virtual canvas, users can dictate the layout and arrangement of objects in the final image. These advancements in text-to-image models hold immense potential for various applications, from creative design and artistic exploration to education and scientific visualization. As research continues to accelerate, the future of this technology appears dazzlingly bright, offering exciting possibilities for generating even more sophisticated and user-controlled visual experiences. The Artistic Palette: Exploring Different Generative Models The world of text-to-image synthesis boasts a variety of generative models, each with its unique approach to the creative process: Generative Adversarial Networks (GANs): Imagine two artists competing, constantly pushing each other to improve their work. This is the essence of GANs. They involve two neural networks: a generator that attempts to create realistic images based on the text description, and a discriminator that tries to distinguish between the generated images and real images from the training data. This continuous competition drives the generator to create increasingly realistic and accurate images. Transformers: Inspired by the revolutionary architecture used in natural language processing (NLP), transformers excel at capturing the long-range dependencies within a sentence. This makes them well-suited for text-to-image synthesis. Transformer-based models learn to encode the textual description and then use this information to meticulously construct the image, pixel by pixel. Diffusion Models: Picture a process akin to sculpting. Diffusion models begin with a random noise image and gradually refine it towards the final desired image based on the provided text description. Think of this as adding details and removing noise step-by-step, guided by the text instructions, until the final image emerges. Each model has its own strengths and weaknesses, and the choice often depends on factors like the desired level of realism, the complexity of the description, and the computational resources available. Beyond the Canvas: Challenges and the Road Ahead While significant strides have been made, text-to-image synthesis still faces a few challenges: Understanding complex semantics: Just like humans sometimes misinterpret nuances in language, models can struggle with complex or ambiguous descriptions. Ensuring accurate interpretation of intricate details remains a work in progress. Generating diverse and creative outputs: While models can create realistic images, they sometimes lack the spark of true creativity. Encouraging models to generate diverse and original outputs that go beyond replicating existing styles is an ongoing pursuit. Ethical considerations: The training data, and consequently the generated images, can inadvertently encode societal biases. Addressing these biases and ensuring fairness and inclusivity in the generated images is crucial. Despite these challenges, researchers are actively working on addressing them, constantly refining model architectures, employing advanced training techniques, and fostering responsible AI development. A Glimpse into the Future: Where Text-to-Image Synthesis Leads Us The future of text-to-image synthesis is brimming with exciting possibilities: Enhanced control and interpretability: Imagine being able to fine-tune the style, content, and composition of the generated images. This level of control and interpretability would empower users to create truly personalized visual experiences. Multimodal learning: Combining textual information with other modalities like audio or video promises even richer and more immersive experiences. Imagine describing a bustling cityscape, and not only seeing the image but also hearing the sounds of traffic and conversations. Real-time generation: The ability to generate images from text descriptions in real-time would open up a plethora of interactive applications. Imagine describing a new outfit for your character in a game and seeing it come to life instantaneously. Text-to-Image: A Stepping Stone to a Broader AI Revolution Large-scale diffusion models are spearheading a significant breakthrough in text-driven image generation, producing high-resolution images based on textual descriptions. This technology holds immense potential beyond just creating visuals from words. Beyond Text-to-Image: A Ripple Effect on AI The groundbreaking advancements in text-to-image models are merely the beginning. These advancements are expected to have a significant impact on other AI fields, leading to substantial improvements in both: Computer Vision: Diffusion models offer the potential to generate vast amounts of realistic synthetic data, overcoming the limitations of collecting real-world data, which can be expensive and time-consuming. This synthetic data can be used to train computer vision models for various tasks, from medical image analysis to autonomous driving. Natural Language Processing (NLP): The ability to generate textual descriptions of images can significantly improve NLP tasks like image captioning and visual question answering. "It's fascinating to see how text-to-image models are blurring the lines between language and visual representation, opening doors to entirely new forms of creative expression." - Fei-Fei Li, Co-Director of the Stanford Human-Centered AI Institute Language: The Bridge Between Our Minds and the World Language plays a crucial role in shaping our world. We use it to describe and understand our surroundings, share knowledge across generations, and even explain the complex mental imagery we experience. Text-to-image models act as a bridge between the power of language and the visual world, opening doors to remarkable possibilities. Real-World Applications of Text-to-Image: The potential applications of text-to-image technology extend far beyond mere image creation. It can be used for various purposes across diverse industries, including: Creative fields: Generating images for marketing campaigns, designing characters for games or animation, and creating unique artwork for NFTs. Image restoration: Restoring blurry or low-resolution images and enhancing their resolution through super-resolution techniques. Image editing: Filling in missing parts of images (inpainting) or making targeted edits based on textual descriptions. In essence, text-to-image models mark a significant step towards a more advanced and versatile AI landscape. Their ability to bridge the gap between language and visual representation unlocks a plethora of applications across various domains, paving the way for a future filled with exciting possibilities. Words Become Worlds: Witnessing the Rise of Text-to-Image, AI's New Storytelling Tool The journey of text-to-image synthesis is like witnessing a seed blossom into a vibrant flower. From the initial spark of translating words into visuals to the current ability to generate high-fidelity images, the advancements in this field are truly remarkable. This technology transcends the realm of mere image creation; it bridges the gap between language and visual representation, unlocking a treasure trove of potential across various domains. Imagine a world where language becomes the brush and text descriptions transform into breathtaking visuals. Artists can explore uncharted creative territories, scientists can visualize complex concepts with ease, and everyday individuals can express their ideas in a whole new way. The possibilities are truly endless. As we stand at the precipice of this exciting revolution, the future seems ablaze with the vibrant colors of imagination. While challenges remain, the relentless pursuit of innovation and the collaborative spirit of the AI community offer a glimpse into a future where words will not just paint a picture in our minds, but on the canvas of reality itself. Ready to delve deeper into the captivating world of generative AI and explore its diverse applications? Follow TheGen.ai for Generative AI news, trends, startup stories, and more.

  • Data Da Vinci? AI Michelangelo? GenAI Modeling Genius Awaits

    The convergence of Generative Data Modeling (GDM) and Artificial Intelligence (AI) is shaping the future across diverse industries. GDM empowers the creation of synthetic data, while AI leverages this data to train powerful models, opening doors to groundbreaking advancements. Let's delve into this dynamic duo and explore how they are crafting the future together. Beyond the Real: Unleashing the Power of Synthetic Data GDM empowers the creation of realistic and diverse synthetic data, where data points are artificially generated but statistically resemble real-world data. This offers several advantages: Overcoming Data Scarcity: GDM allows businesses to address the challenge of limited real-world data, particularly in areas like healthcare and autonomous vehicles. "Synthetic data can significantly improve the performance of AI models when real-world data is scarce," states a recent article in Harvard Business Review. Enhancing Data Privacy: GDM can be used to anonymize or mask sensitive information in real-world data, enabling data sharing and collaboration while protecting privacy. "Synthetic data offers a promising solution for balancing data utility and privacy concerns," highlights a report by the World Economic Forum. Augmenting Training Datasets: GDM can be used to expand and diversify existing data sets, improving the generalizability and robustness of AI models. "Enhancing training data with synthetic examples can lead to more accurate and unbiased AI models," explains a recent research paper published in Nature. A recent survey by Deloitte reveals that 73% of executives believe synthetic data will play a significant role in overcoming data scarcity challenges in their organizations within the next three years, highlighting the growing adoption of GDM. From Synthetic to SMARTER: AI Fueling Innovation with GDM AI algorithms rely on vast amounts of data to learn and improve. GDM empowers them by providing diverse and realistic synthetic data: Improved Model Performance: AI models trained on synthetic data can achieve better performance on real-world tasks, particularly when dealing with limited or biased real-world data. "Leveraging synthetic data can lead to more accurate and reliable AI models for various applications," states a study by McKinsey Global Institute. Enhanced Generalizability: GDM allows the creation of diverse synthetic data scenarios, improving the ability of AI models to adapt to real-world situations beyond training data. "AI models trained on diverse synthetic data can generalize better and perform well on unseen scenarios," explains a recent article on TechCrunch. Accelerated Development Cycles: GDM can shorten the development cycle of AI models by facilitating faster and more efficient training with readily available synthetic data. "Utilizing synthetic data can expedite the development and deployment of AI applications across various industries," highlights a recent report by Accenture. A report by Gartner predicts that 20% of large enterprises will adopt synthetic data generation platforms by 2024, showcasing the increasing momentum for integrating GDM into AI development processes. Navigating the Ethical Landscape: Responsible Innovation As GDM and AI continue to evolve, responsible development and ethical considerations are crucial: Bias and Fairness: Ensuring that GDM and AI models are free from bias requires careful attention to training data and algorithms. Mitigating potential biases throughout the development and deployment stages is essential. Transparency and Explainability: Understanding how AI models function and the role of GDM in their decision-making processes is essential for building trust and ensuring responsible use. Transparency fosters responsible application and ethical considerations. Data Privacy and Security: Protecting user data privacy remains paramount. Robust data security measures and clear communication regarding data collection and usage are crucial for building trust and ensuring ethical practices. By prioritizing responsible development and ethical considerations, we can ensure that GDM and AI work together to create a positive and sustainable future. GDM: Where Data Dances with AI and Innovation is Born GDM and AI, working in tandem, offer immense potential to address complex challenges, unlock innovation, and shape a brighter future. However, embracing responsible development and ethical considerations is paramount to ensure this powerful duo fosters positive change. The question remains: How can we leverage GDM and AI responsibly and creatively to contribute to a more equitable and sustainable future? Follow TheGen.ai for Generative AI news, trends, startup stories, and more.

  • Kamal Ahluwalia, Ikigai Labs: How to take your business to the next level with generative AI

    Ikigai is helping organisations transform sparse, siloed enterprise data into predictive and actionable insights with a generative AI platform specifically designed for structured, tabular data. A significant portion of enterprise data is structured, tabular data, residing in systems like SAP and Salesforce. This data drives the planning and forecasting for an entire business. While there is a lot of excitement around Large Language Models (LLMs), which are great for unstructured data like text, Ikigai’s patented Large Graphical Models (LGMs), developed out of MIT, are focused on solving problems using structured data. Ikigai’s solution focuses particularly on time-series datasets, as enterprises run on four key time series: sales, products, employees, and capital/cash. Understanding how these time series come together in critical moments, such as launching a new product or entering a new geography, is crucial for making better decisions that drive optimal outcomes. How would you describe the current generative AI landscape, and how do you envision it developing in the future? The technologies that have captured the imagination, such as LLMs from OpenAI, Anthropic, and others, come from a consumer background. They were trained on internet-scale data, and the training datasets are only getting larger, which requires significant computing power and storage. It took $100m to train GPT4, and GP5 is expected to cost $2.5bn. This reality works in a consumer setting, where costs can be shared across a very large user set, and some mistakes are just part of the training process. But in the enterprise, mistakes cannot be tolerated, hallucinations are not an option, and accuracy is paramount. Additionally, the cost of training a model on internet-scale data is just not affordable, and companies that leverage a foundational model risk exposure of their IP and other sensitive data. While some companies have gone the route of building their own tech stack so LLMs can be used in a safe environment, most organisations lack the talent and resources to build it themselves. In spite of the challenges, enterprises want the kind of experience that LLMs provide. But the results need to be accurate – even when the data is sparse – and there must be a way to keep confidential data out of a foundational model. It’s also critical to find ways to lower the total cost of ownership, including the cost to train and upgrade the models, reliance on GPUs, and other issues related to governance and data retention. All of this leads to a very different set of solutions than what we currently have. How can companies create a strategy to maximise the benefits of generative AI? While much has been written about Large Language Models (LLMs) and their potential applications, many customers are asking “how do I build differentiation?” With LLMs, nearly everyone will have access to the same capabilities, such as chatbot experiences or generating marketing emails and content – if everyone has the same use cases, it’s not a differentiator. The key is to shift the focus from generic use cases to finding areas of optimisation and understanding specific to your business and circumstances. For example, if you’re in manufacturing and need to move operations out of China, how do you plan for uncertainty in logistics, labour, and other factors? Or, if you want to build more eco-friendly products, materials, vendors, and cost structures will change. How do you model this? These use cases are some of the ways companies are attempting to use AI to run their business and plan in an uncertain world. Finding specificity and tailoring the technology to your unique needs is probably the best way to use AI to find true competitive advantage. What are the main challenges companies face when deploying generative AI and how can these be overcome? Listening to customers, we’ve learned that while many have experimented with generative AI, only a fraction have pushed things through to production due to prohibitive costs and security concerns. But what if your models could be trained just on your own data, running on CPUs rather than requiring GPUs, with accurate results and transparency around how you’re getting those results? What if all the regulatory and compliance issues were addressed, leaving no questions about where the data came from or how much data is being retrained? This is what Ikigai is bringing to the table with Large Graphical Models. One challenge we’ve helped businesses address is the data problem. Nearly 100% of organisations are working with limited or imperfect data, and in many cases, this is a barrier to doing anything with AI. Companies often talk about data clean-up, but in reality, waiting for perfect data can hinder progress. AI solutions that can work with limited, sparse data are essential, as they allow companies to learn from what they have and account for change management. The other challenge is how internal teams can partner with the technology for better outcomes. Especially in regulated industries, human oversight, validation, and reinforcement learning are necessary. Adding an expert in the loop ensures that AI is not making decisions in a vacuum, so finding solutions that incorporate human expertise is key. To what extent do you think adopting generative AI successfully requires a shift in company culture and mindset? Successfully adopting generative AI requires a significant shift in company culture and mindset, with strong commitment from executive and continuous education. I saw this firsthand at Eightfold when we were bringing our AI platform to companies in over 140 countries. I always recommend that teams first educate executives on what’s possible, how to do it, and how to get there. They need to have the commitment to see it through, which involves some experimentation and some committed course of action. They must also understand the expectations placed on colleagues, so they can be prepared for AI becoming a part of daily life. Top-down commitment, and communication from executives goes a long way, as there’s a lot of fear-mongering suggesting that AI will take jobs, and executives need to set the tone that, while AI won’t eliminate jobs outright, everyone’s job is going to change in the next couple of years, not just for people at the bottom or middle levels, but for everyone. Ongoing education throughout the deployment is key for teams learning how to get value from the tools, and adapt the way they work to incorporate the new skillsets. It’s also important to adopt technologies that play to the reality of the enterprise. For example, you have to let go of the idea that you need to get all your data in order to take action. In time-series forecasting, by the time you’ve taken four quarters to clean up data, there’s more data available, and it’s probably a mess. If you keep waiting for perfect data, you won’t be able to use your data at all. So AI solutions that can work with limited, sparse data are crucial, as you have to be able to learn from what you have. Another important aspect is adding an expert in the loop. It would be a mistake to assume AI is magic. There are a lot of decisions, especially in regulated industries, where you can’t have AI just make the decision. You need oversight, validation, and reinforcement learning – this is exactly how consumer solutions became so good. Are there any case studies you could share with us regarding companies successfully utilising generative AI? One interesting example is a Marketplace customer that is using us to rationalise their product catalogue. They’re looking to understand the optimal number of SKUs to carry, so they can reduce their inventory carrying costs while still meeting customer needs. Another partner does workforce planning, forecasting, and scheduling, using us for labour balancing in hospitals, retail, and hospitality companies. In their case, all their data is sitting in different systems, and they must bring it into one view so they can balance employee wellness with operational excellence. But because we can support a wide variety of use cases, we work with clients doing everything from forecasting product usage as part of a move to a consumption-based model, to fraud detection. You recently launched an AI Ethics Council. What kind of people are on this council and what is its purpose? Our AI Ethics Council is all about making sure that the AI technology we’re building is grounded in ethics and responsible design. It’s a core part of who we are as a company, and I’m humbled and honoured to be a part of it alongside such an impressive group of individuals. Our council includes luminaries like Dr. Munther Dahleh, the Founding Director of the Institute for Data Systems and Society (IDSS) and a Professor at MIT; Aram A. Gavoor, Associate Dean at George Washington University and a recognised scholar in administrative law and national security; Dr. Michael Kearns, the National Center Chair for Computer and Information Science at the University of Pennsylvania; and Dr. Michael I. Jordan, a Distinguished Professor at UC Berkeley in the Departments of Electrical Engineering and Computer Science, and Statistics. I am also honoured to serve on this council alongside these esteemed individuals. The purpose of our AI Ethics Council is to tackle pressing ethical and security issues impacting AI development and usage. As AI rapidly becomes central to consumers and businesses across nearly every industry, we believe it is crucial to prioritise responsible development and cannot ignore the need for ethical considerations. The council will convene quarterly to discuss important topics such as AI governance, data minimisation, confidentiality, lawfulness, accuracy and more. Following each meeting, the council will publish recommendations for actions and next steps that organisations should consider moving forward. As part of Ikigai Labs’ commitment to ethical AI deployment and innovation, we will implement the action items recommended by the council. Ikigai Labs raised $25m funding in August last year. How will this help develop the company, its offerings and, ultimately, your customers? We have a strong foundation of research and innovation coming out of our core team with MIT, so the funding this time is focused on making the solution more robust, as well as bringing on the team that works with the clients and partners. We can solve a lot of problems but are staying focused on solving just a few meaningful ones through time-series super apps. We know that every company runs on four time series, so the goal is covering these in depth and with speed: things like sales forecasting, consumption forecasting, discount forecasting, how to sunset products, catalogue optimisation, etc. We’re excited and looking forward to putting GenAI for tabular data into the hands of as many customers as possible.

  • Why Should News Organizations (Not) Build an LLM?

    Integrating Large Language Models (LLMs) into the newsroom has the potential to unlock a myriad of opportunities for news organizations in tasks relevant to content creation and editing, as well as news gathering and distribution. But as newsrooms continue to explore the avenues and prospects for harnessing LLMs, a question arises around the strategic and competitive use of the technology: should news organizations strive to train their own LLMs? In this post I argue that news organizations (especially those with limited resources) that use prompt engineering, fine-tuning, and retrieval augmented generation (RAG) to enhance their productivity and offerings will be strategically better off than if they train their own LLMs from scratch. I will first lay out the cost calculus for engineering and deploying your own model, and then I’ll elaborate the benefits and trade-offs of these other techniques for leveraging the value from generative AI. Training LLMs from scratch could be a costly decision Building and training LLMs from scratch is challenging due to the need for large datasets, extensive computing resources, and specialized talent to develop and train these models. For instance, the computing resources needed to train BloombergGPT — an LLM for finance — is estimated to have cost approximately $1M. While the cost of serving such a model is not public information, the infrastructure required to serve even such a moderate-size model, irrespective of the number of users, is probably not cheap, and is likely somewhere in the six figures. In addition, the ethical considerations around building a responsible model, such as ensuring fairness, privacy, and transparency, while sourcing ethical data, require dedicated attention and resources that could divert the focus of news organizations from their core journalistic tasks and reporting. Optimizing an LLM’s performance with respect to the amount of compute resources required for training remains a work in progress. Rushing forward to train an LLM without a detailed cost-benefit analysis is likely to end up costing news organizations hefty amounts of money that may not yield a high return on investment. Since the development of BloombergGPT, in March 2023, smaller and more capable open source model architectures (such as Mistral Models) are becoming publicly available which present a competitive alternative to other large, costly, and proprietary models. News organizations may instead want to consider the strategic advantages of utilizing third-party models that are accessible via API endpoints. This can reduce infrastructure costs while also ensuring access to state-of-the-art models, and even facilitate versatility in terms of quickly swapping models. For instance, news organizations could deploy the open-source and quite performant Mistral-7B model via the HuggingFace Inference Endpoint on a single A10G GPU for $1.3 per hour for experimentation purposes. They could then decide to switch to Gemma-7B from Google at no cost while still paying the same amount for compute, allowing for rapid iteration and testing of different models. Accordingly, News organizations exploring the potentials of prompt engineering, fine-tuning, and retrieval-augmented generation (RAG) may have a competitive cost advantage and application development agility, possibly achieving a faster return on investment through the use of readily available models (e.g., GPT-4 or Claude) via API or inference endpoints (e.g., Mistral-7B deployed via HuggingFace Inference Endpoint) for their applications. What is prompt engineering? Prompt engineering is an emerging communication technique between users and LLMs that is used to craft questions and instructions to elicit a desired response from LLMs. While prompt engineering appears to be straightforward on the surface, it requires domain expertise in different prompting techniques to fully reap the benefits of LLMs. For instance, this guide lists 17 different approaches to prompting, some of which are rather structured and involved. And different models may require different prompt formats or tricks to get the best performance. Yet prompt engineering is still the fastest way to get information from a general purpose LLM (at least, one that is already tuned to behave like a chat assistant similar to ChatGPT) without modifying its architecture or retraining it. You can refer to the Introduction to prompt design documentation guide to learn about how to create prompts that elicit the desired output from Google’s LLMs. OpenAI offers a similar prompting guide to use its models effectively. Or, Journalist’s ToolBox offers useful prompting resources that are more oriented towards use cases in journalism. What is Retrieval Augmented Generation (RAG)? While prompt engineering is a very powerful and resource efficient way to generate desired content, the knowledge for many LLMs is capped by the cut-off date of their training data. For example, GPT-4 has a cut off training date of December 2023. In other words, without merging GPT-4’s knowledge with the information available online, the model won’t be able to infer the latest updates in the world. News organizations can build their own cost-efficient RAG systems using externally hosted LLMs (such as GPT-4 or Claude) or internally hosted open source models (e.g., Mistral-7B) to enable journalists and users to sift through and converse with a large corpus of archival documents, knowledge bases, or reporting material similar to the Financial Times AI chatbot. RAG services can also be multi-modal. Using multi-modal open source vector databases such as Weaviate, users can query and retrieve audio, video, and text data in natural language. Overall, RAG allows LLMs to access real-time information (via connecting and retrieving information from the internet) or domain-specific knowledge (e.g., archival data) from a specific set of sources. This capability can potentially enable journalists to generate answers to questions that are grounded in a curated set of factual and up-to-date information, enhancing the accuracy of the LLM. When to fine-tune an LLM? News organizations interested in specializing pre-trained LLMs (e.g., GPT-3.5) for specific applications and tasks, such as reflecting a specific writing style in generated text, should consider fine-tuning. Fine-tuning LLMs eliminates the need for constant prompt engineering to get the desired output by instead curating a dataset that closely mirrors the nature of a specific task. For instance, perhaps your organization has a specific style or tone used in headlines that you would like a model to be able to mimic. By curating a dataset of articles and headlines you could then fine-tuning a model to be able to produce such headlines without requiring a user to know how to prompt the model in any specific ways. While fine-tuning offers the advantage of a model with specialized and tailored responses and it reduces the prompting expertise and knowledge burden for end-users, the process of fine-tuning could potentially be expensive depending on the compute and data resources that are required for a particular task. However, it will surely be a much cheaper option than training an LLM from scratch. Fine-tuning LLMs also requires constant monitoring and qualitative evaluation for potential model drifts. OpenAI offers a great guide that walks through the necessary steps of customizing LLMs for your application. In addition, cloud service providers such as Google and Amazon also offer users the ability to fine-tune LLMs via their platforms Vertex AI and Bedrock, respectively. Which method to pick from? Prompt Engineering offers rapid adaptability to tasks in the newsroom with low computational overhead and technical complexity. It requires human expertise to craft prompts but does not require compute resources for inference especially for model endpoints offered by model providers such as OpenAI and Anthropic. Retrieval Augmented Generation (RAG) extends an LLMs’ capacity to incorporate real-time or external data for more factual responses. Although RAG does not require training or fine-tuning LLMs, storing knowledge bases from which the LLMs are fetching information may incur some cost as the knowledge base increases in size. Fine-tuning, on the other hand, provides high specialization for task-specific responses, requires a careful selection of the dataset for fine-tuning, and involves moderate computational and technical resources. In Closing Based on these various factors I would generally recommend for news organizations to explore the use case in which LLMs can be integrated to their workflows through prompting and possibly fine-tuning third-party models for their tasks. This is often going to be preferable to grappling with expensive infrastructure costs needed to train and deploy models that can potentially be outdated and less efficient in the near future. Until the infrastructure costs become cheaper and training LLMs become more accessible, I would not recommend for news organizations to build their own LLMs.

bottom of page