What’s New in AI: Multimodal AI with Purvanshi Mehta
Published by O'Reilly Media, Inc.
Leveraging images, text, audio, and other data for general-purpose AI applications
Join host George Anadiotis and guest Purvanshi Mehta, cofounder of Lica World, for a discussion about multimodal AI and its applications. Trained on various types of data from text to images to audio and video, multimodal AI models are expanding the possibilities for the kinds of AI applications we can build.
New large AI models such as GPT-4, Gemini, and Claude 3 are all general-purpose multimodal foundational models. More specialized multimodal AI models, such as OpenAI’s yet-to-be-released Sora, which generates video from text, or Suno AI, which generates songs from text, are fueling the imagination with ways we might leverage AI to automate and augment tasks in robotics, entertainment, healthcare, manufacturing, and other industries.
George and Purvanshi discuss where this technology stands and share their thoughts on where the field is headed.
What’s New in AI gives you a chance to hear from leading minds in the field on topics such as generative AI, large language models, responsible AI, current regulations, and other developments as they appear. Host George Anadiotis, founder of Linked Data Orchestration, and members of O’Reilly’s community of experts help you make sense of the bigger picture in AI and leverage emerging AI technologies to solve your organization’s most challenging problems.
What you’ll learn and how you can apply it
- Learn about state-of-the-art multimodal AI and technologies that you can leverage today
- Understand the specific techniques and skills needed to build multimodal AI systems
- Explore what’s in store for multimodal AI and how to keep up with the latest developments
This live event is for you because...
- You want to stay up-to-date on the latest developments and breakthroughs in the field of AI.
- You’re an AI practitioner who wants to expand your skills beyond one particular field of application.
Prerequisites
- Come with your questions for Purvanshi
- Have a pen and paper handy to capture notes, insights, and inspiration
Recommended follow-up:
- Read “Multimodal Foundation Models” (chapter 10 in Generative AI on AWS)
- Listen to “Now you see me—multimodality” (episode 9 of AI Unveiled)
- Watch Multilingual and Multimodal Prompt Engineering (on-demand course)
- Watch Enhancing Lakehouse Infrastructure for Multimodal AI (video)
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Thursday, June 6, 2024, at 8:00am PT / 11:00am ET
- Interactive discussion and Q&A (60 minutes)
Your Hosts and Guests
George Anadiotis
George Anadiotis’s got tech, data, and media and is not afraid to use them. An analyst, consultant, engineer, researcher, and writer, George also founded Linked Data Orchestration, Connected Data World, the Year of the Graph, and Cricket Hill. He’s a VentureBeat contributor and a coauthor of the first book on personal knowledge graphs.
Purvanshi Mehta
Purvanshi Mehta cofounded Lica World, a platform that specializes in multimodal content transformations. Lica converts long-form videos and documents into summary reels, blog posts, and podcasts with a single click. Previously, she developed and expanded multimodal and language models at Microsoft, integrating them into Office365 products. She contributed as a manager for Microsoft's AI for Good initiative, collaborating with Innovations for Poverty Action to enhance the Poverty Probability Index using multimodal data. She also worked on the Alexa AI natural language understanding team at Amazon Lab126 and at Luleå Technical University in Sweden and TU Kaiserslautern in Germany as a research scientist. Purvanshi has presented papers on probabilistic deep learning, graph learning, arithmetic word-problem solving, and language processing at prestigious conferences such as NeurIPS, IJCNLP, and WSDM.