Infrastructure & Ops Superstream: Generative AI Use Cases, Risks, and Tooling
Published by O'Reilly Media, Inc.
Explore the challenges, opportunities, and early patterns of AI implementation
ChatGPT launched in November 2022 and landed like a meteor, upending many assumptions about software development and forcing organizations to rethink how they work. Generative AI tools such as ChatGPT, GitHub Copilot, and Replit Ghostwriter have increased the speed and efficiency of software development but pose myriad challenges and questions: Who trains these tools? What are their implications for operations?
Join our lineup of experts for a discussion of the brave new world of generative AI. Examine some of its use cases, risks, and tooling, as well as the many security and ethical issues it raises. And although a set of best practices has yet to crystallize, you’ll explore emerging patterns that can help guide your organization as you move toward implementation.
About the Infrastructure & Ops Superstream Series: This three-part Superstream series guides you through what you need to know about modernizing your organization’s infrastructure and operations, with each event day covering different topics and lasting no more than four hours. They’re packed with the expert insights, skills, and tools that will help you effectively manage existing legacy systems while migrating to modern, scalable, cost-effective infrastructures—with no interruption to your business.
What you’ll learn and how you can apply it
- Understand and assess the trade-offs between higher developer efficiency and the introduction of a new and complex technology into the workflow
- Learn what to expect if you plan to run AI workloads on Kubernetes
- Explore how AI will impact observability in distributed systems
- Understand the main security risks that AI adoption brings
This live event is for you because...
- Your organization has embraced AI, and your development and DevOps teams want a sense of the road ahead.
- You need to understand the pragmatic implications of AI and its impact on operations.
- You want to better understand the GenAI tooling landscape before making critical, impactful decisions.
Prerequisites
- Have a pen and paper handy to capture notes, insights, and inspiration
Recommended follow-up:
- Follow IO Superstream: Generative AI Use Cases, Risks, and Tooling (expert playlist)
- Read Introduction to Generative AI (book)
- Take Generative AI for Everyone (live online course with Altaf Rehmani)
- Listen to Generative AI in the Real World: Pete Warden on Running AI on Small Systems (audiobook)
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Sam Newman: Introduction (5 minutes) - 8:00am PT | 11:00am ET | 3:00pm UTC/GMT
- Sam Newman welcomes you to the Infrastructure & Ops Superstream.
Daniele Zonca: Serving LLMs on Kubernetes (35 minutes) - 8:05am PT | 11:05am ET | 3:05pm UTC/GM
- What are the key hurdles in running large language models efficiently on Kubernetes? Daniele Zonca shares the challenges and offers effective strategies for LLM integration. You’ll get an overview of the current landscape for LLM deployment options and explore the suitability of Kubernetes for these models.
- Daniele Zonca is a senior principal software engineer and architect for model serving of OpenShift AI, Red Hat's flagship AI product combining multiple stacks, responsible for the architecture of serving components: model server management, model registry, monitoring and trustworthy AI. He’s one of the founders of the TrustyAI project, which focuses on responsible and trustworthy AI, and he’s also involved with the KServe, ModelMesh, and Kubeflow model registry projects.
Emily Arnott: Generative AI in Incident Management—Risks and Opportunities (35 minutes) - 8:40am PT | 11:40am ET | 3:40pm UTC/GMT
- Emily Arnott explores what large language models can do when things go terribly wrong. LLMs can act as a powerful "sidekick" in incident management, from parsing your codebase to debug, to ad-hoc testing scripts, to brainstorming solutions with engineers. Not only do these features reduce the time sink involved in incident response, but they also open up that time for feature development. LLMs aren’t perfect, and their common failure modes can have notable consequences. Emily examines some of these failures, including hallucination, misprioritization, and black boxing, and how they’d look in the context of incident response, and shows you how the resilience, adaptability, and knowledge of your incident response teams can compensate for the risks of LLMs.
- Emily Arnott is a big fan of the internet and has been passionate about keeping it humming smoothly ever since she was introduced to the topic of site reliability engineering. She's written articles and ebooks and delivered talks to bring SRE to wider audiences and is excited to be on the frontlines of the next generation of reliability tech!
- Break (10 minutes)
Ezequiel Lanza: Cloud-Native LLM Deployments Made Easy Using LangChain (35 minutes) - 9:25am PT | 12:25pm ET | 4:25pm UTC/GMT
- Deploying large language model architectures with billions of parameters can pose significant challenges. Creating GenAI interfaces is hard enough, but add the difficulty of managing a complex architecture while juggling computational requirements and ensuring efficient resource utilization, and you have a potential recipe for disaster when you transition training models to real-world scenarios. LangChain, an open source framework for developing applications powered by LLMs, simplifies the creation of these interfaces by streamlining the use of several NLP components into easily deployable chains; and Kubernetes can help manage the underlying infrastructure. Ezequiel Lanza shows you how quickly and easily this is achieved by deploying an end-to-end LLM containerized LangChain application in a cloud native environment.
- Ezequiel Lanza is an AI open source evangelist at Intel. He holds an MS in data science, and he’s passionate about helping people discover the exciting world of artificial intelligence. Ezequiel is a frequent AI conference presenter and the creator of use cases, tutorials, and guides that help developers adopt open source AI tools. You can find his full profile here.
Phillip Carter: How to Wrangle Generative AI to Actually Work in Production, Using Observability (40 minutes) - 10:00am PT | 1:00pm ET | 5:00pm UTC/GMT
- Generative AI, such as large language models, pose unique and interesting problems once they're live in production. Like many tech companies, Honeycomb released features using generative AI last year and discovered the inherent difficulty of producing reliable behavior from the technology. Phillip Carter reports on lessons learned the hard way: that user inputs are unpredictable, subtle changes to prompts can create dramatic changes in responses, adjustments to a retrieval-augmented generation pipeline can significantly influence the quality of responses, and more. He shares how Honeycomb wrangled these problems and explains why and how observability—the practice of using telemetry to understand system behavior—is essential to making generative AI work for you in the long run.
- Phillip Carter is a principal product manager at Honeycomb and leads its AI initiatives. He’s also a maintainer of the OpenTelemetry project, the de facto standard for observability instrumentation. In a past life, he worked on the C# and F# languages at Microsoft and helped bring the .NET stack into the modern cross-platform era.
- Break (5 minutes)
Grady Booch: Meet the Expert (45 minutes) - 10:45am PT | 1:45pm ET | 5:45pm UTC/GMT
- Join us for a special conversation between Sam Newman and Grady Booch about some of the challenges and opportunities facing developers in the age of generative AI and some of the possibilities and pitfalls presented by this new technology.
- Grady Booch is chief scientist for software engineering at IBM Research, where he leads the company’s research and development for embodied cognition. Having originated the term and the practice of object-oriented design, he’s best known for his work in advancing the fields of software engineering and software architecture. A coauthor of the Unified Modeling Language and a founding member of both the Agile Alliance and the Hillside Group, Grady has published six books and writes a column for IEEE Software and IEEE Spectrum journals. He’s been awarded the Lovelace Medal, has given the Turing Lecture for the British Computer Society, and was named an IEEE Computer Pioneer for his work in software architecture.
Sam Newman: Closing Remarks (5 minutes) - 11:30am PT | 2:30pm ET | 6:30pm UTC/GMT
- Sam Newman closes out today’s event.
Upcoming Infrastructure & Ops Superstream events:
- Platform Engineering Best Practices - November 13, 2024
Your Hosts and Selected Speakers
Sam Newman
Sam Newman is a technologist focusing on the areas of cloud, microservices, and continuous delivery—three topics which seem to overlap frequently. He provides consulting, training, and advisory services to startups and large multinational enterprises alike, drawing on his more than 20 years in IT as a developer, sysadmin, and architect. Sam is the author of the best-selling Building Microservices and Monolith To Microservices, both from O’Reilly, and is also an experienced conference speaker.
Emily Arnott
Emily Arnott is a big fan of the internet and has been passionate about keeping it humming smoothly ever since she was introduced to the topic of site reliability engineering. She's written articles and ebooks and delivered talks to bring SRE to wider audiences and is excited to be on the frontlines of the next generation of reliability tech!
Phillip Carter
Phillip Carter is a principal product manager at Honeycomb and leads its AI initiatives. He’s also a maintainer of the OpenTelemetry project, the de facto standard for observability instrumentation. In a past life, he worked on the C# and F# languages at Microsoft and helped bring the .NET stack into the modern cross-platform era.
Grady Booch
Grady Boochis chief scientist for software engineering at IBM Research, where he leads the company’s research and development for embodied cognition. Having originated the term and the practice of object-oriented design, he’s best known for his work in advancing the fields of software engineering and software architecture. A coauthor of the Unified Modeling Language (UML), a founding member of the Agile Alliance, and a founding member of the Hillside Group, Grady has published six books and several hundred technical articles, including an ongoing column for IEEE Software and IEEE Spectrum. Grady is a trustee for the Computer History Museum as well as an IBM Fellow and an ACM and IEEE Fellow. He’s been awarded the Lovelace Medal, has given the Turing Lecture for the BCS, and was named an IEEE Computer Pioneer for his work in software architecture. Grady has served as an architect or architectural mentor for a variety of complex software-intensive systems across many domains, including shipping, finance, transportation, defense, commerce, productivity, governmental, medical, software development, artificial intelligence, and many others. He’s currently developing a major transmedia documentary for public broadcast on the intersection of computing and the human experience.
Ezequiel Lanza
Ezequiel Lanza is an AI open source evangelist at Intel. He holds an MS in data science, and he’s passionate about helping people discover the exciting world of artificial intelligence. Ezequiel is a frequent AI conference presenter and the creator of use cases, tutorials, and guides that help developers adopt open source AI tools. You can find his full profile here.