As Java enters the AI era, developers seek ways to run high-performance machine learning workloads without leaving the JVM. This session shares lessons learnt from building GPU-accelerated AI libraries natively in Java using TornadoVM, a framework that transparently offloads Java programs to GPUs and other accelerators. Using GPULlama3.java, a GPU-accelerated library for large language models (LLMs), the session explores techniques for supporting quantized data types (FP16, Q8, Q4), integrating with Quarkus and LangChain4J, and optimizing AI inference directly in Java. Attendees will gain practical insights into how TornadoVM complements Project Babylon and Project Panama in advancing Java as a first-class platform for AI development.
Talk Level:
INTERMEDIATE
Bio:
Dr. Thanos Stratikopoulos (male) is a Research Fellow at the University of Manchester with specialization on heterogeneous architectures and reconfigurable accelerators. He has authored more than 20 research articles in the field of hardware acceleration, system software, and programming languages. Currently, his work involves heterogeneous architectures ranging from low-power devices to high-end cloud deployments. He is one of the lead developers of TornadoVM and has been part of the team for the last eight years. In addition to his core contributions to the system's technical development, Dr. Stratikopoulos leads the project's communication and dissemination efforts, helping to articulate its goals and advancements to both academic and industrial audiences through talks, documentation, and outreach activities.