Data Stream AI
08-24, 11:10–11:40 (Pacific/Auckland), Plenary Space

Data stream learning is an artificial intelligence method of extracting knowledge from continuous, rapid and evolving “streams” of data. Our talk presents our new Python library, called CapyMOA, to close the gap between pioneering research and accessible tooling by providing a Python API that tightly integrates MOA (Java data stream mining), PyTorch (hardware accelerated deep learning), and sci-kit-learn (machine learning). Despite these advanced topics, the talk is intended for a broad audience. It will introduce the discipline of data streams for a wide audience, look at some practical examples, and discuss the technical novelty in constructing the Python CapyMOA project.


This talk showcases our new data stream learning library: CapyMOA (https://capymoa.org/). The talk is structured into three sections:

  1. An introduction of data stream learning to a general audience. We discuss the qualities of data stream learning that make it useful for practical machine learning tasks, like sensor data (IoT), marketing and e-commerce, and cyber-security.
  2. A demonstration of CapyMOA. We give a practical demonstration of CapyMOA on data stream learning problems and compare it against its competitors.
  3. A discussion of how we built CapyMOA. We share the story of creating a Python library that calls an extensive, older, and complex Java library running in a JVM. We discuss how we constructed CapyMOA to minimise the burden on Python users to understand this labyrinth of Java code and provide them with fast, familiar, minimalist Python interfaces.

What is the anticipated audience for your presentation?

Intermediate

Anton Lee is a PhD student and research assistant at the University of Wellington, studying continual learning in artificial intelligence. As a research assistant, he is a maintainer of the CapyMOA open-source data-stream learning project.