Introduction to LAVA

About LAVA V2

LAVA V2 is the second major version of LAVA. The major user-visible features are:

  • The Pipeline model for the dispatcher
  • YAML job submissions
  • Results
  • Queries
  • Charts
  • Data export APIs

The architecture has been significantly improved since V1, bringing major changes in terms of how a distributed LAVA instance is installed, configured and used for running test jobs.

LAVA Overview

What is LAVA?

  • LAVA is the Linaro Automation and Validation Architecture.
  • LAVA is a continuous integration system for deploying operating systems onto physical and virtual hardware for running tests. Tests can be simple boot testing, bootloader testing and system level testing, although extra hardware may be required for some system tests. Results are tracked over time and data can be exported for further analysis.
  • LAVA is a collection of participating components in an evolving architecture. LAVA aims to make systematic, automatic and manual quality control more approachable for projects of all sizes.
  • LAVA is designed for validation during development - testing whether the code that engineers are producing “works”, in whatever sense that means. Depending on context, this could be many things, for example:
    • testing whether changes in the Linux kernel compile and boot
    • testing whether the code produced by gcc is smaller or faster
    • testing whether a kernel scheduler change reduces power consumption for a certain workload
    • etc.
  • LAVA is good for automated validation. LAVA tests the Linux kernel on a range of supported boards every day. LAVA tests proposed android changes in gerrit before they are landed, and does the same for other projects like gcc. Linaro runs a central validation lab in Cambridge, containing racks full of computers supplied by Linaro members and the necessary infrastructure to control them (servers, serial console servers, network switches etc.)
  • LAVA is good for providing developers with the ability to run customized test on a variety of different types of hardware, some of which may be difficult to obtain or integrate. Although LAVA has support for emulation (based on QEMU), LAVA is best at providing test support for real hardware devices.
  • LAVA is principally aimed at testing changes made by developers across multiple hardware platforms to aid portability and encourage multi-platform development. Systems which are already platform independent or which have been optimized for production may not necessarily be able to be tested in LAVA or may provide no overall gain.

Note

This overview document explains LAVA using http://validation.linaro.org/ which is the official production instance of LAVA hosted by Linaro. Where examples reference validation.linaro.org, replace with the fully qualified domain name of your LAVA instance.

What is LAVA not?

  • LAVA is not a set of tests - it is infrastructure to enable users to run their own tests. LAVA concentrates on providing a range of deployment methods and a range of boot methods. Once the login is complete, the test consists of whatever scripts the test writer chooses to execute in that environment.
  • LAVA is not a test lab - it is the software that can used in a test lab to control test devices.
  • LAVA is not a complete CI system - it is software that can form part of a CI loop. LAVA supports data extraction to make it easier to produce a frontend which is directly relevant to particular groups of developers.
  • LAVA is not a build farm - other tools need to be used to prepare binaries which can be passed to the device using LAVA.
  • LAVA is not a production test environment for hardware - LAVA is focused on developers and may require changes to the device or the software to enable automation. These changes are often unsuitable for production units. LAVA also expects that most devices will remain available for repeated testing rather than testing the software with a changing set of hardware.

See also

Continuous Integration which covers how LAVA relates to continuous integration (CI) and covers the consequences of what LAVA can and cannot do with particular emphasis on how automation itself can block some forms of testing.

Features

  • Automated validation - designed for automated processes to create, submit and process results of test jobs to validate the development process.
  • Parallel scheduling - multiple test jobs run at the same time across multiple devices.
  • MultiNode test jobs - test jobs can be run as a single group of tests involving multiple devices.
  • Hardware sharing - uncommon hardware is shared between disparate groups to maximize usage
  • Wide device coverage - a large number of types of device can be supported with instances ranging from one to more than a hundred devices available for test jobs.
  • Data export for customisation - transform the data using custom interfaces to make the validation output directly relevant to specific teams.
  • Privacy support - test jobs or types of device can be kept private to selected groups, individuals or teams.
  • Live result reporting - if a test job does fail, all results up to the point of failure are retained.
  • UNIX and Android test support - Test jobs can be run on systems running various UNIX flavors or using the Android Debug Bridge to interface with mobile devices.
  • Complex network testing - reconfigurable networking across multiple devices using multiple network interfaces.

Architecture

images/arch-overview.svg

A LAVA instance consists of two primary components - a server and a worker. The simplest possible configuration is to run the master and worker components on a single machine, but a larger instance can also be configured to support multiple workers controlling a larger number of attached devices.

Elements of the Master

  • Web interface - This is built using the Apache web server, the uWSGI application server and the Django web framework. It also provides XML-RPC access and the REST API.
  • Database - This uses PostgreSQL locally on the master, with no external access.
  • Scheduler - This is the piece that causes jobs to be run - periodically this will scan the database to check for queued test jobs and available test devices, starting jobs when the needed resources become available.
  • lava-server-gunicorn daemon - This communicates with the worker(s) using HTTP

Elements of the Worker

  • lava-worker daemon - This receives control messages from the server.
  • Dispatcher - This manages all the operations on the device under test, according to the job submission and device parameters sent by the master.
  • Device Under Test (DUT)

Note

Although the Dispatcher interacts directly with the DUT, all the device configuration is sent from the server.

Preparation

LAVA has a steep learning curve and this does not tend to level off as your lab grows. Even small labs involve additional hardware, infrastructure and administrative tasks.

  1. Do not rush into LAVA setup.
  2. Start small.
  3. Think carefully about what you are trying to test. Avoid common pitfalls of simplistic testing.
  4. Learn how to debug LAVA with a small lab and use standard test jobs.
  5. Invest in additional hardware - a device on your desk is not a good candidate for automation.
  6. Test with emulated devices before thinking about the device on your priority list.
    • Integrating a completely new device type is the probably the most complex thing to do in LAVA. It can take a few months of work for devices which do not use currently supported methods or bootloaders.
  7. Start by adding known devices, including purchasing some of the low-cost devices already supported by LAVA.
  8. Talk to us before looking at device types not currently supported on LAVA instances.

Methods

Deployment methods

All test jobs involve a deployment step of some kind, even if that is just to prepare the overlay used to copy the test scripts onto the device or to setup the process of parsing the results when the test job starts.

Boot methods

Hardware devices need to be instructed how to boot, emulated devices need to boot the emulator. For other devices, a boot can be simply establishing a connection to the device.

Test methods

The principal test method in LAVA is the Lava Test Shell which requires a POSIX type environment to be running on the booted device. Other test methods available include executing tests using ADB.

Multiple device testing

Some test jobs need to boot up multiple devices in a single, coordinated, group. For example, a server could be tested against multiple clients. LAVA supports starting these sub jobs as a group as well as passing messages between nodes via the dispatcher connection, without needing the devices to have a working network stack.

Scheduling

LAVA has advanced support for scheduling multiple jobs across multiple devices, whether those jobs use one device or several. Scheduling is ordered using these criteria, in this order:

  1. health checks
  2. priority
  3. submit time
  4. multinode group - see also MultiNode LAVA

In addition, scheduling can be restricted to devices specified by the admin using:

Advanced use cases

Advanced use cases expand on this support to include:

  • creating and deleting customized virtual networks, where suitable hardware and software support exists.
  • extracting data from LAVA to manage job submission and result handling to support developer-specific tasks like KernelCI.

Full documentation

LAVA comes with comprehensive documentation about use and installation, including advice on how to manage a test lab.