Speed-up the computations on a LAScatalog

What takes time when processing a LAScatalog is not necessarily the computation itself but the time required to read the files. In fact the read time (i.e the time needed to load the data in R) might be much longer than the actual computation time. This vignette explains why and how to speed-up the computation by a factor of 2 to 8 by carefully preparing the catalog.

Generic considerations on LAScatalog processing

When processing a LAScatalog the area covered is divided into chunks that are then processed sequentially. In the following examples we present the case where chunks are equal to tiles, i.e. when processing each file sequentially. This is the common way to process data but not the only one. In any case, the explanation remains valid even when chunks are not equal to tiles.

So each file is processed sequentially. For a given processed file, the content is read and loaded into R. In the following figure chunk number 42 is currently read and processed (in blue):

But the current processed file is not the only one that needs to be read. To properly process the catalog without edge artifacts we need to load an extra buffer around the processed file (in red).

To load a buffer the processing engine must not only read the processed file but also the 8 neighbouring tiles (in red) to selectively read a small buffer around the processed file. Thus, for each processed file it is not one file that is read but nine. This is one of the reasons why the read time is far from negligible compared to the actual computation time.

To process a LAScatalog faster we need to read the files faster.

Read a las file vs read a laz file

A laz file is a strongly compressed las file. Reading a laz file is thus slower than reading a las file because it must be un-compressed on the fly at reading time. The following graph shows a benchmark of read time for a single file.

system.time(readLAS(file.las))
system.time(readLAS(file.laz))

So let’s assume that the total computation time is 1 unit of time divided into 0.25 units of actual processing time and 0.75 units of read time (which is a fairly reasonable ratio). We can divide the read time by 3, and thus have 0.25 units of read time and 0.25 units of computing time, which gives a computation time of 0.5 instead of 1. We can therefore speed-up the computation time by a factor of 2 by using the las format instead of laz. Obviously the gain is less significant for more computationally demanding processes.

So for faster computation users can opt for las files instead of laz files. Obviously, there are good reasons to use laz files instead of las files. The strong compression brought by the laz format has a lot of advantages for storing data. It is up to the user to choose a format by considering the trade-offs between space and computation time. This section explains how it works only to help users make a decision that best suits their needs.

Indexation of the points with lax files

Another way to speed-up the total computation time is to avoid reading all 8 neighbouring tiles to load a buffer. Instead, we can read only parts of the neighbouring tiles. The gain comes from the fact that we read only a small portion of the neighbouring files to extract the buffer, skipping most of the file contents. Indeed, the buffer usually corresponds to only a very small percentage of the actual contents of a file (equivalent to a few square meters).

This is made possible by indexing the las or laz files with lax files. A lax file is a tiny file associated with a las or laz file that spatially indexes the points to make faster spatial queries. This file type was created by Martin Isenburg in LAStools. For a better understanding of how it works one can refer to a talk given by Martin Isenburg about lasindex.

By adding lax files along with your las/laz files the buffer can be added around the processing file by only partially reading the 8 neigbouring files (in red)

The best way to create a lax file is to use laxindex from LAStools. It is a free and open-source part of LAStools. If you cannot or do not want to use LAStools the lidR package has an (undocumented) internal function that creates lax files using the rlas package:

lidR:::catalog_laxindex(ctg)

The gain is really significant and allows an additional 2- to 3-fold saving in terms of read time, which significantly speeds up the computation time. Changing from laz to las format has a cost because it implies storing more data. However, using lax files provides a significant gain for free, so there is no incentive not to create lax files.

Changing the hardware

We demonstrated the importance of decreasing the time taken to read files to improve the overall computation time. The faster you read the files the faster you perform the computation because the read time is non-negligible. Opting for a fast SSD disk instead of a slow HDD disk may significantly speed-up the computation time independently of the power of your processor. Hardware matters!

Use more cores

This section comes at the end for a reason. lidR allows you to gain speed using multicore processing. However, users will not necessarily have a significant gain with multicore processing. Multicore is not a magic trick. Processing a LAScatalog implies reading a file on disk, a task that is not really parallelizable. As a consequence, multicore is not necessarily faster than single core. The multicore engine implemented in lidR reads several files at a time to process them in parallel. It uses more memory (four cores means four chunks read at one time) and the overhead of reading four files at a time may be more penalizing than the gain.

In conclusion, multicore processing may or may not speed-up the computation time and it has a significant memory cost.

Benchmarks

The following are benchmarks for some functions

Simple canopy height model

A simple point-to-raster canopy height model using 25 files of 150 x 150 m with 30 pts/m² on a laptop with an SSD and an intel core i7 processor.

chm <- grid_canopy(ctg, 1, p2r())
Format Cores Runtime
laz 1 40 sec
laz + lax 1 20 sec
las 1 10 sec
las + lax 1 7 sec
laz + lax 2 20 sec
las + lax 2 5 sec

Here we found an almost 8-fold increase in speed simply by changing the file types. With 2 cores we gained an additional 30%, reaching an actual 8-fold speed-up compared to the basic use of laz files. One can try laz files only + 32 cores on a supercomputer with a lot of RAM, but the gain will probably not reach an 8-fold speed-up!

Area-based Approach

Computation of a single metric on 360 files of 1 x 1 km with 3 pts/m² (~300 km² and 900 millions points) on a laptop with an SSD and an intel core i7 processor.

hmean <- grid_metric(ctg, mean(Z), 20)
Format Cores Runtime
laz 1 45 min
las 1 15 min
las + lax 1 8 min
las + lax 4 7 min

Here we found an almost 6-fold speed-up by changing only the file types. With 4 cores we gained an additional 20%, reaching almost a 7-fold speed-up compared to the basic use of laz files. Using 2 cores instead of 4 is likely to be faster. Again, one can try laz files only + 32 cores on a supercomputer with a lot of RAM, but the gain will probably not reach an 8-fold speed-up!

Clip ground inventories

Extraction of 140 plots of 12 m radius from 1 x 1 km files with 3 pts/m² on a laptop with an SSD and an intel core i7 processor.

# Streaming way
opt_output_files(ctg) <- "/path/to/output/file_{ID}"
new_ctg <- lasclip(ctg, shapefile, radius = 12)

# Load in memory
opt_output_files(ctg)  <- ""
new_ctg <- lasclip(ctg, shapefile, radius = 12)
Format Method Cores Runtime
las Streaming 1 45 sec
las Streaming 4 35 sec
las + lax Streaming 1 4 sec
las + lax Streaming 4 6 sec
las Memory 1 45 sec
las Memory 4 35 sec
las + lax Memory 1 4 sec
las + lax Memory 4 17 sec

Here using 4 cores instead of 1 considerably increases the computation time. Indeed the runtime is very small and the overhead of the parallelization does not compensate for the potential gain.