»PixSet« Dataset

»It doesn’t work without cooperation«

02. Juni 2021, 08:48 Uhr   |  Iris Stroh

Fortsetzung des Artikels von Teil 1 .

Pixset contains data from a wide range of sensors

Beispiele der in PixSet enthaltenen Bilder, auf denen einzelne Objekte manuell mit 3D-Rahmen versehen wurden
© LeddarTech

Examples of the images included in PixSet on which individual objects have been manually provided with 3D frames

To what extent has public opinion towards automated driving deteriorated?

I'm not talking about robotaxis, but private vehicles. Here, with features like autopilot, consumers have been given hope that the technology is mature and will soon be available. And that it is affordable. This hope has not been fulfilled: even today we are still in a range between level 2 and 3 of automated driving. But if I have to keep my eyes on the road all the time because the technology doesn’t work on its own, then it’s superfluous, that’s not what the consumer wants. After all, these functionalities are not about people not wanting to drive as a matter of principle, but rather that they would like to do something different in certain situations. They want technology that, for example, takes over driving in traffic jams or on motorways. And then they want the technology to take over driving completely and not have the driver sitting next to it as a constant watchdog. I am convinced that if the technologies were fully developed and affordable, many consumers would use them. But what is available today is not mature. Today, we are talking about a time horizon of 2025, a few years ago they were still saying that these technologies were already available today. These delays increase mistrust among consumers: these no longer believe that the technology works. This is a problem that needs to be solved and that requires cooperation.

Do you expect LeddarTech will benefit in return from the academic research itself?

I think that is quite likely. There are some tasks that no one has solved satisfactorily yet. PixSet is a longer-term investment, but of course we hope that technical developments will be triggered in the academic environment based on this dataset that will help us in industry.

Labelled data is rare; how was the response to your dataset?

Very good, and it is true that one reason for this is the fact that it is labelled data. But there is another special feature: the dataset is not only based on data from LeddarTech sensors, but includes measurement data from quite different sensors. And these points are in contrast to the datasets available so far. In some, the objects are not labelled; in others, only measurement data from LiDAR sensors, for example, are included. Our dataset includes data from very different sensors, different cameras, radar data and data from different LiDAR sensors, including non-LeddarTech LiDAR sensors.

Can you be more specific?

Our dataset was created with a complete AV sensor suite, i.e., cameras, LiDARs, radar and IMUs (inertial measurement unit). As I said, we used different LiDAR sensors. But this also means that the dataset includes full-waveform data from our Leddar Pixell, a 3D solid-state flash LiDAR sensor. Full-waveform LiDAR sensors provide a full, digital representation of the incoming light signal, which is significantly more information than conventional LiDAR sensors provide. And this additional information enables higher performance in object detection, classification, etc.

In summary, our dataset is much more comprehensive than those from other commercial providers. And we also provide the tools as open source to manipulate and view the data. I think that makes PixSet unique.

The dataset is free of charge for the academic environment, but there is also the possibility for commercial providers to license the dataset. Who are you specifically addressing here, large OEMs rather not, because they have their own huge data sets?

No, we’re not talking about OEMs here, of course they have their own datasets that were created with hundreds of thousands of kilometres driven. We are more aimed at start-up companies that are developing corresponding technologies, for example pedestrian detection. For these companies, our dataset is certainly helpful. Because up to now, such companies have typically relied on publicly accessible datasets and, as I said, these have a limited scope. And the option to collect their own data does not usually exist for these companies either, because that is too costly. Now, with PixSet, they can license a comprehensive dataset at a reasonable price, which is very helpful.

Is the dataset sufficient to try out complex algorithms?

Yes, the data set is quite extensive. We are talking about around 30,000 images in 97 sequences and more than 1.3 million labelled objects. That is more than twice the size of the Kitti dataset, which we ourselves have used very effectively in our own development. This can be used to test the possibilities of different algorithms. If the algorithms are to go into production, then the dataset will certainly be expanded with significantly more data. Then we are talking about millions of images. But the dataset is certainly sufficient to demonstrate the possibilities.

Should this kind of support be expanded, for example with datasets for traffic monitoring?

To be honest, we are not following a fixed roadmap in this case. If it came up within an internal project for motorway surveillance to create a dataset, then we would certainly consider making it publicly available.

