Parallel Domain has launched an API called Data Lab that allows customers to generate synthetic datasets. This San Francisco-based startup utilizes generative AI technology to give machine-learning engineers control over dynamic virtual worlds, enabling them to simulate any scenario they can imagine.
According to Kevin McNamara, founder and CEO of Parallel Domain, customers can easily install the Data Lab API from GitHub and start generating datasets through Python code. This API allows engineers to create objects that were not previously available in the startup’s asset library. By using 3D simulation, engineers can layer the real world onto the virtual world, creating scenarios like a flipped cab on a highway or a human dressed in an inflatable dinosaur outfit.
The main goal of Data Lab is to provide autonomy, drone, and robotics companies with more control and efficiency in building large datasets, which will help train their models faster and at a deeper level. McNamara explains that the speed of iteration now depends on how fast an ML engineer can translate their ideas into API calls or code.
Data Lab has already attracted major OEMs and autonomous driving companies as customers. Previously, it would have taken weeks or months for Parallel Domain to generate datasets based on customer parameters. With the self-serve API, customers can create new datasets in near real time.
In testing, Parallel Domain found that autonomous vehicle models trained on synthetic datasets performed better than those trained on real-world datasets. While the startup is not using popular open AI APIs like ChatGPT, they are building components using open-source foundation models from the past couple of years.
Parallel Domain’s synthetic data generation engine, called Reactor, was initially launched for internal use and beta testing. Now, with the Data Lab API, the startup’s business model is shifting towards a software-as-a-service model, where customers can subscribe based on their usage.
The potential applications of the Data Lab API go beyond autonomous driving, extending to industries like agriculture, retail, and manufacturing, where computer vision-enabled technology can improve efficiency. McNamara envisions Parallel Domain becoming the go-to platform for training AI models in any domain that requires sensor-based perception.