The Transition of Real-World Imagery To Synthetic Data

To create synthetic data that boosts the performance of geospatial analytics, OneView has created an automated pipeline for generating 3D data, based on real world data sources.

Satellite and aerial imagery can easily reach several square kilometers in size, due to the distance from earth they are taken from. To create appropriate synthetic data, we need to be able to recreate such large-scale scenes. 

When OneView creates synthetic data to train geospatial machine learning algorithms, not only do we need to generate large-scale scenes, but we also need to have large amounts of different scenes, with a wide range of variables and different parameters. Building these scenes manually is not feasible due to the sheer scale of the scene.

OneView developed a solution to automate the process of creating these large-scale scenes, such as cities, airports, docks and harbors, and open areas, among others. To achieve this we turned to an open-source data mapping service called OpenStreetMap (OSM). This is an open-data platform specifically for 2D vector maps, which is updated by contributors with access to worldwide data sources, mainly satellite and aerial images.

Figure 1: The end-to-end process of creating our synthetic images.
(a) Real Pleiades satellite image of Beijing Capital International Airport, (b) OSM vector map of Beijing Capital International Airport, (c) OneView’s 3D scene generated based on OSM data, (d) Example of a render mimicking Pleiades satellite image.

Step One: OpenStreetMap Contributors Create Vector Maps Using Satellite Data

To get an understanding of the process involved in the transition from a 2D real-world satellite image into a synthetic scene, let’s take the example of a single satellite image of Beijing Capital International Airport (Figure 1 (a)). This image from Airbus is representative of the data that is available for the OSM community of contributors. During this process, where contributors using OSM convert real images into vector maps, each element of the airport is drawn and highlighted — the terminal, the runway, the taxiway, the apron, the gates, areas with vegetation, and so on. The result is a detailed representation of the original satellite image as a 2D vector map (featured in Figure 1 (b)). This data is already available globally, and constantly being updated.

If a vector map in an area of interest is not available in OSM, OneView (or practically anyone) can generate their own version of such a 2D vector map, and ingest it into the OneView pipeline.

Figure 2: Real imagery to 2D vector map.
OSM data contains many annotated elements, creating a rich 2D vector map of the world.

Step Two: Using a 2D Vector Map to Create a 3D Environment

To begin the process of transitioning this real-world representation into a synthetic image, OneView takes the output 2D vector, as shown in Figure 1 (b), and uses it as an input for its automated generation tool, to create a 3D environment, as shown in Figure 1 (c).

This process involves inflating buildings, curving roads, pavements, setting ground and vegetation areas, etc. The key for good simulated data is variation, and since the output (the 3D-generated airport) is fully controllable — the structure of elements (width, height, shape) and appearance (brightness, textures, material properties) — the necessary variation is introduced during the building of the 3D scene. Actually almost all scene elements either contain a pool of variants in the form of different types of textures, materials or procedurally (automatically, using OneViews algorithm) change its appearance. Of-course, on top of the scene itself, the placement logic of objects and object-to-scene relationships are added to the 3D scene. This relationship enables the user at rendering time to get varied but also realistic images.

Figure 3: OSM 2D vector map to a 3D environment.
The 3D environment is fully controllable and being configured automatically by OneView.

Step Three: A Synthetic World of Endless Variables 

The final step is utilizing the variety of the 3D scenes, placing objects on top of it and rendering, as can be seen in Figure 1 (d). The ultimate aim is to produce the maximum number of variations of this airport scene to feed into a machine learning algorithm. This will ensure it is able to detect objects of interest, regardless of geographic locations, time of day, weather conditions, positions in relation to other objects, various angles and all the other scenarios. So when the algorithm encounters different variables an object of interest is still identifiable

Therefore, at each rendering time, objects are positioned at different locations, with different densities, and different placement arrangements. The scene components change their appearance and structure, and also the objects on top of the scene, so no scenario is the same.

In addition, the time of day and weather are fully controlled, enabling different appearances of the same scenario. Finally, the sensor itself can capture the environment from different angles and other acquisition parameters.The resulting images are unique, where no image is the same.

The end-to-end result is the ability to rapidly create the necessary amount of variations represented in synthetic imagery to boost the training of machine learning algorithms to accurately identify objects of interest.

Figure 4: Baseline 3D environment to a satellite-like rendered image. 
No two renders are the same, creating a rich and diverse training dataset.

OneView’s platform gives the ability to machine learning teams to create large scale synthetic scenes to speed and scale their algorithm training process, to unleash the full potential of their computer vision model. To find out how, click here.