pynif3d.pipeline¶

class pynif3d.pipeline.BasePipeline¶

Bases: torch.nn.modules.module.Module

Initializes internal Module state, shared by both nn.Module and ScriptModule.

load_pretrained_model(yaml_file, model_name, cache_directory='.')¶

training: bool¶

class pynif3d.pipeline.ConvolutionalOccupancyNetworks(encoder_fn=None, feature_sampler_fn=None, nif_model=None, rendering_fn=None, pretrained=None)¶

Bases: pynif3d.pipeline.base_pipeline.BasePipeline

This is the main pipeline function for the Convolutional Occupancy Networks: https://arxiv.org/abs/2003.04618

This class takes the noisy point cloud, applies an encoding function (i.e. PointNet) to extract features from inputs, projects the points to 2D plane(s) or 3D grid, optionally applies an auto-encoder network in 2D or 3D planes to generate features. For each input query point, bilinear/trilinear sampling on feature plane(s) or grid is applied in order to extract the features query point features. By applying a shallow neural implicit function model, the occupancy probability of each input query point is predicted.

This class takes an encoder, feature sampler, NIF model and rendering functions during initialization as input, in order to define the pipeline.

Usage:

model = ConvolutionalOccupancyNetworks()
occupancies = model(input_points, query_points)

Parameters

encoder_fn (instance) – The function instance that is called in order to encode the input points. Default is PointNet_LocalPool.
feature_sampler_fn (instance) – The function instance that is called in order to sample the features on a plane or on a grid. The sampler has to match the 2D/3D operation mode. Default is PlaneFeatureSampler.
nif_model (instance) – The model instance that outputs occupancy information given some query points and sampled features. Default is ConvolutionalOccupancyNetworksModel.
rendering_fn (instance) – The function instance that is called in order to render the query points obtained using the nif_model. Default is PointRenderer.
pretrained (str) – (Optional) The pretrained configuration to load model weights from. Default is None.

forward(input_points, query_points)¶

Parameters

input_points (torch.Tensor) – Tensor that holds noisy input points. Its shape is (batch_size, n_points, point_dimension).
query_points (torch.Tensor) – Tensor containing the queried occupancy locations. Its shape is (batch_size, n_points, point_dimension).

Returns

Tensor containing occupancy probabilities. Its shape is (batch_size, n_points).

Return type

torch.Tensor

training: bool¶

class pynif3d.pipeline.IDR(image_size, n_rays_per_image=2048, input_sampler_training=None, input_sampler_inference=None, nif_model=None, rendering_fn=None)¶

Bases: torch.nn.modules.module.Module

This is the main pipeline function for the Implicit Differentiable Renderer (IDR) algorithm: https://arxiv.org/abs/2003.09852

It takes an image, object mask, intrinsic parameters and camera pose as input and returns the reconstructed 3D points, the rendered pixel values and the predicted mask, given the input pose. During training it also returns the predicted Z values of the sampled points, along with the value of gradient_theta, used in the computation of the eikonal loss.

Usage: .. code-block:: python

image_size = (image_height, image_width) model = IDR(image_size) pred_dict = model(image, object_mask, intrinsics, camera_poses)

Parameters

image_size (tuple) – Tuple containing the image size, expressed as (image_height, image_width).
n_rays_per_image (int) – The number of rays to be sampled for each image. Default value is 2048.
input_sampler_training (torch.nn.Module) – The ray sampler to be used during training. If set to None, it will default to RandomPixelSampler. Default value is None.
input_sampler_inference (torch.nn.Module) – The ray sampler to be used during inference. If set to None, it will default to AllPixelSampler. Default value is None.
nif_model (torch.nn.Module) – NIF model for outputting the prediction. If set to None, it will default to IDRNIFModel. Default value is None.
rendering_fn (torch.nn.Module) – The rendering function to be used during both training and inference. If set to None, it will default to IDRRenderingModel. Default value is None.

compute_gradient(points)¶

compute_rgb_values(points, view_dirs)¶

forward(image, object_mask, intrinsics, camera_poses, **kwargs)¶

Parameters

image (torch.Tensor) – Tensor containing the input images. Its shape is (batch_size, 3, image_height, image_width).
object_mask (torch.Tensor) – Tensor containing the object masks. Its shape is (batch_size, 1, image_height, image_width).
intrinsics (torch.Tensor) – Tensor containing the camera intrinsics. Its shape is (batch_size, 4, 4).
camera_poses (torch.Tensor) – Tensor containing the camera poses. Its shape is (batch_size, 4, 4).
kwargs (dict) –
- chunk_size (int): The chunk size of the tensor that is passed for NIF prediction.

Returns

Dictionary containing the prediction outputs: the 3D coordinates of the intersection points + corresponding RGB values + the ray-to-surface intersection mask (used in training and inference) and Z values + gradient theta + sampled 3D coordinates (used in training only).

Return type

dict

training: bool¶

class pynif3d.pipeline.NeRF(image_size, focal_length, n_rays_per_image=1024, n_points_per_chunk=1024, input_sampler_training=None, input_sampler_inference=None, background_color=None, ray_generator=None, ray_samplers=None, n_points_per_ray=None, level_of_sampling=2, near=2, far=6, nif_models=None, rendering_fn=None, aggregation_fn=None, pretrained=None)¶

Bases: pynif3d.pipeline.base_pipeline.BasePipeline

This is the main pipeline function for the Neural Radiance Fields (NeRF) algorithm: https://arxiv.org/abs/2003.08934

It takes a camera pose as input and returns the rendered pixel values given the input pose.

Usage:

image_size = (image_height, image_width)
focal_length = (focal_x, focal_y)

model = NeRF(image_size, focal_length)
pred_dict = model(camera_pose)

Parameters

image_size (list, tuple) – List or tuple containing the spatial image size (height, width). Its shape is (2,).
focal_length (list, tuple) – List or tuple containing the camera’s focal length (focal_x, focal_y)). Its shape is (2,).
n_rays_per_image (int) – The number of ray samples that are extracted from an image and processed. Default is 1024. Optional.
n_points_per_chunk – The number of sampled points passed to the NIF model at once.
input_sampler_training (instance) – (Optional) The pixel sampling function used during training. Default is RandomPixelSampler.
input_sampler_inference (instance) – (Optional) The pixel sampling function used during inference. Default is AllPixelSampler.
ray_generator (instance) – (Optional) The function that is called in order to generate rays with respect to a given camera pose. Default is CameraRayGenerator.
ray_samplers (list, tuple) – (Optional) List or tuple of the same length as level_of_sampling containing the function(s) that define the sampling logic for each ray. Default is UniformRaySampler for the first level and WeightedRaySampler for the second level.
n_points_per_ray (list, tuple) – (Optional) List or tuple with a length equal to level_of_sampling containing the number of points that are sampled across each ray. Default is 64 for each level.
level_of_sampling (list, tuple) – (Optional) List or tuple containing the levels of fine samples. Default value is 2 to follow coarse/fine pattern in the original NeRF paper.
near (float) – (Optional) The boundary value for each sampled ray. Each ray will be sampled between [near, far]. Default is 2.
far (float) – (Optional) The boundary value for each sampled ray. Each ray will be sampled between [near, far]. Default is 6.
nif_models (list, tuple) – (Optional) List or tuple with the length equal to level_of_sampling containing the models that define the neural implicit representation of the 3D scene. Default is NeRFModel for each level.
rendering_fn (instance) – (Optional) The function that defines the NIF model execution logic, in order to obtain the resulting pixel values. Default is PointRenderer.
aggregation_fn (instance) – (Optional) The function that defines the aggregation logic for the predicted 3D point values, in order to obtain the final pixel values. Default is NeRFAggregator.
pretrained (str) – (Optional) The pretrained configuration to load model weights from. Default is None.

forward(pose)¶

Parameters: pose (torch.Tensor) – Tensor containing the camera pose information, used for querying. Its shape is (3, 4).
Returns: Dictionary containing the rendering result (RGB, depth, disparity and transparency values for each pixel that is sampled by input_sampler_inference or input_sampler_training).
Return type: dict

training: bool¶