Graphics Runner

Thursday, February 2, 2012

Optix.NET: a managed wrapper for Nvidia Optix

Today I released an open source project I’ve been working on for the last week, Optix.NET. It’s a lightweight wrapper around Nvidia’s Optix GPU ray-tracing library. I figured since there aren’t any .NET wrappers around that I’d go ahead and make one myself. The project started out as a curiosity of managed wrappers and a little of a learning experience working with c++/cli. It’s still in the Alpha stage of development so there may be some bugs, and if you have any suggestions feel free to drop me a line. The math library is pretty sparse at the moment, having only implemented functions that I needed.

Optix.NET may head in the direction of CUDAfy where you can create your optix programs in-line with your C#. The current downside with Optix.NET is that you cannot share structs/classes with your Optix programs as you can when working with the original c/c++ library.

The Optix.NET SDK also comes with a (at the moment very) basic demo framework for creating Optix applications. Such as a basic OBJ model loader and simple camera.

As I talked a little about last post on Instant Radiosity, the general flow of Optix is:

Create a context
- This is similar to a D3D Device.
Create material programs
- These will run when there is an intersection and are akin to pixel shaders.
Create intersection programs
- These are responsible for performing ray-geometry intersection.
Create the main entry program / ray-generation program
- These will launch eye rays in a typical pinhole camera ray-tracer
Load geometry data and creating a scene hierarchy
Perform ray-tracing and display results.

To get a very good introduction to Optix I recommend following the programming guide and quick start guide that comes with the Optix SDK. Let’s get to it then.

This small tutorial will walk through the steps of Sample 6 in the Optix.NET SDK and create a simple program that will ray-trace a cow and shade it with its interpolated normals.

Creating the Optix Context

Context = new Context();
Context.RayTypeCount = 1;
Context.EntryPointCount = 1;

Here we uh create our rendering context :-). We also set the ray type count. This tells optix how many different types of rays will be traversing the scene (e.g. Eye rays, indirect rays, shadow rays, etc ). EntryPointCount sets the number of main entry programs there will be.

Creating the material

Material material = new Material( Context );
material.Programs[ 0 ] = new SurfaceProgram( Context, RayHitType.Closest, shaderPath, "closest_hit_radiance" );

This creates a material that the geometry will use and assigns a SurfaceProgram (similar to a pixel shader), and tells Optix to run this shader on the closest ray-geometry intersection so that there is propery depth sorting.

Creating geometry

Next the geometry is loaded. For brevehity’s sake that part is omitted, but I show the important part of how you get your geometry into Optix.

First we create geometry buffers, similar to vertex and index buffers in D3D, and fill them with the positions, normals, texture coordinates, and triangle indices.

//create buffer descriptions
BufferDesc vDesc = new BufferDesc() { Width = (uint)mVertices.Count,  Format = Format.Float3, Type = BufferType.Input };
BufferDesc nDesc = new BufferDesc() { Width = (uint)mNormals.Count,   Format = Format.Float3, Type = BufferType.Input };
BufferDesc tcDesc = new BufferDesc(){ Width = (uint)mTexcoords.Count, Format = Format.Float2, Type = BufferType.Input };
BufferDesc iDesc = new BufferDesc() { Width = (uint)mIndices.Count,   Format = Format.Int3,   Type = BufferType.Input };

// Create the buffers to hold our geometry data
Optix.Buffer vBuffer = new Optix.Buffer( Context, ref vDesc );
Optix.Buffer nBuffer = new Optix.Buffer( Context, ref nDesc );
Optix.Buffer tcBuffer = new Optix.Buffer( Context, ref tcDesc );
Optix.Buffer iBuffer = new Optix.Buffer( Context, ref iDesc );

vBuffer.SetData<Vector3>( mVertices.ToArray() );
nBuffer.SetData<Vector3>( mNormals.ToArray() );
tcBuffer.SetData<Vector2>( mTexcoords.ToArray() );
iBuffer.SetData<Int3>( mIndices.ToArray() );

Next we create a Geometry node that will tell Optix what intersection programs to use, how many primitives our geometry has, and creates shader variables to hold the geometry buffers.

//create a geometry node and set the buffers
Geometry geometry = new Geometry( Context );
geometry.IntersectionProgram = new Program( Context, IntersecitonProgPath, IntersecitonProgName );
geometry.BoundingBoxProgram = new Program( Context, BoundingBoxProgPath, BoundingBoxProgName );
geometry.PrimitiveCount = (uint)mIndices.Count;

geometry[ "vertex_buffer" ].Set( vBuffer );
geometry[ "normal_buffer" ].Set( nBuffer );
geometry[ "texcoord_buffer" ].Set( tcBuffer );
geometry[ "index_buffer" ].Set( iBuffer );

Now we create a GeometryInstance that pairs a Geometry node with a Material (that we created earlier ).

//create a geometry instance
GeometryInstance instance = new GeometryInstance( Context );
instance.Geometry = geometry;
instance.AddMaterial( Material );

//create an acceleration structure for the geometry
Acceleration accel = new Acceleration( Context, AccelBuilder.Sbvh, AccelTraverser.Bvh );
accel.VertexBufferName = "vertex_buffer";
accel.IndexBufferName = "index_buffer";

We then create an Acceleration structure ( or Bounding Volume Hierarchy ) that will create a spatial data structure that will optimize the ray traversal of the geometry. Here we create the Acceleration node with a Split BVH builder and a BVH traverser. This informs Optix how the BVH should be built and traversed. We also give the Acceleration structure the name of the vertex and index buffers so that it can use that data to optimize the building of the Split BVH (assigning the names of the vertex and index buffers is only required with Sbvh and TriangkeKdTree AccelBuilders ).

Next we create a top-level node to hold our hierarchy. We give it the acceleration structure and the geometry instance. Optix will use this top-level node to begin its scene traversal.

//now attach the instance and accel to the geometry group
GeometryGroup GeoGroup = new GeometryGroup( Context );
GeoGroup.Acceleration = accel;
GeoGroup.AddChild( instance );

Create ray generation program

Now comes the creation of our main entry ray generation program and set it on the Context. This will be responsible for creating pinhole camera rays.

Program rayGen = new Program( Context, rayGenPath, "pinhole_camera" );
Context.SetRayGenerationProgram( 0, rayGen );

Create the output buffer and compile the Optix scene

Finally, we create our output buffer, making sure to define its format and type. The BufferType in Optix defines how the buffer will be used. The BufferTypes are: Input, Output, InputOutput, and Local. The first three are self explanatory. Local sets up the buffer to live entirely on the GPU, which is a huge performance win in multi-gpu setups as it doesn’t require us copying the buffer from GPU memory to main memory after every launch. Local buffers are typically used for intermediate results ( such as accumulation buffers for iterative GI ).

BufferDesc desc = new BufferDesc() { Width = (uint)Width, Height = (uint)Height, Format = Format.UByte4, Type = BufferType.Output };
OutputBuffer = new OptixDotNet.Buffer( Context, ref desc );

Now we setup shader variables to that will hold our top level GeometryGroup, and OutputBuffer. Then Compile Optix (this will validate our node layout and programs are correct) and build the acceleration tree, which only needs to be done on initialization or when geometry changes.

Context[ "top_object" ].Set( model.GeoGroup );
Context[ "output_buffer" ].Set( OutputBuffer );

Context.Compile();
Context.BuildAccelTree();

Ray-tracing and displaying results

To ray-trace the scene we call Launch and give the size of our 2D launch dimensions and the index of our main entry program (zero).

Context.Launch( 0, Width, Height );

And to display the results, we get a pointer to the output buffer, and we use OpenGl’s draw pixels:

BufferStream stream = OutputBuffer.Map();
Gl.glDrawPixels( Width, Height, Gl.GL_BGRA, Gl.GL_UNSIGNED_BYTE, stream.DataPointer );
OutputBuffer.Unmap();

Results:

And that’s pretty much it. A pretty simple program for doing GPU ray-tracing :-).

The current source, samples, and built executables are freely downloadable. Currently, I’ve got 5 samples that mimic the Optix SDK samples and will continue to add more to test functionality and eye candy.

You can download the current release and source here:
http://optixdotnet.codeplex.com/

Or get the source directly with Mercurial here:
https://hg01.codeplex.com/optixdotnet

Saturday, June 4, 2011

New Prey 2 E3 Trailer

Bethesda released a new trailer for Prey 2. I'm not gonna lie, it looks awesome.

Thursday, April 7, 2011

Prey 2 Concept and Screenshot

Bethesda posted a screenshot and concept piece from Prey 2 today: http://bethblog.com/index.php/2011/04/07/sneak-peek-at-prey-2/

Monday, March 28, 2011

Instant Radiosity using Optix and Deferred Rendering

This comes a little later than I wanted, I hadn’t factored in Crysis 2 taking up as much of my time as it did last week :-)

I’ve been using Nvidia’s Optix raytracing API for quite some time, and decided that a good introduction to Optix and what it can do for you would be using it in an Instant Radiosity demo. The demo is fairly large so I won’t cover all of it here, as that would be entirely too long of a post, but just the main parts.

Instant Radiosity

Instant Radiosity is a global illumination algorithm that approximates the diffuse radiance of a scene by placing many virtual point lights that act as indirect light. The algorithm is fairly simple: for each light in the scene you cast N photon rays into the scene. At each intersection the photon either bounces and another ray is cast or, through Russian Roulette, is killed of. At each of the intersections you create a Virtual Point Light (VPL) that has the same radiance value as the photon. Once you have these VPLs you render them as you would any other light source.

One optimization that the demo makes is to divide the scene into a regular grid. For each grid voxel, we find all the VPLs in the voxel and merge them together to form a new VPL that represents the merged VPLs. Any voxels that don’t contain any VPLs are skipped. This dramatically reduces the number of VPLs that we need to render, and trades off indirect accuracy for speed. The following couple of shots demonstrate this idea. The image on the left shows the VPLs as calculated from our Optix program. The image on the right shows the merged VPLs.

Optix

Optix is Nvidia’s ray tracing API that runs on Nvidia GPUs (on G80 and up). Giving an overview to optix could take many blog posts so I won’t go that in depth here. There are a couple of SIGGRAPH presentations that give a good overview:

http://nvidia.fullviewmedia.com/siggraph2010/04-dev-austin-robison.html
http://graphics.cs.williams.edu/papers/OptiXSIGGRAPH10/

To create an optix program you essentially need two things: a ray generation program and a material program ( essentially a shader ) that gets called when a ray intersects geometry. The ray generation program does exactly as it sounds, it generates rays. The program is called for each pixel of your program’s dimensions. Rays cast by your ray generation program will traverse the scene for intersections, once a ray intersects geometry it will call its material program. The material program is responsible for say shading in a classic ray tracer, or any other computation you want to perform. In our case we’ll use it to create our Virtual Point Lights. So lets get down to business.
Here we have the ray generation program that will cast rays from a light. In the case of our cornell box room, we have an area light at the ceiling and we need to cast photons from this light.

RT_PROGRAM void instant_radiosity()
{
     //get our random seed
     uint2 seed = seed_buffer[ launch_index ];

     //create a random photon direction
     float2 raySeed = make_float2( ( (float)launch_index.x + rnd( seed.x ) ) / (float)launch_dim.x,
                                   ( (float)launch_index.y + rnd( seed.y ) ) / (float)launch_dim.y );

     float3 origin = Light.position;
     float3 direction = generateHemisphereLightPhoton( raySeed, Light.direction );

     //create our ray
     optix::Ray ray(origin, direction, radiance_ray_type, scene_epsilon );

     //create our ray data packet and launch a ray
     PerRayData_radiance prd;
     prd.radiance = Light.color * Light.intensity * IndirectIntensity;
     prd.bounce = 0;
     prd.seed = seed;
     prd.index = ( launch_index.y * launch_dim.x + launch_index.x ) * MaxBounces;
     rtTrace( top_object, ray, prd );
}

So here we cast a randomly oriented ray from a hemisphere oriented about the direction of the light. Once we have our ray, we setup a ray data packet that will collect data as this ray traverses the scene. To cast the ray we make a call to rtTrace, providing the ray and its data packet.

Next we have our material program. This program is called when a ray hits the closest piece of geometry from the light. And it is responsible for updating the ray data packet, placing a VPL, and deciding to cast another ray recursively if we’re under the maximum number of bounces.

RT_PROGRAM void closest_hit_radiosity()
{
     //convert the geometry's normal to world space
     //RT_OBJECT_TO_WORLD is an Optix provided transformation
     float3 world_shading_normal   = normalize( rtTransformNormal( RT_OBJECT_TO_WORLD, shading_normal ) );
     float3 world_geometric_normal = normalize( rtTransformNormal( RT_OBJECT_TO_WORLD, geometric_normal ) );
     float3 ffnormal     = faceforward( world_shading_normal, -ray.direction, world_geometric_normal );

     //calculate the hitpoint of the ray
     float3 hit_point = ray.origin + t_hit * ray.direction;

     //sample the texture for the geometry
     float3 Kd = norm_rgb( tex2D( diffuseTex, texcoord.x, texcoord.y ) );
     Kd = pow3f( Kd, 2.2f ); //convert to linear space
     Kd *= make_float3( diffuseColor ); //multiply the diffuse material color
     prd_radiance.radiance = Kd * prd_radiance.radiance; //calculate the ray's new radiance value

     // We hit a diffuse surface; record hit if it has bounced at least once
     if( prd_radiance.bounce >= 0 ) {
          //offset the light a bit from the hit point
          float3 lightPos = ray.origin + ( t_hit - 0.1f ) * ray.direction;
          VirtualPointLight& vpl = output_vpls[ prd_radiance.index + prd_radiance.bounce ];
          vpl.position = lightPos;

          //the light's intensity is divided equally among the photons. Each photon starts out with an intensity
          //equal to the light. So here we must divide by the number of photons cast from the light.
          vpl.radiance = prd_radiance.radiance * 1.0f / ( launch_dim.x * launch_dim.y );
     }

     //if we're less than the max number of bounces shoot another ray
     //we could also implement Russion Roulette here so that we would have a less biased solution
     prd_radiance.bounce++;
     if ( prd_radiance.bounce >= MaxBounces )
          return;

     //here we "rotate" the seeds in order to have a little more variance
     prd_radiance.seed.x = prd_radiance.seed.x ^ prd_radiance.bounce;
     prd_radiance.seed.y = prd_radiance.seed.y ^ prd_radiance.bounce;
     float2 seed_direction = make_float2( ( (float)launch_index.x + rnd( prd_radiance.seed.x ) ) / (float)launch_dim.x,
                                          ( (float)launch_index.y + rnd( prd_radiance.seed.y ) ) / (float)launch_dim.y );

     //generate a new ray in the hemisphere oriented to the surface
     float3 new_ray_dir = generateHemisphereLightPhoton( seed_direction, ffnormal );

     //cast a new ray into the scene
     optix::Ray new_ray( hit_point, new_ray_dir, radiance_ray_type, scene_epsilon );
     rtTrace(top_object, new_ray, prd_radiance);
}

With both of these programs created we need to launch our optix program in order to generate the VPLs. When we’re done running the optix program, we gather all the VPLs into a grid, merging lights that are in the same voxel. Once the VPLs are merged, we add them to the deferred renderer.

//run our optix program
mContext->launch( 0, SqrtNumVPLs, SqrtNumVPLs );

//get a pointer to the GPU buffer of virtual point lights.
VirtualPointLight* lights = static_cast< VirtualPointLight* >( mContext["output_vpls"]->getBuffer()->map() );

//the following block merges the scattered vpls into a structured grid of vpls
//this helps dramatically reduce the number of vpls we need in the scene
if( mMergeVPLs )
{
     //Here we traverse over the VPLs and we merge all the lights that are in a cell
     for( int i = 0; i < TotalVPLs; ++i ) 
     {
          optix::Aabb node = mBoundingBox;

          //start with the root cell and recursively traverse the grid to find the cell this vpl belongs to
          int index = 0;
          if( FindCellIndex( mBoundingBox, -1, mVoxelExtent, lights[ i ].position, index ) )
          {
               //make sure we found a valid cell
               assert( index >= mFirstLeafIndex );

               //subtract the first leaf index to find the zero based index of the vpl
               index -= mFirstLeafIndex;
               float3& light = mVPLs[ index ];
               light += lights[ i ].radiance;
          }
     }

     //once the VPLs have been merged, add them to the renderer as indirect lights
     int numLights = 0;
     int lastIndex = -1;
     for( int i = 0; i < mVPLs.size(); ++i ) 
     {
          const float3& vpl = mVPLs[i];
          if( dot( vpl, vpl ) <= 0.0f )
               continue;

          numLights++;
          float3 radiance = vpl;
          D3DXVECTOR3 pos = *(D3DXVECTOR3*)&mVoxels[i].center();

          Light light =    {    LIGHT_POINT,                                                //type
                                GetColorValue(radiance.x, radiance.y, radiance.z, 1.0f),    //diffuse
                                pos,                                                        //pos
                                Vector3Zero,                                                //direction
                                1.0f                                                        //intensity
                           };

          renderer->AddIndirectLight( light );

          //also add as a light source so we can visualize the VPLs
          LightSource lightSource;
          lightSource.light = light;
          lightSource.Model = mLightModel;
          renderer->AddLightSource( lightSource );
     }
}

Now for some eye candy. The first set are your typical cornell box + dragon. In the Instant Radiosity shot you can see the light bleeding from the green and red walls onto the floor, the dragon and the box.

Direct lighting: