Sony A7 review

Introduction

This is a new post for one of my favourite “off-topic” subjects – photography. I just recently (under 2 weeks ago) bought Sony A7 and wanted to share some my first impressions and write a mini review.

Why did I buy a new piece of photo hardware? Well, my main digital camera since 3-4 years was Fuji FinePix X100. I also owned some Nikon 35mm/FF DSLRs, but since my D700 (that I bought used cheaply with already big shutter counter value) got broken beyond repair I bought D600, I almost didn’t use Nikon gear. D600 is a terrible camera with broken AF, wrong metering (exposes +/- 1EV at random, lots of PP at home) and tons of other problems and honestly – I wouldn’t recommend it to anyone and I don’t use it anymore.

With Fuji X100 I share hate & love relationship. It has lots of advantages. Great image quality for such tiny size and APS-C sensor. It is very small, looks like a toy camera (serious advantage if you want to travel into not really safe areas or simply don’t want to attract too much attention, just enjoy taking photos). Bright f/2.0 lens and interesting focal length (one good photographer friend of mine told me once that there are no interesting photos taken with focal lengths of more than 50mm and while it was supposed to be a joke, I hope you can get the point). Finally nice small built-in flash and excellent fill light flash mode working great with leaf shutter and short sync times – it literally saved thousands of portraits in bright sunlight and other holiday photos. On the other hand, it is slow, has lots of quirks in usage (why do I need to switch to macro mode to take a regular situational portrait?!), slow and inaccurate AF (need to try to take a photo couple times, especially in low light…), it’s not pin-sharp and fixed 35mm focal length equivalent can be quite limiting – too wide for standard shooting, too narrow for wide angle shots.

Since at least a year I was looking around for alternatives / some additional gear and couldn’t find anything interesting enough. I looked into Fuji X100s – but simply a bit better AF and sensor wouldn’t justify such big expense + software has problems with X-Trans sensor pixel color reconstruction. I read a lot about Fuji X-series mirror-less system, but going into a new system and buying all the new lenses is a big commitment – especially on APS-C. Finally quite recent option is Sony RX-1. It seemed very interesting, but Angelo Pesce described it quite well – it’s a toy (NO OVF/EVF???).

Sony A7/A7R and recent A7S looked like interesting alternatives and something that would compete with famous Leica so I looked into it and after couple weeks of research I decided to buy the cheapest and most basic one – A7 with the kit lens. What do I need kit lens for? Well, to take photos. I knew that its IQ wouldn’t be perfect, but it’s cheap, not very heavy and it’s convenient to have one just in case – especially until having completed your target lens set. After few days of extensive use (a weekend trip to NYC, yay!) I feel like writing a mini review of it, so here we go!

Hero of this report - no, not me! Sony A7 :)

Hero of this report – no, not me & sunburn! Sony A7 :) Tiny and works great.

I tested it with the kit lens (Sony FE 28-70mm f/3.5-5.6 OSS), Nikkor 50mm 1.4D and Voigtlander Nokton 40mm 1.4.

DSC00353

What I like about it

Size and look

This one is pretty obvious. Full-frame 35mm camera sized smaller than many mirrorless APS-C or famous Leica cameras! Very light, so I just throw it in a bag or backpack. My neck doesn’t hurt even after whole day of photo shooting. Discrete when doing street photography. Nice style that is kind of blend between modern and retro cameras. Especially with M-mount lenses on – classic look and compact size. Really hard to beat in this area. :)

Love how classic and modern styles work great together on this camera

Love how classic and modern styles work great together on this camera

Image quality

Its full-frame sensor has amazing dynamic range on low ISOs. 24MP resolution – way too much for anyone except for pros taking shots for printing on billboards, but useful for cropping or reducing high-ISO noise when downsizing. Very nice built-in color profiles and aesthetic color reproduction – I like them much better than Adobe Lightroom ones. I hope I don’t sound like audiophiles, but you really should be able to see the effect of full-frame and large pixel size on the IQ – like there is “medium-format look” even with mediocre scans, I believe there is “full-frame look” better than APS-C or Micro 4/3.

Subtle HDR from a single photo? No problem with Sony A7 dynamic range!

Subtle HDR from a single photo? No problem with Sony A7 dynamic range.

IQ and amount of detail is amazing  - even on MF, shot with Voigtlander Nokton 40mm f 1.4

IQ and amount of detail is amazing – even on MF, shot with Voigtlander Nokton 40mm f 1.4

EVF and back display

Surprisingly pleasant in use, high resolution and dynamic range and fast. I was used to Fuji X100 laggy EVF (still useful at night or when doing precise composition) and on Sony A7 I feel huge difference. Switches between EVF and back display quite quickly and eye sensor works nice. Back display can be tilted and I used it already couple times (photos near the ground or above my head), a nice feature to have.

Manual focusing and compatibility with other lenses

This single advantage is really fantastic and I would buy this camera just because of that. Plugging in Voigtlander or Nikon lenses was super easy, camera automatically switched into manual focus mode and operated very well. Focusing with magnification and focus-assist is super easy and really pleasant. It feels like all those old manual cameras, same pleasure of slowly composing, focusing, taking your time and enjoying photography – but much more precise. With EVF and DoF preview always on you constantly think about DoF and its effect on composition, what will be sharp etc. To be honest, I never took so sharp and photos in my life – almost none deleted afterwards. So you spend more time on photo taking (it may be not acceptable for your friends or strangers asked to take a photo of you), but much less in post-processing and selection – again, kind of back to photography roots.

My wife photo shot using Nikkor 50mm f/1.4D - no AF gave me such precise results...

Photo of my wife. It was photo shot using Nikkor 50mm f/1.4D and MF – no AF ever gave me so precise results…

I like the composition and focus in this photo - shot using manual focus on Nikkor 50mm 1.4D

I like the composition and focus in this photo – shot using manual focus on Nikkor 50mm 1.4D

Quality of kit lens and image stabilization

I won’t write any detailed review of the kit lens – but it’s acceptably sharp, nice micro-contrast and color reproduction, you can correct distortion and vignetting easily in Lightroom and it’s easy to take great low-light photos with relatively longer exposure times due to very good image stabilization. AF is usually accurate. While I don’t intend to use this lens a lot, I have much more fun with primes, I will keep it in my bag for sure and it proves itself useful. Only downside is size (zoom FF lenses cannot be tiny…) – because it is surprisingly light!

Hand held photo taken using lens kit at night - no camera shake!

Hand held photo taken using lens kit at night – no camera shake!

Speed and handling

Again probably I feel so good about Sony A7 speed and handling because of moving from Fuji X100 – but ergonomics are great, it is fast to use and reacts quickly. Only disadvantage is how long it takes default photo preview and EVF showing image feed again – 2s is minimum time to select from a menu – way too long for me. There are tons of buttons configured very wisely by default – changing ISO or exposure compensation without taking your eye off the camera is easy.

Various additional modes

Pro photographer probably doesn’t need any panorama mode, or night mode that automatically combines many frames to decrease noise / camera shake / blur, but I’m not a pro photographer and I like those features – especially panoramas. Super easy to take, decent quality and no need to spend hours post-processing or relying on stitch apps!

In-camera panorama image

In-camera panorama image

What I don’t like

Current native lenses available

Current native FE (“full frame E-mount”) lens line-up is a joke. Apart from kit lens there are only 2 primes (why 35mm is only f/2.8 when so big?) and 2 zoom lenses – all definitely over-priced and too large. L There are some Samyang/Rokinon manual focus lenses available (I played a bit with 14mm 2.8 on Nikon and it was cheap and good quality – but way too large). There are rumors of many first and third party (Zeiss, Sigma, maybe Voigtlander) lenses to be announced at Photokina so we will see. For now one has to rely on adapters and manual focusing.

Lack of built-in or small external flash

A big problem for me. I very often use flash as fill light and here it’s not possible. L Smallest Sony flash HVL-F20AM is currently not available (and not so small anyway).

Not too bad photo - but would have been much better with some fill light from a flash...

Not too bad photo – but would have been much better with some fill light from a flash… (ok, I know – would be difficult to sync without ND filters / leaf shutter :) )

What could be better but is not so bad

Accessories

System is very young so I expect things to improve – but currently availability of first or third party accessories (flashes, cases, screen protectors etc.) is way worse than for example Fuji X-series system. I hope things to change in the next months.

Not the best low light behavior

Well, maybe I’m picky and expected too much as I take tons of night photos and couple years ago it was one of the reasons I wanted to buy a full-frame camera. :) But for a 2014 camera A7 high ISO quality degradation of detail (even in RAW files! they are not “true” RAW sensor feed…), color and dynamic range is a bit too high. A7S is much better in this area. Also the AF behavior is not perfect in low light…

Photo taken at night with Nikkor 50mm and f/1.4 - not too bad, but some grain visible and detail lost

Photo taken at night with Nikkor 50mm and f/1.4 – not too bad, but some grain visible and detail loss

Not best lens adapters

The adapters I have for Nikon and M-mount are OK. Their built quality seems acceptable and I didn’t see any problems yet. But they are expensive – 50-200 dolars for a piece of metal/plastic? It would be also nice to have some information in EXIF – for example option to manually specify set focal length or detect aperture? Also Nikon/Sony A-mount/Canon adapters are too big (they cannot be smaller due to design of the lens – focal plane distance must match DSLRs) – what’s the point of having small camera with big, unbalanced lenses?

Even with mediocre adapters can't complain about MF lens handling

Even with mediocre adapters can’t complain about MF lens handling and IQ

Kit zoom and tiny Nikkor 50mm 1.4D with adapter are too big... M-mount adapter and Voigtlander lens are much smaller and more useful.

Kit zoom and tiny Nikkor 50mm 1.4D with adapter are too big… M-mount adapter and Voigtlander lens are much smaller and more useful.

Photo preview mode

I don’t really like how magnification button is placed and that by default it magnifies a lot (to 100% image crop level). I didn’t see any setting to change it – I would expect progressive magnification and better button placement like on Nikon camera.

Wifi pairing with mobile

I don’t think I will use it a lot – but sometimes it could be cool for remote control. In such case I tried to set it up and it took me 5mins or so to figure it out – definitely not something to do when willing to take a single nice photo with your camera placed on a bench at night.

 

What’s next?

In the next couple days (hopefully before the Siggraph as after I have a lot more to write!) I promise I will add in separate posts:

  • More sample photos from my NYC trip
  • Voigtlander Nokton 40mm f/1.4 mini review – I’m really excited about this lens and it definitely deserves a separate review!

So stay tuned!

Image | Posted on by | Tagged , , , , , , , , , , , | Leave a comment

Hair rendering trick(s)

I didn’t really plan to write this post as I’m quite busy preparing for Siggraph and enjoying awesome Montreal summer, but after 3 similar discussion with friends developers I realized that the simple hair rendering trick I used during the prototyping stage at CD Projekt Red for Witcher 3 and Cyberpunk 2077 (I have no idea if guys kept that though) is worth sharing as it’s not really obvious. It’s not about hair simulation or content authoring, I’m not really competent to talk about those subjects and it’s really well covered in AMD Tress FX or nVidia HairWorks (plus I know that lots of game rendering engineers work on that topic as well), so check them out if you need awesome looking hair in your game. The trick I’m going to cover is to improve quality of typical alpha-tested meshes used in deferred engines. Sorry, but no images in this post though!

Hair rendering problems

There are usually two problems associated with hair rendering that lot of games and game engines (especially deferred renderers) struggle with.

  1. Material shading
  2. Aliasing and lack of transparency

First problem is quite obvious – hair shading and material. Using standard Lambertian diffuse and Blinn/Blinn-Phong/microfacet specular models you can’t get proper looks of hair, you need some hair specific and strongly anisotropic model. Some engines try to hack some hair properties into the G-Buffer and use branching / material IDs to handle it, but as recently John Hable wrote in his great post about needs for forward shading – it’s difficult to get hair right fitting those properties into G-Buffer.

I’m also quite focused on performance, love low-level and analyzing assembly and it just hurts me to see branches and tons of additional instructions (sometimes up to hundreds…) and registers used to branch for various materials in the typical deferred shading shader. I agree that the performance impact can be not really significant compared to bandwidth usage on fat GBuffers and complex lighting models, but still it’s the cost that you pay for whole screen even though hair pixels don’t occupy too much of the screen area.

One of tricks we used on The Witcher 2 was faking hair specular using only dominant light direction + per character cube-maps and applying it as “emissive” mesh lighting part. It worked ok only because of really great artists authoring those shaders and cube-maps, but I wouldn’t say it is an acceptable solution for any truly next-gen game.

Therefore hair really needs forward shading – but how to do it efficiently and not pay the usual overdraw cost and combine it with deferred shading?

Aliasing problem.

A nightmare of anyone using alpha-tested quads or meshes with hair strands for hair. Lots of games can look just terrible because of this hair aliasing (the same applies for foliage like grass). Epic proposed to fix it by using MSAA, but this definitely increases the rendering cost and doesn’t solve all the issues. I tried to do it using alpha-to-coverage as well, but the result was simply ugly.

Far Cry 3 and some other games used screen-space blur on hair strands along the hair tangenta and it can improve the quality a lot, but usually end parts of hair strands either still alias or bleed some background onto hair (or the other way around) in non-realistic manner.

Obvious solution here is again to use forward shading and transparency, but then we will face other family of problems: overdraw, composition with transparents and problems with transparency sorting. Again, AMD Tress FX solved it completely by using order-independent transparency algorithms on just hair, but the cost and effort to implement it can be too much for many games.

Proposed solution

The solution I tried and played with is quite similar to what Crytek described that they tried in their GDC 2014 presentation. I guess we prototyped it independently in similar time frame (mid-2012?). Crytek presentation didn’t dig too much into details, so I don’t know how much it overlaps, but the core idea is the same. Another good reference is this old presentation from Scheuermann from ATI at GDC 2004! Their technique was different and based only on forward shading pipeline, not aimed to combined with deferred shading – but the main principle of multi pass hair rendering and treating transparents and opaque parts separately is quite similar. Thing worth noting is that with DX11 and modern GPU based forward lighting techniques it became possible to do it much easier. :)

Proposed solution is a hybrid of deferred and forward rendering techniques to solve some problems with it. It is aimed for engines that still rely on hair alpha tested stripes for hair rendering, have fluent alpha transition in the textures, but still most of hair strands are solid, not transparent and definitely not sub-pixel (then forget about it and hope you have the perf to do MSAA and even supersampling…). You also need to have some form of forward shading in your engine, but I believe that’s the only way to go for the next gen… Forward+/clustered shading is a must for material variety and properly lit transparency – even in mainly deferred rendering engines. I really believe in advantages of combining deferred and forward shading for different rendering scenarios within a single rendering pipeline.

Let me describe first proposed steps:

  1. Render your hair with full specular occlusion / zero specularity. Do alpha testing in your shaders with value Aref close to 1.0. (Artist tweakable).
  2. Do your deferred lighting passes.
  3. Render forward pass of hair speculars with no alpha blending, z testing set to “equal”. Do the alpha testing exactly like in step 1.
  4. Render forward pass of hair specular and albedo for hair transparent part with alpha blending (scaled from 0 to Aref to 0-1 range), inverse alpha test (1-Aref) and regular depth test.

This algorithm assumes that you use regular Lambertian hair diffuse model. You can easily swap it, feel free to modify point 1 and 3 and first draw black albedo into G-Buffer and add the different diffuse model in step 3.

 Advantages and disadvantages

There are lots of advantages of this trick/algorithm – even with non-obvious hair mesh topologies I didn’t see any problems with alpha sorting – because alpha blended areas are small and usually on top of solid geometry. Also because most of the rendered hair geometry writes depth values it works ok with particles and other transparents. You avoid hacking of your lighting shaders, branching and hardcore VGPR counts. You have smooth and aliasing-free results and a proper, any shading model (not needing to pack material properties). It also avoids any excessive forward shading overdraw (z-testing set to equal and later regular depth testing on almost complete scene). While there are multiple passes, not all of them need to read all the textures (for example no need to re-read albedo after point 1 and G-Buffer pass can use some other normal map and no need to read specular /gloss mask). The performance numbers I had were really good – as hair covers usually very small part of the screen except for cutscenes – and proposed solution meant zero overhead/additional cost on regular mesh rendering or lighting.

Obviously, there are some disadvantages. First of all, there are 3 geometry passes for hair (one could get them to 2, combining points 3 and 4, but getting rid of some of advantages). It can be too much, especially if using some spline/tessellation based very complex hair – but this is simply not an algorithm for such cases, they really do need some more complex solutions… Again, see Tress FX. There can be a problem of lack of alpha blending sorting and later problems with combining with particles – but it depends a lot on the mesh topology and how much of it is alpha blended. Finally, so many passes complicate renderer pipeline and debugging can be problematic as well.

 

Bonus hack for skin subsurface scattering

As a bonus description how in a very similar manner we hacked skin shading in The Witcher 2.

We couldn’t really separate our speculars from diffuse into 2 buffers (already way too many local lights and big lighting cost, increasing BW on those passes wouldn’t help for sure). We didn’t have ANY forward shading in Red Engine at the time as well! For skin shading I really wanted to do SSS without blurring neither albedo textures nor speculars. Therefore I came up with following “hacked” pipeline.

  1. Render skin texture with white albedo and zero specularity into G-Buffer.
  2. During lighting passes always write specular not modulated by specular color and material properties into the alpha channel (separate blending) of lighting buffer.
  3. After all lights we had diffuse response in RGB and specular response in A – only for skin.
  4. Do a typical bilateral separable screen space blur (Jimenez) on skin stencil-masked pixels. For masking skin I remember trying both 1 bit from G-Buffer or “hacking” test for zero specularity/white albedo in the G-Buffer – both worked well, don’t remember which version we shipped though.
  5. Render skin meshes again – multiplying RGB from blurred lighting pixels by albedo and adding specularity times the specular intensity.

The main disadvantage of this technique is losing all specular color from lighting (especially visible in dungeons), but AFAIK there was a global, per-environment artist specified specular color multiplier value for skin. A hack, but it worked. Second, smaller disadvantage was higher cost of SSS blur passes (more surfaces to read to mask the skin).

In more modern engines and current hardware I honestly wouldn’t bother, do separate lighting buffers for diffuse and specular responses instead, but I hope it can inspire someone to creatively hack their lighting passes. :)

References

[1] http://www.filmicworlds.com/2014/05/31/materials-that-need-forward-shading/

[2] http://udn.epicgames.com/Three/rsrc/Three/DirectX11Rendering/MartinM_GDC11_DX11_presentation.pdf

[3] http://www.crytek.com/download/2014_03_25_CRYENGINE_GDC_Schultz.pdf

[4] http://developer.amd.com/tools-and-sdks/graphics-development/graphics-development-sdks/amd-radeon-sdk/

[5] https://developer.nvidia.com/hairworks 

[6] “Forward+: Bringing Deferred Lighting to the Next Level” Takahiro Harada, Jay McKee, and Jason C.Yang https://diglib.eg.org/EG/DL/conf/EG2012/short/005-008.pdf.abstract.pdf

[7] “Clustered deferred and forward shading”, Ola Olsson, Markus Billeter, and Ulf Assarsson http://www.cse.chalmers.se/~uffe/clustered_shading_preprint.pdf

[8] “Screen-Space Perceptual Rendering of Human Skin“, Jorge Jimenez, Veronica Sundstedt, Diego Gutierrez

[9] “Hair Rendering and Shading“, Thorsten Scheuermann, GDC 2004

Posted in Code / Graphics | Tagged , , , , , | 3 Comments

C#/.NET graphics framework on GitHub + updates

As I promised I posted my C#/.NET graphics framework (more about it and motivation behind it here) on GitHub: https://github.com/bartwronski/CSharpRenderer

This is my first GitHub submit ever and my first experience with Git, so there is possibility I didn’t do something properly – thanks for your understanding!

List of changes since initial release is quite big, tons of cleanup + some crashfixes in previously untested conditions, plus some features:

Easy render target management

I added helper functions to manage lifetime of render targets and allow render target re-use. Using render target “descriptors” and RenderTargetManager you request a texture with all RT and shader resource views and it is returned from a pool of available surfaces – or lazily allocated when no surface fitting given descriptor is available. It allows to save some GPU memory and makes sure that code is 100% safe when changing configurations – no NULL pointers when enabling not enabled previously code paths or adding new ones etc.

I also added very simple “temporal” surface manager – that for every surface created with it stores N different physical textures for requested N frames. All temporal surface pointers are updated automatically at beginning of a new frame. This way you don’t need to hold states or ping-pong in your rendering passes code and code becomes much easier to follow eg.:

RenderTargetSet motionVectorsSurface = TemporalSurfaceManager.GetRenderTargetCurrent("MotionVectors");
RenderTargetSet motionVectorsSurfacePrevious = TemporalSurfaceManager.GetRenderTargetHistory("MotionVectors");
m_ResolveMotionVectorsPass.ExecutePass(context, motionVectorsSurface, currentFrameMainBuffer);

Cubemap rendering, texture arrays, multiple render target views

Nothing super interesting, but allows to much more easily experiment with algorithms like GI (see following point). In my backlog there is a task to add support for geometry shader and instancing for amplification of data for cubemaps (with proper culling etc.) that should speed it up by order of magnitude, but wasn’t my highest priority.

Improved lighting – GI baker, SSAO

I added 2 elements: temporally supersampled SSAO and simple pre-baked global illumination + fully GPU-based naive GI baker. When adding those passes I was able to really stress my framework and check if it works as it is supposed to – and I can confirm that adding new passes was extremely quick and iteration times were close to zero – whole GI baker took me just one evening to write.

csharprenderer_withgi

GI is stored in very low resolution, currently uncompressed volume textures – 3 1MB R16 RGBA surfaces storing incoming flux in 2nd order SH (not preconvolved with cosine lobe – not irradiance). There are some artifacts due to low resolution of volume (64 x 32 x 64), but for cost of 3MB for such scene I guess it’s good enough. :)

It is calculated by doing cubemap capture at every 3d grid voxel, calcularing irradiance for every texel and projecting it onto SH. I made sure (or I hope so! ;) but seems to converge properly) it is energy conserving, so N-bounce GI is achieved by simply feeding previous N-1 bounce results into GI baker and re-baking the irradiance. I simplified it (plus improved baking times – converges close to asymptotic value faster) even a bit more, as baker uses partial results, but with N -> oo it should converge to the same value and be unbiased.

It contains “sky” ambient lighting pre-baked as well, but I will probably split those terms and store separately, quite possibly at a different storage resolution. This way I could simply “normalize” the flux and make it independent of sun / sky color and intensity. (it could be calculated in the runtime). There are tons of other simple improvements (compressing textures, storing luma/chroma separately in different order SH, optimizing baker etc) and I plan to gradually add them, but for now the image quality is very good (as for something without normalmaps and speculars yet ;) ).

Improved image quality – tone-mapping, temporal AA, FXAA

Again nothing that is super-interesting, rather extremely simple and usually unoptimal code just to help debugging other algorithms (and make their presentation easier). Again adding such features was matter of minutes and I can confirm that my framework succeeds so far in its design goal.

Constant buffer constants scripting

A feature that I’m not 100% happy with.

For me when working with almost anything in games – from programming graphics and shaders through materials/effects to gameplay scripting the biggest problem is finding proper boundaries between data and code. Where splitting point should be? Should code drive data, or the other way around. From multiple engines I have worked on (RedEngine, Anvil/scimitar, Dunia plus some very small experience just to familiarize myself with CryEngine, UnrealEngine 3, Unity3D) in every engine it was in a different place.

Coming back to shaders, usually tedious task is putting some stuff on the engine side in code, and some in the actual shaders while both parts must mach 100%. It not only makes it more difficult to modify some of such stuff, adding new properties, but also harder to read and follow code to understand the algorithms as it is split between multiple files not necessarily by functionality, but for example performance (eg. precalculate stuff on CPU and put into constants).

Therefore my final goal would be to have one meta shader language and using some meta decorators specify frequency of every code part – for example one part should be executed per frame, other per viewport, other per mesh, per vertex, per pixel etc. I want to go in this direction, but didn’t want to get myself into writing parsers and lexers and temporarily I used LUA (as extremely fast to integrate and quite decently performing).

Example would be one of my constant buffer definitions:

cbuffer PostEffects : register(b3)
{
 /// Bokeh
 float cocScale; // Scripted
 float cocBias; // Scripted
 float focusPlane; // Param, Default: 2.0, Range:0.0-10.0, Linear
 float dofCoCScale; // Param, Default: 0.0, Range:0.0-32.0, Linear
 float debugBokeh; // Param, Default: 0.0, Range:0.0-1.0, Linear
 /* BEGINSCRIPT
 focusPlaneShifted = focusPlane + zNear
 cameraCoCScale = dofCoCScale * screenSize_y / 720.0 -- depends on focal length & aperture, rescale it to screen res
 cocBias = cameraCoCScale * (1.0 - focusPlaneShifted / zNear)
 cocScale = cameraCoCScale * focusPlaneShifted * (zFar - zNear) / (zFar * zNear)
 ENDSCRIPT */
};

We can see that 2 constant buffer properties are scripted – there is zero code on C# side that would calculate it like this, instead a LUA script is executed every frame when we “compile” constant buffer for use by the GPU.

UI grouping by constant buffer

Simple change to improve readability of UI. Right now the UI code is the most temporary, messy part and I will change it completely for sure, but for the time being I focused on the use of it.

constant_buffer_grouping

Further hot-swap improvements

Right now everything in shader files and related to shaders is hot-swappable – constant buffer definitions, includes, constant scripts. Right now I can’t imagine working without it, definitely helps iterating faster.

Known issues / requirements

I was testing only x64 version, 32 bit could be not configured properly and for sure is lacking proper dll versions.

One known issue (checked on a different machine with Windows 7 / x64 / VS2010) is runtime exception complaining about lack of “lua52.dll” – it is probably caused by lack of Visual Studio 2012+ runtime.

Future plans

While I update stuff every week/day in my local repo, I don’t plan to do any public commits (except for something either cosmetic, or serious bug/crash fix) till probably late August. I will be busy preparing for my Siggraph 2014 talk and plan to release source code for the talk using this framework as well.

Posted in Code / Graphics | Tagged , , , , , | Leave a comment

Coming back to film photography

Yeah, finally I managed to go back to my past-time favourite hobby – film/analog photography that I started when I was 10 years old with following camera:

Lomo_smena_8m

Now I’m a bit older and my photo gear has changed as well (but I really miss this camera!). :) So I’m using at the moment:

DSCF0714

Why film and not digital? Don’t get me wrong. I love digital photography for its quality, ease of use and possibility to document events and reality. It’s also very convenient on holiday (especially something small like my Fuji X100). However, lots of people (including me) find it easier to take more “artistic”/better aesthetic quality photos when working with film, especially on medium format – just due to the fact that you have 10, 12 or 15 (depending if it’s 645, 6×6 or 6×7) photos you think about every shot, composition and try to make best ones. Also shooting B&W is quite interesting challenge, as we are easily attracted to colors and shoot photos based on them, while in B&W it’s impossible and you have to look for interesting patterns, geometric elements, surface of objects and relations between them. Interesting way to try to “rewire” your brain and sense of aesthetics and learn a new skill.

Finally, developing your own film by yourself is amazing experience – you spend an hour in the darkroom, fully relaxed carefully treat film and obey all the rules and still you don’t know what will be the outcome, maybe no photo will be good at all. Great and relaxing experience for all OCD programmer guys. ;)

 

Some photos from just awesome Montreal summer – nothing special, just a test roll of Mamiya I brought from Poland (and it turns out it underexposes, probably old battery, will need to calibrate it properly with light meter…).

5162-015 5162-014 5162-011 5162-007 5162-006 5162-005 5162-004 5162-003 5162-002

Posted in Travel / Photography | Tagged , , , , , , | Leave a comment

Runtime editor-console connection in The Witcher 2

During Digital Dragons and tons of inspiring talks and discussions I’ve been asked by one Polish game developer (he and his team are doing quite cool early-access Steam economy/strategy game about space exploration programmes that you can check out here) to write a bit more about the tools we had for connectivity between game editor and final game running on a console on The Witcher 2. As increasing productivity and minimizing iteration times is one of my small obsessions, (I believe that fast iteration times, big productivity and efficient and robust pipelines are much much more important than adding tons of shiny features) I agreed that it is quite cool topic to write about. :) While I realize that probably lots of other studios have similar pipelines, it is still a cool topic to talk about and multiple other (especially smaller) developers can benefit from it. As I don’t like sweeping problems under the carpet, I will discuss disadvantages and limitations of the solution we had at CD Projekt RED at that time.

Early motivation

Xbox 360 version of The Witcher 2 was first console game done 100% internally by CD Projekt RED. At that time X360 was already almost 7 years old and far behind the capabilities of modern PCs, for which we developed the game in the beginning. Whole studio – artists, designers and programmers were aware that we will need to cut down and change really lots of stuff to make game running on consoles – but have to do wisely not to sacrifice the high quality of players experience that our game was known for. Therefore programmers team apart from porting and optimizing had to design and implement multiple tools to aid the porting process.

Among multiple different tools, a need for connection between game editor and consoles appeared.  There were 2 specific topics that made us consider doing such tools:

Performance profiling and real-time tweaking on console

PC version sometimes had insane amounts of localized lights. If you look at following scene – one of game opening scenes, at specific camera angles it had up to 40 smaller or bigger localized deferred lights on a PC – and there were even heavier lit scenes in our game! 

the_witcher_2_geralt_dungeon

Yeah, crazy, but how was it even possible?

Well, our engine didn’t have any kind of Global Illumination or baking solution, one of early design decisions was that we wanted to have everything dynamic, switchable, changeable (quite important for such nonlinear game – most locations had many “states” that depended on game progress and player’s decision), animated.

Therefore, GI was faked by our lighting and environment artists by placing many lights of various kinds – additive, modulative, diffuse-only, specular-only, character or env-only with different falloffs, gobo lights, different types of animation on both light brightness and position (for shadow-casting lights it gives this awesome-looking torches and candles!) etc. Especially interesting ones were “modulative” lights that were subtracting energy from the scene to fake large radius AO / shadows – doing such small radius modulative light will be cheaper than rendering a shadowmap and gives nice, soft light occlusion.

All of this is totally against current trend of doing everything “physically-correct” and while I see almost only benefits of PBR approach and believe in coherency etc, I also trust great artists and believe they can also achieve very interesting results when crossing those physical boundaries and have “advanced mode” magical knobs and tweaks for special cases – just like painters and other artists that are only inspired by reality. 

Anyway, having 40+ lights on screen (very often overlapping and causing massive lighting overdraw) was definitely a no-go on X360, even after we optimized our lighting shaders and pipelines a lot. It was hard for our artists to decide which lights should be removed, which ones add significant cost (large overdraw / covered area). Furthermore, they wanted to be able to decide in which specific camera takes big lighting costs were acceptable – even 12ms of lighting is acceptable if whole scene mesh rendering took under 10ms – to make game as beautiful as possible we had flexible and scene-dependent budgets.

All of this would be IMHO impossible to simulate with any offline tools – visualizing light overdraw is easy, but seeing final cost together with the scene drawing cost is not. Therefore we decided that artists need a way to tweak, add, remove, move and change lights in the runtime and see changes in performance immediately on screen and to create tools that support it.

Color precision and differences

Because of many performance considerations on x360 we went with RGBA 1010102 lighting buffer (with some exp bias to move it to “similar range” like on PC). We also changed our color grading algorithms, added filmic tone mapping curve and adapted gamma curves for TV display. All of this had simply devastating effect on our existing color precision – especially moving from 16bit lighting to 10 bit and having multiple lighting, fog and particle passes – as you might expect, the difference was huge. Also our artist wanted to have some estimate of how the game will look on TVs, with different and more limited color range etc. – on a PC version most of them used high quality, calibrated monitors to achieve consistency of texturing and color work in the whole studio. To both have a preview of this look on TV while tweaking color grading values and to fight the banding, again they wanted to have live preview of all of their tweaks and changes in the runtime. I think it was easier way to go (both in terms of implementation and code maintenance time), than trying to simulate looks of x360 code path in the PC rendering path.

Obviously, we ended up with many more benefits that I will try to summarize.

Implementation and functionality

To implement this runtime console-editor connection, we wrote a simple custom command-based network protocol. 

Our engine and editor already had support for network-based debugging for scripting system. We had a custom, internally written C-like scripting system (that automatically extended the RTTI, had access to all of the RTTI types, was aware of game saving/loading and had a built-in support for state machines – in general quite amazing piece of code and well-designed system, probably worth some more write-up). This scripting system had even its own small IDE, debugger with breakpoints and a sampling profiler system. 

Gameplay programmers and script designers would connect with this IDE to running editor or game, could debug anything or even hot-reload all of the scripts and see the property grid layout change in the editor if they added/removed or renamed a dynamic property! Side note: everyone experienced with complex systems maintenance can guess how often those features got broken or crashed the editor after even minor changes… Which is unfortunate – as it discouraged gameplay scripters from using those features, so we got less bug reports and worked on repairing it even less frequently… Lesson learned is as simple as my advice – if you don’t have a huge team to maintain every single feature, KISS.

Having already such network protocol with support for commands sent both ways, it was super-easy to open another listener on another port and start listening to different types of messages! 

To get it running and get first couple of commands implemented I remember it took only around one day. :) 

So let’s see what kinds of commands we had:

Camera control and override

Extremely simple – a command that hijacked in-game camera. After the connection from editor and enabling camera control, every in-editor camera move was just sent with all the the camera properties (position, rotation, near/far planes and FOV) and got serialized through the network.

Benefits from this feature were that it not only made easier working with all the remaining features – it also allowed debugging streaming, checking which objects were not present in final build (and why) and in general our cooking/exporting system debugging. If something was not present on the screen in final console build, artist or level designer could analyze why – whether it is also not present in the editor, does it have proper culling flags, is it assigned to a proper streaming layer etc. – and either fix it, or assign a systemic bug to programmers team.

Loading streaming groups / layers

Simple command that send a list of layers or layer groups to load or unload (while they got un/loaded in the editor), passed directly to the streaming system. Again allowed performance debugging and profiling of the streaming and memory cost – to optimize CPU culling efficiency, minimizing memory cost of loaded objects that were not visible etc.

While in theory something cool and helpful, I must admit that this feature didn’t work 100% as expected and wasn’t very useful and used commonly in practice for those goals. It was mostly because lots of our streaming was affected by hiding/unhiding layers by various gameplay conditions. As I mentioned, we had very non-linear game and streaming was also used for achiving some gameplay goals. I think that it was kind of a misconception and bad design of our entity system (lack of proper separation of objects logic and visual representation), but we couldn’t change it for Xbox 360 version of Witcher 2 easily.

Lights control and spawning

Another simple feature. We could spawn in the runtime new lights, move existing ones and modify most of their properties – radius, decay exponent, brightness, color, “enabled” flag etc. Every time a property of a light was modified or new light component was added to a game world, we sent a command over network that replicated such event on console.

A disadvantage of such simple replication was that if we restarted the game running on console, we would lose all those changes. :( In such case either save + re-export (so cooking whole level again) or redoing those changes was necessary.

Simple mesh moving

Very similar to the previous one. We had many “simple” meshes in our game (that didn’t have any gameplay logic attached to them) that got exported to a special, compressed list, to avoid memory overhead of storing whole entities and entity templates and they could be moved without the need of re-exporting whole level. As we used dynamic occlusion culling and scene hierarchy structure – a beam-tree, therefore we didn’t need to recompute anything, it just worked.

Environment presets control

The most complex feature. Our “environment system” was a container for multiple time-of-day control curves for all post-effects, sun and ambient lighting, light groups (under certain mood dynamic lights had different colors), fog, color grading etc. It was very complex as it supported not only dynamic time of day, but multiple presets being active with different priorities and overriding specific values only per environment area. To be able to control final color precision on x360 it was extremely important to allow editing them in the runtime. IIRC when we started editing them while in the console connection mode, whole environment system on console got turned off and we interpolated and passed all parameters directly from the editor.

Reloading post-process (hlsl file based) shaders

Obvious, simple and I believe that almost every engine has it implemented. For me it is obligatory to be able to work productively, therefore I understand how important it is to deliver similar functionalities to teams other than graphics programmers. :)

What kind of useful features we lacked

While our system was very beneficial for the project and seeing its benefits in every next project in any company I will opt for something similar, we didn’t implement many other features that would be as helpful.

Changing and adding pre-compiled objects

Our system didn’t support adding or modifying any objects that got pre-compiled during export – mostly meshes and textures. It could be useful to quickly swap textures or meshes in the runtime (never-ending problems with dense foliage performance anyone? :) so far the biggest perf problem on any project I worked on), but our mesh and texture caches were static. It would require partial dynamism of those cache files and system + adding more support for export in editor (for exporting we didn’t use the editor, but a separate “cooker” process).

Changing artist-authored materials

While we supported recompiling hlsl based shaders used for post-effects, our system didn’t support swapping artist-authored particle or material shaders. Quite similar to the previous one – we would need to add more dynamism to the shader cache system… Wouldn’t be very hard to add if we weren’t already late in “game shipping” mode.

Changing navmesh and collision

While we were able to move some “simple” static objects, the navmesh and gameplay collision didn’t change. It wasn’t a very big deal – artists almost never played on those modified levels – but it could make life of level and quest designers much easier – just imagine when having a “blocker” or wrong collision on a playthrough quick connection with editor, moving it and immediately checking the result – without the need to restart whole complex quest or starting it in the editor. :)

Modifying particle effects

I think that being able to change particle system behaviors, curves and properties in the runtime would be really useful for FX artists. Effects are often hard to balance – there is a very thin line of compromise between the quality and performance due to multiple factors – resolution of the effect (half vs full res), resolution of flipbook textures, overdraw, alpha value and alpha testing etc. Being able to tweak such properties on a paused game during for instance explosion could be a miracle cure for frame timing spikes during explosions, smoke or spell effects. Still, we didn’t do anything about it due to complexity of particle systems in general and multiple factors to take into account… I was thinking about simply serializing all the properties, replicating them over the network and deserializing them – would work out of the box – but there was no time and we had many other, more important tasks to do.

Anything related to dynamic objects

While our system worked great on environment objects, we didn’t have anything for the dynamic objects like characters. To be honest, I’m not really sure if it would be possible to implement easily without doing a major refactor on many elements. There are many different systems that interact with each other, many global managers (which may not be the best “object-oriented” design strategy, but often are useful to create acceleration structures and a part of data/structure oriented design), many objects that need to have state captured, serialized and then recreated after reloading some properties – definitely not an easy task, especially under console memory constraints. Nasty side effect of this lack was something that I mentioned – problems with modifying semi-dynamic/semi-static objects like doors, gameplay torches etc.

Reloading scripts on console

While our whole network debugging code was designed in the first place to enable scripts reloading between the editor and a scripting IDE, it was impossible to do it on console the way it was implemented. Console version of the game had simplified and stripped RTTI system that didn’t support (almost) any dynamism and moving there some editor code would mean de-optimizing runtime performance. It could be a part of a “special” debug build, but the point of our dynamic console connection system was to be able to connect it simply to any running game. Also again capturing state while RTTI gets reinitialized + scripts code reloaded could be more difficult due to memory constraints. Still, this topic quite fascinates me and would be kind of ultimate challenge and goal for such connection system.

Summary

While our system was lacking multiple useful features, it was extremely easy and fast to implement (couple days total?). Having an editor-console live connection is very useful and I’m sure that time spent developing it paid off multiple times. It provides much more “natural” and artist-friendly interface than any in-game debug menus, allows for faster work and implementing much more complex debug/live editing features. It not only aids debugging as well as optimization, but if it was a bit more complex, it could even accelerate the actual development process. When your iteration times on various game aspects get shorter, you will be able to do more iterations on everything – which gives you not only more content in the same time/for the same cost, but also much more polished, bug-free and fun to play game! :)

Posted in Code / Graphics | Tagged , , , , , , | Leave a comment

Digital Dragons 2014 slides

This Friday I gave a talk on Digital Dragons 2014.

It was a presentation with tons of new, unpublished content and details about our:

  • Global Illumination solution – full description of baking process, storing data in 2D textures and runtime application
  • Temporal supersampled SSAO
  • Multi resolution ambient occlusion by adding “World Ambient Occlusion” (Assassin’s Creed 3 technique)
  • Procedural rain ripple effect using compute and geometry shaders
  • Wet surfaces materials approximation
  • How we used screenspace reflections to enhance look of wet surfaces
  • GPU driven rain simulation
  • Tons of videos and debug displays of every effect and procedural textures!

If you have seen my GDC 2014 talk, then probably still there is lots of new content for you – I tried to avoid reusing my GDC talk contents as much as possible.

 

Here (and on publications page) are my slides for Digital Dragons 2014 conference:

PPTX Version, 226MB - but worth it (tons of videos!)

PPTX Version with extremely compressed videos, 47MB

PDF Version with sparse notes, 6MB

PDF Version, no notes, 7MB

 

 

Posted in Code / Graphics | Tagged , , , , , , | 2 Comments

Temporal supersampling pt. 2 – SSAO demonstration

This weekend I’ve been working on my Digital Dragons 2014 presentation (a great Polish game developers conference I was invited to – if you will be somewhere around central Europe early May be sure to check it out) and finally got to take some screenshots/movies of temporal supersampling in action on SSAO. I promised to take them quite a while ago in my previous post about temporal techniques and almost forgot. :)

To be honest, I never really had time to “benchmark” properly its quality increase when developing for Assassin’s Creed 4 – it came very late in the production, actually for a title update/patch – in the same patch as increased PS4 resolution and our temporal anti-aliasing. I had motion vectors so I simply plugged it in, tweaked params a bit, double-checked the profilers, asked other programmers and artists to help me assess increase in quality (everybody was super happy with it) and review it, gave for full testing and later submitted.

Now I took my time to do proper before-after screenshots and the results are surprising ever for me.

Let’s have a look at comparison screenshots:

 

Scalable Ambient Obscurance without temporal supersampling / smoothing

Scalable Ambient Obscurance without temporal supersampling / smoothing

Scalable Ambient Obscurance with temporal supersampling / smoothing

Scalable Ambient Obscurance with temporal supersampling / smoothing

On a single image with contrast boosted (click it to see in higher res):

Scalable Ambient Obscurance with/without temporal supersampling - comparison

Scalable Ambient Obscurance with/without temporal supersampling – comparison

Quite decent improvement (if we take into account a negligible runtime cost), right? We see that ugly pattern / noise around foliage disappeared and undersampling behind the ladder became less visible.

But it’s nothing compared to to how it behaves in motion – be sure to watch it in fullscreen!

(if you see poor quality/compression on wordpress media, check out direct link)

I think that in motion the difference is huge and orders of magnitude better! It fixes all the issues typical to the SSAO algorithms that happen because of undersampling. I will explain in a minute why it gets so much better in motion.

You can see on the video some artifacts (minor trailing / slow appearance of some information), but I don’t know if I would notice them not knowing what to look for (and with applied lighting, our SSAO was quite subtle – which is great and exactly how SSAO should look like – we had great technical art directors :) ).

Let’s have a look what we have done to achieve it.

 

Algorithm overview

Our SSAO was based on Scalable Ambient Obscurance algorithm by McGuire et al. [1]

The algorithm itself has a very good console performance (around 1.6ms on consoles for full res AO + two passes of bilateral blur!), decent quality and is able to calculate ambient obscurance of quite high radius (up to 1.5m in our case) with fixed performance cost. Original paper presents multiple interesting concepts / tricks, so be sure to read it!

We plugged our temporal supersampling to the AO pass of algorithm – we used 3 rotations of SSAO sampling pattern (that was unique for every per screen space pixel position) alternating every frame (so after 3 frames you got the same pattern).

To combine them, we simply took previous ssao buffer (so it became effectively accumulation texture), took offset based on motion vectors, read it and after deciding on rejection or acceptance (smooth weight) combined them together with a fixed exponential decay (weight of 0.9 for history accumulation buffer on acceptance, it got down to zero on rejection) and output the AO.

For a static image it meant tripling the effective sample count and supersampling – which is nice. But given the fact that every screen space pixel has a different sampling pattern it meant that number of samples contributing to the final image when moving game camera could be hundreds of times higher! With camera moving and pixel reprojection we were getting more and more different sampling patterns and information from different pixels and they all accumulated together into one AO buffer – that’s why it behaves so well in motion.

Why we supersampled during the AO pass, not after blur? My motivation was that I wanted to do the supersampling, so increase the number of samples taken by AO by splitting them across multiple frames / pixels. It seemed to make more sense (temporal supersampling + smoothing, not just the smoothing) and was much better at preserving the details than doing it after blur – when the information is already lost (low-pass filter) and scattered around multiple pixels.

To calculate the rejection/acceptance we used the fact the Scalable Ambient Obscurance has a simple, but great trick of storing and compressing depth into same texture as AO (really accelerates the subsequent bilateral blurring passes, only 1 sample taken each tap) – 16bit depth gets stored in 2 8-bit channels. Therefore we had depth information ready and available with the AO and could do the depth rejection for no additional cost! Furthermore, as our motion vectors and temporal AO surfaces were 8 bits only, they didn’t pollute the cache too much and fetching those textures pipelined very well – I couldn’t see any additional cost of temporal supersampling on a typical scene.

Depth rejection has a problem of information “trailing”, (when occluder disappears, occluded pixel has no knowledge of it – and cannot reject the “wrong” history / accumulation) but it was much cheaper to do (information for given pixel compressed and fetched with color) than multi-tap color-based rejection and as I said – neither we, nor any testers / players have seen any actual trailing issues.

 

Comparison to previous approaches

Idea to apply temporal smoothing on SSAO is not new. There were presentations from DICE [2] and Epic Games [3] about similar approaches (thanks for Stephen Hill for mentioning the second one – I had no idea about it), but they differed from our approach a lot not only in implementation, but also in both reasoning as well as application. They used temporal reprojection to help smoothen the effect and reduce the flickering when camera was moving, especially to reduce half resolution artifacts when calculating SSAO in half res (essential for getting acceptable perf on the expensive HBAO algorithm). On the other hand, for us it was not only to smoothen the effect, but to really increase the number of samples and do the supersampling distributed across mutliple frames distributed in time and main motivation/inspiration came from temporal antialiasing techniques. Therefore our rejection heuristic was totally different than the one used by DICE presentation – they wanted to do temporal smoothing only on “unstable” pixels, while we wanted to keep the history accumulation for as long as possible on every pixel and get the proper supersampling.

 

Summary

I hope I have proved that temporal supersampling works extremely well on some techniques that take multiple stochastic samples like SSAO and solves common issues (undersampling, noise, temporal instability, flickering) at a negligible cost.

So… what is your excuse for not using it for AA, screen-space reflections, AO and other algorithms? :)

 

References

[1] Scalable Ambient Obscurance – McGuire et al

[2] Stable SSAO in Battlefield 3 with Selective Temporal Filtering – Andersson and Bavoil

[3] Rendering Techniques in GEARS OF WAR 2 – Smedberg and Wright

 

Posted in Code / Graphics | Tagged , , , , , , , , , , , , | 2 Comments