Localized tonemapping – is global exposure and global tonemapping operator enough for video games?

In this blog post, I wanted to address something that I was thinking about for many years – since starting working on rendering in video games and with HDR workflows and having experience with various photographic techniques. For the title of the article, I can immediately give you an answer – “it depends on the content”, and you could stop reading it. 🙂

However, if you are interested in what is localized tonemapping and why it’s an useful tool, I’ll try to demonstrate some related concepts using personal photographs and photographic experiments as well as screenshots from Sony Santa Monica’s upcoming new God of War and the demo we have shown this year at E3.

Note: all screenshots have some extra effects like bloom or film grain turned off – not to confuse and to simplify reasoning about them. They are also from an early milestone, so are not representative of final game quality.

Note 2: This is a mini-series. A second blog post accompanies this one. It covers some notes about dynamic range, gamma operations, tonemapping operators, numerical analysis and notes about viewing conditions.

Global exposure and tonemapping

Before even talking about tonemapping and the need of localized exposure, I will start with simple, global exposure setting.

Let’s start with a difficult scene that has some autoexposure applied:

underexposed_linear.png

For whole demo for lighting we used reference physical radiance values for natural light sources (sky, sun) and because of the scene taking place in a cave, you can already see that it has pretty big dynamic range.

I won’t describe here all the details of the used autoexposure system (pretty standard; though I might describe it in future blog post or in some presentation), but due to its center weighted nature it slightly underexposed lots of details in the shade at the bottom of the screen – but outdoors are perfectly exposed. In most viewing conditions (more about it in a second post) shadows completely lost all details, you don’t understand the scene and main hero character is indistinguishable…

Let’s have a look at the histogram for the reference:

hist_underexposed.png

Interesting observation here is that even though big area of the image is clearly underexposed, there is still some clipping in whites!

Let’s quickly fix it with an artist-exposed exposure bias:

proper_exp_linear.png

And have a look at the histogram:

hist_exp_linear.png

Hmm, details are preserved slightly better in shadows (though they still look very dark – as intended) and the histogram is more uniform, but whole outdoor area is completely blown out.

I have a confession to make now – for the purpose of demonstration I cheated here a bit and used linear tonemapping curve. 🙂 This is definitely not something you would like to do, and for as why there are excellent presentations on the topic. I want to point to two specific ones:

First one is a presentation from a 2010 Siggraph course organized by Naty Hoffman – “From Scene to Screen” by Josh Pines. It describes how filmic curves were constructed and what was the reasoning behind their shape and in general – why we need them.

Second one is from GDC 2016 by Timothy Lottes “Advanced Techniques and Optimization of HDR Color Pipelines” and covers various topics related to displaying High Dynamic Range images.

So, this is the scene with correct, filmic tonemapping curve and some adjustable cross-talk:

proper_exp_filmic.png

And the histogram:

hist_exp_filmic.png

Much better! Quite uniform histogram and no blown whites (in game bloom that is additive would make whites white).

I made a gif of the difference before/after:linear_vs_filmic.gif

And the animation of all three histograms:

hist_anim.gif

Much better! Few extra stops of dynamic range, lots of details and saturation preserved in bright areas. However, histogram still contains many completely black pixels, there is a large spike in lower 10th percentile and depending on the viewing conditions, scene might still be too dark…

Gamma / contrast settings

Why not just reduce the dynamic range of the image? Raising linear pixel values to some power and rescaling are equivalent of logarithmic EV space scaling and shifting and can give result that looks like that:

contrast_gamma.png

hist_low_contrast.png

Definitely all details are visible, but does the scene look well? In my opinion not; everything is milky, washed out, boring and lacks “punch”. Furthermore, both environments and characters started to look chalky. Due to perceptual character of per-channel gamma adjustments, not only contrast, but also saturation is lost. Even histogram shows the problem – almost no whites nor blacks in the scene, everything packed into grays!

Check this gif comparison:

filmic_vs_gamma.gif

Does this problem happen in real life / photography?

Yes, obviously it does! Here is a photo from Catalina Island – you can notice on the left how camera took an underexposed picture with extremely dark shadows with zero detail – and on the right how I corrected it to feel more perceptually correct and similar to how I saw the scene.

real_photo_comp.png

So can we do anything about it? Yes, we can! And I have teased with the photograph above what is the solution.

But first – some question that probably immediately pop up in a discussion with lighting artists.

Why not just change the lighting like they do in movies?

The answer often is – yes, if you can do it right, just adjust lighting in the scene to make dynamic range less problematic. We interviewed recently many lighters coming from animation and VFX and usually as a solution for large scene dynamic range they mention just adding more custom lights and eyeballing it until it looks “good”.

Obviously by using the word right in my answer, I kind of dodged the question. Let me first explain what it means in real-life photography and cinematography with some random youtube tutorials I have found on the topic. It means:

Diffusors

Pieces of some material that partially transmits light, but completely diffuses the directionality and source position of lighting. Equivalent of placing large area light in rendered scene and masking out original light.

Reflectors

Simply  a surface that bounces lighting and boosts lighting on shadowed surfaces and ones not facing the light direction.

In a game scene you could either place white material that would produce more GI / bounced lighting, or simply in a cinematic place a soft (area) fill light.

Additional, artificial light sources (fill flash, set lights etc.)

This is pretty obvious / straightforward. Movie sets and games with cinematic lighting have tons of them.

So as you can see – all those techniques apply to games! With constrained camera, camera cuts, cinematics etc. you can put in there fill lights and area lights and most games (with enough budget) do that all the time.

This is also what is possible in VFX and animation – manually adding more lights, fixing and balancing stuff in composition and just throwing more people at the problem…

Note: I think this is good, acceptable solution and in most cases also the easiest! This is where my answer “it depends” for the question in the title of this article comes from. If you can always add many artificial, off-screen light sources safely and your game content allows for it and you have great lighters – then you don’t need to worry.

Is it always possible in games and can get us good results?

No. If you have a game with no camera cuts or simply need perfect lighting in conditions with 100% free camera, you cannot put invisible light source in the scene and hope it will look well.

On the other hand, one could argue that there is often place for grounded light sources (torch, fire, bulbs… name it) and sure; however as often they can make no sense in the context and level design design of the scene, or you might have no performance budget for them.

Hacking Global Illumination – just don’t!

Some artists would desperately suggest “let’s boost the GI!” “let’s boost bounced lighting!” “let’s boost just the ambient/sky lighting intensity!” and sometimes you could get away with that on previous generation of consoles and non-PBR workflows, but in PBR pipeline, it’s almost impossible not to destroy your scene this way. Why? Large family of problems it creates is balance of specular and diffuse lighting.

If you hack just diffuse lighting component (by boosting the diffuse GI), your objects will look chalky or plastic and your metals will be too dark; other objects sometimes glow in the dark. If you boost also indirect speculars, suddenly under some conditions objects will look oily and/or metallic; your mirror-like glossy surfaces will look weird and lose sense of grounding.

Finally, this is not only GI specific, but applies to any hacks in PBR workflow – when you start hacking sun/sky/GI intensities, you lose ability to quickly reason about material responses and the lighting itself and debugging them – as you can’t trust what you see and many factors can be the source of the problem.

How photography deals with the problem of too large dynamic range when operating with natural light?

This is a very interesting question and my main source of inspiration to the solution of this problem. Especially in the film / analog era, photographers had to know a lot about dynamic range, contrast, various tonemapping curves. Technique and process were highly interleaved with artistic side of photography.

One of (grand)fathers of photography, Ansel Adams created so-called Zone system.

https://en.wikipedia.org/wiki/Zone_System

http://photography.tutsplus.com/tutorials/understanding-using-ansel-adams-zone-system–photo-5607

https://luminous-landscape.com/zone-system/

I won’t describe it in detail here, but it is very similar to many principles that we are used to – mapping middle gray, finding scene dynamic range, mapping it to media dynamic range etc.

Fascinating part of it is the chemical/process part of it:

Picking correct film stock (different films have different sensitivity and tonemapping curve shape), correct developer chemical, diluting it (my favourite developer, Rodinal can deliver totally different contrast ratios and film acuity/sharpness depending on the dilution), adjusting development time or even frequency of mixing the developed film (yes! 1 rotation of development tank a minute can produce different results than 1 rotation every 30 seconds!).

wilno1

Photo captured on Fuji Neopan developed in diluted Agfa Rodinal – absolutely beautiful tonality and low contrast, high acuity film.

Manual localized exposure adjustment

This all is interesting, but also in global tonemapping / per-image, global process domain. What photographers had to do to adjust exposure and contrast locally, was tedious process called dodging and burning.

https://en.wikipedia.org/wiki/Dodging_and_burning

It meant literally filtering or boosting light during print development. As film negatives had very large dynamic range, it made it possible to not only adjust exposure/brightness, but recover lots of details in otherwise overblown / too dark areas.

An easy alternative that works just great for landscape photography is using graduated filters:

https://en.wikipedia.org/wiki/Graduated_neutral-density_filter

Or even more easily, by using polarizer (darkens and saturates the sky and can cancel out specular light / reflections on e.g. water).

https://en.wikipedia.org/wiki/Polarizing_filter_(photography)

Fortunately in digital era, we can do it much easier with localized adjustment brushes! This is not very interesting process, but it’s extremely simple in software like Adobe Lightroom. Some (stupid) example of manually boosting exposure in shadows:

localized_adjustment.gif

As localized adjustment brush with exposure is only exposure addition / linear space multiplication (more about it in second post in the series!), it doesn’t affect contrast in modified neighborhood.

It is worth noting here that such adjustment would be probably impossible (or lead to extreme banding / noise) with plain LDR bmp/jpeg images. Fortunately, Adobe Lightroom and Adobe Camera Raw (just like many other similar deticated RAW processing format) operate on RAW files that are able to capture 12-16 exposure stops of dynamic range with proper detail! Think of them as of HDR files (like EXR), just stored in a compressed format and containing data that is specific to the input device transform.

This is not topic of this post, but I think it’s worth mentioning that on God of War we implemented similar possibility for lighting artists – in form of 3D shapes that we called “exposure lights”. Funnily they are not lights at all – just spherical, localized exposure boosters / dimmers. We used dimming possibility in for example first scene of our demo – Kratos reveal, to make him completely invisible in the darkness (there was too much GI 🙂 ) and we use brightness boosting capabilities in many scenes.

Automatic localized adjustments – shadows / highlights

Manual localized exposure adjustments are great, but still – manual. What if we could do it automatically, but without reducing whole image contrast – so:

a) automatically

b) when necessary

c) preserving local contrast?

Seems like Holy Grail of exposure settings, but let’s have a look at tools already at photographers/artists disposal.

Enter… Shadows / Highlights! This is an image manipulation option available in Adobe Photoshop and Lightroom / Camera Raw. Let’s have a look at some image with normal exposure, but lots of bright and dark areas:

photo_orig_exposure.png

We can boost separately shadows:

photo_shadows.png

(notice how bright the trees got – with slight “glow” / “HDR-look” (more about it later).

Now highlights:

photo_highlights.png

Notice more details and saturation in the sky.

And finally, both applied:

photo_both.png

acr_highlights_shadows.gif

What is really interesting, is that it is not a global operator and doesn’t only reshape exposure curve. It’s actually a contrast-preserving, very high quality localized tonemapping operator. Halo artifacts are barely visible (just some minor “glow”)!

Here is an extreme example that hopefully shows those artifacts well (if you cannot see them due to small size – open images in a separate tab):

localized_extreme.gifhalo.png

Interestingly, while ACR/Lightroom HDR algorithm seems to work great until pushed to the extreme, same Shadows/Highlights looks quite ugly in Photoshop in extreme settings:

photoshop_highlights_shadows.png

Aaaargh, my eyes! 🙂 Notice halos and weird, washed out saturation.

Is the reason only less information to work with (bilateral weighting in HDR can easily distinguish between of -10EV vs -8EV while 1/255 vs 2/255 provides almost no context/information?) or a different algorithm – I don’t know.

Actual algorithms used are way beyond scope of this post – and still a topic I am investigating (trying to minimize artifacts for runtime performance and maximize image quality – no halos), but I was playing with two main categories of algorithms:

  • Localized exposure (brightness) adjustments, taking just some neighborhood into account and using bilateral weighting to avoid halos. I would like to thank here our colleagues at Guerilla Games for inspiring us with an example of how to apply it in runtime.
  • Localized histogram stretching / contrast adjustment – methods producing those high structure visibility, oversaturated, “radioactive” pictures.

There are obviously numerous techniques and many publications available – sadly not many of them fit in a video game performance budget.

In “God of War”

Enough talking about photography and Adobe products, time to get back to God of War!

I implemented basic shadows/highlights algorithm with artist tweakable controls and trying to match behavior of Lightroom. First screenshot shows a comparison of “shadows” manipulation and regular, properly tonemapped screenshot with a filmic tonemapping curve.

filmic_vs_shadows.gif

hist_filmic_vs_shadows.gif

I set it to some value that is relatively subtle, but still visible (artists would set it from more subtle settings to more pronounced in gameplay-sensitive areas). Now the same with highlights options:

filmic_vs_highlights.gif

hist_filmic_vs_highlights.gif

One thing that you might notice here is haloing artifacts – they result from both relatively strong setting as well as some optimizations and limitations of the algorithm (working in lower/partial resolution).

Finally, with both applied:

filmic_vs_shadows_and_highlights.gif

hist_filmic_vs_both.gif

As I mentioned – here it is shown in slightly exaggerated manner and showing artifacts. However, it’s much better than regular “gamma” low contrast settings:

gamma_vs_both.gif

hist_gamma_vs_both.gif

Histogram shows the difference – while gamma / contrast operator tends to “compact” the dynamic range and pack it all in midtones / grays, shadows/highlights operations preserve local contrast, saturation and some information about darkest and brightest areas of the image.

Why localized exposure preserves contrast and saturation? Main difference is that gamma in logarithmic space becomes scale, scaling whole histogram, while exposure/scale becomes just a linear shift (more about it in part 2) and shifts under / over exposed parts with same histogram shape into visible range.

Summary

You can check the final image (a bit more subtle settings) here:

final.png

To sum up – I don’t think that problems of exposure and dynamic range in real time rendering are solved. Sometimes scenes rendered using realistic reference values have way too large dynamic range – just like photographs.

We can fix it with complicated adjustments of the lighting (like they do on movie sets), some localized exposure adjustments (in 3D “exposure lights”) or using simple “procedural” image space controls of shadows/highlights.

Possible solutions depends heavily on the scenario. For example – if you can cut the camera, you have many more options than when is it 100% free and not constrained with zero cuts. It also depends how much budget do you have – both in terms of milliseconds to spend on extra lights as well as in terms of lighting artists time.

Sometimes a single slider can make scene look much better and while localized exposure / localized tonemapping can have its own problems, I recommend adding it to your artists’ toolset to make their lives easier!

If you are interested a bit more in the dynamic range, tonemapping and gamma operations, check out my second post in the mini-series.

References

http://renderwonk.com/publications/s2010-color-course/ SIGGRAPH 2010 Course: Color Enhancement and Rendering in Film and Game Production

http://gpuopen.com/gdc16-wrapup-presentations/  “Advanced Techniques and Optimization of HDR Color Pipelines”, Timothy Lottes.

https://en.wikipedia.org/wiki/Zone_System

http://photography.tutsplus.com/tutorials/understanding-using-ansel-adams-zone-system–photo-5607

https://luminous-landscape.com/zone-system/

https://en.wikipedia.org/wiki/Dodging_and_burning

https://en.wikipedia.org/wiki/Graduated_neutral-density_filter

https://en.wikipedia.org/wiki/Polarizing_filter_(photography)

 

 

Posted in Code / Graphics, Travel / Photography | Tagged , , , , , , , , | 17 Comments

Technical debt… or technical weight?

Introduction

Whole idea for this post came from a very inspiring conversation with my friends and ex-colleagues from Ubisoft that we had over a dinner few months ago.

We started to talk about some sophisticated code algorithm – an algorithm that is very well written, brilliant idea, lots of hard work of talented and experienced people to make it robust and perform very well on many target hardware platforms. On the other hand, the algorithm and its implementation are so complex, that it takes almost full time of some people to maintain it. One of my colleagues called it a “technical debt”, which I disagreed with and we started to discuss differences and I came up with a name, “technical weight”.

Quick definition of technical weight would be a property that makes your solutions (very) costly and “heavy” not when implementing them, but in the long run; but in clean and properly designed environment (unlike technical debt).

This is not a post about technical debt

Let me be clear – this is not yet another post about technical debt. I think enough people have covered it and it’s a very well understood problem. I like analogy to a real debt and taking a credit (or even a mortgage) – for some short-term benefit one can avoid hard work of “doing it properly” (analogy of paying with cash) and instead take this “credit”. It can be many things – “hacking”, introducing unnecessary globals and states, ignoring proper data flow and data structures, injecting weird dependencies (like core data logic depending on visual representation), writing unextendable code or sometimes writing just ugly, unreadable and aesthetically unpleasant code.

There are posts about what is technical debt, how to avoid it, how to fix it or even necessary cultural changes within company structure to deal with it (like convincing product owners that something that has no visible value can make an investment that will pay off).

What most post don’t cover is that recently huge amount of technical debt in many codebases comes from shifting to naïve implementations of agile methodologies like Scrum, working sprint to sprint. It’s very hard to do any proper architectural work in such environment and short time and POs usually don’t care about it (it’s not a feature visible to customer / upper management). There are some ways of counter-acting it, like clean-up weeks (great idea given that your team actually cares about working with clean, good code – but fortunately many teams do)…

…But this is not a post about it. 🙂

Enter the “technical weight”

So what is technical weight then? I wouldn’t call this way a single item – technology / decision / process / outcome; I think of it as a property of every single technical decision you make – from huge architectural decisions through models of medium-sized systems to finally way you write every single line of code. Technical weight is a property that makes your code, systems, decisions in general more “complex”, difficult to debug, difficult to understand, difficult to change, difficult to change active developer.

Technical weight of code producing technical debt will usually be very large – this goes without saying. But also beautiful, perfectly written and optimized, data-oriented code can have large technical weight. It is also inevitable to have some systems like that, so what is the problem here?

I also wanted to talk about my motivation behind writing a blog post about it. Enough people cover things like code aesthetics; “smart” (hmm) code tricks to make your code shorter; fighting technical debt, or even potentially counter-productive ideas like design patterns or technical-debt inducing methodologies like naïve Scrum – but I haven’t seen many post about taking technical weight nor psychology of picking technical solutions. So while lots of things will seem “captain obvious”, I think it’s worth writing and codifying at least some of it.

Before proceeding, I need to emphasize (more on it later): every decision and solution has some technical weight. Some problems are just difficult and “most lightweight” one can be still a huge struggle. Some very heavy solutions are the way to go for given project and team and only way to progress.

Also I do not want to criticize any of examples I will give, I think they are great, interesting solutions, but just not always suitable.

Analogy one – tax / operating costs

First analogy that I would like to compare it to is similar to “technical debt” investment strategy. Imagine you want a car – and you decide to buy with cash a 1966 Ford Mustang or not to imply necessarily “outdated” technology, a new Corvette / Ferrari. Dream of many people, very expensive, but if you have enough money, what can go wrong…? Well, initial cost of the item is just the beginning. They use lots of gas. They are expensive in maintenance (forget an old car if you don’t have a special workshop or trusted car mechanic). They can break often and require replacement parts that are hard to come by. Insurance costs will be insane and in many countries, their registration cost / tax is much higher (“luxury goods”). Finally, you don’t want to take your perfect Ferrari on a dirt road.

So even if you could afford something, didn’t take a loan and bought something in technically perfect condition, initial costs are just the beginning and you will end up having to spend huge amounts of money or time and still won’t be able to do many tasks.

Analogy two – literal weight of carried baggage

Second analogy is comparing a project / product / developing a technology to packing up when going on some trip. Depending on the kind of trip (something longer than casual walk / hike), you need to pack. A minimum change of clothes, some water / food, backpack, maybe a tent and a sleeping bag… But someone can decide, “let’s take a portable grill!”; “let’s take specialized hiking gear!”; “let’s take a laptop, portable speakers and a guitar!”. All of those ideas can be decent and provide some value for certain kinds of trips, but then you need to carry them around for the duration of the whole trip. And for sure if your trip doesn’t involve a minivan, you don’t want to take all of those. 🙂 If you are just walking, weight on your back will be too heavy for a long trip – it is inconvenient, making you more exhausted, stopping you more often and in some cases, you might not be able to finish your initial trip because of this weight. You can always throw them away after some point, but this is pure waste of money / initial investment.

Back to tech

So we have some solution that is well architected, designed and written – no hacks involved. But it is very complex and “heavy” – why it *could* be bad?

  1. Required manpower to maintain it

Almost no system is ever “finished”. After you write and commit it, there will be probably much iteration. Obviously, it depends on the field of IT and domain (I can imagine some areas require very careful and slow changes – medical equipment and space rockets? Some others can rely on one-off products that when you are done you never go back to – some webpage frontends?), but in most cases when you are working on a longer term project, you (or someone else taking it over) will revisit such code again and again and again. You need someone to be able to maintain it – and the heavier is the solution, the more manpower you need. Think of it as of property tax – you will “pay “ (in time spent) on average every month some percent of project time. It can be anything from marginal 0.5% to anything like 50% – scales almost directly with quality of code /solution (but as I explained – this is not post about bad code) but also complexity.

  1. Difficulty to get new developers into the system

Very heavy, smart and sophisticated solutions can take lots of time for new people to learn and start actively working on them. Sometimes it can be even impossible within the project time frame – imagine an algorithm written by some world expert in given domain; or even a team of experts that decide to leave your project one day (it happens and it’s better to be prepared…).

  1. Bugfixing costs

Every system has some bugs. I don’t remember the exact estimate, but I remember seeing some research conducted on many code bases that found the average number of bugs per 1000LOC – it’s quite constant from language to language and from developer to developer (at least statistically). So more complicated systems mean more bugs, more time bugfixing, but also if they have lots of moving parts – more time spent debugging per every single bug. If your code is “smart”, then “smartness“ required during debugging is even higher – good luck on that when your are later time pressured, stressed and tired (as lots of bugfixing happens at end of projects)…

  1. Complicated refactoring

Requirements change, especially in many agile-like projects like game development. If you made your project very “heavy”, any changes will be much more difficult. I noticed that this is usually when technical weight can creep into technical debt – under time pressure; you add “just one, innocent hack” (that at the end of the project, after N such worse and worse hacks means huge and unmaintainable tech debt). Or spend months on a refactor that nobody really asked for and adds zero value to your project. So technical weight and shortage of time or resources can evolve into technical debt.

  1. Complicated adding new, unrelated code

Similar to previous point, but unlike requirements changing, this one is almost inevitable. You have always systems surrounding your system, various interacting features and functionalities. Now anyone wanting to interact with it has to understand lots of its complexity. It’s a problem no matter how well interfaced and encapsulated it is; in general it is something I would require from every good programmer – if you call a method/use a class, you really should understand how it works, what it will do, what is the performance impact and all consequences.

  1. Psychological aspect

With technically heavy solutions, one of major aspect that is ignored by most developers is psychology and cognitive biases. (Side note: we often tend to ignore psychology as we want to believe we as programmers, engineers, scientists, educated “computer people” are reasonable – what a fallacy J). All kinds of biases can affect you, but I will list just few that I see very often with programmers regarding “technical weight”:

https://en.wikipedia.org/wiki/Confirmation_bias “I made a decision so I see only its advantages and no disadvantages”.

https://en.wikipedia.org/wiki/Escalation_of_commitment “We already invested so much time/money into it! We cannot back out now, it will pay off soon!”.

https://en.wikipedia.org/wiki/Progress_trap “We have to keep on going and adding functionalities, otherwise we will regress”.

https://en.wikipedia.org/wiki/Loss_aversion “It’s more important to avoid bad performance / instability / whatever than focus on advantages of other solutions”.

To put it all together – if we invested lots of thought, work and effort into something and want to believe it’s good, we will ignore all problems, pretend they don’t exist and decline to admit (often blaming others and random circumstances) and will tend to see benefits. The more investment you have and heavier is the solution – the more you will try to stay with it, making other decisions or changes very difficult even if it would be the best option for your project.

Examples

I’ll start with an example that started whole discussion.

We were talking about so-called “GPU pipelines”. For anyone not specializing in real-time graphics, this is whole family of techniques driven by a great vision – that to produce work on the GPU (rendering / graphics), you don’t need to produce work on the CPU – no need to cull visibility, issue drawcalls, put pressure on drivers – it all (or almost all) could be potentially generated on the GPU itself. This way you can get great performance (finer granularity culling / avoiding work; also GPUs are much better at massive amounts of simple/similar tasks), have your CPU available for other tasks like AI and even allow for efficient solutions to whole family of problems like virtual texturing or shadow-mapping. What started discussion was that we all admired quality of work done by our colleagues working on such problems and how it made sense for their projects (for example for projects with dynamic destruction, when almost nothing can be precomputed; or user generated content or massive crowds), but started to discuss if we would want to use it ourselves. The answer was everyone agreeing “it depends”. 🙂 Why we wouldn’t “always” use something that is clearly better?

Reason was simple – involved amount of work of extremely smart people and complexity of not only initial implementation, but also maintaining it and extending – especially when working on multiple platforms. Maybe your game doesn’t have many draw calls? Maybe lots of visibility can be pre-computed? Maybe you are GPU bound, but not on Vertex Shading / rasterization? There can be many reasons.

Some other example could be relying heavily on complex infrastructure to automate some tasks, like building of your data / code and testing it. If you have manpower to maintain it and make it 99.999% robust, it is the way to go. On the other hand, if the infrastructure is unreliable and flaky and gets changed often – technical weight totally outweighs the benefits. So yes, maybe you don’t need to do some tasks manually, but now you need to constantly debug both the automation and the automated process itself.

Yet another example – something that will probably resonate with most programmers (I have no idea why it’s so fun to write “toy” compilers and languages; is it because it’s true “meta”-programming? 🙂 ) – domain specific languages, especially for scripting! It’s so enjoyable to write a language and you can make it fit 100% your needs, you have full ownership of it, no integration etc. But on the other hand, you just increased entry barrier for anyone new to the system, need to maintain and debug it and if it is your first language, probably it will have some bad decisions and will be difficult to extent (especially if needs to be backwards compatible). Good luck if every programmer on your team eventually adds or writes a new language to your stack…

Conversely, opposite can be also technically heavy – relying on middle-wares, off-the-shelf frameworks and solutions and open-source. Costs of integrating, debugging, merging, sending e-mails to tech support… Realizing (too late!) that it lacks some essential or new functionality. Using a complex middleware/engine definitely can add some technical weight to your project. It often is a right solution – but weight has to be taken into account (if your think “we will integrate it and all problems with X are gone”, then you have clearly never worked with a middleware).

Reasons for heavy technical weight

  1. Difficult problem

First one is obvious and simplest – maybe your problem is inherently difficult, you cannot work around it, it is nature of your work. But then there is probably nothing to do about it, and you are aware of it. Your solution will provide unique selling point to your product, you are aware of the consequences – all good. 🙂

  1. Thinking that you have a problem / your problem is difficult

On the other hand, sometimes you may think that your problem is difficult, you want to solve it in a “heavy” way, but it is not or you shouldn’t be solving it. Often heavy solutions for such category of problems come from “inheriting” a problem or a system from someone. So for example – continuing work on a very legacy system. Trying to untangle technical debt caused by someone else N years ago with gradual changes (spaghetti-C code or lava cake OOP code). Trying to solve non-tech (cultural? people?) problem with tech solutions – category that scares me most and is a proof of our (yes, almost every engineer me included falls into such fallacy) technocratic arrogance. 🙂 There are numerous problems that only seem very difficult – but it’s pretty hard to see it. Advice here – just ask your colleagues for a second opinion (without biasing them with your proposed solution); if both you and them have some extra time, don’t even phrase the problem yourself, let them discover it partially themselves and comment on it.

  1. Not enough scoping

Often not scoped user story will describe a very complex system that needs to do EVERYTHING, has tons of features, functionalities, all possible options, work with all existing systems. You as a programmer will want it to also have great performance, readable code etc. If you don’t apply some scoping, splitting implementation stages and don’t allow for users to start giving you feedback on early iterations (“hey you know, I actually don’t use those options”), you are probably guaranteed to end up with too heavy solution.

  1. a. Future coding

To explain the term – excellent blog post from Sebastian Sylvan that inspired me and helped grow as a programmer. http://sebastiansylvan.com/post/the-perils-of-future-coding/

This is subcategory of 3, but even worse – as your over-engineering doesn’t even come from the user! It is programmer trying to be overly abstract and predicting user problems ahead. On its own it is not a bad thing, but instead solution should be as always – KISS, write simple systems with not many moving pieces and strings to outer world that you can replace.

  1. Not willing to compromise

This one is really difficult as it’s not a bad thing per se in many situations… Sometimes you need to sacrifice some aspects of final result. Is 5% performance increase worth much more complicated system (10x more code)? Is having automatic boilerplate code generation worth spending months on some codegen system? It always depends on so many factors and you cannot predict the future… And for sure if you are open minded you would agree that some past decisions you made were bad or even regret them.

  1. Not enough experience seeing the technical weight and evaluating consequences

This is a problem mostly of junior programmers – I remember being inexperienced and jumping with enthusiasm to every single new feature and request, writing complicated things and wanting them to be the best in every possible aspect. Shipping few products, especially if dealing with consequences means lots of effort/problem usually teaches more humility. 🙂

  1. Programming dogmas / belief

Horrible, horrible reason to add technical weight. Someone says that it has to be “true OOP”, “idiomatic C++”, “this design pattern”, or obey some weird religious-like arguments. I heard of people saying, “oh you should code in this way, not that way, because this is the way you do things in Java”. Or recently that multi-inheritance is better than composition or polymorphism in general better than if statements (wtf?). My advice? If you don’t feel that you have enough energy to inspire a major cultural change with months of advices, examples, discussions and no guarantee of succeeding – then you don’t want to work with such zealous people.

  1. Fun!

Ok, this is a weird point, but if you enjoy programming and problem solving I am sure you will understand.

We like solving complicated problems in complicated ways! C&P 5 lines of code is not so “fun” as writing complex code generator. Using off-the-shelf solutions is not as enjoyable as writing your own domain specific language. In general – satisfaction and dopamine “reward” from solving a complex problem and owning a complex, sophisticated system is much higher than simple solutions to a reduced/scoped problem. We want to do cool things, solve ambitious problems and provide interesting solutions – and it’s ok to work on them – just admit it; don’t try to lie to yourself (or even worse your manager!) that it is “necessary”. Evaluate if you have some time for this fun coding and what kind of consequences it will have in 1/3/6/12/24 months.

  1. Being “clever” and optimizing for code writing, not reading

It is something between points 6 and 7, but fortunately often happens on very small scale (single lines/functions/classes); on the other hand unfortunately it can grow and later impact your whole codebase… Some programmers value perceived “smartness” of their solutions, want to solve mental puzzle, impress themselves or other programmers or optimize time spent writing the code. They would hide complexity using some macros, magic iterators, constructors and templates. This all adds lots of technical weight – but can slip through code design and reviews. Good luck reading and debugging such code later though!

  1. Career progress – individual

Thanks for Kris Narkowicz for pointing out that I forgot about very common reason for people trying to write sophisticated, over-ambitious systems and solutions.
I think it can be split into two subcategories.
First subcategory is individual growth. Pretty interesting one, as it’s something between 7 and 8. We want to do interesting things and get challenged every day, working on more and more ambitious things to develop our career and skills. Many programmers are ambitious and don’t treat their work just as “day job”. They would like to develop something that they could do talks on conferences, be proud of having in CV/portfolio or even contribute to the whole computer science field. Easy solutions and simple tasks don’t get you this and they don’t leave a “legacy”. You can do it in your spare time, contribute to open source etc. – but you have limited time in your life and understandably, some people would prefer to do it at work.
Again – something that is ok on a limited scale – if you never do it, you won’t feel challenged your career will stagnate and you can get burned out (and eventually look for more interesting / ambitious job). Just make sure it is not unnecessary, common pattern and doesn’t eat up majority of your time (and especially doesn’t cause lots of work for others…).

  1. Career progress – in the organization structure

Similar to previous one – but this point it is not driven by the individual and their goals and ambitions, but weird organization structure and bad management. Pathological organizations will promote only people who seem to do very complex things. If you do your job right, pick simple solutions, predict problems and make sure they never happen – in many places you won’t be appreciated as much as someone who writes super complex system, puts lots of overtime into it, causes huge problems and in the end “saves the day” last night. As I said – it is pathological and not sustainable. It definitely should be recognized and fixed in the organization itself.
I even heard of major tech companies that make it a clear and explicit rule to get bonuses and getting promoted – that you need to be an author and owner of whole systems or products. I understand that it is easy and measurable metric, but in the long run problems and pathological behaviors outweigh benefits; it can be detrimental to any team work and good working culture.

How to deal with technical weight?

You will be probably disappointed by length of this paragraph, but there are no universal solutions. But – being aware of technical weight will help you make better decisions. Thinking about every problem, evaluate:

– How much technical weight you can carry on? Do you already have some heavy systems and spend a lot of time supporting them?

– For how long do you need to carry this weight? Is your product / tech a sprint, or a marathon / “trip around the globe”?

– If your team gets reduced or main contributors leave, are you going to be able to continue carrying it?

– Is more heavy solution providing you some unique value? Are your users going to be able to iterate faster? Is it some unique feature that customers will truly appreciate? Are you going to be able to use for example cutting edge performance to make your product unique?

– What are disadvantages of lighter solutions? Are they really unacceptable? Can you prove it, or is it just intuition driven by biases?

– Are you proposing this solution, because it is more interesting and fun problem to work on? Trying to be “smart” here and want to do some impressive work?

– Are you psychologically biased towards this solution? Did you already invest a lot in it? Is there ego aspect? Are you obsessed with loss aversion? Have you really considered alternatives with an open mind and others suggest same solutions without guiding them?

 

To close this post, an observation that I had looking at how different is to work with teams of different sizes – adding technical weight can be most difficult for medium-sized teams.

Usually small, experienced teams will add some technically complex and heavy solutions to add unique value to their unique, hand crafted product. There are some specific teams (often come from demoscene) and games that have some very original, beautiful tech that could be difficult to maintain and use for anyone bigger (and good luck convincing lots of people in a bigger company to such risky ideas!). If you like video games and rendering, you probably know what studios and games I talk about, Media Molecule and Q-Games and their amazing voxel or SDF based tech. There are more examples, but those 2 come immediately to my mind.

On the other hand, technically heavy solutions are also suitable for giants. EA (specifically their engine, Frostbite division and amazing work they do and publish) or Ubisoft (that this year at the GDC broke record of valuable, inspiring and just great technical presentations) can invest lots of money and manpower for R&D and maintenance of such technology and because it is shared between many products, it will pay off. So even if solutions are “heavy”, they can manage to develop them.

Medium-sized teams have neither of those advantages – usually they have to have many different features (as targeting wider audience – costs – they are not able to stick to a single unique selling point), but don’t have enough manpower to waste time on too complex problems and endless R&D. Therefore they have to choose appropriate solutions carefully, calculating ROI per every single decision. There is again a psychological aspect – having team of let’s say 6-10 developers, you might think that you can do a lot more – and even if you do your planning perfectly and reasonably, having even a single person leave or tech requirements change can totally shift the scales.

Nothing really special here – but working with technical weight is like working with psychological biases – everyone has them, but just being aware of them makes you able to make better, less biased decisions. I recommend to read from time to time about cognitive biases as well – and analyzing own decisions looking for them.

Special thanks

Special thanks go to my friends who inspired whole discussion few months ago – alphabetically – Mickael Gilabert, John Huelin, David Robillard and lots of insight into the problem. Miss such inspiring conversations with you guys!
Extra special thanks for Kris Narkowicz for pointing out some important missing reason for technical weight.

Posted in Code / Graphics | Tagged , , , , | 2 Comments

White balance and physically based rendering pipelines. Part 2 – practical problems.

White balance and lighting conditions

After this long introduction (be sure to check part one if you haven’t!), we can finally go to the problem that started whole idea for this post (as in my opinion it is unsolved problem – or rather many small problems).

The problem is – what kind of white balance you should use when capturing and using IBLs? What color and white balance you should use while accumulating lights. When the white balance correction should happen? But first things first.

Image based lighting and spherical panoramas. Typical workflow for them is fixing the exposure and some white balance (to avoid color shifts between shots), taking N required bracketed shots and later merging them in “proper” software to panoramas. Then it is saved in some format (DDS and EXR seem to be most common?) that usually is only a container format and has no color space information and then after pre-filtering used in the game engine as the light source. Finally magic happens, lighting, BRDFs, some tonemapping, color grading, output to sRGB with known target monitor color balance and view conditions… But before that “magic” did you notice how vaguely I described white balance and color management? Well, unfortunately this is how most pipelines treat this topic…

Why it can be problematic?

Ok, let’s go back to setting of white balance, during single time of day, same sun position – just a photo captured in shadows and in the sunlight.

Photo 1, camera auto WB (5150K) – cool look, blueish shadows.

Photo 1, camera auto WB (5150K) – cool look, blueish shadows.

Photo 1, Lightroom/ACR Daylight WB (5500K) – warm, sunny look, but still slightly blue shadows.

Photo 1, Lightroom/ACR Daylight WB (5500K) – warm, sunny look, but still slightly blue shadows.

Photo 1, Lightroom/ACR Cloudy WB (6500K) – shadows have no tint, but the photo looks probably too warm / orange.

Photo 1, Lightroom/ACR Cloudy WB (6500K) – shadows have no tint, but the photo looks probably too warm / orange.

Photo 2, same time and location, behind the building, camera auto WB (5050K) – blue color cast.

Photo 2, same time and location, behind the building, camera auto WB (5050K) – strong blue color cast.

Photo 2, same time and location, behind the building, daylight WB (5500K) – blue color cast.

Photo 2, same time and location, behind the building, daylight WB (5500K) – blue color cast.

Photo 2, same time and location, behind the building, cloudy WB (6500K) – color cast gone (maybe slight hint of magenta).

Photo 2, same time and location, behind the building, cloudy WB (6500K) – color cast almost gone (slight hint of magenta).

Imagine now that you use this images as an input to your engine IBL solution. Obviously, you are going to get different results… To emphasize the difference, I did a small collage of 2 opposite potential solutions.

Same wb for both scenes.

Same WB for both scenes.

Different, dynamic/changing WB.

Different, dynamic/changing WB.

In this example the difference can be quite subtle (but obviously shadowed parts either get blueish tint or not). Sometimes (especially with lower sun elevations – long shadows, lower light intensities, more scattering because light travels through larger layer of the atmosphere) it can get extreme – that it is impossible to balance wb even within a single photo!

Example:

White balance set for the sunlit areas, 4300K. Everything in shadows is extremely blue.

White balance set for the sunlit areas, 4300K. Everything in shadows is extremely blue.

White balance set for the shadowed areas, 8900K. People would appear orange in sunlit areas

White balance set for the shadowed areas, 8900K. People would appear orange in sunlit areas

“Neutral”/medium white balance, 5350K. People probably wouldn’t look right neither in shadows (too strong blue tint) nor in sunlight (too strong orange/yellow tint). This is however how I think I perceived the scene at this time.

“Neutral”/medium white balance, 5350K. People probably wouldn’t look right neither in shadows (too strong blue tint) nor in sunlight (too strong orange/yellow tint). This is however how I think I perceived the scene at that time.

What is really interesting is that depending on which point you use for your IBL capture (is your WB and grey card set in shadows or in sunlit part), you will get vastly different lighting and scene colors.

So, depending on the white balance setting, I got completely different colors in lights and shadows. It affects the mood of whole image, so should depend on artistic needs for given photograph/scene, but this level of control cannot be applied on the final, sRGB image (too small color gamut and information loss). So when this artist control should happen? During the capture? During lighting? During color grading?

Digression – baking lighting, diffuse bounces and scattering taking full light spectrum into account

Quite interesting side note and observation to which I also don’t have a clear answer – if you go with one of the extremes when taking your panorama for IBL, you can even get different GI response after the light baking! Just imagine tungsten lights in a blue room; or pure, clear sky early afternoon lighting in an orange room – depending if you perform the WB correction or not you can get almost none multi-bounced lighting versus quite intense one.

The only 100% robust solution is to use spectral renderer. Not many GI bakers (actually, are there any?) support spectral rendering. Some renderers start to use it – there was an interesting presentation this year at Siggraph about its use at Weta http://blog.selfshadow.com/publications/s2015-shading-course/ “Physically Based Material Modeling at Weta Digital” by Luca Fascione (slides not yet there…).

In similar manner Oskar Elek and Petr Kmoch pointed out importance of spectral computations when simulating scattering and interaction of atmosphere and water: http://people.mpi-inf.mpg.de/~oelek/ http://www.oskee.wz.cz/stranka/uploads/Real-Time_Spectral_Scattering_in_Large-Scale_Natural_Participating_Media.pdf .

I don’t think we need to go that far – at least for this new console generation and until we eliminate much more severe simplifications in the rendering. Still – it’s definitely something to be aware of…

White balance problem and real-time rendering / games

Coming back to our main problem – since all those images come from a photograph and natural, physical light sources this is an actual, potentially serious problem –if you are using or considering use of real-world data acquisition for the IBLs and/or scientific physical sky models for sky and ambient lighting.

Just to briefly summarize the problem and what we know so far:

  • Different times of day and lighting conditions result with completely different temperature of light even just for sun and sky lighting.
  • Panoramas captured with different white balance camera settings that could depend on the first shot will result with completely different color temperature.
  • Images or light sources with different color temperature don’t work well together (mixing/blending).
  • If you stay with 1 white balance value for every scene, uncorrected lighting will look extreme and ugly (tungsten lights example or extremely blue skies).
  • Sometimes there are many light sources in the scene with different color temperature (simplest example– sky/sun) and you cannot find a white balance that will work in every situation; you can end up with strong tint of objects when only 1 of light sources is dominating the lighting (the other one in the shadows) no matter what are your settings.
  • Different white balance can achieve different mood (cool / warm; day / afternoon / evening; happy / sad) of final presented scene – but it is not (only) part of color grading; usually it is set for “raw” data, way before setting the final mood; when computing the sky model or capturing skies / panoramas.

Such summary and analysis of the problem suggests some possible solutions.

Before suggesting them, I’m going to write about how those problems were “hidden” in previous console generation and why we didn’t notice them.

Why in previous generation games and non-PBR workflows we didn’t notice those problems? How we solved them?

In previous console generation, light color and intensity and ambient color were abstract, purely artistic concepts and often LDR and “gamma“. The most important part is especially the ambient color. “Ambient” was defined in many alternative ways (ambient hemispheres – 2 colors, ambient cube – 3-6 colors), but it had nothing to do with sky and sun rendering.

So lighters would light environment and characters to achieve specific look and color balance (usually working mentally 100% in sRGB), not taking into consideration any physical properties and color temperature of light sources to achieve specific look that they or the art director envisiounes. Even with HDR workflows, ambient and light intensities had nothing to do with real ones, were rather set to be convenient and easy to control with exposure. In Assassin’s Creed 4: Black Flag we had no exposure control / variable exposure at all! Very talented lighting artists were able to create believable world working in 8bit sRGB as both final and intermediate color space!

Then concept, environment and effect artists would paint and model sky and clouds, throw in some sun disk flare or post effect god rays and viola. Reflections were handled only for by special system of planar (or cube) reflections, but indirect speculars were virtually non-existent. Metals had some special, custom, artist authored (and again sRGB/gamma) cubes that had nothing to do with the sky and often the environment itself.

This doesn’t mean that there was no problem. Wrong handling reflections and indirect specular lighting was one of many reasons for the transition for PBR workflows (and this is why so many engines demos show buzz-worded “next-gen PBR rendering” with metal/wet/shiny areas 😉 ). Without taking properly environment and image based lighting into account, surfaces looked flat, plastic, and unnatural. When we integrated IBL and “everything has specular and Fresnel” workflows, suddenly artists realized that different sky and reflections can look wrong and result in weird rim lighting with intensity and color not matching the environments. Things would get completely out of control…

Also modern normalized and energy conserving specular distribution functions, physically correct atmospheric and fog effects and energy conserving GI started to emphasize the importance of high dynamic range lighting (you don’t have big dynamic range of your lights? Well, say goodbye to volumetric fog light shafts). As intuitively understanding differences in lighting intensity of many EV within a single scene is IMO virtually impossible, to gain to get consistent behavior we started to look at physical and correct sky and sun models. This on the other hand – both if using analytical models as well as IBL/photogrammetry/real captured panoramas – shown us problems with the white balance and varied color temperature of lighting.

Intuitively set and “hacked” proper, film-like color balance in Witcher 2. Lots of work and tweaking, but at that time the result was amazing and critically acclaimed –all thanks to good artists with good understanding of photographic concepts and hours and hours of tweaking and iteration...

Intuitively set and “hacked” proper, film-like color balance in The Witcher 2. Lots of work and tweaking, but at that time the result was amazing and critically acclaimed –all thanks to good artists with good understanding of photographic concepts and hours and hours of tweaking and iteration…

Before I proceed with describing more proper solutions I wanted to emphasize – importance of it now doesn’t change the fact that we were looking at importance of color balance even with previous, physically not correct games. It was partially achieved through lighting, partially through color grading, but artists requested various “tweaks” and “hacks” for it.

On The Witcher 2 I had a privilege to work with amazing artists (in all departments) and lots of them were photographers and understood photography, lighting and all such processes very well. We experimented with various elements of photographic workflow, like simulation of polarizing filter to get more saturated skies (and not affecting the tonemapping too much). Or sometimes we would prototype total hacks like special color grading (as you can imagine usually blue) applied only in shadows (you can imagine why it was quite a terrible idea 😀 but this shows how intuitive need for specific look can be attempted to be achieved in “hacked” ways).

With one of artists we even tried to simulate dynamic white balance (in 2011/2012?) – by changing it in artist authored way depending on the scene average luminance (not average chrominance or anything related) – to simulate whole screen color cast and get warmer WB in shadows/darker areas; and on the other hand had nice color variation and blueish shadows when the scene had mostly sunlit areas.

Now lots of it sounds funny, but on the other hand I see how artists’ requests were driven by actual, difficult physical problems. With better understanding of PBR, real world lighting interactions and BRDF we can finally look at more proper/systemic solutions!

Dealing with the white balance – proposed solutions

In this paragraph I will first present my general opinions but later group some “alternatives” I don’t have strong opinion about.

One thing that I would assume is a must (if you don’t agree – please comment!) is sticking with a single color balance value for at least a scene (game fragment / time of day?). This should be done by a lighter / artist with strong photographic / movie background and good understanding of white balance. This way you get proper analytical lights interactions and realistic resulting colors (warm sun color and cooler sky color balance out perfectly).

One could argue that “old-school” way of mixing various light sources with different color balance, just artists making them quite “neutral” and not tinted and rely on color grading is “better”/”easier”. But then you will never get natural looking, blueish shadows and perfect HDR balance of light/shadow. Also you lose one of biggest advantages of PBR – possibility to reason about the outcome, to verify every element in the equation, find errors and make your lights / materials / scenes easily interchangeable (consistency in whole game!).

Ok, one light white balance value. How should you chose this color balance? I see two options:

  1. Sticking with a single value for whole game – like this common final sRGB 6500K for D65. And therefore achieving all the proper color warmth / coolness via final color balance/grading/correction only. It is definitely possible, but without color balance applied (pre color correction), some scenes will look extremely weird (orange/blue or even green if you have fluorescent lights). You also need to do your color correction in a gamut wider than regular sRGB (which is assumed to be in final white balance) – just like for proper photo WB you need “RAW files“. I see many more benefits of color grading in wider color spaces and higher dynamic range (maybe I will write more about it in future), but it’s not something that many titles seem to be doing now.
  2. Picking this value per scene to “neutralize” white balance either in shadows or sunlit parts (or maybe a 50/50 mix of both?). It gives much easier starting point, nice looking intermediate values, understandable and good looking HDR sky cubemaps and artist-authorable skydomes – but you need to be aware of it all the time! Also blending between various lighting conditions / zones becomes more tricky and you cannot as easily swap elements in the equation – “lets take sky capture from the noon and sun radiance from the morning” won’t work. On the other hand it is probably not very good idea anyway. 🙂 Still, it can be more difficult for games with dynamic / varying time of day.

Finally, should the color corrected, display white balance be fixed or should it auto-adapt? This is the most controversial topic. Lots of programmers and artists just hate automatic exposure / eye adaptation… And for a good reason.

It is very difficult (impossible) to do properly… and almost nothing really works just like artist / art director would like it to work… too much contrast / not enough contrast; too dark / too bright; too slow / too fast. It’s impossible to be solved completely – no matter how accurate, you can imagine manual settings being artistically better and better serving the scene.

And yet, here we talk about adding an additional dimension to the whole auto-camera-settings problem!

I see 2 “basic” solutions:

  1. Accepting no white balance eye adaptation and different color cast in shadows and lights. For crucial cinematics that happen in specified spots, manually overriding it and embedding in color grading/color correction fixed to this cinematic. Either fading it in smoothly, or accepting difference between camera cuts.
  2. Adding auto white balance. I still think that my original “old-school” idea of calculating it depending on the average reflected luminance + knowledge of the scene light sources can work pretty well… After all, we are lighting the scene and have all the information – way more than available in cameras! If not, then taking diffuse lighting (we definitely don’t want to take albedo into account! On the other hand, albedo is partially baked in with the GI / indirect lighting…) and calculating clamped/limited white balance.

But I see a third one and that can actually work for auto-exposure as well:

  1. Relying on some baked or dynamically calculated information about shadows, sky visibility averaged between some sparse points around the camera. We tend to perceive white balance and exposure “spatially” (and averaging values when looking around), not only depending on current “frame” (effective sharp FOV vs. lateral vision) and I see no reason why we shouldn’t try it in real time rendering.

For me this is quite fascinating and open topic and I’m still not sure what kind of attitude I would advocate – probably it would depend on the title, setting, lighting conditions, and existence or not of day cycle etc.

I hope I presented enough ideas and arguments to inspire some discussion – if you have any feedback on this topic, please comment!

(Next paragraph is just a small rant and feel free to skip it.)

Is research of Physically Based Rendering workflows inventing new / non-existent / abstract problems?

This sub-chapter is again a digression, probably could be in any article about anything PBR workflow-related.

But yet this is a question I keep hearing very often and probably any programmer or technical artist will hear it over and over. If you think a bit about it, it’s like with deeper and deeper understanding of physically-based-rendering and more and more advanced workflows we “invent” some new problems that didn’t exist in the past…

Yes, everyone who is in the industry for a while has shipped games with gamma-space lighting, without properly calibrated monitors (my pet peeve – suggestions of “calibrating” for an “average” TV…), with no care for sRGB/REC-709, color spaces, energy conservation, physical light units, EVs or this unfortunate white balance…

In Witcher 2 we didn’t care about any BRDFs; most reflective surfaces and their indirect speculars were achieved through artist-assigned per-shader tweaked cubemaps – so characters had different ones, environment assets had different ones, water had different ones… And still the game looked truly amazing at that time.

Does that mean we shouldn’t care for understanding of all those topics? No, definitely not. We are simply further on the curve of diminishing returns – we must invest much more, gain way more understanding and knowledge to progress further. Things are getting more complicated and complex, it can be overwhelming.

But then you can just look at recent titles like Assassin’s Creed: Unity, The Order 1886. The upcoming Frostbite engine games like Star Wars: Battlefront. Demos from Unreal Engine 4. Photogrammetrically scanned titles like The Vanishing of Ethan Carter made by teams of 7 people! You can clearly see that this extra know-how and knowledge pays off.

Good luck getting such fidelity of results in such scale with old bag of hacks!

Posted in Code / Graphics, Travel / Photography | Tagged , , , , , , , , , , , , , , , | 5 Comments

White balance and physically based rendering pipelines. Part 1 – introduction.

This is part one of the whole article. Part two is here.

In this two posts (started as one, but I had to split it to make it more… digestible) I’m going to talk a bit about the white balance. First part will describe what it is, how human white perception works and how both vintage and modern photography used to and currently deals with it and provide perceptually “correct” tones. If you don’t feel very confident about white balance management in lighting, photography and output color spaces – I hope it will provide some interesting new facts.

The second part will focus more on practical application. I will analyze which steps of color and white balance management are missing in many games pipelines, what we ignore and what we used to “hack” in previous generation of renderers and how. I will present some loose ideas on how we can include this knowledge and improve our handling of light temperature.

White color

Remember that gold/white and black/blue dress and weirdly vivid discussion about it in the internet?

Source: Tumblr/Swiked

Source: Tumblr/Swiked

There are already many posts trying to describe what is going on there (lighting conditions, light exposure) and I won’t be going more into it, but wanted to use it as an example – that color that is clearly RGB blue on a picture and when checked with a color picker can be considered white (and to be honest I was one of those gold/white people 😉 ).

Why is it so? The main reason is that in nature there is no single “white” color. As you know, light can have different wavelengths and every wavelength (within range of human perception of ~390-800nm) has some color corresponding to the response of eye.

Source: wikipedia

Light wave lengths and visible color. Source: Wikipedia

No white color there… So what defines white? When all eye photoreceptor cones are excited the same way? When camera Bayer-filtered cells generate same electrical output? When all layers of film are exposed?

Fujifilm Provia color layer sensitivity curve, source: Fujifilm

Fujifilm Provia color layer sensitivity curve, source: Fujifilm

Unfortunately no, it is way more complex and there is no single definition of white (though the last answer – film sensitivity is closest as film is very “defined” medium and with slides /positive process it can be viewed directly).

Usually, white is defined by specific spectra of light sources of certain color temperature. Color temperature and its use in photography and color science is in my opinion quite fascinating concept, as it comes from physics and black body radiation.

Color temperature of a natural (but also artificial) light source is a value assigned to a color perceptually similar to a color emitted by a perfect black body of given temperature. Blackbodies of different temperatures emit a different wavelength spectra and perceptually those spectra will appear different color.

Source: wikipedia

Source: wikipedia

While some cooler blackbodies seem clearly red, range of yellowish-blueish ones (4000-8000K?) can be used to define white color.

Look in your monitor settings, you will find some color temperature setting that most probably (and hopefully – if you work with colors) will show 6500K, a common value for “daylight”.

Ok, so is this 6500K a white color? No… Depends!

I will describe it in a second, but first – where this whole 6500K comes from? Quick google-fu will verify that it is not the actual physical temperature of the sun (which is ~5800K). After passing through the atmosphere, sun perceived temperature gets even colder (Rayleigh scattering), resulting in actually warmer (in artistic terms) colors.

Note: the naming of warm/cold white balance is very confusing, as artistically “warm” colors correspond to “colder” black bodies and color balances! But using “warmer” temperature of a light source (and perceptually colder color) as the white point will “warm up” colors in the scene! This sounds crazy and confusing, but please keep reading – I hope that after both posts it will be easier.

On the other hand, we get the atmospheric in-scattering and perceptually blue skies and sky lighting also contributes to overall light color and intensity. This 6500K is average daylight light temperature during a cloudy, overcast day (after multi-scattering in the sky and clouds) – neither the temperature of the sun or the sky on its own. In my camera settings this color temperature is also referred to as “cloudy” white balance. Naming of white balance settings is not standardized (to my knowledge) and may vary from device to device.

Finally, spectrum of a ~6500K black-body defines so called Illuminant D65, a CIE standardized definition of daylight corresponding to +/- mid European mid day light temperature. What is most important is that this D65 is a standard white color when outputting information in sRGB or REC-709, our standard formats when generating output images for display in the internet or on the HD TV. And this is why you really should make sure that lights in your studio are the same daylight D65 color temperature (and see the next paragraph).

Perception of white

Ok, so I mentioned that 6500K can be white and it depends. What defines it in reality, outside of artificial monitor color spaces?

Eyes are not our primary vision perception device. They are just lens and sensor combination. Our main source of vision and perception is obviously brain. Brain interprets white differently depending on the lighting conditions.

I intuitively think of it as of brain trying to interpret white in terms of actually reflected color, white albedo. Have you ever been in one of old towns (one of things I miss so much about the Europe!) lit by gas-lamps or old, tungsten lamps? Like the one on the perfect masterpiece from Van Gogh:

“Cafe Terrace at Night”, Vincent van Gogh

“Cafe Terrace at Night”, Vincent van Gogh

The painting is quite bright, but you can clearly identify a night scene, by seeing yellow lamps and the blue sky. Still, you can see some white(ish) brush strokes in the painting (tables).

Colors are pretty saturated, it is clearly (post)impressionist and stylized, still looks believable. This is how the artist saw and imagined that scene and this is how we accept and recognize it.

What is really fascinating is what happens if you try to take a photo of such scene with the same camera settings as during the day. Would it look like on this photo?

I found a preeeetty old photo (while still in RAW file) in my personal collection in beautiful Spanish Malaga and tried that experiment (note: zero color grading).

Let’s set color temperature to D65! It’s the standard, right?

DSCF0160

Hmm, pretty orange and ugly… Especially people skin tones look unnatural and uncanny (unless you enjoy over-the-top color grading of blockbuster Hollywood movies).

Let’s correct it – set the white balance for tungsten (as while I didn’t have a handy spectrograph, I expect those lights to have tungsten – wolfram – filaments 😉 ).

DSCF0160-2

More natural looking and more similar to the master’s painting (except for saturation, composition, artistic value and all that stuff, you know 😉 ).

Similar example (also not very artistically appealing, but even more extreme as there are no billboard lights to neutralize the tungsten – this time from beautiful Valentia), D65 version:

DSCF0341

…and after the correction:

DSCF0341-2

I hope those 2 examples proved how human perception can differ from a photograph in orange lights. But in my collection I found an example of opposite effect – camera exposing scene for daylight at evening when all lighting comes from the sky, resulting in extreme blueish color cast. (btw. this is how it looked straight from camera and showing how poor job it did at auto white balancing – this is from old Nikon D90). As lovely Edinburgh, this time not D65, but ~5300K (no idea why camera would set it this way…):

DSC_0004

The same photo with corrected white balance (50 000K – insanely high!) looks like this:

DSC_0004-2

Snow and skin color look much better and more natural. (On the other hand, the photo lost impression of evening darkness and accidental, color-graded cooler winter atmosphere; note that this is trick used in cinematography – filming during the day using white, bright lights and color grading blue to simulate night and evening scenes).

So what is this photo white balance? What does that mean? If you never worked with RAW conversion software or were not digging in camera menus, you could be surprised, but even mobile phone cameras adjust the white balance (on iPhone you can check out Camera+ that allows you to play with various white balance settings).

Small side note – if you have many light sources of different color temperature can result in weird, ugly and un-correctable pictures. Common white flash that completely doesn’t match the scene lighting and makes people look unattractive is an example – and this is why Apple implemented dual LED flashes in their new iPhone built-in cameras, a feature I was really enthusiastic about.

White balance and eye adaptation

White balance defines what color temperature (and tint – for artificial light sources or for the scenes dominated with bounced light color e.g. green bounce from foliage) is expected to be white. It is also the color that your brain expects to be white in given light conditions.

Think of it as of color equivalent of exposure adaptation – eyes and brain slowly adapt to the lighting conditions. Doesn’t matter if they are 25 EV stops apart (so hmm ~33 million times brighter?) – after a while and if you have good eye sight, you will adapt to those new lighting conditions.

In the same manner, your brain knows already light types, knows materials, knows what it was thinking is white and what you expect to be white – and slowly adjusts to this expected white color.

I have an interesting example – swimming googles. Some cheap tinted googles I bought a while ago (work very well if you swim mainly outdoors).

DSC04395

What is really interesting is that at first, after just putting them on, the tint seems extreme. However after 20-30 minutes of swimming, eyesight adapts completely, I no longer notice any tint at all. And after taking them off, everything looks orange / sepia-like and “warm”. 🙂 I tried to reproduce this experiment using my camera, RAW files and some Lightroom / Photoshop.

Day wb and a photo straight through the googles.

Day wb and a photo straight through the googles.

Almost-corrected WB (geek side note – notice how edges have different color cast due to more color absorption because of larger optical depth).

Almost-corrected WB (geek side note – notice how edges have different color cast due to more color absorption because of larger optical depth).

I wasn’t able to perfectly correct it, as I run out of the WB scale in the Adobe Lightroom (!). It would be interesting to see if those colors are outside of the camera sensor gamut and even in RAW format, or is it only limitation of the software and UX designers clamping slider to “reasonable”/ usable range.

I tried also as an experiment correcting the WB of a resolved JPEG file (so not RAW sensor data). Results look worse – lots of precision loss and banding. I kind of expected it, but this is worth emphasizing – if you ever do strong white balance corrections, never do them in sRGB / 8bit space, but with as high dynamic range and gamut as possible.

Result of trying to correct WB in Adobe Camera RAW using a JPEG file.

Result of trying to correct WB in Adobe Camera RAW using a JPEG file.

Finally, I wanted to check how properly exposed shot would behave after “taking off the goggles” and simulating the eye adaptation, so using same WB as the one used to correct swimming goggles.

Day WB, same scene, no goggles.

Day WB, same scene, no goggles.

Same scene, same WB like for correcting the goggles tint.

Same scene, same WB like for correcting the goggles tint.

Success! Looks almost exactly same like I perceive the scene after taking them the googles after longer swim. So this crude experiment proves that human white perception is at least similar to the camera white balance correction.

…and this is why I mentioned that your studio lighting conditions should be uniform and match the target daylight temperature / D65 color space. Otherwise if your eyes get adapted to surrounding “warmer” or “colder” color, you will perceive colors on the screen “wrong” and will end up making your images or video game output too warm/cold and wrongly balanced!

White balance and professional photography

I mentioned that most cameras – from your handy mobile phone through point and shoot camera but also up to professional, full-frame ones – have an option for automatically setting the white balance with literally zero user intervention. This is not setting the output color space white balance (sRGB or Adobe RGB, which is more friendly when outputting images not only for monitor display, but also for print), but neutralizing the color temperature of incoming light.

I have to admit I’m not 100% sure how it works (Sony/Canon/Nikon trade secrets?) – definitely seems more sophisticated than calculating average captured color temperature. My guess would be them having some reference db or approximate fitted algorithm that bases on “common” scenarios. Maybe per-scene, maybe only depending on the histogram. But no matter how smart is this algorithm all of those algorithms fail from time to time and you can end up with a photo with a wrong color balance.

This is not a problem when working with RAW files – you can correct them later and interestingly, Adobe software seems to have much better Auto WB than any camera I used so far. But this is definitely not enough for professional goals. You cannot get consistent and coherent results when relying on some heuristic algorithm with not enough data. Professional photography developed some guidelines and process how to do it robustly.

Getting correct white and color balance is quite well established process, but it consists of many steps and components and failure at one point can result in wrong colors. I’m not going to cover here very important topics of having properly calibrated monitors, proper lighting in the studio / room, working in consistent, proper color spaces and color management. I will focus only on usual ways of acquiring source data with properly calibrated colors.

The main difficulty during acquisition part comes from the fact that a photography captures usually reflected (and/or scattered) light. So we are getting results of convolution of the complex lighting environment, BRDF and material properties. Therefore it is difficult to set up what is reference “white” color when green-filtered sensor pixels can get green light because of either green albedo or green bounced lighting. Easy option to solve this equation is to introduce into scene reference objects that have known properties and by measuring their response to the lighting environment, figuring out properties of light, its color and perceptual white.

Sounds complex, but in its simplest variant is just setting “white balance” using color picker on objects that you know are grey or white (as a last resort I end up looking for eye whites or teeth in the photograph 🙂 gives good starting point, even with tired eyes reddish tint).

More professional option is using “professional” reference materials, like grey/white cards, grey diffuse or perfect chrome balls.

Semi-pro black/white/gray cards.

Semi-pro black/white/gray cards.

Even better option is to use range of calibrated, known materials. There are commercial products like X-Rite Color Checker that contain printed various known colors and they allow to analyze not only single color temperature/tint (which is obviously over-simplification of the complex light spectrum!), but also more complex, spectral properties. There are programs that allow to create special camera color profiles for given lighting conditions.

Color checker in action.

Color checker in action.

I mentioned that this is not easy thing to do, because it relies on very good discipline and remembering about many steps.

In theory any time lighting conditions change (either if you change the angle from which you approach your subject or cloud cover or sun moves), you need to re-capture and re-calibrate colors. This can be long, tedious and easy to confuse process… But following such simple guidelines you are definitely able to capture properly calibrated, natural looking objects and colors of albedo, natural skin colors and portraits etc.

White balance before the digital sensor era – film color response

How did color management look like in times of color film photography?

I’ll be honest – I never have worked with it professionally, only for fun, but what was fun for me, didn’t seem like very fun for people who want to achieve perfect, consistent white balance in their prints…

Every film has different color response and specific color cast. There is greenish Velvia, purpleish Portra, brownish-warm Provia, blue-ish Ektar, red-ish Ektachrome…

Contrasty and slightly brownish Provia 400X.

Contrasty and slightly brownish Provia 400X. My home city of Warsaw and Wola train station graffitis.

Fujifilm Portra 400 – violet all the way!

Fujifilm Portra 400 – violet all the way! Warsaw botanical garden.

On the other hand, the film response is fixed and quite low dynamic range (actually pretty ok dynamic range for negative film to make prints, but very small for positive/slides). You cannot change the white balance…

What is their white balance btw.? Which color temperature will produce white albedo on a print? Most films used fixed white balance of around ~5000K (direct, warm sunlight – note that it’s not “digital era standard” of ~6500K). There were some specific films for tungsten lights (like Ektachrome 160T balanced for 3200K) and AFAIK some fluorescent-light balanced ones, but that’s it!

This is the reason if you ever shot film more “professionally”, you probably literally had a bag full of colored filters. Warming daylight filters for cloudy days and shooting in shade, purple filters for correcting green, fluorescent lights and blue filters for tungsten lights. Lots of it was trial-and-error with an unknown outcome (until you develop the film and prints!) and also using such filters meant some luminance loss – which with much lower light sensitivities (anything above 100-200 was starting to get grainy…) of film was definitely undesired…

Ok, this wraps up part one! More (real-time) rendering and practical info to come in part two!

Bonus: gold and white, color corrected dress:

blue_black_dress_wb

Posted in Code / Graphics, Travel / Photography | Tagged , , , , , , , , , , | 3 Comments

Fixing screen-space deferred decals

Screen-space deferred decals are a very popular technique. There were so many presentations and blog posts about it that I will just list couple of them (just a first google search results page to be honest…) in no particular order:

Therefore I think it wouldn’t be exaggeration to call it “industry standard”.

The beauty of screen-space decals used together with deferred rendering is that they provide further “defer” of another part of the rendering pipeline – in this case of layered materials and in general – modifications to the rendered surface, both static and dynamic. Just like you can defer the actual lighting from generating the material properties (in deferred lighting / shading), you can do it as well with composite objects and textures.

You don’t need to think about special UV mapping, unwrapping, shader or mesh permutations, difficult and expensive layered material shaders, even more difficult pipelines for artists (how to paint 2 partially overlapping objects at once? How to texture it in unique way depending on the asset instance?) or techniques as complex and hard to maintain as virtual texturing with unique space parametrization.

Instead just render a bunch of quads / convex objects and texture it in the world space – extremely easy to implement (matter of hours/max days in complex, multi-platform engines), very easy to maintain (usually only maintenance is making sure you don’t break MRT separate blending modes and normals en/decoding in the G-Buffer) and easy for artists to work with. I love those aspects of screen-space decals and how easily they work with GBuffer (no extra lighting cost). I have often seen the deferred decals as one of important advantages of the deferred shading techniques and cons of the forward shading!

However, I wouldn’t write this post if not for a serious deferred screen-space decals problem that I believe every presentation failed to mention!

Later post edit: Humus actually described this problem in another blog post (not the original volume decals one). I will comment on it on one of later sections.

(Btw. a digression – if you are a programmer, researcher, artist, or basically any author of talk or a post – really, please talk about your failures, problems and edge cases! This is where 90% of engineering time is spent and mentioning it doesn’t make any technique any less impressive…).

Dirty screen-space decal problem

Unfortunately, in all those “simple” implementations presented in blog posts, presentations and articles there is a problem with the screen space decals that makes them in my opinion unshippable without any “fix” or hack in PS4/XboxOne generation of AAA games with realistic and complex lighting, materials and sharp, anisotropic filtering. Funnily enough, I found only one (!) screenshot in all those posts with such camera angle that presents this problem… Edge artifacts. This is a screenshot from the Saint Row: The Third presentation.

Problem with screen-space decals - edges

Problem with screen-space decals – edges. Source: Lighting and Simplifying Saints Row: The Third

I hope the problem is clearly visible on this screenshot – some pixels near the geometric edges perpendicular to the camera do not receive the decal properly and its background is clearly visible. I must add that in motion such kind of artifacts looks even worse. 😦 Seeing it in some other engine, I suspected at first many other “obvious” different reasons that cause edge artifacts – half-texel offsets, wrong depth sampling method, wrong UV coordinates… But the reason for this artifact is quite simple – screen space UV derivatives and the Texture2D.Sample/tex2DSample instruction!

Edit: there are other interesting problems with the screen-space / deferred decals. I highly recommend reading Sébastian Lagarde and Charles de Rousiers presentation about moving Frostbite to PBR in general (in my opinion the best and most comprehensive PBR-related presentation so far!), but especially section 3.3 about problems with decals and materials and lighting.

Guilty derivatives

The guilty derivatives – source of never ending graphics programmers frustrations, but also a solution to a problem unsolved otherwise. On the one hand a necessary feature for antialiasing of textures and the texturing performance, on the other hand a workaround with many problem of its own. They cause your quad overshading and inability to handle massive amounts of very small triangles (well, to be fair there are some more other reasons like vertex assembly etc.), they are automatically calculated for textures only in pixel shaders (in every other shader stage to use texturing you need to specify the LOD/derivatives manually), their calculations are imprecise and possibly low quality; they can cause many types of edge artifacts and are incompatible with jittered rasterization patterns (like flip-quad).

In this specific case, let’s have a look at how the GPU would calculate the derivatives, first by looking how per quad derivatives are generated in general.

Rasterized pixels

Rasterized pixels – note – different colors belong to the different quads.

In the typical rendering scenario and a regular rendering (no screen-space techniques) of this example small cylinder object, there would be no problem. Quad containing pixels A and B would get proper derivatives for the texturing, different quad containing pixels C and D would cause some overshading, but still have proper texture UV derivatives – no problem here as well (except for the GPU power loss on those overshaded pixels).

So how do the screen-space techniques make it not work properly? The problem lies within the way the UV texture coordinates are calculated and reprojected from the screen-space (so the core of this technique). And contrary to the triangle rasterization example, the problem with decal being rendered behind this object is not with the pixel D, but actually with the pixel C!

Effect of projecting reconstructed position into decal bounding box

Effect of projecting reconstructed position into decal bounding box

We can see on this diagram how UVs for the point C (reprojected from the pixel C) will lie completely outside the bounding box of the decal (dashed-line box), while the point D has proper UV inside it.

While we can simply reject those pixels (texkill, branch out with alpha zero etc. – doesn’t really matter), unfortunately they would contribute to the derivatives and mip level calculation.

In this case, the calculated mip level would be extremely blurry – the calculated partial derivative sees a difference of 1.5 in the UV space! As usually the further mip levels contain mip-mapped alpha as well then we end up with almost transparent alpha from the alpha texture or bright/blurred albedo and many kinds of different edge artifacts depending on the decal type and blending mode…

Other screen-space techniques suffering

Screen-space/deferred decals are not the only technique suffering from this kind of problems. Any kind of technique that relies on the screen-space information reprojected to the world space and used as the UV source for the texturing will have such problems and artifacts.

Edit: The problem of mip-mapping, derivatives and how screen-space deferred lighting with projection textures can suffer from it was described very well by Aras Pranckevičius.

Other (most common) examples include projection textures for the spot-lights and the cubemaps for the environment specular/diffuse lighting. To be honest, in every single game engine I worked with there were some workarounds for this kind of problems (sometimes added unconsciously 🙂 more about it in on of the next sections).

Not working solution – clamping the UVs

The first, quite natural attempt to fix it is to clamp the UVs – also for the discarded pixels, so that derivatives used for the mip-mapping are smaller in such problematic case. Unfortunately, it doesn’t solve the issue; it can make it less problematic of even completely fix it when the valid pixel is close to the clamped, invalid one, but it won’t work in many other cases… One example would be a having an edge between some rejected pixels close to U or V 0 and some valid pixels close to U or V 1; In this case still we get full mip chain dropped due to huge partial derivative change within this quad.

Still, if you can’t do anything else, it makes sense to throw in there a free (on most modern hardware) saturate instruction (or instruction modifier) for some of those rare cases when it helps…

Brutal solution – dropping mip-maps

I mentioned quite natural “solution” that I have seen in many engines and that is an acceptable solution for most of other screen space techniques – not using mip-maps at all. Replace your Sample with SampleLevel and the derivative and mip level problem is solved, right? 😉

This works “ok” for shadow maps – as the aliasing is partially solved by commonly used cascaded shadow mapping – further distances get lower resolution shadow maps (plus we filter some texels anyway)…

It is “acceptable” for the projection textures, usually because they are rendered only when being close to the camera because of a) high lighting cost b) per-scene and per-camera shot tweaking of lights.

It actually often works well with the environment maps – as lots of engines have Toksvig or other normal variance to roughness remapping and the mip level for the cubemap look-up is derived manually from the roughness or gloss. 🙂

However, mip mapping is applied on textures for a reason – removing aliasing and information in frequency higher than rasterizer can even reproduce. For things like shiny, normal mapped deferred decals like blood splats the effect of having no mip maps can be quite extreme and the noise and aliasing unacceptable. Therefore I wouldn’t use this as a solution in a AAA game, especially if deferred, screen-space decals are used widely as a tool for environment art department.

A middle ground here could be just dropping some further mip maps (for example keeping mips 0-3). This way one could get rid of extreme edge artifacts (when sampling completely invalid last mip levels) and still get some basic antialiasing effect.

Possible solution – multi-pass rendering

This is again a partial solution that would fix problems in some cases, but not in most. So the idea is to inject decal rendering in-between the object rendering with per-type object sorting. So for example “background”/”big”/”static” objects could be rendered first, decals projected on top of them and then other object layer.

This solution has many disadvantages – the first one is complication of the rendering pipeline and many unnecessary device state changes. The second one – the performance cost. Potential overshading and overdraw, wasting the bandwidth and ALU for pixels that will be overwritten anyway…

Finally, the original problem can be still visible and unsolved! Imagine a terrain with high curvature and projecting the decals on it – a hill with a valley background can still produce completely wrong derivatives and mip level selection.

Possible solution – going back to world space

This category of solutions is a bit cheated one, as it derives from the original screen-space decals technique and goes back to the world space. In this solution, artists would prepare a simplified version of mesh (in extreme case a quad!), map UV on it and use such source UVs instead of reprojected ones. Such UVs would mip-map correctly and won’t suffer from the edge artifacts.

Other aspects and advantages of the deferred decals technique would remain the same here – including possibility of software Z-tests and rejecting based on object ID (or stencil).

manual_decals

On the other hand, this solution is suitable only for the environment art. It doesn’t work at all for special effects like bullet holes or blood splats – unless you calculate source geometry and its UV on the CPU like in “old-school” decal techniques…

It also can suffer from wrong, weird parallax offset from the UV not actually touching the target surface – but in general camera settings in games never allow for extreme close ups so that it would be noticeable.

Still, I mention this solution because it is very easy on the programming side, can be good tool on the art side and actually works. It was used quite heavily in The Witcher 2 in the last level, Loc Muinne – as an easier alternative for messy 2nd UV sets and 2 layered, costly materials.

I’m not sure if those specific assets in this following screenshot used it, but such partially hand-made decals were used on many similar “sharp-ended” assets like those rock “teeth” on the left and right on the door frame in this level.

Loc_Muinne_sewers_screen1

It is much easier to place them and LOD out quickly with distance (AFAIK they were present only together with a LOD 0 of a mesh) than creating multi-layered material system or virtual texture. So even if you need some other, truly screen-space decals – give artists possibility of authoring manual decal objects blended into the G-Buffer – I’m sure they will come up with great and innovative uses for them!

Possible solution – Forward+ decals

Second type of “cheated” solutions – fetch the decal info from some pre-culled list and apply it during the background geometry rendering. Some schemes like per-tile pre-culling like in Forward+ or clusterred lighting can make it quite efficient. It is hard for me to estimate the cost of such rendered decals – depends probably on how expensive are your geometry pixel shaders, how many different decals you have, are they bound on memory or ALU, can they hide some latency etc. One beauty of this solution is how easy it becomes to use anisotropic filtering, how easy is it to blend normals (blending happens before any encoding!), no need to introduce any blend states or decide what won’t be overwritten due to storage in alpha channel; Furthermore, it seems it should work amazingly well with MSAA.

Biggest disadvantages – complexity, need to modify your material shaders (and all of their permutations that probably already eat too much RAM and game build times), increased register pressure, difficulty debugging and potentially biggest runtime cost. Finally, it would work properly only with texture arrays / atlases, which add a quite restrictive size limitation…

Possible solution – pre-calculating mip map / manual mip selection

Finally, a most “future research” and “ideas” category – if you have played with any of them and have experience or simply would like to share your opinion about them, please let me know in comments! 🙂

So, if we a) want the mip-mapping and b) our screen-space derivatives are wrong, then why not compute the mip level or even the partial derivatives (for anisotropic texture filtering) manually? We can do it in many possible ways.

One technique could utilize in-quad communication (available on GCN explicitly or via tricks with many calls to ddx_fine / ddy_fine and the masking operations on any DX11 level hw) and compute the derivatives manually only when we know that pixels are “valid” and/or come from the same source asset (via testing distances, material ID, normals, decal mask or maybe something else). In case of zero valid neighbors we could fall back to using the zero mip level. In general, I think this solution could work in many cases, but I have some doubts about its temporal stability under camera movement and the geometric aliasing. It also could be expensive – it all depends on the actual implementation and used heuristics.

Another possibility is calculating the derivatives analytically during reconstruction, given the target surface normal and the distance from the camera. Unfortunately a limitation here is how to read the source mesh normals without the normal-mapping applied. If your G-Buffer layout has them lying somewhere (interesting example was in the Infamous: Second Son GDC 2014 presentation) around then great – they can be used easily. 🙂 If not, then IMO normal-mapped information is useless. One could try to reconstruct normal information from the depth buffer, but this is either not working in the way we would like it to be – when using simple derivatives (because we end up having exact same problem like the one we are trying to solve!) – or expensive when analyzing bigger neighborhood. If you have the original surface normals in G-Buffer though it is quite convenient and you can safely read from this surface even on the PC – as decals are not supposed to write to it anyway.

Post edit: In one older post Humus described a technique being a hybrid of the ones I mentioned in 2 previous paragraphs – calculating UV derivatives based on depth differences and rejection. It seems to work fine and probably is the best “easy” solution, though I would still be concerned by temporal stability of technique (with higher geometric complexity than in the demo) given that approximations are calculated in screen-space. All kinds of “information popping in and out” problems that exist in techniques like SSAO and SSR could be relevant here as well.

Post edit 2: Richard Mitton suggested on twitter a solution that seems both smart and extremely simple – using the target decal normal instead of surface normal and precomputing those derivatives in the VS. I personally would still scale it by per-pixel depth, but it seems like this solution would really work in most cases (unless there is huge mismatch of surface curvature – but then decal would be distorted anyway…). Anyway it seems it would work in most cases.

Final possibility that I would consider is pre-computing and storing the mip level or even derivatives information in the G-Buffer. During material pass, most useful information is easily available (one could even use CalculateLevelOfDetail using some texture with known UV mapping density and later simply rescale it to the target decal density – assuming that projection decal tangent space is at least somehow similar to the target tangent space) and depending on the desired quality it probably could be stored in just few bits. “Expensive” option would be to calculate and store the derivatives for potential decal anisotropic filtering or different densities for target triplanar mapping – but I honestly have no idea if it is necessary – probably depends what you intend to use the decals for.

This is the most promising and possibly cheap approach (many game GDC and Siggraph presentations proved that next-gen consoles seem to be quite tolerant to even very fat G-Buffers 🙂 ), but makes the screen-space decals less easy to integrate and use and requires probably more maintenance, editing your material shaders etc.

This idea could be extended way further and generalized towards deferring other aspects of material shading and I have discussed it many times with my industry colleagues – and similar approach was described by Nathan Reed in his post about “Deferred Texturing”. I definitely recommend it, very interesting and inspiring article! Is it practical? Seems to me like it could be, the first game developers who will do it right could convince others and maybe push the industry into exploring interesting and promising area. 🙂

Special thanks

I would like to thank Michal Iwanicki, Krzysztof Narkowicz and Florian Strauss for inspiring discussions about those problems and their potential solutions that lead to me writing this post (as it seems that it is NOT a solved problem and many developers try to workaround it in various ways).

Posted in Code / Graphics | Tagged , , , , , , , , | 5 Comments

Anamorphic lens flares and visual effects

Introduction

There are no visual effects that are more controversial than various lens and sensor effects. Lens flares, bloom, dirty lens, chromatic aberrations… All of those have their lovers and haters. Couple years ago many games used cheap pseudo HDR effect by blooming everything; then we had light-shafts craze (almost every UE3 game had them, often set terribly – “god rays” not matching the lighting environment and the light source at all) and more recently many lo-fi lens effects – dirty-lens, chromatic aberrations and anamorphic flares/bloom.

They are extremely polarizing – on the one hand for some reason art directors and artists love to use them, programmers engines implement them in their engines, but on the other hand lots of gamers or movie audience seem to hate those effects and find their use over the top or even distracting. Looking for some examples of those effects in games it is way easier to find criticism like http://gamrconnect.vgchartz.com/thread.php?id=182932 (more on neogaf and basically any large enough gamer forum) than any actual praise… Hands up if you have ever heard from a player “wow, this dirty lens effect was soooo immersive, more of that please!”. 😉

Killzone lens flares - high dynamic range and highly saturated colors producing interesting beautiful effect, or abused effect and unclear image?

Killzone lens flares – high dynamic range and highly saturated colors producing interesting visuals, or abused effect and unclear image?

It is visible not only in games, but also movies – it went to the extreme point that after tons of criticism movie director J.J. Abrams supposedly apologized for over-using the lens flares in his movies.

Star Trek: Into Darkness lens effects example, source: http://www.slashfilm.com/star-trek-lens-flares/

Star Trek: Into Darkness lens effects example, source: http://www.slashfilm.com/star-trek-lens-flares/

Among other graphics programmers and artists I have heard very often quite strong opinion “anamorphic effects are interesting, but are good only for the sci-fi genre or modern FPS”.

Before stating any opinion of my own, I wanted to write a bit more about anamorphic effects, which IMO are quite fascinating and actually physically “inspired“. To understand them, one has to understand the history of cinematography and analog film tapes.

Anamorphic lenses and film format

I am not a cinema historian or expert, so first I will reference you to two links that cover the topic much more in depth and in my opinion much better and provide some information about the history:

Wikipedia entry

RED guide to anamorphic lenses

To sum it up, anamorphic lenses are lenses that (almost always) provide double squeezing of the image in the horizontal plane. They were introduced to provide much higher vertical resolution of the image when cinema started to experiment with widescreen formats. At that time, most common film used were 35mm tapes and obviously whole industry didn’t want to exchange all of its equipment to larger format (impractical equipment size, more expensive processes), especially just for some experiments. Anamorphic lenses allowed for that by using essentially analog and optics-based compression scheme. This compression was literal one – by squeezing the image before exposing a film and later decompressing by unsqueezing it when screening it in the cinema.

First example of a movie shot using anamorphic lenses is The Robe from 1953, over 60 years ago! Anamorphic lenses provided simple 2:1 squeeze no matter what was the target aspect ratio – but there were various different target aspect ratios depending if sound was encoded on the same tape, what was the format etc.

 

No anamorphic image stretching - limited vertical resolution. Source: wikipedia author Wapcaplet

No anamorphic image stretching – limited vertical resolution. Source: wikipedia author Wapcaplet

 

Effect of increased vertical resolution due to anamorphic image stretching. Source: wikipedia author Wapcaplet

Effect of increased vertical resolution due to anamorphic image stretching. Source: wikipedia author Wapcaplet

To compensate for squeezed, anamorphic image inverse conversion and stretching were performed during the actual movie projection. Such compression didn’t leave the image quality unaffected – due to lens imperfections it resulted in various interesting anamorphic effects (more about it later).

Anamorphic lenses are more or less a thing of the past – since the transition to digital format, 4K resolution etc. they are not really needed anymore and are expensive, but also incompatible with many cameras, poor optical quality etc. I don’t believe if anamorphic lenses are used anymore at all, maybe except for probably some niche experiments – but please correct me in comments if I’m wrong.

Lens flares

Before proceeding with the description of how it affects the lens flares, I wanted to refer to a great write-up by Padraic Hennessy about physical basis for the lens flares effects in actual, physical lenses. This post covers comprehensively why all lenses (unfortunately) produce some flares and about simulation of this effects.

In shortcut – physical lenses used for movies and photography consist of many glass lens groups. Because of Fresnel law and different IOR of every layer, light is never transmitted perfectly and in 100% between the air and glass. Note: lens manufacturers coat glass with special nano-coating to reduce it as much as possible (except for some hipster “oldschool” lenses versions)- but it’s impossible to reduce it completely.

Optical elements - notice how close pieces of glass are together (avoiding glass/air contact)

Optical elements – notice how close pieces of glass are together (avoiding glass/air contact)

Having many groups and different transmission values it results in light reflecting and bouncing multiple times inside the lens before hitting the film or sensor – and in effect some light leaking, flares, transmittance loss and ghosting. In cases of low dynamic range scenes, due to very small amount of light that gets reflected every time, it produces negligible results – but it is worth noting that the image always contains some ghosting and flares, sometimes it is not measurable. However with extremely high dynamic range light sources like sun (orders of orders of magnitude higher intensity), the light after bouncing and reflecting can be still brighter than actual other image pixels!

Anamorphic lens flares

Ok, so we should understand at this point the anamorphic format, anamorphic lenses and the lens flares, so where do the anamorphic lens flares come from? This is relatively simple – light reflection on the glass-air contact surface can happen in many places in the physical lens. It can happen both before and after the anamorphic lens components. Therefore extra light transmitted and producing a lens flare will be ghosted as if the image was not-anamorphic and had regular, not squished aspect ratio. If you look at exposed and developed such film, you will see squished image, but with some regular looking circular lens flares. Then, during film projection it will be stretched and viola – an horizontal, streaked, anamorphic lens flare and bloom! 🙂

Reproducing anamorphic effects – an experiment

Due to extremely simple nature of anamorphic effects – just your lens effects happen in 2x squeezed texture space, you can quite simply reproduce them. I added option to do so to my C#/.NET Framework for graphics prototyping (git update soon) together with some simplest procedural and fake lens flares and bloom. I just squeezed my smaller resolution buffers used for blurring by 2 – that simple. 🙂 Here are some comparison screenshots that I’ll comment in the next paragraph – for the first 3 of them the blur is relatively smaller. For some of them I added some bloom extra color multiplier (for the cheap sci fi look 😉 ), some other ones have uncolored bloom.

Please note that all of those screenshots are not supposed to produce artistically, aesthetically pleasing image, but to demonstrate the effect clearly!

c1 c2 c3

In the following ones the bloom/flare blur is 2x stronger and the effect probably more natural:

c4 c5 c6

Bonus:

I tried to play a bit with anamorphic bokeh achieved in similar way.

anamorphic_bokeh1 anamorphic_bokeh2Results discussion

First of all, we can see that with simple, procedural effects and Gaussian blurs using real stretch ratio of 2:1 it is impossible to achieve crazy effect of anamorphic flares and bloom seen in many movies and games with single, thin lines across the whole screen. So be aware that it can be an artistic choice – but has nothing to do with real, old school movies and anamorphic lenses. Still, you will probably get such a request when working on an engine with real artists or for some customers – and there is nothing wrong with that.

Secondly, the fact that procedural effects are anamorphic, makes it more difficult to see the exact shape of ghosting, blends them together and makes less distracting. This is definitely a good thing. It is questionable if it can be achieved only by a more aggressive blur on its own – in my opinion the non-uniform blurring and making the shapes not mirrored perfectly is more effective for this purpose.

Thirdly, I had no expectations for the anamorphic bokeh, played with it as some bonus… And still don’t know what to think about it, as for the results I’m not as convinced. I never got a request from an artist to implement it and it definitely can look weird (more like a lazy programmer who wrongly implemented aspect ratio during DOF 😉 ), but it is worth knowing that such effects actually existed in the real, anamorphic lens/film format scenario.

Probably I would prefer to spend some time investigating the physical basis and how to implement busy, circular bokeh (probably just some anamorphic stretch perpendicular to the radius of the image).

My opinion

In my opinion, anamorphic effects like bloom, glare and lens flares are one of many effects and tools in the artists toolbox. There is a physical basis for such effect and they are well established in the history of the cinema. Therefore viewers and audience are used to their characteristic look and even subconsciously can expect to see them.

They can be abused, applied in ugly or over-stylized manner that has nothing to do with reality – but that is not the problem of the technique; it is again some artistic choice, fitting some specific vision. Trust your artists and their art direction.

I personally really like subtle and physically inspired 2x ratio of anamorphic lens flares, glare and bloom and think they make scene look better (less distracting) than isomorphic, regular procedural effects. Everything just “melts” together nicely.

I would argue with someone saying such effects fit only sci-fi setting – in my opinion creating simulation of cinematic experience (and a reference to the past movies…) is just as valid as trying to 100% simulate human vision only for any kind of game – it is matter of creative and artistic direction of the game and its rendering. Old movies didn’t pick anamorphic lens flares selectively for specific set&setting – it was a workaround for film technical limitation and used to exist in every genre of movies!

Therefore, I don’t mind them in some fantasy game – as long as the whole pipeline is created in cinematic and coherent way. Good example of 100% coherent and beautiful image pipeline is The Order 1886 – their use of lens aberrations, distortion and film grain looks just right (and being an engineer – technically amazing!) and doesn’t interfere with the fantasy-Victorian game setting. 🙂

Probably over-stylized and sterile, extreme anamorphic lens flares and bloom producing horizontal light streaks over whole screen don’t fit into the same category though. I also find them quite uncanny in fantasy/historical settings. Still, as I said – at this point such extreme effects are probably a conscious decision of the art director and should serve some specific purpose.

I hope that my post helped at least a bit with understanding the history and reasoning behind the anamorphic effects – let me know in comments what you think!

Posted in Code / Graphics | Tagged , , , , , , , , , , , , , , , | 16 Comments

Processing scanned/DSLR photos of film negatives in Lightroom

The topic I wanted to cover in this post is non-destructive workflow for “developing” photographed or scanned negatives – in this case B&W film. Why even bother about film? Because I still love analog photos as a hobby. 🙂 I wrote some time ago a post about it.

Previously I used to work on my scanned negatives in Photoshop. However, even with specialized scanning software like Silverfast it is quite painful process. Using Photoshop as additional step has many problems:

  • Either you are working on huge files or lose lots of quality and resolution on save.
  • Even 16 bit output images from the scanning software are low dynamic range.
  • 16 bit uncompressed TIFF files output from scanning software are insanely big in size.
  • Batch processing is relatively slow and hard to control.
  • If you don’t keep your data in huge and slow to load PSD files and adjustment layers, you are going to lose information on save.
  • Photoshop is much more complex and powerful tool, not very convenient for a quick photo collection edit. That’s why Adobe created Lightroom. 🙂

No scanner? No problem!

Previously I used a scanner on many of my all time favourite film photos.

At the end of the post (not to make it too long) there are some examples of ones scanned using excellent (for the price and its size) Epson V700. I developed B&W photos myself, color one was developed at a pharmacy lab for ~$2 a roll.

They are one of my favourite photos of all time. All taken using relatively compact Mamiya 6. Such a high quality is possible with medium format film only though – don’t expect such results from a small frame 35mm camera.

However, I foolishly left my scanner in Poland and wasn’t able to use anymore (I’m afraid that such fragile equipment can get broken during shipping without proper packaging). Buying a new one is not super-cheap. Therefore I started to experiment with using a DSLR or camera in general for getting decent looking positive digital representation of negatives. I confirm that it is definitely possible even without buying expensive macro lens – I hope to write a bit more about the reproduction process later when it improves. For now I wanted to describe the non-destructive process I came up with in Lightroom using a small trick with curves.

The Lightroom workflow

Ok, so you take a photo of your negative using some slide copying adapter. You get results looking like this:

step0-inputFar from perfect, isn’t it? The fact that it was a TriX 400 roll rated at ASA 1250 and developed using Diafine doesn’t help – it is push developer, quite grainy and not very high detail with specific contrast that cannot be fixed using Adams Zone System and development times. It was also taken using a 35mm rangefinder (cheapest Voigtlander Bessa R3A), so when taking the photo you can’t be sure of proper crop, orientation or sharpness.

But enough excuses – it’s imperfect, but we can try to work around many of those problems in our digital “lightroom”. 🙂

Ok, so let’s go and fix that!

Before starting, I just wanted to make it clear – all I’ll describe is relevant only if you use RAW files and have a camera with dynamic range that is good enough – any DSLR or mirrorless bought within last 5-6 years will be fine.

1. Rough and approximate crop

step1-approx-cropThe first step I recommend for convenience of even evaluating your film negatives is doing some simple cropping and maybe rotating the shot. It not only helps you judge photo and decide if you want to continue “developing” it (could have been a completely missed shot), but also will help the histogram used for further parts of development.

2. Adjusting white balance and removing saturation and noise reduction

step2-wb-and-saturationIn the next step I propose to adjust white balance (just use pipette and pick any gray point) and completely remove saturation from your photo if working with black and white. Every film has some color cast (depends on the film and developer – can be purple, brown etc). Also since you (should) take it using small ISO like 100, you can safely remove any extra digital noise reduction that unfortunately is on by default in Lightroom – that’s what I did here as well. Notice that in my example color almost didn’t change at all – the camera was smart enough to auto adjust its WB automatically to the film color.

3. Magic! Inverting the negative

step3-inverting-negativeOk, this is the most tricky part and a feature that Lightroom is lacking – a simple “invert”. In Photoshop it is one of basic menu options, there is even a keyboard shortcut, here you have to… use the curves. Simply grab your white point and turn it to 0, and do the opposite with the black point – put it to 1. Simple (though UI sometimes can get stuck, so adjust those points slowly) and works great! Finally you can see something on this photo. You can also see that this digital “scan” is far from perfect, as the film was not completely flat – blurriness on the edges. 😦 But in the era of desired low-fi and instagram maybe it is an advantage? 😉

4. (Optional) Pre-adjusting the exposure and contrast

step4-exposureThis step can be optional – depends on the contrast of your developed film. In my case I decided to move them a bit to make it easier later on further operations on the curves – otherwise you might have to do precise sub-pixel changes with the curves which due to UI imprecision can be inconvenient. I also cropped a bit extra to make the histogram even better and not fooled by the fully lit or dark borders.

5. Adjusting white point

step5-white-pointNow having the histogram more equalized and useful, you can set your white point using curves. Obviously your histogram is now reversed – this can be confusing at first, but after working on a first scan you can quickly get used to it. The guidelines here are the same like with regular image processing or photography – tweak your slider looking for the points of your scene you want to be white (with B&W one can be more radical on this step) using the histogram as a helper.

Why I’m not using Lightroom “general” controls like blacks, whites etc.? Because their behavior is reversed and very confusing. They also do some extra magic and have non-linear response, so it’s easier to work with curves. Though if you can find optimal workflow using those controls – let me know in comments!

6. Adjusting the black point

step6-black-pointNext step is simple and similar – you proceed in the same way to find your black point and darkest parts of your photo.

At this point your photo may look too contrasty, too dark or too bright – but don’t worry, we are going to fix it in the next step. Also since all editing in Lightroom is non-destructive, it is going to have still the same quality.

7. Adjusting gamma curve / general brightness

step7-gammaIn this step you add another control point to your curve and by dragging it, create smooth, gamma response. In this step look mainly at your midtones – medium grays – and general ambiance of the photo.

You can make your photo brighter or darker – it all depends. In this case I wanted slightly brighter one.

It can start to lack extra “punch” and the response can become too flat – we will fix it in the next point.

8. Adding extra contrast in specific tonal parts

step8-contrast-heelBy adding extra point and “toe” to your curve, you can boost the contrast in specific parts. I wanted this photo to be aggressive (I like how black and white chest pieces work well in B&W), so I added quite intensive one – now looking at it I think I might have over done it, but this whole post is instructional.

Fun-fact for photographers who are not working in the games industry or are not technical artists or graphics programmers – such S-shaped curve is often called in games an “filmic tonemapping” curve, from analog photo or movie film/tape.

9. Final crop and rotation

step9-final-cropI probably should have done it earlier, but I added extra cropping and straightened the photo. I added some extra sharpening / unsharp mask to compensate for not perfectly sharp both original photo and the scan of the film.

10. Results after saving to the hard drive

DSC01359This is how it looks after a resizing save from Lightroom to disk – not too bad comparing to the starting point, right?

The best advantage of Lightroom – you can extremely easy copy those non-destructive settings (or even create a preset!) and apply it to other photos on your scan! I spent on those 2 extra shots no longer than 3 minutes each. 🙂 Very convenient and easily controllable batch processing.

DSC01362 DSC01361

Conclusions

I hope I have shown that non-destructive workflow of processing scans of negatives in Lightroom can be fast, easy, productive and you can batch-process many photos. This is amazing tool and I’m sure other better photographers will get even better results!

And I promise to write more about my scanning rig assembled for around $100 (assuming you already have a camera) and post some more scans from better quality, lower ASA films or medium format shots.

Bonus

As I promised, some photos that I took couple years ago using Mamiya 6. All scanned and processed manually (color one developed at cheap pharmacy photo lab). Medium format 6×6 composition – another reason to start using film. 🙂

wilno3 wilno2 wilno1 warsaw3 warsaw2 warsaw1

Posted in Travel / Photography | Tagged , , , , , , , , , , | 5 Comments

Designing a next-generation post-effects pipeline

Hey, it’s been a while since my last post. Today I will focus on topic of post-effects. Specifically, I wanted to talk about next-gen post process pipeline and redesign I worked on while being a part of Far Cry 4 rendering team. While I’m no longer Ubisoft employee and my post won’t represent the company in any way and can’t share for example internal screenshots and debug views of buffers and textures I don’t have access to anymore, I think it is a topic worth discussing just in “general” way and sharing and some ideas could be useful for other developers. Some other aspects of the game were discussed in Michal Drobot’s presentation [1]. Also at the GDC 2015 Steve McAuley will talk about Far Cry impressive lighting and vegetation technology [9] and Remi Quenin about game engine, tools and pipeline improvements [12] – if you are there, be sure to check their presentations!

Whole image post-processing in 1080p on consoles took around 2.2ms.

Whole image post-processing in 1080p on consoles took around 2.2ms.

Introduction

Yeah, image post processing – usual and maybe even boring topic? It was described by almost every game developer in detail during previous console generation. Game artists and art directors got interested in “cinematic” pipelines and movie-like effects that are used to build mood, attract viewers’ attention to specific parts of the scene and in general – enhance the image quality. So it was covered very well and many games got excellent results. Still, I believe that most games post-effects can be improved – especially given new, powerful hardware generation.

Definition of a post-effect can be very wide and cover anything from tone-mapping through AA up to SSAO or even screen space reflections! Today I will cover only “final” post effects that happen after the lighting, so:

  • Tonemapping,
  • Depth of field,
  • Motion blur,
  • Color correction,
  • “Distortion” (refraction),
  • Vignetting,
  • Noise/grain,
  • Color separation (can serve as either glitch effect or fake chromatic aberration),
  • Various blur effects – radial blur, gaussian blur, directional blur.

I won’t cover AA – Michal Drobot described it exhaustively at Siggraph and mentioned some his work on SSAO during Digital Dragons presentation. [1]

State of the art in post-effects

There were many great presentations, papers and articles about post effects. I would like to just give some references to great work that we based on and tried to improve in some aspects:

– Crytek presentations in general, they always emphasize importance of highest quality post-effects. I recommend especially Tiago Sousa’s Siggraph 2011-2013 presentations. [2]

– Dice / Frostbite smart trick for hexagonal bokeh rendering. [3]

– Morgan McGuire work together with University of Montreal on state of the art quality in motion blur. [4]

– And recent amazing and comprehensive publication by Jorge Jimenez, expanding work of [4] and improving real-time performance and plausibility of visual results. [5]

Motivation

With so many great publications available, why we didn’t use exactly same techniques on Far Cry 4?

There are many reasons, but main one is – performance and how effects work together. Far Cry 3, Blood Dragon and then Far Cry 4 are very “colorful” and effect heavy games, it is part of game’s unique style and art direction. Depth of Field, motion blur, color correction and many others are always active and in heavy combat scenes 4-6 other effects kick in! Unfortunately they well all designed separately, often not working very well and they were not working in HDR – so there were no interesting effects like bright bokeh sprites. But even with simple and LDR effects, their frame time often exceeded 10ms! It was clear to us that we needed to address post-processing in unified manner. So re-think, re-design and re-write their pipeline completely. We got a set of requirements from the art director and fx artists:

– Depth of field had to produce circular bokeh. I was personally relieved! 🙂 I wrote already about how much I don’t like hexagonal bokeh and why IMO it makes no sense in games (low-quality/cheap digital camera effect vs human vision and high definition cameras and cinematic lenses). [6]

– They wanted “HDRness” of depth of field and potentially other blur and distortion effects. So bright points should cause bright motion blur streaks or bokeh circles.

– Proper handling of near and far depth of field and no visible lerp blend between sharp and blurred image – so gradual increase/decrease of CoC.

– Many other color correction, vignetting, distortion (refraction) and blur effects.

– Motion blur to work stable and behave properly in high-velocity moving vehicles (no blurring of the vehicle itself) without hacks like masks for foreground objects.

– Due to game fast tempo and many objects moving, lots of blurs happening all the time – no need for proper “smearing” of moving objects; at first art director prioritized per-object MB very low – fortunately we could sneak it in for almost free and getting rid of many artifacts with previous, “masked” motion blur.

– Most important – almost all effects active all the time! DoF was used for sniper rifle aiming, focus on main weapon, binoculars, subtle background blurring etc.

The last point made it impossible to go with many techniques in 1080p and with good performance. We I made ourselves a performance goal – around 2ms spent on post-effects total (not including post-fx AO and AA) per frame on consoles.

Some general GCN/console post-effect performance optimization guidelines

Avoid heavy bandwidth usage. Many post-effects do data multiplication and can eat huge amounts of available memory bandwidth. Anything done to operate on smaller targets, smaller color bit depth, cutting number of passes or other forms of data bw compression will help.

Reduce your number of full screen passes as much as possible. Every such pass had cost associated with reading and outputting a full screen texture – there is some cache reload cost as well as exports memory bandwidth costs. On next-gen consoles it is relatively small, smaller cost than on x360 (when you had to “resolve” after every pass if you wanted to read data back) even in way higher resolution, but in 1080p and with many passes and effects it adds up!

Avoid weird data-dependent control flows to allow efficient latency hiding. I wrote about latency hiding techniques in GCN architecture some time ago [7] and suggested that this architecture in case of many needed samples (so typical post-effect use-case) benefits rather from batching samples together and hiding latency without wave switching. Therefore any kind of data-dependent control flow will prevent this optimization –watch out for branches (especially dynamically calculating required number of samples – often planning for worst case works better! But take it with a grain of salt – sometimes it is good to dynamically reject for example half of samples; just don’t rely on a dynamic condition that can take 1-N samples!).

With efficient GPU caches it is easy to see “discrete performance steps” effect. What I mean is that often adding a new sample from some texture won’t make the performance worse – as GPU will still fit +/- same working set in cache and will be able to perfectly hide the latency. But add too many source textures or increase their size and suddenly timing can increase even 2 times! It means you just exceeded optimal cache working size and started to trash your caches and cause their reloading. This advice doesn’t apply to ALU – it scales almost always with the number of instructions and if you are not bw-bound it is always worth to do some fast math tricks.

Often previous console generation advices are counterproductive. One example is practice from previous consoles to save some ALU in PS by moving trivial additions (like pixel offsets for many samples) to VS and relying on hardware triangle parameter interpolation – this way we got rid of some instructions and if we were not interpolation bound we observed only performance increase. However, on this architecture there is nothing like hardware interpolation – all interpolation is done in PS! Therefore such code can be actually slower than such additions in PS. And thanks to “sample with literal offset” functions (last parameter of almost all Sample / SampleLevel / Gather functions) if you have fixed sample count you probably don’t need to do any ALU operations at all!

Be creative about non-standard instruction use. DX11+ has tons of Sample and Gather functions and they can have many creative uses. For example to take N horizontal samples from 1 channel texture (with no filtering) it is better to do N/2 gathers and just ignore half of gathered results! It really can make a difference and allow for many extra passes with timings of e.g. 0.1ms.

Finally, I would like to touch a quite controversial topic and this is my personal opinion – I believe that designing visual algorithms and profiling runtime performance we should aim to improve the worst case, not the average case. This point is valid especially with special (post) FX – they kick in already when scenery is heaviest for the GPU because of particles, many characters and dynamic camera movement. I noticed that many algorithms rely on forms of “early outs” and special optimal paths. This is great as an addition and to save some millis, but I wouldn’t rely on it. Having such fluctuations makes it much harder for technical artists to optimize and profile the game – I prefer to “eat” some parts of the budget even if the effect is not visible at the moment. There is nothing worse than stuttering in action-heavy games during those intensive moments when the demand for interactivity is highest! But as I said, this is a controversial topic, I know many great programmers who don’t agree with me. There are no easy answers and single solutions – it depends on specific case of game, special performance requirements etc. For example probably hitting 60fps most of the time with occasional drops to 30fps would be better than constant 45 v-synced to 30.

Blur effects

Whole idea for the pipeline is not new or revolutionary; it appeared on many internet forums and blogs for a long time (Thanks to some people I have the reference I was talking about – thanks! [13]). It is based on observation that all blurs can be combined together if we don’t really care about their order. Based on this we started with combining motion blur and depth of field, but ended up including many more blurs: whole-screen blur, radial blur and directional blur. Poisson disk of samples can be “stretched” or “rotated” in given direction, giving blur directionality and desired shape.

Stretching of CoC Poisson disk in the direction of motion vector and covered samples.

Stretching of CoC Poisson disk in the direction of motion vector and covered samples.

If you do it in half screen resolution, take enough samples and calculate “occlusion” smartly – you don’t need more than one pass! To be able to fake occlusion we used “pre-multiplied alpha” approach. Blur effect would be feeded 2 buffers:

  1. Half resolution blur RGB parameters/shape description buffer. Red channel contained “radius”, GB channels contained directionality (signed value – 0 in GB meant perfectly radial blur with no stretch).
  2. Half resolution color with “blurriness”/mask in alpha channel.

In the actual blur pass we wouldn’t care at all about the source of blurriness – just did 1 sample from blur shape buffer, and then did 16 or 32 samples (depending if it was a cut-scene or not) from color buffer, weighting by color alpha and renormalizing afterwards – that’s all! 🙂

How blur shape and blurriness alpha/mask would be calculated? It was mixture of samples from motion vectors buffer, Circle of Confusion buffer, some artist-specified masks (in case of generic “screen blur” effect) and some ALU for radial blur or directional blur.

Ok, but what about desired bleeding of out-of-focus near objects to sharp in-focus background objects? We used a simple trick of “smearing” the circle of confusion buffer – blurred objects in front of focus plane would blur their CoC on sharp in-focus objects. To extend near objects CoC efficiently and not to extend far-blur objects onto sharp background we used signed CoC. Objects behind the focus plane had negative CoC sign and during the CoC extension we would just simply saturate() fetched value and calculate maximum with unclamped, original value. No branches, no ALU cost – the CoC extension was separable and had some almost-negligible cost of AFAIR 0.1ms.

Synthetic example of DoF CoD without near depth extension.

Synthetic example of DoF CoD without near depth extension.

Synthetic example of DoF CoD with near depth extension. Notice how only near CoC extends onto sharp areas - far CoC doesn't get blurred.

Synthetic example of DoF CoD with near depth extension. Notice how only near CoC extends onto sharp areas – far CoC doesn’t get blurred.

Obviously it was not as good as proper scatter-as-gather approaches and what Jorge Jimenez described in [5], but with some tweaking of this blur “shape” and “tail” it was very fast and produced plausible results.

Whole pipeline overview

You can see very general overview of this pipeline on following diagram.

postprocess_diagram

Steps 1-3 were already explained, but what also deserves some attention is how bloom was calculated. Bloom buffers used fp 11-11-10 color buffers – HDR, when pre-scaled precision was high enough, good looking, and 2x less bandwidth!

For the blur itself, we borrowed idea from Martin Mittring’s Unreal Engine 4 presentation [8]. Mathematical background is easy – according to Central Limit Theorem average of many randomly distributed variables with many distributions including uniform one converges to Gaussian variable distribution. Therefore we approximated Gaussian blur with many octaves of efficiently box-sampled bloom thresholded buffer. Number of samples for every pass was relatively small to keep data in L1 cache if possible, but with many those passes combined effect approached nicely a very wide Gaussian curve. They were combined together to ½ resolution buffer in step 4 with applied artist-specified masks and typical “dirty lens” effect texture (only the last octaves contributed to the dirty lens). There was also combine with “god-rays”/”lens-flare” post-effect in this step, but I don’t know if it was used in final game (cost was negligible, but it definitely is a past-gen effect…).

Most complex, most expensive and only full-screen resolution pass was 5.

It combined not only bloom, half-resolution blurs and sharp image, but also performed tone-mapping operator, 3D texture color correction and other ALU texture operations and simple ALU-based noise/dithering effect (magnitude of noise calculated to be at least 1 bit of sRGB). Please note that the tone-mapping didn’t include the exposure – it was already exposed properly in lighting / emissive / transparent shaders. It allowed for much better color precision, no banding and easier to debug color buffers. I hope that Steve McAuley will describe it more in his GDC talk as part of lighting pipeline he designed and developed.

But what I found surprising performance-wise and I think is worth sharing was that we also calculated distortion /refraction and color separation in there. It was cheaper to do color separation as 3x more samples in every combined buffer! Usually they were not very far away from original ones and it was localized in screen-space and within adjacent pixels, so there was not so much additional cost for those passes! Separate passes for those effects were more expensive (and harder to maintain) than this single “uber-pass”. There were many more different passes combined in there and we applied similar logic – sometimes it is possible to calculate a cascade of effects in a single pass. It allows for saving bandwidth, reducing export cost and improved latency hiding – and post process effect usually don’t have dependent flow in code, so even with lower occupancies performance is great and the latency hidden.

Summary

Described solution performed very fast (and the worst case was only a bit slower than the average) gave nice and natural effects. The way all effects were united and working together allowed for good color precision. As it was called from a single file and the order was clearly defined in one shader file, it was easy to refactor, maintain and change it. Single blur shader provided great performance optimization but also improved the quality (affordable to take many samples).

However, there are some disadvantages of this technique.

– There were some “fireflies”. Artifacts caused by too big smearing of bright HDR pixels, especially when doing some intermediate steps in partial resolution. Smart and fast workaround for it seems to be weighting operator suggested by Brian Karis. [11] It would come at almost no additional cost (already doing premul alpha weighting). However it would mean that artists would lose some of HDR-ness of DoF. So as always – if you cannot do “bruteforce” supersampling, you have to face some trade-offs…

-There was no handling of motion blurred objects “smearing” at all. If you find it a very important feature, then probably it would be possible to do some blurring / extension on motion vectors buffer with taking occlusion into account – but such pass even in half res would add some extra cost.

– Circle of confusion extension/blur for near-objects was sometimes convincing, but sometimes looked artificial. It depended a lot on tweaked parameters and fudge factors – after all, it was a bit “hack”, not proper realistic sprite-based scatter solution. [6]

– Finally, there were some half resolution artifacts. This is pretty self-explainatory. Worst one was caused by taking bilinear samples from half resolution blur “mask” stored in blur buffers alpha channels. Worst case was when moving fast along a wall. Gun was not moving in screen space, but the wall was moving very fast and it accidentally was grabbing some samples from gun outline. We experimented with more aggressive weighting, changing depth minimizing operator to “closest” etc., but it only made the artifact less visible – it still could appear in case of very bright specular pixels. Probably firefly reduction weighting technique could help here. Also 3rd person games would be much less prone to such artifact.

References

[1] http://michaldrobot.files.wordpress.com/2014/08/hraa.pptx

[2] http://www.crytek.com/cryengine/presentations

[3] http://publications.dice.se/attachments/BF3_NFS_WhiteBarreBrisebois_Siggraph2011.pptx

[4] http://graphics.cs.williams.edu/papers/MotionBlurHPG14/

[5] http://advances.realtimerendering.com/s2014/index.html#_NEXT_GENERATION_POST

[6] https://bartwronski.com/2014/04/07/bokeh-depth-of-field-going-insane-part-1/

[7] https://bartwronski.com/2014/03/27/gcn-two-ways-of-latency-hiding-and-wave-occupancy/

[8] http://advances.realtimerendering.com/s2012/index.html

[9] http://www.gdconf.com/news/see_the_world_of_far_cry_4_dec.html

[10] http://www.eurogamer.net/articles/digitalfoundry-2014-vs-far-cry-4

[11] http://graphicrants.blogspot.com/2013/12/tone-mapping.html

[12] http://schedule.gdconf.com/session/fast-iteration-for-far-cry-4-optimizing-key-parts-of-the-dunia-pipeline

[13] http://c0de517e.blogspot.com.es/2012/01/current-gen-dof-and-mb.html

Posted in Code / Graphics | Tagged , , , , , , , , , | 6 Comments

CSharpRenderer Framework update

In couple days I’m saying goodbye to my big desktop PC for several next weeks (relocation), so time to commit some stuff to my CSharpRenderer GitHub repository that was waiting for it for way too long. 🙂

Startup time optimizations

The goal of this framework was to provide as fast iterations as possible. At first with just few simple shaders it wasn’t a big problem, but when it started growing it became something to address. To speed it up I did following two optimizations:

Gemetry obj file caching

Fairly simple – create a binary instead of loading and processing obj file text every time. On my hd in Debug mode gives up to two seconds of start-up time speed-up.

Multi-tasked shader compilation

Shader compilation (pre-processing and building binaries) is trivialy parallelizable, so I simply needed to make sure it’s stateless and only loading binaries to driver and device happens from a main, immediate context.

I highly recommend .NET Task Parallel Library – it is both super simple and powerful, has very nice syntax with lambdas and allows for complex task dependencies (child tasks, task continuations etc.). It also hides from user problematic thread vs task management (think with tasks and multi-tasking, not multiple threads!). I didn’t use all of its power (like Dataflow features which would make sense), but it is definitely worth taking into consideration when developing any form of multitasking in .NET.

Additional tools for debugging

shapshots_features

I added simple features toggles (auto-registered and auto-reloaded UI) to allow easier turning on-off from within the UI. To provide additional debugging help with this feature and also some other features (like changing a shader when optimizing and checking if anything changed quality-wise and in which parts of the scene) I added option of taking “snapshots” of final image. I supports quickly switching between snapshot and current final image or displaying snapshot vs current image difference. Much faster than reloading a whole shader.

Half resolution / bilateral upsampling helpers

Some helper code to generate offsets texture for bilateral upsampling. For every full res pixel it generates offset information that depending on depth differences between full-res and half-res pixels uses either original bilinear information (offset equal zero) or snaps to edge-bilinear (instead of quad-bilinear) or even point sampling (closest depth) from low resolution texture when depth differences are big. Benefit of doing it this way (not in every upscale shader) is much less shader complexity and potentially performance (when having multiple half res – > full res steps); also less used registers and better occupancy in final shaders.

bilat_upsampling

Physically-correct env LUTs, cubemap reflections and (less correct) screen-space reflections

I added importance sampling based cubemap mip chain generation for GGX distribution and usage of proper environment light LUTs – all based on last year’s Brian Karis Siggraph talk.

I also added very simple screen-space reflections. They are not full performance (reflection calculation code is “simple”, not super-optimized) or quality (noise and temporal smoothing), more as a demonstation of the technique and showing why adding indirect specular occlusion is so important.

Screen-space reflections are temporally supersampled with additional blurring step (source of not being physically correct) and by default look very subtle due to lack of metals or very glossy materials, but still useful for occluding indirect speculars.

without_ssrwith_ssr

As they re-use previous frame lighting buffer we actually get multi-bounce screen-space reflections at cost of increasing temporal smoothing and trailing of moving objects.

Weather to use them or not in a game is something I don’t have a clear opinion on – my views were expressed in one of first posts in this blog. 🙂

Future

I probably won’t update the framework because of having only MacBook Pro available for at least several weeks / possibly months (unless I need to integrate a critical fix), but I plan to do quite big write-up about my experiences with creating efficient next-gen game post-processing pipeline and optimizing it – and later definitely post some source code. 🙂

Posted in Code / Graphics | Tagged , , , , , , , | 15 Comments

Review: “Multithreading for Visual Effects”, CRC Press 2014

Today I wrote a short review about a book I bought and read recently – “Multithreading for Visual Effects” published by CRC Press 2014 and including articles by Martin Watt, Erwin Coumans, George ElKoura, Ronald Henderson, Manuel Kraemer, Jeff Lait, James Reinders. Couple friends asked me if I recommend it, so I will try to briefly describe its contents and who I can recommend it for.

BxN4wfoIcAE00_t

What this book is not

This book is a collection of various VFX related articles. It is not meant to be a complete / exhaustive tutorial for designing multi-threaded programs or algorithms or how VFX industry approaches multithreading in general. On the other hand, I don’t really feel it’s just a collection of technical papers / advancements like ShaderX or GPU Pro books are. It doesn’t include very detailed presentation of any algorithm or technique. Rather it is a collection of post-mortems of various studios, groups and people working on a specific piece of technology and how they had to face multi-threading, problems they encountered and how they solved them.

Lots of articles have no direct translation to games or real time graphics – you won’t get any ready-to-use recipe for any specific problem, so don’t expect it.

What I liked about it

I really enjoyed practical aspects of the book – talking about actual problems. Most of the problem comes from the fact that existing code bases contain tons of not threaded / tasked, legacy code with tons of global states and “hacks”. It is trivial to say “just rewrite bad code”, but when talking about technology developed for many years, producing desired results and already deployed in huge studios (seems that VFX studios are often order of magnitude larger than game ones…) obviously it is rarely possible. One article provides very interesting reasoning in whole “refactor vs rewrite” discussion.

Authors are not afraid to talk about such not-perfect code and provide practical information how to fix it and avoid such mistakes in the future. There are at least couple articles that mention best code practices and ideas about code design (like working on contexts, stateless / functional approach, avoiding global states, thinking in tasks etc.).

I also liked that authors provided very clear description of “failures” and practicality of final solutions, what did and what didn’t work and why. Definitely this is something most scientific / academic papers are lacking, but here it was described clearly and will definitely help readers.

Short chapters descriptions

“Introduction and Overview”, James Reinders

Brief introduction in history of hardware, its multi-threading capabilities and why they are so important. Distinction between threading and tasking. Presentation of different parallel computations solutions easily available in C++ – OpenMP, Intel TBB, OpenCL and others. Very good book introduction.

“Houdini, Multithreading existing software”, Jeff Lait

Great article about the problem of multithreading existing, often legacy code bases. Description of best practices when designing multi-threaded/tasked code and how to fix existing, not perfect one (and various kinds of problems / anti-patterns you may face). I can honestly recommend this article to any game or tools programmer.

“The Presto Execution System: Designing for Multithreading”, George ElKoura

Introductory article about threaded systems designs when dealing with animations. Very beneficial for any engine or tools programmers as describes many options for parallelism strategies, their pros and cons. Final applied solution is not really applicable for games run-time, but IMO this article is a still very practical and good read for game programmers.

“LibEE: Parallel Evaluation of Character Rigs”, Martin Watts

Second chapter exclusively about animations, but applicable to any node/graph-based systems and their evaluation. Probably my favorite article in the book because of all the performance numbers, compared approaches and practical details. I really enjoyed its in-depth analysis of several cases, how multi-tasking worked on specific rigs and how content creators can (and probably at some point will have to) optimize their content for optimal and parallel evaluation. The last part is something often not covered by any articles at all.

“Fluids: Simulation on the CPU”, Ronald Henderson

Interesting article describing process of picking and evaluating most efficient parallel data structures and algorithms for specific case of fluids simulation. It is definitely not exhaustive description of fluids simulation problem, but rather example analysis of parallelizing a specific problem – very inspiring.

“Bullet Physics: Simulation with OpenCL”, Erwin Coumans

Introduction to GPGPU with OpenCL with case study of Bullet physics engine. Introduction to rigid body simulation, collision detection (tons of references to great “Real-time collision detection“) nicely overlapping with description of OpenCL, GPGPU / compute simulations and differences between them and classic simulation solutions.

“OpenSubdiv: Interoperating GPU Compute and Drawing”, Manuel Kraemer

IMO the most specialized article. As I’m not an expert on mesh topologies, tessellation and Catmull-Clark surfaces it was for me quite hard to follow. Still, depiction of the title problem is clear and proposed solutions can be understood even by someone who doesn’t fully understand the domain.

Final words / recommendation

I feel that with next-gen and bigger game levels, vertex counts and texture resolutions we need not only better runtime algorithms, but also better content creation and modification pipelines. Tools need to be as responsive as they used to be couple years ago, but this time with order of magnitude bigger data sets to work on. This is the area where we almost converged with the problems VFX industry faces. From discussions with many developers, it seems to be the biggest concern of most game studios at the moment – tools are lagging in development compared to the runtime part and we are just beginning to utilize network caches and parallel, multithreaded solutions.

I always put emphasis on short iteration times (they allow to fit more iterations at the same time, more prototypes that directly translate to better final quality of anything – from core gameplay to textures and lighting), but with such big data sets to process, they would have to grow unless we optimize pipelines for modern workstations. Multi-threading and multi-tasking is definitely the way to go.

Too many existing articles and books either only mentioned parallelization problem, or silently ignored it. “Multithreading for Visual Effects” is very good as it finally describes practical side of designing code for multi-threaded execution.

I can honestly recommend “Multithreading for Visual Effects” to any engine, tools and animations programmers. Gameplay or graphics programmers will benefit from it as well and hopefully it will help them create better quality code that runs efficiently on modern multi-core machines.

Posted in Code / Graphics | Tagged , , , | Leave a comment

Python as scientific toolbox – 8 months later

I started this blog with a simple post about my attempts to find free Mathematica replacement tool for general scientific computing with focus on graphics. At that time I recommended scientific Python and WinPython environment.
Many months have passed, I used lots of numerical Python at home, I used a bit of Mathematica at work and I would like to share my experiences – both good and bad as well as some simple tips to increase your productivity. This is not meant to be any kind of detailed description, guide or even tutorial – so if you are new to Python as scientific toolset, I recommend you to check out great Scientific Python 101 by Angelo Pesce before reading my post.
My post is definitely not exhaustive and is very personal – if you have different experiences or I got something wrong – please comment! 🙂

Use Anaconda distribution

In my original post I recommended WinPython. Unfortunately, I don’t use it anymore and at the moment I definitely can vote for Anaconda. One quite obvious reason for that is that I started to use MacBook Pro and Mac OSX – WinPython doesn’t work there. I’m not a fan of having different working environments and different software on different machines, so I had to find something working on both Win and MacOSX.

Secondly, I’ve had some problems with WinPython. It works great as a portable distribution (it’s very handy to have it on USB key), but once you want to make it essential part of your computational environment, problems with its registration in system start to appear. Some packages didn’t want to install, some other ones had problems to update and there were conflicts in versions. I even managed to break distro by desperate attempts to make one of packages work.

Anaconda is great. Super easy to install, has tons of packages, automatic updater and “just works”. Its registration with system is also good and “works”. Not all interesting packages are available through its packaging system, but I found no conflicts so far with Python pip, so you can work with both.

At the moment, my recommendation would be – if you have administrative rights on a computer, use Anaconda. If you don’t (working not on your computer), or want to go portable, have WinPython on your USB key – might be handy.

Python 2 / 3 issue is not solved at all

This one is a bit sad and ridiculous – perfect example of what goes wrong in all kinds of open source communities. When someone asks me if they should get Python 2.7+ or 3.4+, I simply don’t have an easy answer – I don’t know. Some packages don’t work with Python 3, some others don’t work with Python 2 anymore. I don’t feel there is any strong push for Python 3, for “compatibility / legacy reasons”… Very weird situation and definitely blocks development of the language.

At the moment I use Python 2, but try to use imports from __future__ and write everything compatible with Python 3, so I won’t have problems if and when I switch. Still, I find lack of push in the community quite sad and really limiting the development/improvement of the language.

Use IPython notebooks

My personal mistake was that for too long I didn’t use the IPython and its amazing notebook feature. Check out this presentation, I’m sure it will convince you. 🙂

I was still doing oldschool code-execute-reload loop that was hindering my productivity. With Sublime Text and Python registered in the OS it is not that bad, but still, with IPython you can get way better results. Notebooks provide interactivity maybe not as good as Mathematica, but comparable to and much better than regular software development loop. You can easily re-run, change parameters, debug, see help and profile your code and have nice text, TeX or image annotations. IPython notebooks are easy to share, store and to come back to later.

Ipython as shell is also quite ok itself – even as environment to run your scripts from (with handy profiling macros, help or debugging).

NumPy is great and very efficient…

NumPy is almost all you need for your basic numerical work. SciPy linear algebra packages (like distance arrays, least squares fitting or other regression methods) provide almost everything else. 🙂 For stuff like Monte Carlo, numerical integration, pre-computing some functions and many others I found it sufficient and performing very well. Slicing and indexing options can be not obvious at beginning, but once you get some practice they are very expressive. Big volume operations can boil down to a single expression with implicit loops over many elements that are internally written in efficient C. If you ever worked with Matlab / Octave you will feel very comfortable with it – to me it is definitely more readable than weird Mathematica syntax. Also interfacing with file operations and many libraries is trivial – Python becomes expressive and efficient glue code.

…but you need to understand it and hack around silent performance killers

On the other hand, using NumPy very efficiently requires quite deep understanding of its internal way of working. This is obvious and true in case of any programming language, environment or algorithm – but unfortunately in case of numerical Python it can be very counter-intuitive. I won’t cover examples here (you can easily find numerous tutorials on numpy optimizations), but often writing efficient code means writing not very readable and not self-documenting code. Sometimes there are absurd situations like some specialized functions performing worse than generic ones, or need to write incomprehensible hacks (funniest one was suggestion to use complex numbers as most efficient way for simple Euclidean distance calculations)… Hopefully after couple numerically heavy scripts you will understand when NumPy does internal copies (and it does them often!), that any Python iteration over elements will kill your perf, that you need to try to use implicit loops and slicing etc.

There is no easy way to use multiple cores

Unfortunately, multithreading, multitasking and parallelism are simply terrible in Python. Whole language wasn’t designed to be multitasked / multithreaded and Global Interpreter Lock as part of language design makes it a problem almost impossible to solve. Even if most NumPy code releases GIL, there is quite a big overhead from doing so and other threads becoming active – you won’t notice big speed-ups if you don’t have really huge volumes of work done in pure, single NumPy instructions. Every single line of Python glue-code will become a blocking, single-threaded path. And according to Amdahl’s law, it will make any massive parallelism impossible. You can try to work around it using multiprocessing – but in such case it is definitely more difficult to pass and share data between processes. I haven’t researched it exhaustively – but anyway no simple / annotation based (like in OpenMP / Intel TBB) solution exists.

SymPy cannot serve as replacement for Mathematica

I played with SymPy just several times – it definitely is not any replacement for symbolic operations in Mathematica. It works ok for symbol substitution, trivial simplification or very simple integrals (like regular Phong normalization), but for anything more complex (normalizing Blinn-Phong level… yeah) it doesn’t work – after couple minutes (!) of calculations produces no answer. Its syntax is definitely not as friendly for interactive work like Mathematica as well. So for symbolic work it’s not any replacement at all and isn’t very useful. One potential benefit of using it is that it embeds nicely and produces nice looking results in IPython notebooks – can be good for sharing them.

No very good interactive 3D plotting

There is matplotlib. It works. It has tons of good features

…But its interactive version is not embeddable in IPython notebooks, 3D plotting runs very slow and is quite ugly. In 2D there is beautiful Bokeh generating interactive html files, but nothing like that for 3D. Nothing on Mathematica level.

I played a bit with Vispy – if they could create as good WebGL backend for IPython notebooks like they promise, I’m totally for it (even if I have to code visualizations myself). Until then it is “only” early stage project for quickly mapping between numerical Python data and simple OpenGL code – but very cool and simple one, so it’s fun to play with it anyway. 🙂

There are packages for (almost) everything!

Finally, while some Python issues are there and I feel won’t be solved in the near future (multithreading), situation is very dynamic and changes a lot. Python becomes standard for scientific computing and new libraries and packages appear every day. There are excellent existing ones and it’s hard to find a topic that wasn’t covered yet. Image processing? Machine learning? Linear algebra? You name it. Just import proper package and adress the problem you are trying to solve, not wasting your time on coding everything from scratch or integrating obscure C++ libraries.
Therefore I really believe it is worth investing your time in learning it and adapting to your workflow. I wish it became standard for many CS courses at universities instead of commercial Matlab, poorly interfaced Octave or professors asking students to write whole solutions in C++ from scratch. At least in Poland they definitely need more focus on problems, solutions and algorithms, not on coding and learning languages…

Posted in Code / Graphics | Tagged , , , , , , | 1 Comment

New debugging options in CSharpRenderer framework

Hi, minor update to my C#/.NET graphics rendering framework / playground got just submitted to GitHub. I implemented following new features:

Surface debugging snapshots

One of commentators asked me how to easily display for debug SSAO buffer – I had no easy answer (except for hacking shaders). I realized that very often debugging requires displaying various buffers that can change / get overwritten in time – we cannot rely simply on grabbing such surface at the end of the frame for display…

csharprenderer_surfacedebug

Therefore I added option to create in code various surface “snapshots” that get copied to a debug buffer if needed. They are copied at given time only if user requested such copy. You can display RGB, A or fractional values (useful for depth / world position display). In future there could be some options for linearization / sRGB, range stretching clamping etc., but for now I didn’t need it. 🙂

Its use is trivial – in code after rendering information that you possibly could want debugged – like SSAO and its blurs, write:


// Adding SSAO debug
SurfaceDebugManager.RegisterDebug(context, "SSAOMain", ssaoCurrent);

// Do SSAO H blurring (...)

// Adding SSAO after H blur debug
SurfaceDebugManager.RegisterDebug(context, "SSAOBlurH", tempBlurBuffer);

// Adding SSAO after V blur debug
SurfaceDebugManager.RegisterDebug(context, "SSAOBlurV", tempBlurBuffer);

No additional shader code is needed and debug display is handled on “program” level. Passes get automatically registered and UI refreshed.

GPU debugging using UAVs

Very often when writing complex shader code (especially compute shaders) we would like to have some “printf” / debugging options. There are many tools for shader debugging that (I personally recommend excellent and open-source RenderDoc), but often launching external tools adds time overhead and it can be not possible to debug everything in case of complex compute shaders.

We would like to have “printf” like functionality form the GPU. While no APIs provide it, we can use append buffers UAVs and simply hack it in shaders and later either display such values on screen or lock/read them on the CPU. This is not a new idea.

I implemented very rough and basic functionality in the playground and made it work for both Pixel Shaders and Compute Shaders.

csharprenderer_uavdebug

It doesn’t require any regular code changes, just shader file change + checking option in UI to ON.

In pixel shader one would write something like:

float3 lighting = finalLight.diffuse * albedo + finalLight.specular;

if (DEBUG_FILTER_VPOS(i.position, 100, 100))
{
 DebugInfo(i.position.xyz, lighting);
}

In compute shader you would write accordingly:

float4 finalOutValue = float4(lighting * scattering, scattering + absorbtion);

if (DEBUG_FILTER_TID(dispatchThreadID, 10, 10, 10))
{
 DebugInfo(dispatchThreadID, finalOutValue);
}

If you don’t want to specify / filter values manually in shader you don’t have to – you can override position filter in UI and use DEBUG_FILTER_CHECK_FORCE_VPOS and DEBUG_FILTER_CHECK_FORCE_TID macros instead.

If you want to debug pixels you can even automatically set those filter values in UI from last viewport click position (hacked, returns coords only in full resolution, but can be useful when trying to track negative values / NaN sources).

Minor / other

  • Improved a bit temporal AA based on luma clamping only – loosely inspired by excellent Brian Karis UE4 Siggraph AA talk. 🙂
  • Added small dithering / noise to the final image – I quite like this subtle effect.
  • Extracted global / program constant buffer – very minor, but could help reorganizing code in the future.
  • Option to freeze the time – useful for debugging / comparison screenshots.

 

Posted in Code / Graphics | Tagged , , , , , | 4 Comments

Updated Poisson-like generator with GUI and more

poisson

Just a super short note:

I updated my simple rendering-oriented Poisson-like pattern generator with:

  • Very simple GUI made in PyQt to make experimenting easier.
  • Option to do rotating disk (with minimizing rotated point distance) for things like Poisson bokeh / shadow maps PCF.
  • Better visualizations with guidelines.
  • …And optimized algorithm a bit.

I’m definitely finished working with it (unless I find some bugs and will fix them), it was mostly done to learn to create Python GUIs quickly and for fun. 🙂

And also I have started writing a longer blog note about my experiences with Python as scientific environment and free / open source alternative to Mathematica. What worked well, what didn’t and some tips & tricks to avoid my mistakes. 🙂

Stay tuned!

Posted in Code / Graphics | Tagged , , , , , , , , , | Leave a comment

Major C#/.NET graphics framework update + volumetric fog code!

As I already promised too many times, here comes major CSharpRenderer framework update!

As always, all code available on GitHub.

Note that the goal is still the same – not to write most beautiful or fast code, but to provide a prototype playground / framework for hacking and having fun with iteration times approaching 0. 🙂 It still will undergo some major changes.

Apart from volumetric code as example for my Siggraph talk (which is not in perfect shape code quality wise – it is supposed to be a quickly written demo of this technique; note also that this is not the code that was used for shipping game, it is just a demo; original code had some NDAd and console specific optimizations), other major changes cover:

“Global” shader defines visible from code

You can define some constant as “global” one in shader and immediately have it reflected in C# side after changing / reloading. This way I removed some data / code duplication and potential for mistakes.

Example:


// shader side

#define GI_VOLUME_RESOLUTION_X 64.0 // GlobalDefine

// C# side

m_VolumeSizeX = (int)ShaderManager.GetUIntShaderDefine("GI_VOLUME_RESOLUTION_X");

Derivative maps

Based on old but excellent post by Rory Driscoll. I didn’t see much sense in computing tangent frames in mesh preprocessing for needs of such simple framework. I used “hack” of using normal maps as derivative map approximation – doesn’t really care in such demo case.

“Improved” Perlin noise textures + generation

Just some code based on state of the art article from GPU Pro by Simon Green. Used in volumetric fog for some animated, procedural effect.

Very basic implementation of BRDFs

GGX Specular based on a very good post about optimizing it by John Hable.

Note that lighting code is a bit messy now, its major clean-up is my next task.

 

Minor changes added are:

  • UI code clean-up and dynamic UI reloading/recreating after constant buffer / shader reload.
  • Major constants renaming clean-up.
  • Actually fixing structured buffers.
  • Some simple basic geometric algorithms I found useful.
  • Adding shaders to project (actually I had it added, have no idea why it didn’t get in the first submit…).
  • Some more easy-to-use operations on context (blend state, depth state etc.).
  • Simple integers supported in constant buffer reflection.
  • Other type of temporal AA – accumulation based, trails a bit – I will later try to apply some ideas from excellent Epic UE4 AA talk.
  • Time-delta based camera movement (well, yeah…).
  • Fixed FPS clamp – my GPU was getting hot loud. 🙂
  • More use of LUA constant buffer scripting – it is very handy and serves purpose very well.
  • Simple basis for “particle” rendering based on vertex shaders and GPU buffer objects.
  • Some stupid animated point light.
  • Simple environment BRDF approximation by Dimitar Lazarov from Black Ops 2

Future work

Within next few weeks I should update it with:

  • Rewriting post-effects, tone-mapping etc.
  • Adding GPU debugging
  • Improving temporal techniques
  • Adding naive screens-pace reflections and an env cube-map
  • Adding proper area light support (should work super-cool with volumetric fog!)
  • Adding local lights shadows
Posted in Code / Graphics | Tagged , , , , , , , , | 18 Comments

Siggraph 2014 talk slides are up!

As promised during my talk, I have just added Siggraph 2014 Advances in the Real-Time rendering slides, check them out at my Publications page. Some extra future ideas I didn’t manage to cover in time are in bonus slides section, so be sure to check it out.

They should be also soon online at “Advances in the Real-Time” web page. When they land there, check them the whole page out as there was lots of amazingly good and practical content in this course this year! Thanks again to Natalya Tatarchuk for organizing the whole event.

…Also I promised that I will release some source code soon, so stay tuned! 🙂

Posted in Code / Graphics | Tagged , , , , , , , | Leave a comment

Voigtlander Nokton Classic 40mm f1.4 M on Sony A7 Review

As I promised, my delayed review of Voigtlander Nokton Classic 40mm f1.4 M used on Sony Alpha A7. First I’m going to explain some “mysterious” (lots of questions in the internet!) aspects of this lens.

Why 40mm?

So, first of all – why such weird focal length like 40mm, while there are tons of great M-mount 35mm and 50mm lenses? 🙂
I’ve always had problems with “standard” and “wide-standard” focal lengths. Honestly, 50mm feels too narrow. It’s great for neutral upper-body or full-body portraits and shooting in open-door environments, but definitely limiting in interiors and for situational portraits.
In theory, it was supposed to be a “neutral” focal length, similar to human perception of perspective, but is a bit narrower. So why so many 50mm lens and they are considered standard? Historical reasons and optics – they are extremely easy to produce and correct any kinds of optical problems (distortion, aberration, coma etc.) and require less optical elements than other kinds of lenses to achieve great results.
On the other hand, 35mm usually catches too much environment and photos get a bit too “busy”, while it’s still not true wide angle lens for amazing city or landscape shots.
40mm feels just right as a standard lens. Lots of people recommend against 40mm on rangefinders, as Leica and similar don’t have any framings for 40mm. But on digital full frame mirrorless with great performing EVF? No problem!
Still, this is just personal preference. You must decide on your own if you agree, or maybe prefer something different. 🙂 My advice on picking focal lengths is always – spend a week and take many photos in different scenarios using cheap zoom kit lens. Later check the EXIF data and check what kinds of focal lengths you used for the photos you enjoy the most.
Great focal length for daily "neutral" shooting.

Great focal length for daily “neutral” shooting.

What does it mean that this lens is “classic”?

There is lots of bs in the internet about “classic” lens design. Some people imply that it means that lens is “soft in highlights”. Obviously this makes no sense, as sharpness is not a function of brightness – either lens is soft or sharp. It can mean transmittance problems wrongly interpreted, but what’s the truth?
Classic design usually means design of lenses relating to historical designs of earlier XX century. Lenses were designed this way before introduction of complex coating and many low-dispersion / aspherical elements. Therefore, they have relatively lower number of elements – as without modern multi-coating and according to Fresnel law on every contact point between glass and air there was light transmission loss and light got partially reflected. Lack of proper lens coating resulted not only in poor transmission (less light getting to film / camera sensor) and lower contrast, but also in flares and various other artifacts coming from light bouncing inside the camera. Therefore number of optical elements and optical groups was kept a bit lower. With lower number of optical elements it is impossible to fix all lens problems – like coma, aberration, dispersion or even sharpness.
“Classic” lenses were also used with rangefinders that had quite large close-focusing range (usually 1m). All this disadvantages had a good side effect – lenses designed this way were much smaller.
And while Voigtlander Nokton Classic bases on “classic” lens design, it has modern optical element coating, a bit higher number of optical elements and keeps very small size and weight while fixing some of those issues.
Optical elements - notice how close pieces of glass are together (avoiding glass/air contact)

Optical elements – notice how close pieces of glass are together (avoiding glass/air contact)

What’s the deal with Single / Multi Coating?

I mentioned the effect of lens coating in previous section. For unknown reason, Voigtlander decided to release both truly “classic” version with single, simple coating and multi-coated version. Some websites try to explain it that a) single coating is cheaper b) some contrast and tranmission loss is not that bad when shooting on B&W film c) flaring can be desired effect. I understand this reasoning, but if you shoot anything in color, stick to the multi-coated version – no need to lose any light!
Even with Multi-coating, flaring at night can be a bit strong.

Even with Multi-coating, flaring of light sources at night can be a bit strong. Notice quite strong falloff and small coma in corners.

Lens handling

Love how classic and modern styles work great together on this camera/ lens combination

Love how classic and modern styles work great together on this camera / lens combination

Lens handles amazingly well on Sony A7. With EVF and monitor it’s really easy to focus even at f/1.4 (although takes a couple of days of practicing). Aperture ring and focus ring work super smooth. Size is amazing (so small!) even with adapter – advantage of M-Mount – lenses for M-mount were designed to have small distance to film. Some people mention problems on Sony A7/A7R/A7S with purple coloring on the corners on wider-angle Voigtlander lenses due to grazing angle between light and sensor – fortunately that’s not the case with Nokton 40mm 1.4.
Only disadvantage is that sometimes while eye at EVF i “lose” the focus tab and cannot locate it. Maybe it takes some time to get used to it?
In general, it is very enjoyable and “classic” experience, and it’s fun just to walk around with camera with Nokton 40mm on.

Image quality

I’m not a pixel-peeper and won’t analyze all micro-aspects on crop images or measure. Just conclusions from every day shooting. The lens I have (remember that every lens copy can differ!) is very sharp – has quite decent sharpness even at f/1.4 (although it is extremely easy with only slight movement to lose focus…). Performance is just amazing at night – great lens for wide-opened f/1.4 night photos – you don’t have to pump ISO or fight with long shutter speed – just enjoy photography. 🙂
Pin-sharp with nice, a bit busy bokeh

Pin-sharp at f/1.4 with nice, a bit busy bokeh

Higher apertures = corner to corner sharpness

Higher apertures = corner to corner sharpness

Bokeh is a bit busy, gets “swirly” and squashed, sometimes can be distracting – but I like it this way. Depends on personal preferences. At f/1.4 with 40mm it can almost melt down the backgrounds. Some people complain about purple fringing (spectrochromatism) of bokeh – something I wrote about in my post about Bokeh scatter DoF. I didn’t notice it on almost any of my pictures, on one I removed it with one click in Lightroom – definitely not that bad.
Bokeh

Bokeh

Even with mediocre adapters can't complain about MF lens handling

At larger apertures bokeh gets quite “swirly”. Still lots of interesting 3D “pop”.

There is definitely some light fall-off at f/1.4 and f/2.0, but I never mind those kind of artifacts. Distortion is negligible in regular shooting – even architecture.
General contrast and micro-contrast is nice and there is this “3D” look to many photos. I really don’t understand complaints and see big difference compared to “modern” designed lenses – but I never used latest Summicron/Summilux so maybe I haven’t seen everything. 😉
Color definition is very neutral – no visible problematic coloring.
Performance is a bit worse in corners – still quite sharp, but some visible coma (squashing of image in plane perpendicular to radius).
Some fall-off and coma in corners. Still pretty amazing night photo - Nokton is truly deserved name.

Some fall-off and coma in corners. Still pretty amazing night photo – Nokton is truly deserved name.

Unfortunately, even with Multi-Coating, there is some flaring at night from very bright light sources. Fortunately I didn’t notice any ghosting that often comes with it.

Disadvantages

So far I have one, biggest problem with this lens – close focus range of 0.7m. It rules out many tricks with perspective on close-ups, any kind of even semi-macro photography (photos of food while at restaurant). While at f/1.4 you could have amazingly shallow DoF and wide bokeh, that’s not the case here, as you cannot set focus closer…  It can even be problematic for half-portraits. Big limitation and pity, otherwise the lens would be perfect for me – but on the other hand such focus range contributes to smaller lens size. As always – you cannot have only advantages (quality, size&weight, aperture and in this case close-focus range). Some Leica M-lenses have focus range of 1m – I don’t imagine shooting with such lenses…

Recommendations

Do I recommend this lens? Oh yes! Definitely great buy for any classic photography lover. You can use it on your film rangefinder (especially if you own Voigtlander Bessa) and on most of digital mirrorless camera. Great image quality, super pleasant handling, acceptable price – if you like 40mm and fast primes, then it’s your only option. 🙂
DSC00331
DSC00270
DSC00292
DSC00226
Image | Posted on by | Tagged , , , , , , , , , , , | 15 Comments

Poisson disk/square sampling generator for rendering

I have just submitted onto GitHub small new script – Poisson-like distribution sampling generator suited for various typical rendering scenarios.

Unlike other small generators available it supports many sampling patterns – disk, disk with a central tap, square, repeating grid.

It outputs ready-to-use (and C&P) patterns for both hlsl and C++ code. It plots pattern on very simple graphs.

Generated sequence has properties of maximizing distance for every next point from previous points in sequence. Therefore you can use partial sequences (for example only half or a few samples based on branching) and have proper sampling function variance. It could be useful for various importance sampling and temporal refinement scenarios. Or for your DoF (branching on CoC).

Edit: I added also an option to optimize sequences for cache locality. It is very estimate, but should work for very large sequences on large sampling areas.

Usage

Just edit the options and execute script: “python poisson.py“. 🙂

Options

Options are edited in code (I use it in Sublime Text and always launch as script, so sorry – no commandline parsing) and are self-describing.

# user defined options
disk = False # this parameter defines if we look for Poisson-like distribution on a disk (center at 0, radius 1) or in a square (0-1 on x and y)
squareRepeatPattern = True # this parameter defines if we look for "repeating" pattern so if we should maximize distances also with pattern repetitions
num_points = 25 # number of points we are looking for
num_iterations = 16 # number of iterations in which we take average minimum squared distances between points and try to maximize them
first_point_zero = disk # should be first point zero (useful if we already have such sample) or random
iterations_per_point = 64 # iterations per point trying to look for a new point with larger distance
sorting_buckets = 0         # if this option is > 0, then sequence will be optimized for tiled cache locality in n x n tiles (x followed by y) 

Requirements

This simple script requires some scientific Python environment like Anaconda or WinPython. Tested with Anaconda.

Have fun sampling! 🙂

Posted in Code / Graphics | Tagged , , , , , , , , | Leave a comment

Sony A7 review

Introduction

This is a new post for one of my favourite “off-topic” subjects – photography. I just recently (under 2 weeks ago) bought Sony A7 and wanted to share some my first impressions and write a mini review.

Why did I buy a new piece of photo hardware? Well, my main digital camera since 3-4 years was Fuji FinePix X100. I also owned some Nikon 35mm/FF DSLRs, but since my D700 (that I bought used cheaply with already big shutter counter value) got broken beyond repair I bought D600, I almost didn’t use Nikon gear. D600 is a terrible camera with broken AF, wrong metering (exposes +/- 1EV at random, lots of PP at home) and tons of other problems and honestly – I wouldn’t recommend it to anyone and I don’t use it anymore.

With Fuji X100 I share hate & love relationship. It has lots of advantages. Great image quality for such tiny size and APS-C sensor. It is very small, looks like a toy camera (serious advantage if you want to travel into not really safe areas or simply don’t want to attract too much attention, just enjoy taking photos). Bright f/2.0 lens and interesting focal length (one good photographer friend of mine told me once that there are no interesting photos taken with focal lengths of more than 50mm and while it was supposed to be a joke, I hope you can get the point). Finally nice small built-in flash and excellent fill light flash mode working great with leaf shutter and short sync times – it literally saved thousands of portraits in bright sunlight and other holiday photos. On the other hand, it is slow, has lots of quirks in usage (why do I need to switch to macro mode to take a regular situational portrait?!), slow and inaccurate AF (need to try to take a photo couple times, especially in low light…), it’s not pin-sharp and fixed 35mm focal length equivalent can be quite limiting – too wide for standard shooting, too narrow for wide angle shots.

Since at least a year I was looking around for alternatives / some additional gear and couldn’t find anything interesting enough. I looked into Fuji X100s – but simply a bit better AF and sensor wouldn’t justify such big expense + software has problems with X-Trans sensor pixel color reconstruction. I read a lot about Fuji X-series mirror-less system, but going into a new system and buying all the new lenses is a big commitment – especially on APS-C. Finally quite recent option is Sony RX-1. It seemed very interesting, but Angelo Pesce described it quite well – it’s a toy (NO OVF/EVF???).

Sony A7/A7R and recent A7S looked like interesting alternatives and something that would compete with famous Leica so I looked into it and after couple weeks of research I decided to buy the cheapest and most basic one – A7 with the kit lens. What do I need kit lens for? Well, to take photos. I knew that its IQ wouldn’t be perfect, but it’s cheap, not very heavy and it’s convenient to have one just in case – especially until having completed your target lens set. After few days of extensive use (a weekend trip to NYC, yay!) I feel like writing a mini review of it, so here we go!

Hero of this report - no, not me! Sony A7 :)

Hero of this report – no, not me & sunburn! Sony A7 🙂 Tiny and works great.

I tested it with the kit lens (Sony FE 28-70mm f/3.5-5.6 OSS), Nikkor 50mm 1.4D and Voigtlander Nokton 40mm 1.4.

DSC00353

What I like about it

Size and look

This one is pretty obvious. Full-frame 35mm camera sized smaller than many mirrorless APS-C or famous Leica cameras! Very light, so I just throw it in a bag or backpack. My neck doesn’t hurt even after whole day of photo shooting. Discrete when doing street photography. Nice style that is kind of blend between modern and retro cameras. Especially with M-mount lenses on – classic look and compact size. Really hard to beat in this area. 🙂

Love how classic and modern styles work great together on this camera

Love how classic and modern styles work great together on this camera

Image quality

Its full-frame sensor has amazing dynamic range on low ISOs. 24MP resolution – way too much for anyone except for pros taking shots for printing on billboards, but useful for cropping or reducing high-ISO noise when downsizing. Very nice built-in color profiles and aesthetic color reproduction – I like them much better than Adobe Lightroom ones. I hope I don’t sound like audiophiles, but you really should be able to see the effect of full-frame and large pixel size on the IQ – like there is “medium-format look” even with mediocre scans, I believe there is “full-frame look” better than APS-C or Micro 4/3.

Subtle HDR from a single photo? No problem with Sony A7 dynamic range!

Subtle HDR from a single photo? No problem with Sony A7 dynamic range.

IQ and amount of detail is amazing  - even on MF, shot with Voigtlander Nokton 40mm f 1.4

IQ and amount of detail is amazing – even on MF, shot with Voigtlander Nokton 40mm f 1.4

EVF and back display

Surprisingly pleasant in use, high resolution and dynamic range and fast. I was used to Fuji X100 laggy EVF (still useful at night or when doing precise composition) and on Sony A7 I feel huge difference. Switches between EVF and back display quite quickly and eye sensor works nice. Back display can be tilted and I used it already couple times (photos near the ground or above my head), a nice feature to have.

Manual focusing and compatibility with other lenses

This single advantage is really fantastic and I would buy this camera just because of that. Plugging in Voigtlander or Nikon lenses was super easy, camera automatically switched into manual focus mode and operated very well. Focusing with magnification and focus-assist is super easy and really pleasant. It feels like all those old manual cameras, same pleasure of slowly composing, focusing, taking your time and enjoying photography – but much more precise. With EVF and DoF preview always on you constantly think about DoF and its effect on composition, what will be sharp etc. To be honest, I never took so sharp and photos in my life – almost none deleted afterwards. So you spend more time on photo taking (it may be not acceptable for your friends or strangers asked to take a photo of you), but much less in post-processing and selection – again, kind of back to photography roots.

My wife photo shot using Nikkor 50mm f/1.4D - no AF gave me such precise results...

Photo of my wife. It was photo shot using Nikkor 50mm f/1.4D and MF – no AF ever gave me so precise results…

I like the composition and focus in this photo - shot using manual focus on Nikkor 50mm 1.4D

I like the composition and focus in this photo – shot using manual focus on Nikkor 50mm 1.4D

Quality of kit lens and image stabilization

I won’t write any detailed review of the kit lens – but it’s acceptably sharp, nice micro-contrast and color reproduction, you can correct distortion and vignetting easily in Lightroom and it’s easy to take great low-light photos with relatively longer exposure times due to very good image stabilization. AF is usually accurate. While I don’t intend to use this lens a lot, I have much more fun with primes, I will keep it in my bag for sure and it proves itself useful. Only downside is size (zoom FF lenses cannot be tiny…) – because it is surprisingly light!

Hand held photo taken using lens kit at night - no camera shake!

Hand held photo taken using lens kit at night – no camera shake!

Speed and handling

Again probably I feel so good about Sony A7 speed and handling because of moving from Fuji X100 – but ergonomics are great, it is fast to use and reacts quickly. Only disadvantage is how long it takes default photo preview and EVF showing image feed again – 2s is minimum time to select from a menu – way too long for me. There are tons of buttons configured very wisely by default – changing ISO or exposure compensation without taking your eye off the camera is easy.

Various additional modes

Pro photographer probably doesn’t need any panorama mode, or night mode that automatically combines many frames to decrease noise / camera shake / blur, but I’m not a pro photographer and I like those features – especially panoramas. Super easy to take, decent quality and no need to spend hours post-processing or relying on stitch apps!

In-camera panorama image

In-camera panorama image

What I don’t like

Current native lenses available

Current native FE (“full frame E-mount”) lens line-up is a joke. Apart from kit lens there are only 2 primes (why 35mm is only f/2.8 when so big?) and 2 zoom lenses – all definitely over-priced and too large. L There are some Samyang/Rokinon manual focus lenses available (I played a bit with 14mm 2.8 on Nikon and it was cheap and good quality – but way too large). There are rumors of many first and third party (Zeiss, Sigma, maybe Voigtlander) lenses to be announced at Photokina so we will see. For now one has to rely on adapters and manual focusing.

Lack of built-in or small external flash

A big problem for me. I very often use flash as fill light and here it’s not possible. L Smallest Sony flash HVL-F20AM is currently not available (and not so small anyway).

Not too bad photo - but would have been much better with some fill light from a flash...

Not too bad photo – but would have been much better with some fill light from a flash… (ok, I know – would be difficult to sync without ND filters / leaf shutter 🙂 )

What could be better but is not so bad

Accessories

System is very young so I expect things to improve – but currently availability of first or third party accessories (flashes, cases, screen protectors etc.) is way worse than for example Fuji X-series system. I hope things to change in the next months.

Not the best low light behavior

Well, maybe I’m picky and expected too much as I take tons of night photos and couple years ago it was one of the reasons I wanted to buy a full-frame camera. 🙂 But for a 2014 camera A7 high ISO quality degradation of detail (even in RAW files! they are not “true” RAW sensor feed…), color and dynamic range is a bit too high. A7S is much better in this area. Also the AF behavior is not perfect in low light…

Photo taken at night with Nikkor 50mm and f/1.4 - not too bad, but some grain visible and detail lost

Photo taken at night with Nikkor 50mm and f/1.4 – not too bad, but some grain visible and detail loss

Not best lens adapters

The adapters I have for Nikon and M-mount are OK. Their built quality seems acceptable and I didn’t see any problems yet. But they are expensive – 50-200 dolars for a piece of metal/plastic? It would be also nice to have some information in EXIF – for example option to manually specify set focal length or detect aperture? Also Nikon/Sony A-mount/Canon adapters are too big (they cannot be smaller due to design of the lens – focal plane distance must match DSLRs) – what’s the point of having small camera with big, unbalanced lenses?

Even with mediocre adapters can't complain about MF lens handling

Even with mediocre adapters can’t complain about MF lens handling and IQ

Kit zoom and tiny Nikkor 50mm 1.4D with adapter are too big... M-mount adapter and Voigtlander lens are much smaller and more useful.

Kit zoom and tiny Nikkor 50mm 1.4D with adapter are too big… M-mount adapter and Voigtlander lens are much smaller and more useful.

Photo preview mode

I don’t really like how magnification button is placed and that by default it magnifies a lot (to 100% image crop level). I didn’t see any setting to change it – I would expect progressive magnification and better button placement like on Nikon camera.

Wifi pairing with mobile

I don’t think I will use it a lot – but sometimes it could be cool for remote control. In such case I tried to set it up and it took me 5mins or so to figure it out – definitely not something to do when willing to take a single nice photo with your camera placed on a bench at night.

 

What’s next?

In the next couple days (hopefully before the Siggraph as after I have a lot more to write!) I promise I will add in separate posts:

  • More sample photos from my NYC trip
  • Voigtlander Nokton 40mm f/1.4 mini review – I’m really excited about this lens and it definitely deserves a separate review!

So stay tuned!

Image | Posted on by | Tagged , , , , , , , , , , , | 2 Comments

Hair rendering trick(s)

I didn’t really plan to write this post as I’m quite busy preparing for Siggraph and enjoying awesome Montreal summer, but after 3 similar discussion with friends developers I realized that the simple hair rendering trick I used during the prototyping stage at CD Projekt Red for Witcher 3 and Cyberpunk 2077 (I have no idea if guys kept that though) is worth sharing as it’s not really obvious. It’s not about hair simulation or content authoring, I’m not really competent to talk about those subjects and it’s really well covered in AMD Tress FX or nVidia HairWorks (plus I know that lots of game rendering engineers work on that topic as well), so check them out if you need awesome looking hair in your game. The trick I’m going to cover is to improve quality of typical alpha-tested meshes used in deferred engines. Sorry, but no images in this post though!

Hair rendering problems

There are usually two problems associated with hair rendering that lot of games and game engines (especially deferred renderers) struggle with.

  1. Material shading
  2. Aliasing and lack of transparency

First problem is quite obvious – hair shading and material. Using standard Lambertian diffuse and Blinn/Blinn-Phong/microfacet specular models you can’t get proper looks of hair, you need some hair specific and strongly anisotropic model. Some engines try to hack some hair properties into the G-Buffer and use branching / material IDs to handle it, but as recently John Hable wrote in his great post about needs for forward shading – it’s difficult to get hair right fitting those properties into G-Buffer.

I’m also quite focused on performance, love low-level and analyzing assembly and it just hurts me to see branches and tons of additional instructions (sometimes up to hundreds…) and registers used to branch for various materials in the typical deferred shading shader. I agree that the performance impact can be not really significant compared to bandwidth usage on fat GBuffers and complex lighting models, but still it’s the cost that you pay for whole screen even though hair pixels don’t occupy too much of the screen area.

One of tricks we used on The Witcher 2 was faking hair specular using only dominant light direction + per character cube-maps and applying it as “emissive” mesh lighting part. It worked ok only because of really great artists authoring those shaders and cube-maps, but I wouldn’t say it is an acceptable solution for any truly next-gen game.

Therefore hair really needs forward shading – but how to do it efficiently and not pay the usual overdraw cost and combine it with deferred shading?

Aliasing problem.

A nightmare of anyone using alpha-tested quads or meshes with hair strands for hair. Lots of games can look just terrible because of this hair aliasing (the same applies for foliage like grass). Epic proposed to fix it by using MSAA, but this definitely increases the rendering cost and doesn’t solve all the issues. I tried to do it using alpha-to-coverage as well, but the result was simply ugly.

Far Cry 3 and some other games used screen-space blur on hair strands along the hair tangenta and it can improve the quality a lot, but usually end parts of hair strands either still alias or bleed some background onto hair (or the other way around) in non-realistic manner.

Obvious solution here is again to use forward shading and transparency, but then we will face other family of problems: overdraw, composition with transparents and problems with transparency sorting. Again, AMD Tress FX solved it completely by using order-independent transparency algorithms on just hair, but the cost and effort to implement it can be too much for many games.

Proposed solution

The solution I tried and played with is quite similar to what Crytek described that they tried in their GDC 2014 presentation. I guess we prototyped it independently in similar time frame (mid-2012?). Crytek presentation didn’t dig too much into details, so I don’t know how much it overlaps, but the core idea is the same. Another good reference is this old presentation from Scheuermann from ATI at GDC 2004! Their technique was different and based only on forward shading pipeline, not aimed to combined with deferred shading – but the main principle of multi pass hair rendering and treating transparents and opaque parts separately is quite similar. Thing worth noting is that with DX11 and modern GPU based forward lighting techniques it became possible to do it much easier. 🙂

Proposed solution is a hybrid of deferred and forward rendering techniques to solve some problems with it. It is aimed for engines that still rely on hair alpha tested stripes for hair rendering, have fluent alpha transition in the textures, but still most of hair strands are solid, not transparent and definitely not sub-pixel (then forget about it and hope you have the perf to do MSAA and even supersampling…). You also need to have some form of forward shading in your engine, but I believe that’s the only way to go for the next gen… Forward+/clustered shading is a must for material variety and properly lit transparency – even in mainly deferred rendering engines. I really believe in advantages of combining deferred and forward shading for different rendering scenarios within a single rendering pipeline.

Let me describe first proposed steps:

  1. Render your hair with full specular occlusion / zero specularity. Do alpha testing in your shaders with value Aref close to 1.0. (Artist tweakable).
  2. Do your deferred lighting passes.
  3. Render forward pass of hair speculars with no alpha blending, z testing set to “equal”. Do the alpha testing exactly like in step 1.
  4. Render forward pass of hair specular and albedo for hair transparent part with alpha blending (scaled from 0 to Aref to 0-1 range), inverse alpha test (1-Aref) and regular depth test.

This algorithm assumes that you use regular Lambertian hair diffuse model. You can easily swap it, feel free to modify point 1 and 3 and first draw black albedo into G-Buffer and add the different diffuse model in step 3.

 Advantages and disadvantages

There are lots of advantages of this trick/algorithm – even with non-obvious hair mesh topologies I didn’t see any problems with alpha sorting – because alpha blended areas are small and usually on top of solid geometry. Also because most of the rendered hair geometry writes depth values it works ok with particles and other transparents. You avoid hacking of your lighting shaders, branching and hardcore VGPR counts. You have smooth and aliasing-free results and a proper, any shading model (not needing to pack material properties). It also avoids any excessive forward shading overdraw (z-testing set to equal and later regular depth testing on almost complete scene). While there are multiple passes, not all of them need to read all the textures (for example no need to re-read albedo after point 1 and G-Buffer pass can use some other normal map and no need to read specular /gloss mask). The performance numbers I had were really good – as hair covers usually very small part of the screen except for cutscenes – and proposed solution meant zero overhead/additional cost on regular mesh rendering or lighting.

Obviously, there are some disadvantages. First of all, there are 3 geometry passes for hair (one could get them to 2, combining points 3 and 4, but getting rid of some of advantages). It can be too much, especially if using some spline/tessellation based very complex hair – but this is simply not an algorithm for such cases, they really do need some more complex solutions… Again, see Tress FX. There can be a problem of lack of alpha blending sorting and later problems with combining with particles – but it depends a lot on the mesh topology and how much of it is alpha blended. Finally, so many passes complicate renderer pipeline and debugging can be problematic as well.

 

Bonus hack for skin subsurface scattering

As a bonus description how in a very similar manner we hacked skin shading in The Witcher 2.

We couldn’t really separate our speculars from diffuse into 2 buffers (already way too many local lights and big lighting cost, increasing BW on those passes wouldn’t help for sure). We didn’t have ANY forward shading in Red Engine at the time as well! For skin shading I really wanted to do SSS without blurring neither albedo textures nor speculars. Therefore I came up with following “hacked” pipeline.

  1. Render skin texture with white albedo and zero specularity into G-Buffer.
  2. During lighting passes always write specular not modulated by specular color and material properties into the alpha channel (separate blending) of lighting buffer.
  3. After all lights we had diffuse response in RGB and specular response in A – only for skin.
  4. Do a typical bilateral separable screen space blur (Jimenez) on skin stencil-masked pixels. For masking skin I remember trying both 1 bit from G-Buffer or “hacking” test for zero specularity/white albedo in the G-Buffer – both worked well, don’t remember which version we shipped though.
  5. Render skin meshes again – multiplying RGB from blurred lighting pixels by albedo and adding specularity times the specular intensity.

The main disadvantage of this technique is losing all specular color from lighting (especially visible in dungeons), but AFAIK there was a global, per-environment artist specified specular color multiplier value for skin. A hack, but it worked. Second, smaller disadvantage was higher cost of SSS blur passes (more surfaces to read to mask the skin).

In more modern engines and current hardware I honestly wouldn’t bother, do separate lighting buffers for diffuse and specular responses instead, but I hope it can inspire someone to creatively hack their lighting passes. 🙂

References

[1] http://www.filmicworlds.com/2014/05/31/materials-that-need-forward-shading/

[2] http://udn.epicgames.com/Three/rsrc/Three/DirectX11Rendering/MartinM_GDC11_DX11_presentation.pdf

[3] http://www.crytek.com/download/2014_03_25_CRYENGINE_GDC_Schultz.pdf

[4] http://developer.amd.com/tools-and-sdks/graphics-development/graphics-development-sdks/amd-radeon-sdk/

[5] https://developer.nvidia.com/hairworks 

[6] “Forward+: Bringing Deferred Lighting to the Next Level” Takahiro Harada, Jay McKee, and Jason C.Yang https://diglib.eg.org/EG/DL/conf/EG2012/short/005-008.pdf.abstract.pdf

[7] “Clustered deferred and forward shading”, Ola Olsson, Markus Billeter, and Ulf Assarsson http://www.cse.chalmers.se/~uffe/clustered_shading_preprint.pdf

[8] “Screen-Space Perceptual Rendering of Human Skin“, Jorge Jimenez, Veronica Sundstedt, Diego Gutierrez

[9] “Hair Rendering and Shading“, Thorsten Scheuermann, GDC 2004

Posted in Code / Graphics | Tagged , , , , , | 7 Comments

C#/.NET graphics framework on GitHub + updates

As I promised I posted my C#/.NET graphics framework (more about it and motivation behind it here) on GitHub: https://github.com/bartwronski/CSharpRenderer

This is my first GitHub submit ever and my first experience with Git, so there is possibility I didn’t do something properly – thanks for your understanding!

List of changes since initial release is quite big, tons of cleanup + some crashfixes in previously untested conditions, plus some features:

Easy render target management

I added helper functions to manage lifetime of render targets and allow render target re-use. Using render target “descriptors” and RenderTargetManager you request a texture with all RT and shader resource views and it is returned from a pool of available surfaces – or lazily allocated when no surface fitting given descriptor is available. It allows to save some GPU memory and makes sure that code is 100% safe when changing configurations – no NULL pointers when enabling not enabled previously code paths or adding new ones etc.

I also added very simple “temporal” surface manager – that for every surface created with it stores N different physical textures for requested N frames. All temporal surface pointers are updated automatically at beginning of a new frame. This way you don’t need to hold states or ping-pong in your rendering passes code and code becomes much easier to follow eg.:

RenderTargetSet motionVectorsSurface = TemporalSurfaceManager.GetRenderTargetCurrent("MotionVectors");
RenderTargetSet motionVectorsSurfacePrevious = TemporalSurfaceManager.GetRenderTargetHistory("MotionVectors");
m_ResolveMotionVectorsPass.ExecutePass(context, motionVectorsSurface, currentFrameMainBuffer);

Cubemap rendering, texture arrays, multiple render target views

Nothing super interesting, but allows to much more easily experiment with algorithms like GI (see following point). In my backlog there is a task to add support for geometry shader and instancing for amplification of data for cubemaps (with proper culling etc.) that should speed it up by order of magnitude, but wasn’t my highest priority.

Improved lighting – GI baker, SSAO

I added 2 elements: temporally supersampled SSAO and simple pre-baked global illumination + fully GPU-based naive GI baker. When adding those passes I was able to really stress my framework and check if it works as it is supposed to – and I can confirm that adding new passes was extremely quick and iteration times were close to zero – whole GI baker took me just one evening to write.

csharprenderer_withgi

GI is stored in very low resolution, currently uncompressed volume textures – 3 1MB R16 RGBA surfaces storing incoming flux in 2nd order SH (not preconvolved with cosine lobe – not irradiance). There are some artifacts due to low resolution of volume (64 x 32 x 64), but for cost of 3MB for such scene I guess it’s good enough. 🙂

It is calculated by doing cubemap capture at every 3d grid voxel, calcularing irradiance for every texel and projecting it onto SH. I made sure (or I hope so! 😉 but seems to converge properly) it is energy conserving, so N-bounce GI is achieved by simply feeding previous N-1 bounce results into GI baker and re-baking the irradiance. I simplified it (plus improved baking times – converges close to asymptotic value faster) even a bit more, as baker uses partial results, but with N -> oo it should converge to the same value and be unbiased.

It contains “sky” ambient lighting pre-baked as well, but I will probably split those terms and store separately, quite possibly at a different storage resolution. This way I could simply “normalize” the flux and make it independent of sun / sky color and intensity. (it could be calculated in the runtime). There are tons of other simple improvements (compressing textures, storing luma/chroma separately in different order SH, optimizing baker etc) and I plan to gradually add them, but for now the image quality is very good (as for something without normalmaps and speculars yet 😉 ).

Improved image quality – tone-mapping, temporal AA, FXAA

Again nothing that is super-interesting, rather extremely simple and usually unoptimal code just to help debugging other algorithms (and make their presentation easier). Again adding such features was matter of minutes and I can confirm that my framework succeeds so far in its design goal.

Constant buffer constants scripting

A feature that I’m not 100% happy with.

For me when working with almost anything in games – from programming graphics and shaders through materials/effects to gameplay scripting the biggest problem is finding proper boundaries between data and code. Where splitting point should be? Should code drive data, or the other way around. From multiple engines I have worked on (RedEngine, Anvil/scimitar, Dunia plus some very small experience just to familiarize myself with CryEngine, UnrealEngine 3, Unity3D) in every engine it was in a different place.

Coming back to shaders, usually tedious task is putting some stuff on the engine side in code, and some in the actual shaders while both parts must mach 100%. It not only makes it more difficult to modify some of such stuff, adding new properties, but also harder to read and follow code to understand the algorithms as it is split between multiple files not necessarily by functionality, but for example performance (eg. precalculate stuff on CPU and put into constants).

Therefore my final goal would be to have one meta shader language and using some meta decorators specify frequency of every code part – for example one part should be executed per frame, other per viewport, other per mesh, per vertex, per pixel etc. I want to go in this direction, but didn’t want to get myself into writing parsers and lexers and temporarily I used LUA (as extremely fast to integrate and quite decently performing).

Example would be one of my constant buffer definitions:

cbuffer PostEffects : register(b3)
{
 /// Bokeh
 float cocScale; // Scripted
 float cocBias; // Scripted
 float focusPlane; // Param, Default: 2.0, Range:0.0-10.0, Linear
 float dofCoCScale; // Param, Default: 0.0, Range:0.0-32.0, Linear
 float debugBokeh; // Param, Default: 0.0, Range:0.0-1.0, Linear
 /* BEGINSCRIPT
 focusPlaneShifted = focusPlane + zNear
 cameraCoCScale = dofCoCScale * screenSize_y / 720.0 -- depends on focal length & aperture, rescale it to screen res
 cocBias = cameraCoCScale * (1.0 - focusPlaneShifted / zNear)
 cocScale = cameraCoCScale * focusPlaneShifted * (zFar - zNear) / (zFar * zNear)
 ENDSCRIPT */
};

We can see that 2 constant buffer properties are scripted – there is zero code on C# side that would calculate it like this, instead a LUA script is executed every frame when we “compile” constant buffer for use by the GPU.

UI grouping by constant buffer

Simple change to improve readability of UI. Right now the UI code is the most temporary, messy part and I will change it completely for sure, but for the time being I focused on the use of it.

constant_buffer_grouping

Further hot-swap improvements

Right now everything in shader files and related to shaders is hot-swappable – constant buffer definitions, includes, constant scripts. Right now I can’t imagine working without it, definitely helps iterating faster.

Known issues / requirements

I was testing only x64 version, 32 bit could be not configured properly and for sure is lacking proper dll versions.

One known issue (checked on a different machine with Windows 7 / x64 / VS2010) is runtime exception complaining about lack of “lua52.dll” – it is probably caused by lack of Visual Studio 2012+ runtime.

Future plans

While I update stuff every week/day in my local repo, I don’t plan to do any public commits (except for something either cosmetic, or serious bug/crash fix) till probably late August. I will be busy preparing for my Siggraph 2014 talk and plan to release source code for the talk using this framework as well.

Posted in Code / Graphics | Tagged , , , , , | Leave a comment