Blog

research

Augmented Reality with web technologies

Daniel
Daniel

With the start of the implementation of the WebRTC API around 2012, javascript developers gained access to the video and audio streams coming from webcams and microphones. This paved the way for augmented reality (AR) applications that were build solely with web technologies. Using webtechnologies for AR is also called "the augmented web". Most AR applications use a webcam feed that gets analyzed for certain patterns, for instance a simple color field, a movement or a marker. Nowadays there are several other ways to augment the reality of a webpage, for instance using geolocation or devices such as the Leapmotion. In this post I will focus on AR with markers. Two libraries While some developers have created their own AR libraries, most developers use either JSAruco or JSARToolkit. Both libraries are javascript ports from ancient C++ libraries. JSAruco is based on OpenCV and JSARToolkit is a port of ARToolkit via the in-between ports NyARToolkit (Java) and FLARToolkit (Actionscript). The inner-workings of the libraries is as follows: a snapshot of the video feed is taken by copying image data from the video to a canvas on every animation frame. This image data gets analyzed for markers, and the position and rotation of every detected marker is returned. This information can subsequently be used to render a 3D object on top of the video feed, thus augmenting the reality. Three.js works very well with both JSAruco and JSARToolkit and I made 2 simple examples that show you how to use the libraries with Three.js, the code and some markers are available at Github. AR 3D model rendered on a marker Markers A marker is usually a grid of small black and white squares and there are certain rules for how these black and white squares must be patched together. Diogok has put code on github that generates all possible JSAruco markers. Note that JSAruco markers can not be used with JSARToolkit; you have to use the markers that you can find in the repository on Github, in the markers folder. Both libraries support multiple markers, which means that each distinct marker gets its own id, and this id can be used to couple a model (or an action) to a specific marker. For instance I made this small test using multiple markers: Conditions In the process of analyzing, the image is turned into an inverted plain black and white image. This means that a pixel is either white or black and this makes it very easy to detect a marker. For the best results, good bright lighting is mandatory. Also putting the marker on a surface with a plain color is recommended. If possible, using backlight is ideal. In general you should turn off the auto focus of your webcam. Marker detection Marker detection Performance In JSARToolkit you can set the size of the image data that will be processed: parsing smaller images is faster but on the other hand smaller images have less detail. Besides tweaking the size of the image being processed, you can set the threshold, which is the color value that classifies whether a pixel will become white or black. In JSAruco the size of the image data has to match the size of the canvas that you use to render the 3D scene (in our case: where we render the Three.js scene). I have noticed that if the width of the canvas is more than about 700 pixels, JSAruco starts to have difficulties detecting markers, and the wider the canvas, the more severe this problem becomes. In general JSARToolkit performs better than JSAruco, but both libraries suffer from missed or wrongly positioned markers, resulting in an unsteady presentation. You can compare both libraries yourself using the simple test applications that I mentioned earlier. Code is at Github. Web or native On iOS you don't have access to the streams coming from a camera or a microphone due to restrictions put in place by Apple. So on this platform it is impossible to create an AR application with only web technologies. Since mobile devices have become ubiquitous, you see an increasing number of native AR libraries for mobile platforms appear on Github, especially for iOS. The benefits of native are twofold: better performance and control over all camera settings (contrast, brightness, auto focus, etc.). Better performance means faster and more accurate marker detection and control over camera settings provide tools for optimizing the incoming feed. Moreover you can use the light sensor of your device to detect the light intensity and adjust the camera settings accordingly. Currently you can't use the light sensor API on iOS, but on Android, Ubuntu, FirefoxOS and Windows you can. Conclusion Technically you can build an AR application using web technologies but the performance isn't as good as native AR apps such as Layar and Roomle. For some applications web technologies might suffice, for instance art installations or applications that just want to show the possibilities of AR. The advantage of using web technologies is obvious: it is much simpler to set up an application and it runs on any platform (iOS being the sad exception). The lesser performance is partly because analyzing the image data is done in the main javascript thread, and partly because the lack of control over the camera settings which leads to a poor quality of the incoming feed, for instance due to bad or fluctuating light conditions. On the short term using webworkers may improve the analyzing and detection step, and on the longer term the ever improving performance of browsers will eventually lead to a more reliable marker detection. Furthermore Web API's keep evolving so in the near future we might get more control over the camera settings via javascript. The draft version of the MediaCapture API already shows some useful future capabilities. Also there is a draft Web API for the light sensor that is currently only implemented in Firefox. The future of the augmented web looks bright.

Point Light Shadows In Three.js, part II

Ruben
Ruben

While working on [a 3D project] that involved garden lights we stumbled upon unexpected problems with shadows cast from point lights. The shadows were either not there at all or they were just all over the place. It seemed impossible to create a decent light/shadow world (garden in our case). After recovering from the initial shock and disappointment we started investigating the problem. This should be doable. As it turns out, it sort of is. Read on. In the [previous blog post] about this topic we talked about possible ways to implement an efficient way of calculating point light shadows in [Three.js], the javascript 3D library. 'Efficient' meaning: taking fewer texture samplers than the naive approach which takes six samplers for each point light, one for each of the principal directions. We came up with three potential solutions: Divide a larger texture into smaller viewports and draw the depth map to each of these viewports Render each of the depth maps to one side of a cube texture map Use dual-hyperboloid shadow mapping We will first discuss the three approaches and finish up with the results of the most successful one. Possible solutions The first approach - dividing a larger texture into smaller viewports - proved to be difficult to integrate with the current shader and uniform variables. The GLSL shader language requires all of the for loops to be unrollable, as the shaders can then be divided into small chunks of work for each of the stream processors of the GPU. When handling an assortment of textures, some of which have this subdivision of smaller textures, it is a pretty complex task to write it in such a way that it is unrollable without making two separate loops for the two types of textures. Which in turn makes maintaining the shader code quite a hassle. viewports Image from the internal rendering in Three.js. The grid is added to clarify what happens. The grid locations correspond with +x, −x, +y, −y, +z and −z axis, going from top left to bottom right. The second approach, the cube texture, was scrapped halfway through development. While the solution seemed obvious and also is used in the industry, it was very hard to debug. Both Firefox' native canvas debugger and Chrome's WebGL debugger, called the [WebGL Inspector], did not render the cube texture properly. We could observe the switching of framebuffers (which are like internal screens to draw on) but they stayed blank while the draw calls proceeded. This means Three.js did not cull them and they should have shown up on the framebuffer. With no way to debug this step and no output it would be ill-advised to continue to develop this method. cube depth map Image taken from [devmaster.net] explaining shadow mapping using cube maps. The final approach is the dual-paraboloid shadow mapping. This approach takes two textures per point light. The [previous blog post] talked about one, but this proved to be incorrect. This fact would make it less ideal than the other two approaches. On top of that, the implementation is rather complex. If we had complete control over the OpenGL code this could be a solution, but figuring out where to adapt the Three.js code and the shaders would probably turn out to be a struggle. As it would also involve a transformation to paraboloid space it would be really hard to debug. All this would be required for a lesser effect than the other - hopefully more simple - methods, like the larger texture with viewports. paraboloid transformationImage taken from [gamedevelop.eu] explaining the paraboloid transformation. The most favorable approach In conclusion the best way to make point light shadows work, without going over the texture-sampler limit or spending too much time, is the "large texture with viewports" approach. This means we have to duplicate some code in the shader and implement two loops to do shadow calculation: one calculating shadows for all the spot lights and one for all the point lights. After implementing this strategy we ran into another problem. This time the number of varying variables ([GLSL standard] page 31) in the shaders exceeded the WebGL implementation register limit. This limit in Chrome is fixed at 16. This meant we could only have one point light with shadows, which is even fewer than when we used the naive implementation. In Firefox the limit is higher which results from it being hardware implementation defined. On my - basic - hardware, it works smoothly with two point lights, but the performance starts to suffer when enabling three or more point lights. The result is shown in the video below. The reason for this is that the hardware implementation of the fragment shaders only has a couple of "fast registers". These are actually separate hardware implementations of real registers which allow fast access to the data stored within. If you exceed this hardware limit, values normally stored in these fast registers will be stored in "slow registers". These are implemented by storing them in Video RAM, which is much slower relative to the fast registers. Shadows from point lights in our demo Conclusion Can we use these results for something practical? Yes, in Firefox this demo will run in real-time with a couple of point lights IF your hardware has some extra "fast registers". If you want to use more than a couple point lights, you can still use this implementation to generate screenshots of scenes that give a nice impression of the shadows being cast (in a garden, in a living room etc). For an extensive, real-time solution you will need above average desktop hardware. Consumers using popular devices (smartphones, tablets, laptops) are obviously not part of the target audience. However, practical applications are still to found in - for example - fixed setups like a presentation in an exhibition stand. [GLSL standard]: https://www.khronos.org/files/openglesshadinglanguage.pdf#page=37 [devmaster.net]: http://devmaster.net/p/3002/shader-effects-shadow-mapping [gamedevelop.eu]: http://gamedevelop.eu/en/tutorials/dual-paraboloid-shadow-mapping.htm [WebGL Inspector]: http://benvanik.github.io/WebGL-Inspector/ "WebGL inspector homepage" [Three.js]: http://threejs.org/ "three.js homepage" [previous blog post]: /blog/15/point-light-shadows-in-threejs

Point light shadows in Three.js

Marlon
Marlon

For a research and development project we created a small garden environment in which you can place lights. The objective was to visualise what your garden would look like during the night, beautifully lit according to your personal light design. Of course objects in your garden cast shadows and influence the look and feel of the lighting: we needed to include shadow casting in our demos. That seemed doable, but it turned out the WebGL framework we use for our WebGL development, Three.js, only supports shadow casting for spot lights. Unfortunately not all lights in our gardens are spot lights... We needed to find a way to cast shadows from point lights in Three.js. YGA verlichting Screen from the garden prototype. Spot light support only... The first step was understanding how shadow casting works for spot lights and understanding why this would not work for point lights. The process of shadow casting is quite elegant. It takes an extra render pass one comparison per object and light to determine if a fragment should be shaded. The extra render pass is rendering the distance of an object to every light to a separate texture. This is called the depth pass. You can do this by applying the inverse View matrix, which represents the position and rotation of the camera, to the current object and applying the model matrix of the current light you want to calculate shadows for. Shadowpass The above picture shows a color representation of the depth value. Each color corresponds with a 32 bit integer indicating where its z-position lies between Z-Near and Z-Far. So a the maximum value of the 32 bit integer would correspond with Z-Far and the 0 value would correspond with Z-Near. Now you can check for every pixel if it is in the shade of a certain light as follows. Take the world coordinate of the fragment you are processing right now. This information can be passed by the vertex shader and interpolated in the fragment shader. Transform it into light space: mimic looking at that spot from the position of that light. This is achieved by passing the transformation matrix of the light you are processing to the fragment shader. Then transform it to a pixel coordinate by applying the perspective transformation of the corresponding "shadow camera". This determines what area is shaded by this light. The next step is to calculate the distance between the point and the light, if this is greater than the distance stored in the depth pass texture (z-buffer texture) the pixel is shaded by this lamp, if it's closer it is illuminated. But this only works well for spot lights because of the how perspective works in OpenGL. Perspective mapping in OpenGL works with frustums, which is a pyramid with the top cut off. The new plane that is created by removing the top of the pyramid is called the near plane and this is what you see on screen. All other pixels are coloured by ray tracing to the far plane from the eye through the pixel you want to color to the far plane (which is the bottom of the pyramid). This works for small angles, but the maximum angle you can approach is 180 degrees. At which your near plane will be very small and very close to the camera position, which will cause weird distortions in the rendering. For point lights we would need a 360 degree field of view as it is called, and this is simply impossible. Frustum So what do we do? Instead of doing 1 depth pass we do 6. One for every unit direction of the space we are in. so one in +x, -x, +y, -y, +z and -z direction. All with a horizontal and vertical field of view of 90 degrees. This covers the whole space. But where do we render them to? We can't use 1 texture for all of them, as they would simply override the previous depth pass. There are four solutions: Render them to 6 different textures. Which takes up 6 of the (in our case) 16 texture samplers available to us in the webgl fragment shader pipeline. Render them to one big texture with smaller viewports. Lowers the maximum possible shadow resolution and complicates shaders as you need to map shadow camera matrices to the right parts of the texture. Use a cube map which is a 3D texture that has 6 2D textures corresponding to each side of a cube creating a shadow cube. Only uses one sampler but has the same shader complexity of mapping a shadow perspective to the right cube side. Use dual paraboloid shadow mapping, which does allow us to use 1 texture for point lights but still needs the 6 render passes we talked about. Another downside of this technique is that there are slight distortions, but all in all it should look nice enough. These implementations are all candidate implementations for the final prototype. They haven’t been implemented yet except for the first one, which makes 6 seperate textures per point light. The result of that implementation is shown below. Update: we've written a follow-up article "Point light shadows in Three.js, part II". Test environment for point light shadows