A brief survey of modern technological solutions for server side vector graphics rendering.
Updated: May 7
Nevertheless, there are cases when a software product cannot be designed on top of web technologies. One of such cases is a requirement to render graphics content in real-time on remote servers, processing many incoming requests in parallel and sending the results back to users — all in real time and with minimal latency. Furthermore, if it requires interleaving of 2D graphics with 3D at maximum performance, then usually there is a need for a customized solution which can be an existing 3rd party library, or ‘roll your own’ solution that would match the specific needs of the application. In this post I will give a brief overview of the most common 2d scalable vector graphics rendering technologies, their pros and cons, based on my personal experience working on projects which required the implementation of such a functionality.
Raster (left) vs vector (right) shapes when scaled up.
I want to give a theoretical example of a niche application that could require usage of a high performance vector graphics rendering functionality: You want to develop a cloud based software capable of streaming live video and overlay a 2d graphics on top of it. This is quite a common application used in sports, entertainment, advertising and other industries. You’re the tech lead in charge of developing this product and you have to decide on what kind of infrastructure the company is going to rely on, to achieve the goal and deliver the best solution possible.
Abusing the Web Browser
Ideally, you would like to use a turnkey solution that provides 2d graphics rendering with a minimum fuss out of the box. This can be a web browser like Chromium or Firefox, which can be run even on a headless server. Full stack developers would probably want to use node.js with browser-less canvas modules like this one, issue drawing commands and download the results from the server as a bitmap. I won’t continue with the node.js path in this article as I don’t have any experience using it. Having said that, it is obvious that running vector graphics rendering via node.js should be more efficient than doing the same with a web browser.
State of the art vector graphics rendering API, convenient interface with support of any type of path drawing defined by SVG, will save you a headache to develop and maintain your own solution.
Zero ability to adjust the solution to the specific needs and unique apps often come with specific use cases. If vector path rendering is supposed to be a part of complex composition of graphics layers, sources which come from other locations (video, image,3d models). Such a scenario may become even harder to implement if there is a requirement for interaction between different layers like dynamic z-order, intersection, pixel perfect collision tests, 2d/3d/ picking. That’s because a browser will return a bitmap with all the shapes data backed into it. Performing a hit-test on the 2d shape baked into an image wouldn’t allow a pixel perfect precision. Animated shapes would require per-frame re-rendering with such an API first, then re-submitting those to your application to update this specific graphics element. And what if your app uses hundreds of highly dynamic animated shapes, which interact with other graphics objects in the layered manner, including alpha blending? You will have to mark for re-render any other vector shape element which overlays another shape which graphics state has been changed. And there is much more to this when it comes to a layered rendering that involves 2d graphics manipulated in a specific order.
In other words, if the 2D graphics are used as one monolithic overlay bitmap-such a solution is acceptable, but forget about high performance.
Then there is a scalability problem. Just take a look how much memory Google chrome consumes on your system with any new tab being opened. Well, probably using node.js on the server(I assume it uses less hardware resources when calling different web APIs compared to running the browser heedlessly) with dozens of cores and a ton of RAM available will make it less painful, but that’s a bad start for an efficient scalability and you will find yourself wasting a significant amount of financial resources to run more instances on AWS to amortize this problem.
The above solution should probably be part of what I put at the bottom of this article as it introduces all the same inefficiencies as game engine and video authoring tools when being run in unconventional ways. But I put it separately on purpose — today’s world is heavily based on web technologies and many companies take it for granted to use them to solve problems these technologies are not designed to solve.
GL_CHROMIUM_path_rendering or GL_NV_path_rendering extensions.
The first one probably (again, Skia experts ,please correct me if I am wrong here) uses ANGLE back-end, which provides WebGL and GL ES interfaces translating calls to D3D on Windows. DirectX has a Direct2D library which provides its own implementation of GPU accelerated path rendering. Second extension is NVIDIA GPU specific OpenGL extension called NV_PATH ,which adds ability to render vector path using OpenGL API. If you will be running Skia on an Nvidia GPU, it is likely that Skia will try to use this back-end to rasterize 2D vector shapes.
The above piece of code checks for GPU accelerated path rendering support, located in google/skia/src/gpu/gl/GrGLCaps.cpp
Additionally, Skia provides its own GPU accelerated path rasterizer (again, correct me if I am wrong, but that’s what it seems to be based on the code located in GrTriangulator and other Gr* files located in Skia’s GPU directory. Finally, Skia provides a complete software rasterizer implementation as a fallback for the systems lacking proper GPU hardware.
Use Skia as a 3rd party library integrated into our application to provide support for GPU accelerated vector graphics rendering.
State of the art vector graphics rendering API, supporting standards like SVG, will save you a headache to develop and maintain your own solution.
Compared to using a web browser as an interface ,this one will provide much better overall performance and significantly less HW resource consumption as it is not a part of another application like a web browser.
Can be integrated directly into the target application.
It is not easy to decouple useless functionality as Skia presents its own types even for simple math data structures like SkPoint, SkRect, SkScalar. It also manages dynamic memory allocation using memory pools with its own data structures as well as its own types for reference counting (aka smart pointers). In short, if you plan to extract the GPU path rendering functionality from the code base, it is going to be not a trivial amount of work as you will need to replace or include all the POD types, memory management routines, refactor almost every function signature etc. And we didn’t talk about functional modification your project may need. It won’t be an easy task as the library is pretty big and complex algorithmic areas don’t provide any documentation.
Skia may produce noticeable artifacts for complex paths (see below),if running on software mode.
Image source “Anecdotal Survey of Variations in Path Stroking among Real-world Implementations”, M.J Kilgard
In the above illustration the author (who is also the person behind Nvidia’s NV_PATH extension) compares results of different vector path rasterizers. Here we can see that Skia CPU and GPU (no NV_path_rendering)receive grade C, which is pretty low due to the produced artifacts.
Below is grade A+ (perfect score) result obtained using NV_PATH OpenGL extension.
It is hard to dismiss such a level of visual bug in real world applications. If you want to read more about using NV_path_rendering directly — skip to the bottom of the article.
Another option is Qt SDK. Yes, it is known for its robustness to render cross-platform desktop and mobile complex UI systems, but the SDK can be used in an offscreen mode and it has its own cross-platform CPU/GPU internal vector graphics rendering functionality. In fact, cross-platform native UI programming functionality is just a small portion of things Qt SDK does nowadays. You can use it to build a full fledged 3d rendering engine, with a minimum effort. Latest version of the SDK provides vector graphics rendering via C++ Graphics, or alternatively QML Shape API. Both render shapes at a very high quality using GPU. But here is the catch. Qt SDK code base is huge and so complex that you should forget about “hacking” relevant pieces out of it. It won’t work. The standard approach working with Qt is either relying totally on the SDK (which provides nearly any conceivable modules for any sort of development tasks),or creating a hybrid architecture where, for instance, the core rendering logic is written outside Qt SDK, using “pure” C++,then integrated into Qt project as a static lib or just as raw source files. In such a case, from my personal experience, it won’t be easy to interact with the shape rendering API, again, since it is an integral part of Qt’s rendering pipeline. Therefore, the workflow would be similar to using Skia library: Call Qt’s drawing routines, bake the results into a bitmap, render the bitmap in your renderer via texture mapping.
One of the best existing turnkey C++ based solutions for rich graphics, cross platform, application development available on the market, given you’re fine using Qt SDK as is.
Easy learning curve.
Fast development pace.
Vector shapes are constrained to 2d space. To render in 3d one has to render to texture then map it on a 3d model.
Modifying the source is possible, though it is going to be a hell of work. If your finances afford, you can hire Qt creators to make the source code modification to fit your specific needs.
Noticeable memory overhead due to the SDK size. This can be mitigated by building the SDK from the source excluding unused modules. Still, some of the libraries are must, so you can’t get reduced to a bare minimum.
The Licensing model may be at odds with the business model of your company.
Getting “Close to the metal”
It’s important to note that the above mentioned solutions (except browser based rendering) can truly be considered CTM, by definition. The code is compiled to the machine byte code, no VMs involved. Still, libraries like Qt and Skia present relatively complex architectures, with pretty deep call stacks and processing pipelines which bring a performance overhead which cannot be dismissed. So how can we shave off these and get even more efficient communication with the hardware?
Low level graphics hardware APIs like OpenGL, Direct3D, Vulkan, Metal (yeah, the low level graphics libs market is thriving…) allow fairly fast access to the GPU. Don’t be mistaken, these libraries do not issue function calls directly to your graphic card. Those are still implemented in C/C++ and installed on the system as .dll or .so libraries. To get the payload to GPU you would issue a function call into the user space library, which is done on CPU, then it would proceed into the kernel driver, which would finally dispatch a related hardware specific command to the GPU. Still, we are talking here about something that happens extremely fast and you probably can’t get it faster unless you write your own GPU driver implementations.
So I mentioned Nvidia’s path rendering extension quite a few times in this article (from now on will call it NVPR). Skia supports it as one of its GPU backends. If you have the capacity to use “raw” OpenGL in your application and if the deployment is planned to be on AWS or Google Cloud, it is not a bad idea to grab NVPR and implement a high-level interface to provide state of the art, high performance vector shape rendering directly on GPU. I used NVPR in the past both for vector shapes as well as text rendering needs and all I can say is — it is a pity this extension hadn’t been promoted to the core, which means it will probably stay forever vendor specific, which in turn means you can’t run OpenGL based app that uses this extension on anything but Nvidia cards. But hey, show me who uses today Intel or AMD GPUs for the server side rendering in the cloud? NVPR does not just provide extremely accurate vector rasterization (See again Skia paragraph above),which beats industry standard alternatives like Qt, Skia and Cairo, it also as fast as it can get, mostly due to the specific optimization work done in the OpenGL driver to allow faster batching and pipelining of path commands with minimal hardware state change in runtime.
Image source. Different results from projective (3D) rendering of shapes by NVPR, Skia, Cairo, Qt path rendering implementations.
NVPR features out of the box text ,SVG, PostScript path formats, quadratic and cubic curves, state of the art shape strokes with different join, line and cap styles, path interpolations, accurate intersection tests, gradient fills, bitmap fills (via pixel shader) and more. And… it integrates seamlessly with 3D rendering. Adobe Illustrator uses it for vector shapes rasterization, which probably means this is a rock solid solution for professional vector graphics.
I can summarize the disadvantages to the following, again, based on my experience working with this extension:
Vendor specific. Unsupported on non Nvidia platforms.
The API is based partially on the fixed pipeline functionality, though some of those are DSA methods, which is nice and makes the usage of the API and GL state management a bit more convenient. In fact, NVidia is committed to providing fixed pipeline calls optimization with minimal overhead compared to using the programmable pipeline. But this one rather has to do with the overall application design considerations, which may lead to somewhat ugly inconsistencies in execution flow, if the rendering core is written with modern OpenGL. (For example, using a programmable pipeline you upload matrices to GPU via uniforms and GPU buffers, while NVPR uses immediate mode functions like glPush/LoadMatrix).
Only fragment shader stage access is allowed, which means you have zero access to geometry buffers generated by NVPR. (not sure you will ever need it, but who knows…) Fragment shader attributes like uv coordinates access is provided only via built-in GLSL variables (similar to pre GLSL 3.0 era).
MSAA or any sort of screen space anti-aliasing must be used as NVRP takes care of smoothing only the curvatures. Straight lines stay aliased by default (this has to do with the path rendering implementation technique which based on stencil — then — cover algorithm)
GPU Debugging. This is probably the most important one. If you get used to GPU debuggers like RenderDoc or Nvidia Nsight you will feel blind as those refuse to support vendor specific extensions debugging.I didn’t use Nsight for a long time but the last time I did — it didn’t work when NVPR function calls were used in the app.Now (12/02/2020) I installed the latest Nsight Graphics 2021.1.0 release and well, it still doesn’t support NVPR draw calls debugging… which is kind of weird given the fact NVRP is Nvidia’s extension.
I skip Microsoft’s rendering library in this article as I have never used it. The library is also Windows OS specific which makes it less attractive for server side rendering, which today is heavily based on Linux operating systems.
“Roll your own” solution.
Then you can just decide to implement a GPU accelerated vector graphics rendering library from scratch. This is totally doable. It will require from you at least a minimal understanding of computational geometry and some linear algebra. If you’re particularly good on those subjects it may take you several months to get it done.
The Pros are obvious:
Made and maintained in house, which allows total control on what and how to implement.
Can fit into modern low level APIs like Vulkan seamlessly.
What about Cons?
Development and maintenance time.
Hard to implement advanced features without proper research and multiple dev iteration cycles.
Must always have an expert on the R & D team to take care of this module.
Probably will never be as fast as NVPR.
If you decide to go for it, I highly recommend reading “Resolution Independent Curve Rendering using Programmable Graphics Hardware” by Loop & Blinn. This is a foundation for many GPU accelerated implementations nowadays.
Image source: “Resolution Independent Curve Rendering using Programmable Graphics Hardware”, page 4.
NVPR is based on this paper with some further improvements, like real time triangulation (vs offline geometry generation as proposed in the original paper),which is possible due to stencil — then — cover 2 pass tricks. There are other interesting techniques based on scanline rasterization, but those would probably perform slower. If you can afford investing into thorough research, you can also invent your own technique. For example, here you can check probably the best GPU accelerated vector text rendering solution on the market, created and patented by computer scientist and mathematician Eric Lengyel. Sometimes it is better to pay for state of the art solution. It will save you both time and money, in the long run, giving you immediate access to cutting-edge technology.
Headless game engines, image and video editing software.
Finally, this part is dedicated to the desperate souls, such as PoC implementers and those who have no technical capacity to develop robust solutions.
You think it is a joke, right? Why on earth anyone would want to run a game engine on the cloud. Well, this is not. In fact, many businesses do it starting from cloud based game streaming and all the way down to the media companies which lack proprietary robust rendering solutions and therefore are forced to run video editing software like Adobe AfterEffect on the server. I will not get into details regarding running Adobe AfterEffects on the server. My advice — just don’t do it.
Epic’s Unreal or Unity game engines, on the other hand, are totally workable solutions. I put aside Unity, as my experience with this engine in its recent releases is very limited, and will focus on Unreal Engine 4 ( UE4).
UE4, with a minimal effort can be turned into a headless rendering remote server, but it doesn’t support vector shapes rendering by default. You will need to buy or write a plugin which can answer your needs. At the moment of this writing UE4 can be run in headless mode on Linux using Vulkan rendering back-end. UE4 also completely exposes the whole engine source code (C++). So you can hack it according to your needs. The fact that it deprecated OpenGL in favor of Vulkan makes it somewhat difficult to integrate a low level path rendering functionally, via NVPR extension, as example. So it won’t be easy to get a path rendering into UE4 rendering core.
OpenGL -> Vulkan interoperability extensions can be used to render NVPR in OpenGL context then copy into Vulkan surface. Again, we are talking here about hacking into UE4 RHI, which is pretty big and complex. So yes, this won’t be easy.
In terms of hardware resources consumption, UE4 would still present a major bottleneck for any setup which will need running more than a single UE4 instance simultaneously on the same server instance. A minimal sized UE4 app takes ~1Gb + of RAM and utilizes close to 100% of GPU. Something more serious may easily surpass the 12Gb mark. It’s a beast. Payments to the cloud providers will dramatically increase as you will have to add more and more hardware (both CPU and GPU) instances to maintain the scaling demands. Another important consideration here is royalties and maintenance. Some game engines are free to use and open source, but will ask you to start paying royalties once you reach some earning threshold, others are royalty free, but are closed source and require a payment by subscription. It is hard to tell what is the right choice here from the business point of view as it depends on the product market potential. From the engineering perspective, in my opinion, open source is preferable as you have the ability to maintain the code base by yourself. Remember, any bug shipped with the next release of the closed source engine means bug in your product.
I didn’t forget OpenVG, I omitted it on purpose. More than a decade ago The Khronos Group designed an open standard for GPU accelerated scalable vector graphics and called it OpenVG. The thing is, the standard didn’t take off across all the major hardware platforms. It was meant to be OpenGL for vector graphics. It didn’t happen. One of the reasons may be the fact that there is not much need for GPU 2D shapes rendering in the gaming industry, who knows? Anyhow, the API is pretty ugly (OpenGL 1.1 style immediate mode), there are several commercial hardware implementations targeting mostly embedded devices, and some open source implementations done via OpenGL exist as well, but I have never used those and have no idea how robust and performant they are.