Skip to content

Instantly share code, notes, and snippets.

View JuanDiegoMontoya's full-sized avatar
🧊

Jake Ryan JuanDiegoMontoya

🧊
View GitHub Profile
@JuanDiegoMontoya
JuanDiegoMontoya / opengl_resource_indexing.md
Last active April 16, 2024 10:52
How to do bindless in OpenGL without blowing your legs off.

The Definitive Guide to Non-Uniform Resource Indexing in OpenGL

Code

For those short on time:

#extension GL_NV_gpu_shader5 : enable
#extension GL_EXT_nonuniform_qualifier : enable

#ifdef GL_EXT_nonuniform_qualifier
#define NonUniformIndex(x) nonuniformEXT(x)
@JuanDiegoMontoya
JuanDiegoMontoya / glslang_includer.cpp
Last active March 28, 2024 12:26
A shrimple little include handler for glslang that works (doesn't crash lol) for my subset of use cases. Supports nested relative includes with local include syntax (""). Does not support absolute paths for includees or system include syntax (<>).
#include <glslang/Public/ShaderLang.h>
#include <glslang/SPIRV/GlslangToSpv.h>
#include <vector>
#include <cassert>
#include <stdexcept>
#include <fstream>
#include <stdexcept>
#include <memory>
@JuanDiegoMontoya
JuanDiegoMontoya / coro_semaphore_2.cpp
Created December 22, 2023 12:43
test of sync primitive for libcoro: this semaphore can be incremented and decremented by an arbitrary amount (instead of just 1), which makes it useful for rate-limiting resources
#include <coro/coro.hpp>
namespace test
{
class semaphore
{
public:
enum class acquire_result
{
acquired,
@JuanDiegoMontoya
JuanDiegoMontoya / SampleShadow.glsl
Created June 3, 2023 09:31
Adaptive shadow map sampling
uint pixel_index = uint(gl_FragCoord.x) + uint(gl_FragCoord.y) * 4096u;
uint seed = pcg_hash(pixel_index);
float shadow = Shadow(seed, fragWorldPos, normal, -shadingUniforms.sunDir.xyz);
float shadow_accum = shadow;
int succ = 0;
for (int i = 0; i < 4; i++)
{
if (shadow_accum / (i + 1) > 0.0 && shadow_accum / (i + 1) < 2.0 / (shadowUniforms.pcfSamples * (i + 1)) + 0.0001)
{
succ++;
@JuanDiegoMontoya
JuanDiegoMontoya / glCopyBufferSubData_but_ancient.cpp
Last active June 22, 2022 08:34
glCopyBufferSubData (ancient edition)
/*
* This snippet demonstrates how to copy the contents of a buffer to another buffer without CPU readback
* using only features available in OpenGL 2.1.
* I use DSA for brevity, but hopefully it should be obvious how to convert this to ancient GL if needed :)
* The method involves creating a temporary texture and framebuffer, copying the source buffer's contents
* to the texture (via pixel unpack), then copying the texture's contents to the destination buffer using pixel pack.
*/
// convenience
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
@JuanDiegoMontoya
JuanDiegoMontoya / NonUniformHack.glsl
Last active April 11, 2024 22:38
UPDATE: This does not actually work in its current form! Use nonuniformEXT on available drivers (AMD at the time of writing). GLSL cross-platform non-uniform resource indexing hack (I did not make this)
#extension GL_ARB_shader_ballot : require
uint NonUniform(uint i)
{
for (;;)
{
uint cur = readFirstInvocationARB(i);
if (cur == i)
{
return cur;
}
@JuanDiegoMontoya
JuanDiegoMontoya / rand.glsl
Last active March 21, 2024 03:59
A fast, robust, and easy-to-use set of functions for generating random numbers in GLSL. Rehosted/stolen from here so it's in an easier-to-access location: https://www.shadertoy.com/view/sd3GR2
// PCG hash, see:
// https://www.reedbeta.com/blog/hash-functions-for-gpu-rendering/
// Used as initial seed to the PRNG.
uint pcg_hash(uint seed)
{
uint state = seed * 747796405u + 2891336453u;
uint word = ((state >> ((state >> 28u) + 4u)) ^ state) * 277803737u;
return (word >> 22u) ^ word;
}
@JuanDiegoMontoya
JuanDiegoMontoya / cubeHelpers.h.glsl
Last active October 29, 2022 21:14
Functions for getting the index and UV for a cube face struck by a direction vector originating inside the cube (i.e., cubemap coord to texture index and UV).
int GetCubeFaceIndex(vec3 dir)
{
float x = abs(dir.x);
float y = abs(dir.y);
float z = abs(dir.z);
if (x > y && x > z)
return 0 + (dir.x > 0 ? 0 : 1);
else if (y > z)
return 2 + (dir.y > 0 ? 0 : 1);
return 4 + (dir.z > 0 ? 0 : 1);
@JuanDiegoMontoya
JuanDiegoMontoya / Timer.h
Last active September 10, 2022 06:45
Fast-compiling timer class that relies on cursed and undefined behavior. Notably, the header does not include any headers itself. At the bottom should be a not-UB version that includes the necessary header.
#pragma once
class Timer
{
public:
Timer();
void Reset();
double Elapsed_s() const;
Timer(const Timer& other);
@JuanDiegoMontoya
JuanDiegoMontoya / TimerQueryAsync.h
Last active March 6, 2023 17:16
Asynchronous OpenGL timer query for measuring nested render passes without stalling or blocking. Thanks to Hexcoder for providing an original implementation of this.
#include <cstdint>
#include <optional>
// Async N-buffered timer query.
// Does not induce pipeline stalls.
// Useful for measuring performance of passes every frame without causing stalls.
// However, the results returned may be from multiple frames ago,
// and results are not guaranteed to be available.
// In practice, setting N to 5 should allow at least one query to be available.
class TimerQueryAsync