Wow, that’s painful. Seems the limits can be even more severe than that too. This actually seems to stem from some clause in C99:
The implementation shall be able to translate and execute at least one program that contains at least one instance of every one of the following limits:¹³
65535 bytes in an object (in a hosted environment only)
I guess we could try chunking this data into multiple strings. It’s perverse to be doing this dance just to make the compiler successfully shove a bunch of literal bytes into the output binary. We could try bypassing the compiler and passing this as a blob to the linker, but my experience is that getting this working cross-platform is more trouble than it’s worth.
Maybe we should take a step back and solve a higher order problem: what actually is this data? It looks vaguely regular to me. Could we compute it programmatically instead of shipping a 10MB look-up-table in the binary?
I am totally out of my depth here, but wonder if we can just move to 64 bits? I know, repeating something you already said.
We can try to get someone’s attention at GitLab.
Yes, it is data Yifan Hu used for a color palette. We have had problems
with the size in the past. I remember playing with some ways of
generating it at build/install time or dynamically as needed. I got
sidetracked when things fell apart. Let me refresh my memory about it.
After some detailed research (i.e. chain-reading Wikipedia), I think this data is the visible portion of the CIELAB color space under a certain configuration. I tried to learn how to compute this programmatically, eventually coming across this SO post that links to this helpful site. With a combination of that and staring at the lab.c code, I came up with the Python below that almost reproduces the data. I think remaining discrepancies are due to precision loss or order-of-operations differences I’ve missed somewhere.
The main takeaways are:
yes, we can compute this data programmatically at runtime, however…
it seems prohibitively expensive to do so
Having said that, knowing how to compute it we can probably come up with a more compact/efficient representation of it.
def srgb_to_xyz(sR, sB, sG): [0/914566]
var_R = sR / 255.0
var_G = sG / 255.0
var_B = sB / 255.0
if var_R > 0.04045:
var_R = ((var_R + 0.055) / 1.055) ** 2.4
else:
var_R /= 12.92
if var_G > 0.04045:
var_G = ((var_G + 0.055) / 1.055) ** 2.4
else:
var_G /= 12.92
if var_B > 0.04045:
var_B = ((var_B + 0.055) / 1.055) ** 2.4
else:
var_B /= 12.92
var_R *= 100
var_G *= 100
var_B *= 100
X = var_R * 0.4124 + var_G * 0.3576 + var_B * 0.1805
Y = var_R * 0.2126 + var_G * 0.7152 + var_B * 0.0722
Z = var_R * 0.0193 + var_G * 0.1192 + var_B * 0.9505
return X, Y, Z
# D65 1931
Reference_X = 95.047
Reference_Y = 100.000
Reference_Z = 108.883
def xyz_to_cielab(X, Y, Z):
var_X = X / Reference_X
var_Y = Y / Reference_Y
var_Z = Z / Reference_Z
# if var_X > 0.008856:
if var_X > 0.008856451679035631:
var_X = var_X ** (1 / 3.0)
else:
# var_X = 7.787 * var_X + 16 / 116.0
var_X = (903.2962962962963 * var_X + 16) / 116.0
# if var_Y > 0.008856:
if var_Y > 0.008856451679035631:
var_Y = var_Y ** (1 / 3.0)
else:
# var_Y = 7.787 * var_Y + 16 / 116.0
var_Y = (903.2962962962963 * var_Y + 16) / 116.0
if var_Z > 0.008856451679035631: # 0.008856:
var_Z = var_Z ** (1 / 3.0)
else:
# var_Z = 7.787 * var_Z + 16 / 116.0
var_Z = (903.2962962962963 * var_Z + 16) / 116.0
CIE_L = 116 * var_Y - 16
CIE_a = 500 * (var_X - var_Y)
CIE_b = 200 * (var_Y - var_Z)
return CIE_L, CIE_a, CIE_b
colors = set()
for r in range(256):
for g in range(256):
for b in range(256):
x, y, z = srgb_to_xyz(r, g, b)
l, a, b = xyz_to_cielab(x, y, z)
# round all values
l = int(l)
a = int(a)
b = int(b)
# reject out of range
if l < 0 or l > 100:
continue
if a < -128 or a > 127:
continue
if b < -128 or b > 127:
continue
colors.add((l, a, b))
for color in sorted(colors):
print(color)
In a stackoverflow article it is claimed that in a somewhat similar situation, declaring a large array as static reduced compile time dramatically. Unfortunately we have two source files that may need to access this table - possibly they need to be merged?
I noticed our interesting ideas about include files. Also this code,
double *lab_gamut_from_file(char *gamut_file, const char *lightness, int *n){
/* give a list of n points in the file defining the LAB color gamut. return NULL if the mgamut file is not found.
lightness is a string of the form 0,70, or NULL.
*/
mgamut? A string of the form 0,70, or NULL? Chuckle.
I think I’ve concluded that this is not a 32/64 bit issue. After an almost endless struggle with using environment variables in PowerShell that are defined in Visual Studio .bat files I finally managed to compile using the HostX64 compiler. I still fails, but it’s only the 32-bit jobs that gives any detailed error message. The pipeline is here: Pipeline · Magnus Jacobsson / graphviz · GitLab. I’d appreciate another pair of eyes to check the logs so I haven’t come to the wrong conclusion. Since we don’t get any detailed error message for the 64 bit builds I guess it would be possible that the 64 bit build encounters a completely different problem, but I doubt it.
How performance-critical is this functionality? We could require users of this library to call an initialization function on startup that computes this table. As an added benefit, we could construct the table as doubles (rather than chars) and then drop the heap allocation in lab_gamut() and just return a const pointer into the table.
Is this worth exploring? If so, I can write up some patches for this so we can test how it behaves.
What’s wrong with making it static (private) data and accessing through a function?
This seems like really a bad problem with the Microsoft Visual Studio C compiler.
I think others have broken up large tables into smaller tables and concatenated them.
The main point is that the algorithm is too slow to run every time. I’m also against penalizing other platforms because of this poor behavior in Visual C.
I have to call it a day here now, but I just wanted to mention that I’ve managed to compile it using HostX86 (32-bit) (but not using HostX64 (64-bit mode)) here: Pipeline · Magnus Jacobsson / graphviz · GitLab.
In these jobs I’m not using msbuild, but the MSVC compiler (cl) directly.
Exactly what I did to get there, I’m going to backtrack tomorrow, since unfortunately, in the latest series of commits leading up to this working one, I used only HostX64 so it’s going to take some bisecting. I’m not even sure that I’m compiling it in a way that is usable in the appropriate context. I also have to find out how to make msbuild set the cl options to what I’ve manually set them here.
Worth noting is also that this version is using a plain array with 1000 data points per row, not an array of structs with one struct per row, but I don’t know if this is necessary or not. In solitude, that didn’t help anyway.
Wow, nice work!! I was beginning to think it would be impossible to get MSVC to work without significant code changes.
I can’t immediately tell what the key change to make it work was. Maybe msbuild is passing some different flags or a different environment to cl?
If we can figure out a way to decrease memory usage when compiling this file, I think we should still apply it to aid anyone building on an underpowered machine.
I’m now finished with everything except deployment. To test this I’m going to have to push to a branch (or branches) in the master repo in order to be able to use the necessary tokens to deploy to https://www2.graphviz.org. I will name those branches magjac-<something>. I will also use some kind of magjac directory on https://www2.graphviz.org so I don’t interfere with the normal distribution.
This may lead to that you will get email about failing pipelines. As long as it looks like it’s coming from one of my branches, you can safely ignore them.