For better color, there’s a stupid smart answer and a smart stupid answer.
The stupid smart answer is that the k-means algorithm you use to find best-fit colors also works to find best-fit palettes. You take some initial guesswork palettes, bin tiles according to which palette is closest, and modify each palette using the colors from those tiles. A few loops of that get you something half-decent. There’s not much point in overdoing it because I don’t think the results are stable. For this application, you’d want to do one loop per frame, to minimize sudden flickery changes.
The smart stupid answer is to use RGBK and let dithering do its thing. Or for the GBC’s dim screen, CMYK. The point is, error diffusion makes up for whatever nearby pixels missed, and you have a color to deal with the channel that needs it most.