We trained a CLIP model entirely on synthetic data. Here the paper on arXiv.