When an A.I. model is trained to create images from text, it uses a massive dataset of images and captions. The model is trained by displaying the captions and having it attempt to recreate the images associated with each one as accurately as possible. The model learns both general concepts found in millions of images, such as how humans look, and more specific details, such as textures, environments, poses, and compositions, which are more easily identified.

