Where does chatgpt get training data?

ChatGPT is an AI language model that was trained with a large amount of text from various sources (for example, the most basic training of language models consists of predicting a word in a sequence of words). In most cases, this is seen as a prediction of the next token or as masked language modeling.

