Making my Own Transformer Language Model
I made my own transformer in late 2024, using pytorch. I compiled my own small dataset from
Project Gutenberg, consisting of children's books.
I also made my own byte-pair encoder and tokeniser, in addition to the attention and mlp layers.
The resulting model is pretty terrible. Here is an example paragraph:
Prompt: "He "
He cut him into O
between, recly off he asted out them to castly,-son
charts. She grew in the lakeyeous old other, who suppose. And he looked as well old wel and to
to servers heart, too,
k.On of a simply at the poor birth, stood life room
turn food and beat to
King and followed, the mountains over on, on to --the to-water away. The young Thunable to
both farmer ance."What his pleasanty had even spatful
The code can be found here.