ManaGPT: 4,080 NLP Prompts and Generated Texts

Gladden, Matthew E. “ManaGPT: 4,080 NLP prompts and generated texts. Output from an LLM trained on a corpus from organizational futures studies.” Dataset published on Kaggle.com, March 26, 2023.

This dataset can be downloaded from Kaggle.com.

ManaGPT: Generated text length by input-sequence subjectSummary. This dataset includes 4,080 texts that were generated by the ManaGPT-1020 large language model, in response to particular input sequences.

ManaGPT-1020 is a free, open-source model available for download and use via Hugging Face’s “transformers” Python package. The model is a 1.5-billion-parameter LLM that’s capable of generating text in order to complete a sentence whose first words have been provided via a user-supplied input sequence. The model represents an elaboration of GPT-2 that has been fine-tuned (using Python and TensorFlow) on a specialized English-language corpus of over 509,000 words from the domain of organizational futures studies. In particular, the model has been trained to generate analysis, predictions, and recommendations regarding the emerging role of advanced AI, social robotics, ubiquitous computing, virtual reality, neurocybernetic augmentation, and other “posthumanizing” technologies in organizational life.

In generating the texts, 102 different prompts were used, each of which was employed to generate 20 responses. The 102 input sequences were created by concatenating 12 different “subjects” with 17 different “modal variants,” in every possible combination.