GPT models’ learning and disclosure of personal data: An experimental vulnerability analysis

GPTf-PDVS banner (centered)

Medium.com • April 10, 2023

SUMMARY: The possible gathering, retention, and later dissemination of individuals’ personal data by AI systems utilizing Generative Pretrained Transformers (GPTs) is an area that’s of growing concern from legal, ethical, and business perspectives. To develop a better understanding of at least one aspect of the privacy risks involved with the rapidly expanding use of GPT-type systems and other large language models (LLMs) by the public, we conducted an experimental analysis in which we prepared a series of GPT models that were fine-tuned on a Wikipedia text corpus into which we had purposefully inserted personal data for hundreds of imaginary persons. (We refer to these as “GPT Personal Data Vulnerability Simulator” or “GPT-PDVS” models.) We then used customized input sequences (or prompts) to seek information about these individuals, in an attempt to ascertain how much of their personal data a model had absorbed and to what extent it was able to output that information without confusing or distorting it. The results of our analysis are described in this article. They suggest that – at least with regard to the class of models tested – it’s unlikely for personal data to be “inadvertently” learned by a model during its fine-tuning process in a way that makes the data available for extraction by system users, without a concentrated effort on the part of the model’s developers. Nevertheless, the development of ever more powerful models – and the existence of other avenues by which models might possibly absorb individuals’ personal data – means that the findings of this analysis are better taken as guideposts for further scrutiny of GPT-type models than as definitive answers regarding any potential InfoSec vulnerabilities inherent in such LLMs.

Read more

“Modal Hints” for ManaGPT: Better AI Text Generation Through Prompts Employing the Language of Possibility, Probability, and Necessity

ManaGPT banner (centered)

Medium.com • March 12, 2023

SUMMARY: The crafting of optimal input sequences (or “prompts”) for large language models is an art and a science. In this article, we conduct an exploratory analysis of 4,080 sentence-completion responses generated by ManaGPT-1020. This model is an LLM that has been fine-tuned on a corpus of scholarly and popular works from the domain of management and organizational foresight, with the aim of engineering a model that can produce texts containing novel insights into the emerging impact of advanced AI, social robotics, virtual reality, and other “posthumanizing” technologies on the structure of organizations and our human experience of organizational life. More particularly, we investigate how the length and quality of texts generated by the model vary in relation to “modal hints” that are supplied by a user’s input sequences. Such hints take the form of modal verbs and phrases that suggest the degree of possibility, probability, or logical or moral necessity that a completed sentence should reflect. Our preliminary analysis suggests that such “modal shading” of prompts can have at least as great an impact on the nature of the generated sentences as the identity of the subject that a user has chosen for a given sentence.

Read more

ManaGPT: 4,080 NLP Prompts and Generated Texts

ManaGPT: Generated text length by input-sequence subject

“ManaGPT: 4,080 NLP prompts and generated texts. Output from an LLM trained on a corpus from organizational futures studies” • March 26, 2023

SUMMARY. This dataset includes 4,080 texts that were generated by the ManaGPT-1020 large language model, in response to particular input sequences. ManaGPT-1020 is a free, open-source model available for download and use via Hugging Face’s “transformers” Python package. The model has been trained to generate analysis, predictions, and recommendations regarding the emerging role of advanced AI, social robotics, ubiquitous computing, virtual reality, neurocybernetic augmentation, and other “posthumanizing” technologies in organizational life.

Read more

ManaGPT-1020: An LLM Generating Insights into “Posthumanized” Organizations

ManaGPT logo

ManaGPT-1020 • March 23, 2023

SUMMARY: ManaGPT-1020 is a free, open-source model available for download and use via Hugging Face’s “transformers” Python package. The model is a 1.5-billion-parameter LLM that’s capable of generating text in order to complete a sentence whose first words have been provided via a user-supplied input sequence. The model represents an elaboration of GPT-2 that has been fine-tuned (using Python and TensorFlow) on a specialized English-language corpus of over 509,000 words from the domain of organizational futures studies. In particular, the model has been trained to generate analysis, predictions, and recommendations regarding the emerging role of advanced AI, social robotics, ubiquitous computing, virtual reality, neurocybernetic augmentation, and other “posthumanizing” technologies in organizational life.

Read more

8-Bit Mystique: An Ingardenian Aesthetic Analysis of the Appeal of Retro Computer Games

“8-Bit Mystique: An Ingardenian Aesthetic Analysis of the Appeal of Retro Computer Games”

In Roman Ingarden’s Aesthetics and Ontology, edited by Leszek Sosnowski and Natalia Anna Michna • London: Bloomsbury, 2023

ABSTRACT: Recent years have seen revived interest in 8-bit computer games developed in the 1980s and the increasing popularity of “8-bit-style” or “retro” games designed to imitate their look. Judged objectively, 8-bit games appear far more “primitive” than typical contemporary video games, leading some to suggest that they are deficient works of art whose resurgent popularity results solely from nostalgia. However, by drawing on Ingarden’s analysis of artworks as schematic constructs, we argue that 8-bit-style games’ “primitiveness” is actually a form of indeterminacy that can generate singularly meaningful aesthetic experiences by allowing (and requiring) players to perform a uniquely enjoyable kind of concretization that is impossible with more “sophisticated” contemporary games. We also draw on Ingarden’s account of the “life cycle” of a work of art to show how 8-bit games’ pattern of initial popularity, neglect, and revival reflects an organic vitality demonstrated not by kitsch but by exceptional artwork.

Read more

A Better Way of Forecasting Employees’ Performance: Evaluating the use of composite ceiling-floor models for predicting the likely range of workers’ future job performance

A joint range model created with Comport_AI

Medium.com • March 12, 2023

SUMMARY: This text is Part 3 of a three-article series on “Advanced modelling of workers’ future performance ranges through ANNs with custom loss functions.” It demonstrates how custom ceiling and floor models can be combined to create a composite prediction interval that can outperform simpler models based on MAE or SD in forecasting the probable range of workers’ future job performance.

Read more

Optimize Your Performance Intervals! Use ANNs with custom loss functions to predict probable ceilings and floors for workers’ future job performance

A ceiling model created with Comport_AI

Medium.com • March 12, 2023

SUMMARY: This text is Part 2 of a three-article series on “Advanced modelling of workers’ future performance ranges through ANNs with custom loss functions.” It investigates the mechanics of independently modelling the likely ceiling and floor of the range of a worker’s probable future job performance using separate artificial neural networks with custom loss functions.

Read more

One Number Is Rarely Enough: Why prediction intervals are critical (and challenging!) for HR predictive analytics

A joint range model created with Comport_AI

Medium.com • March 12, 2023

SUMMARY: This text is Part 1 of a three-article series on “Advanced modelling of workers’ future performance ranges through ANNs with custom loss functions.” It explores why it’s useful to predict the probable ceiling and floor for an employee’s future performance – and why it’s difficult to do so effectively, using conventional methods based on mean absolute error or standard deviation.

Read more

Comport_AI™

Comport_AI screenshots

Comport_AI™ (version 0.3.22) • March 5, 2023

ABSTRACT: Comport_AI is a free open-source HR predictive analytics tool in the form of a Python-based web app that uses advanced machine learning to forecast the likely range of a worker’s future job performance. Rather than mechanistically deriving the predicted ceiling and floor of a worker’s future performance from a single predicted target value using calculations based on MAE or SD, Comport_AI treats the likely ceiling and likely floor of a worker’s performance during a future timeframe as independent entities, which are modelled by artificial neural networks whose custom loss functions enable them to formulate prediction intervals that are as small as possible, while being just large enough to contain a worker’s actual future performance value, in the vast majority of cases. This allows more precise, nuanced, and useful forecasting of workers’ future job performance. Comport_AI utilizes TensorFlow, Keras, scikit-learn, FastAPI, Uvicorn, Jinja2, NumPy, Pandas, and Matplotlib. It’s developed by Matthew E. Gladden (with support from Cognitive Firewall LLC and NeuraXenetica LLC) and is made available for use under GNU General Public License Version 3.

Read more