April AI SIG Notes | Midwest Historical and Genealogical Society

Accuracy
- - Large language models deal in probability, not truth. Outputs must be checked for accuracy.
  - Ask the AI to cite sources – and check that they are real

Training Data
Large language models are only as good as training data.
Biased data leads to biased outputs: article
- Did the company have rights to use training data?
  - Google – uses search, web, etc
  - Facebook – uses posts on FB. Also used a database of pirated books: article
  - Beginning to sign contracts: article

Confidential Data
- When you go to, say, chatgpt.com, everything you input may get added to training data, eventually. Don’t include confidential or sensitive information. (AI is actually good at helping to anonymize data, such as generating lists of names to substitute for the real names.)Assume everything that you type on the public internet is added to training data.New downloadable models run on your personal device. Your interactions are not necessarily uploaded as training data.
- Check the terms of use!

Copyright, Trademarks & Privacy
- Purely AI-generated content cannot be copyrighted in the US. AI makes it really easy to generate illegal derivatives of copyrighted content.
- May you use AI to create images using Coke cans, Lady Gaga’s face or graphic violence?
  - Most products have restrictions and guidelines. Grok (the X product) does not.

Authorship
- When and how do we disclose the assistance of AI? If it is being used as a tool in a way similar to a dictionary or word processor – ie, as a tool whose results get filtered through a human brain – probably not. If it is doing unaudited work, probably.
- How to make sure future AI products can be distinguished from “original” products… such as images, letters?

Environmental Issues
- Running a new machine learning model on a huge training dataset, and offering it to millions of users, requires huge amounts of electricity and water. Article

Coalition for Responsible AI in Genealogy
- RootsTech session on CRAIGEN: https ://www .familysearch .org /en /rootstech /session /guidelines –for –the –responsible –use –of –artificial –intelligence –ai –in –genealogy

Resources
- The Family History AI Show, by Mark Thompson & Steve LittleGenealogy and Artificial Intelligence Facebook Group
- University of South Florida has written a very useful LibGuide on using AI: website

Wichita, Kansas