Top 5 things to know about AI ethics

Wondering why some people say that most current AI tools (especially generative AI) weren't developed ethically or responsibly, why it matters, and what you can do? Here's what you need to know!

Mar 11, 2025

If you’ve been using generative AI tools like ChatGPT, Gemini, Claude, & Perplexity, and wondering why AI ethics are an issue, here’s the scoop. This article highlights 5 reasons why more and more people view most current AI tools — especially generative AI tools — as unethically developed. Learn about these 5 ethical concerns and what you can do.

This article is not a substitute for legal advice and is for general information only.

If you’re more of an AI power user or developer, you might prefer this deeper dive, with extensive references: “Five Concerns in Ethical and Responsible AI (and how most tech companies fall short)”, 6 'P's in AI Pods (AI6P).

Top five things to know about AI ethics

“Generative AI was trained on the internet and so has inherited many of its unsolved issues, including those related to bias, misinformation, copyright infringement, human rights abuses, and all-round economic upheaval.” 1 [MIT Technology Review, 2023]

Overview
Concern 1. Adverse Environmental Impacts
Concern 2. Unethical Data Sourcing
Concern 3. Exploitation of Data Workers
Concern 4. Harmful Model Biases
Concern 5. Impact on Lives and Livelihoods
The Bottom Line: What We Can Do
References

Overview

AI algorithms and companies need data and computing power. For those who use AI, but aren’t familiar with how the tools are built, here’s a simplified view.

Diagram showing 5 key steps in AI development & use ecosystem, how AI tool usage by humans in step 5 affects the first 4 steps, and how the 5 steps map to 5 ethical AI concerns. © 2025 Karen Smiley, CC BY-NC-ND.

Here’s a quick overview on these 5 steps in delivering AI tools. AI companies:

Build huge, high-capacity data centers. (Or contract with a company that has these data centers.)
Acquire large volumes of data for ‘training’ AI models in those data centers. Data sources include: ‘scraping’ the public web; ‘data brokers’ who collect and sell our data; licensing data from other companies (such as book publishers or music labels); and people who already use their tools.
Process the data to get it ready to use for AI. (For instance, a song might be labeled with its genre, or a video might be labeled as offensive.) This is done partly by human ‘data workers’, partly by using AI models for labeling.
Have AI developers build models with the data, evaluate them for accuracy and fairness, and tune them.
Offer AI tools with features that use these models, and collect more data on how people use the tool and features.

As shown in the diagram above, ethical concerns arise in each of these 5 steps. Let’s look at each one.

1. Adverse Environmental Impacts

AI algorithms need data and computing power. Inefficient algorithms or wasteful use of AI drive demand for even more data, power, or both. Environmental and climate impacts are a key factor in cost-benefit analyses of AI technologies, especially generative AI.

On the plus side, AI (mostly not generative AI) has the potential to help humans find more efficient ways to design and operate all kinds of systems.

However, current AI algorithms, systems, and levels of demand need massive data centers. Data center construction and operation consume water, power, and rare earth minerals, and create hazardous waste. When data center power is generated unsustainably, CO2 emissions or nuclear power risks increase, and the impact & risks to the environment are even higher.

You may have heard buzz about AI training costs. While they can be steep, people’s high levels of using AI tools (‘inferencing’) actually consume far more power and water than training.

Environmental concerns can potentially be addressed in several ways:

Developers can make AI itself more efficient.
Companies can increase power capacity for data centers in sustainable, low-impact ways, e.g. solar, wind, water, or geothermal
People can use AI tools more judiciously or more efficiently, reducing demand

2. Unethical Data Sourcing

AI tools are developed and run on massive amounts of data. This can include written texts, images, videos, or other ‘modes’ of data. Most of this data comes from us humans!

Most people worldwide believe we all have inherent rights to the “3Cs: Consent. Credit. Compensation.”2 for our data. However, ownership rights can get complicated when people’s data is:

stored in online platforms that tech companies control,
released under a legal agreement (e.g. a contract between a musician and a label),
collected from people without their full, free, and informed consent.

Many concerns have arisen on whether companies who obtain data for AI are sourcing it ethically - i.e. whether consent is obtained and whether the people are credited and / or compensated.

Most AI companies are not open about where they get data. However, most AI tools are built on stolen data. Even worse, some datasets used for AI training (e.g. LAION-5B) are polluted with harmful content, such as CSAM material.

There are many ways to give creators consent (opt-IN), and to credit and compensate them for use of their works. We can and must expect AI companies to do better in order to earn our business.

3. Exploitation of Data Workers

Raw data generally needs processing to be used for training AI and in operations of AI-based systems. While much data processing is automated, some isn’t, and human labor is needed to make the data and tool useful.

Data work can offer people in job-scarce areas remote opportunities to support themselves and their families. However, some companies mistreat data workers. For more information, see our recent article in

6 'P's in AI Pods (AI6P)

about this: “AI ethics: beyond not stealing data”

AI ethics: beyond not stealing data

Karen Smiley

Feb 4

Read full story

4. Harmful Model Biases

Biases permeate our society and lives; 12 common types are recognized. (Example: biased performance feedback on the job.) Many social biases are ‘unconscious’ or implicit.

Well-trained and carefully-developed tools, including AI tools, could help to identify and mitigate biases in many everyday work and life situations. For instance, a recent article by

Karo

Zieminski highlighted how use of data and AI tools could help to improve objectivity in business decision-making and reduce friction in prioritizing product features.

When developing an AI-based tool, biases can be introduced, worsened, or mitigated in steps 2-4: data sourcing, data labeling (by humans or AI models), and model building. At best, an AI tool can only be as good and as fair as:

the diversity of society represented in the data,
priorities for the tool’s design,
developers’ awareness and abilities to identify and address bias risks.

People build tool designs and models with the data, priorities, and risks in mind. Datasets tend to be biased because our society (from which data is drawn) is biased. Without care in sourcing, selecting, and processing data for use in AI, biased AI tools will reinforce and worsen existing harmful biases in our society. (Mistakes made by AI tools — sometimes called ‘hallucinations’ — can be harmful, too.)

Stories abound of AI algorithms that are biased and have caused harm. (Examples: Machine learning vision tools fail to recognize people with darker skin; generative AI creates only pictures of white men in suits when asked to picture a CEO; predictive policing algorithms show high racial disparities.)

Even when companies ensure well-selected data, hire diverse teams, and give high priority to detecting and fixing biases, it’s still not easy to do. That’s no excuse for not trying, though, and some companies are trying.

5. Impact on Lives and Livelihoods

Some people question the harm of AI companies harvesting or scraping data to build generative tools that “we all” can use, sometimes even “for free”. Like other aspects of ethics, there are pros and cons.

AI-based tools can potentially help people work more efficiently or enable them to create new businesses. AI tools can also improve accessibility and global communications. [If you’re using Substack’s “read aloud” feature to listen to this article, that’s AI-based voice cloning in action.] Several AISW interview guests have touted these benefits.

AI and ML can also be used wisely and ethically to improve people’s lives in many ways. For instance, AI tools may detect lung cancer earlier (a cause that’s long been close to my heart).

But most AI companies aren’t altruistic. They’re commercial, profit-driven enterprises. And “we all” will not benefit from AI tools to the same degree; some people’s lives and livelihoods will be severely harmed.

To be blunt, data scraping for AI is stealing: human creators worldwide lose potential income from selling rights to their existing works. Selling or using generative AI tools trained on that stolen content benefits other people, and deprives those creators of income for new works. And markets for creative works by musicians, artists, writers, and other professionals are being overrun by competing AI-generated content (e.g. Spotify and Amazon books).

This impact to creators’ livelihoods is already happening and is projected to get much worse for many industries. Some AISW interview guests have reported this happening to them. AI tools which were unfairly built on stolen professional-quality work are being used to compete against them. Their businesses are withering as a result.

Keep in mind that even “free” tools aren’t really free. Every person who uses those AI tools (and every person on whom the AI tools are used) is giving the tool companies even more data — sometimes very personal data! — that the companies will likely use for AI training and tuning. Through interactions, feedback, and prompt sequences while using the tool, human users create computing workloads and give free data and labor to the tool companies. This is reflected by the arrows in the reference diagram, looping back from human use of the tool to data centers, data sourcing, data processing, and model building:
Many people view providing this labor and data as part of their cost for being able to use the tools (whether they are free or not). However, without knowing how AI companies will use their data, people may not know the actual price they pay. It’s not ‘informed consent’, and may not be a fair exchange of value.

The Bottom Line: What We Can Do

When creators’ prior work is stolen for use in AI, without the 3Cs, they lose the opportunity to earn money from their hard efforts. Other people benefit, perhaps unknowingly, at the expense of the creators whose work was stolen. As HBR presciently noted in 2022, a month after ChatGPT was released:

“The key to adjusting is figuring out how to redesign our economic systems to fully engage the working population. That will require system solutions that don’t just shift the same tasks between people and machines.”

We’re not there yet.

In the meantime, here are three tips on things we can do to become more ‘data literate’, stay informed about responsible AI, and use AI more wisely and ethically.

Tip 1: Choose AI tools carefully.

Check for responsible AI certifications and assessments when choosing an AI tool, such as ISO-42001 (Claude) or Fairly Trained (19 companies).
Look for an “AI Usage Policy” and “Terms of Use”. (This 2023 podcast by Mark Miller gives plain-English explanations by a lawyer of EULAs from 10 major companies.)
Look into the site’s privacy and data protection notices and whether the tool shares data with an underlying AI cloud platform or partner. Prefer AI platforms and tools that publicly affirm their adherence to regional data protection standards (even if you don’t live in a covered region).
Prefer tools that can trace the results they give you to specific citations or sources.
Be alert for news on how the tool provider addresses biases, where they get their training data, and how they treat data workers.

Tip 2: Protect your data.

Be thoughtful about which tools and companies you choose to allow to use your personal data - or the data of children, friends, patients, clients, colleagues, partners, or businesses. Some tips:

Before you enter detailed prompts or responses, or upload files to an AI tool, know how that data might be used to train the company’s tool, or by a platform or partner it uses.
If you post images of your creative work online, consider using one of the ‘data poisoning’ tools to protect your images (e.g. Nightshade, Glaze, or Kudurru).

Tip 3: Use genAI with care, not by default.

Many needs do not require AI or ML, or genAI, to be well met. Less generation required = less impact on the environment, less time and cost to you, lower risks, and less harm to lives and livelihoods of human creators and data workers.

Before reaching for that chosen AI-based tool, consider whether the needs could be met with a non-AI search, image repository, tool, etc. or by an original human creator. (For example, many public domain and free-to-use sources of images are readily available; the list of image sources I use is here).
If not, use the genAI tool efficiently. Learn to prompt it better so that you get good results with fewer runs. Try searching the table in this directory of 240+ women and nonbinary folks in 37+ countries who write about AI and data here on Substack. You’ll probably find people in your industry and role who are using AI and data well, and writing about how they do it!
Be alert for biases in what you get from AI tools, especially generative AI tools. Not all medical advice cited by AI is based on studies that include women, and not all CEOs are white males in suits!

Many AI tool users may be unaware of the risks they are taking with unethical tools, such as loss of privacy, copyright infringement, or unintentionally generating biased or harmful content. One goal of this article is to help more people become aware of the ethical risks and what they can do to use AI safely. If you enjoyed this article, please add a heart, share, restack, or post a Note! This helps others to find the article, and it lets me know you value this work.

What questions do you have about finding and using AI tools ethically and responsibly?

Credits

No AI tools were used in the creation of this article.

References

For a much deeper dive on these ethical concerns, with extensive references, please see “Five Concerns in Ethical and Responsible AI (and how most tech companies fall short)”, 6 'P's in AI Pods (AI6P), by Karen Smiley.

Five Concerns in Ethical and Responsible AI (and how most tech companies fall short)

Karen Smiley

Mar 11

Read full story

“These six questions will dictate the future of generative AI”, Will Douglas Heaven, MIT Technology Review, 2023-12-19

Agile Analytics and Beyond
is a 100% reader-supported publication. (No ads, no affiliate links, no paywalls on new posts). All new posts are FREE to read and listen to. To automatically receive new posts and support Karen’s work on AI ethics and inclusion, consider becoming a subscriber (it’s free)!
Special offer: In honor of International Women’s History Month and my first full year on Substack, as a thank you to my readers, I am offering a 25% discount for life on all monthly or annual paid subscriptions started by March 31, 2025. Use this Subscribe link or the button above to redeem!
One-time tips or voluntary donations via paid subscriptions are always welcome and appreciated, too 😊
Drop a tip in the jar!

Celeste Garcia

May 26Edited

Karen, AI represents a once-in-a-millennium technological advancement. With this kind of advancement comes a litany of ethical questions. I like how clearly you have outlined the issues. I believe we need to demand transparency and accountability from Big Tech. I have done extensive research and posted about 2 of the 5 Ethical Concerns you raised.

While AI may help us solve climate change one day, for now data centers are guzzling fossil fuels. I did a deep dive into the alarming CO2 emissions in a post called "Cloud Computing or Coal Computing" https://gettingrealaboutai.substack.com/p/cloud-computing-or-coal-computing-e22 The hypocrisy of big tech claiming they are Net Zero Carbon Emissions is astonishing.

Meanwhile, big tech is blatantly violating copyright laws and fighting to expand "Fair Use" to avoid licensing and royalties while competing against copyright owners at scale, which I discussed in a recent post, "Big Tech Is Stealing Your IP" https://gettingrealaboutai.substack.com/p/big-tech-is-stealing-your-ip

I agree wholeheartedly with the risks you have outlined and would add another major concern. Lack of diversity in the development of AI and the leadership. The AI Tech Workforce is 29% women to 71% men. As for AI Company CEOs, the leadership gap is even more pronounced. Across industries, men outnumber women in CEO roles by approximately 17 to 1. In AI it is much lower, but I couldn't find any real data.

Expand full comment

1 reply by Karen Smiley

1 more comment...