Difference between revisions of "AI policy principles in California"
Line 1: | Line 1: | ||
− | AI policy principles in California | + | =AI policy principles in California= |
− | |||
What is the policy? | What is the policy? | ||
+ | |||
How do we implement it? | How do we implement it? | ||
+ | |||
How do we communicate what we have learned / concluded? | How do we communicate what we have learned / concluded? | ||
Revision as of 20:59, 22 December 2024
AI policy principles in California
What is the policy?
How do we implement it?
How do we communicate what we have learned / concluded?
Goals for State Policy:
- Job protection
- Technology leadership
- Climate (hello. this technology uses a ton of power)
Data stewardship: how do we use data ethically and responsibly?
What do we require of companies?
- Without scaring tech companies out of California?
- Without relying on tools and service providers that are beyond the reach of CA law?
How are we building these tool? Where is data being hosted?
How are we explaining this to the rest of the world (citizens, tech partners)
People making policy (in this case) genuinely want to do the right thing. At least in this context.
Establishing an advisory council that includes people without something to sell, including academics.
Currently the state is sandboxing implementations -- currently 8, trying to tackle specific challenges but without the ability to take data out of the systems.
Concerns
Black boxing -- compare to Wikipedia, which is heavily cited and the debates are visible.
Is there a way to have policy that requires some data sourcing.
How do you extract data that was included in a training model? How do we say "you have to be able to remove data"? eg. this was trained on my intellectual property; or this was trained on an out of data law and you need to forget that law and learn this one; or this was trained on a nazi website.
What does it look like to do due diligence on language support? (by policy the state has to support Spanish, Hmong, Cantonese and a few other key languages)
Is it really impossible to require AI tools to be auditable? or to be able to cite sources? Forget sources? They say they can't, but that smells like a lie.
Consent, esp wrt data sources. Suddenly transcripts are being created for internal conversations and no one is paying attention to what are the terms and conditions? Who has to opt in?
What is FOIA-able? (Freedom of Information Act)
Is there a policy opportunity for making robots.txt enforceable?
What does a marketplace of data brokers marketplace look like?
Does audit-ability / opening the black box help us address algorithmic bias? Is the tool responsible for any bias it absorbed?
Requiring a system to delete information
Are we going to hit walls as a state trying to set laws that are stricter than federal law? Can (should?) CA enact policy first to establish the de-facto standard?
How does CA protect autonomy? Right now we take guidance from NIST and we're very much in step with them. That may change if NIST changes?
What does it look like to co-own government data sets? Data governance ± collective intelligence; what experiments might be fundable.
We don't have a good data consent policy yet -- that doesn't exist.
Big projects will require a bake-off -- two competing providers to test implementations and effectiveness. (CA is doing this now, but will lock in as a requirement).
If an LLM is built on publicly owned data with public funding, who owns the data? (The public, no?)
Questions on the Table
Does CA build it's own LLM that the state can license to counties? Or to individual agencies?
Fulfillment pathways -- big problem that needs a big AI solution; small problem and there's an extant commercial tool (grammar checking); not actually an AI problem
What CA does as a state will ripple nationwide. CA's Clean Air Act became a de-facto national standard. If we say you have to tell us what informed your model, does that kill innovation?
Can we set emissions standards? Or renewable energy standards for energy usage?
Frameworks
Risk
What is the risk of data being used in bad ways? (eg. using medical data without your consent) Risk for the organization -- if a department is using AI and it gives people bad information, they're responsible for the information they give out. Current policy is that agency leadership is responsible for absorbing the risk -- you're responsible for any data leakage, for example.
- Guardrails that promote competition
We do want to keep businesses in California, we want and need the tax money. We want to create an environment that allows companies to take risks and experiment.
Recommended Reading
Meredith Whittaker and Timnit Gebru, Stochastic Parrots This Machine Kills (Podcast)