AI Questions on California Bar Exam Spark Legal Debate

Background: February 2025 Exam and AI Disclosure
In late April 2025, the State Bar of California acknowledged that it had leveraged artificial intelligence to draft a subset of questions on its February bar examination. A contracted psychometrician, ACS Ventures, used AI assistance to create 23 of the 171 scored multiple‑choice items. The remainder comprised 48 questions drawn from a first‑year law student assessment and 100 developed by Kaplan Exam Services under an $8.25 million contract. This revelation followed weeks of candidate complaints about typos, confusing prompts and platform outages during the hybrid in‑person/remote test.
Technical Breakdown of AI Question Generation
According to State Bar documents obtained by Ars Technica, ACS Ventures employed a fine‑tuned large language model (LLM) based on GPT‑4 Turbo. The workflow included:
- Prompt engineering to target specific legal subtopics (contracts, torts, evidence)
- Automated drafting of scenario stems and answer options
- Initial filtering to remove hallucinations and ensure statutory accuracy
- Integration with an internal review dashboard logging model confidence scores and token usage
Model parameters were restricted to 7 billion weights and temperature fixed at 0.2 for deterministic output. ACS Ventures reported that the LLM produced 80 draft questions per hour, after which human editors culled and refined items down to 23 approved for scoring.
Psychometric Considerations and Review Processes
All AI‑assisted questions underwent a multi‑stage validation process:
- Content Validation Panels (nine volunteer law professors) checked legal accuracy and relevance
- A subject‑matter expert reviewed each item for clarity, bias and alignment with the Uniform Bar Exam blueprint
- Item Response Theory (IRT) analyses evaluated difficulty and discrimination indices.
However, critics note that the same vendor (ACS Ventures) drafted and evaluated its own AI questions. Dr. Simone Liang, a psychometrician at Stanford, warns that “self‑review can inflate Cronbach’s alpha without truly capturing content validity or fairness across demographic cohorts.”
Legal and Policy Implications
Legal educators have expressed alarm. Mary Basick of UC Irvine called the practice “unbelievable,” arguing that AI lacks the nuanced reasoning required for minimal competence assessments. Meanwhile, Katie Moran of USF School of Law highlighted a potential conflict of interest: “The drafter and approver should never be the same entity, especially when remote testing increases security risks.”
On June 15, the California Supreme Court held a hearing on whether to mandate a third‑party audit of AI‑generated content. The Court also considered a proposal to integrate biometric proctoring for remote exams while safeguarding candidate privacy under state data protection laws.
Next Steps and Industry Impact
The Committee of Bar Examiners convened May 5 and recommended score adjustments for February examinees. They declined to revert to the National Conference of Bar Examiners’ Multistate Bar Examination, citing remote‑testing demand—nearly 45% of applicants favored at‑home options. A federal lawsuit against Meazure Learning, the remote‑proctoring provider, remains pending after a judge refused to dismiss claims of system outages and discrimination.
Looking ahead, the State Bar has proposed a pilot for generative‑AI question writing with independent oversight, plus expanded psychometric audits leveraging open‑source tools. The controversy spotlights broader questions about AI’s role in high‑stakes testing and underscores the need for transparent governance frameworks as educational institutions embrace machine learning technologies.