Ling, Lin (2025) Investigating Social Bias in LLM-Generated Code. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
1MBLin_MA_F2024.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Large language models (LLMs) have significantly advanced the field of automated code
generation. However, a notable research gap exists in evaluating social biases that may be
present in the code produced by LLMs. To solve this issue, we propose a novel fairness
framework, i.e., Solar, to assess and mitigate the social biases of LLM-generated code.
Specifically, Solar can automatically generate test cases for quantitatively uncovering
social biases of the auto-generated code by LLMs. To quantify the severity of social biases
in generated code, we develop a dataset that covers a diverse set of social problems. We
applied Solar and the crafted dataset to four state-of-the-art LLMs for code generation.
Our evaluation reveals severe bias in the LLM-generated code from all the subject LLMs.
Furthermore, we explore several prompting strategies for mitigating bias, including Chainof-
Thought (CoT) prompting, combining positive role-playing with CoT prompting, and
dialogue with Solar. Our experiments show that dialogue with Solar can effectively reduce
social bias in LLM-generated code by up to 90%.
Beyond single prompts, we studied social bias in multi-agent LLM workflows using
FlowGen, where agents act as requirement engineers, architects, developers, and testers.
The results show that the design of the workflow, the fairness-aware role instructions, and
the composition of the roles affect the fairness of the code.
Our findings demonstrate that social bias is a systemic issue in LLM-based code generation.
Solar offers a practical tool for assessing bias risks, tracing their origins, and applying
targeted mitigation strategies, including collaborative workflows.
| Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
|---|---|
| Item Type: | Thesis (Masters) |
| Authors: | Ling, Lin |
| Institution: | Concordia University |
| Degree Name: | M.A. |
| Program: | Computer Science |
| Date: | March 2025 |
| Thesis Supervisor(s): | Yang, Jinqiu |
| ID Code: | 996196 |
| Deposited By: | Lin Ling |
| Deposited On: | 04 Nov 2025 15:02 |
| Last Modified: | 04 Nov 2025 15:02 |
Repository Staff Only: item control page


Download Statistics
Download Statistics