As large language models (LLMs) increasingly shape decision-making, public discourse, education, healthcare, and governance, a critical question emerges: Whose values are these systems truly reflecting? While today’s generative AI systems demonstrate extraordinary capabilities, they also inherit hidden biases, ethical blind spots, and implicit assumptions embedded within their training data. Existing alignment approaches often remain opaque, resource-intensive, or insufficiently adaptive to diverse societal expectations.
This paper introduces a novel value-driven LLM framework designed to systematically uncover, quantify, and realign the implicit values embedded within LLMs toward socially desirable outcomes. Built upon the AI for Social Good (AIfSG) framework, our proposed methodology operationalizes ethical alignment across six critical domains: reasoning and interpretability, bias removal, transparency and accountability, security and privacy, moral and ethical observations, and public understanding. Using advanced embedding techniques, cosine-based value-difference metrics, and topic-weighted iterative fine-tuning, the framework transforms ethical alignment from an abstract aspiration into a measurable and actionable computational process.
To demonstrate adaptability across moral and regulatory paradigms, the framework evaluates two distinct reference value systems: the Ten Commandments and the General Data Protection Regulation (GDPR). Experimental results using open-source LLMs, including Llama 3.2 and Gemma 2, reveal substantial reductions in value misalignment, ranging from approximately 25% to 70% across ethical domains, while preserving model flexibility and scalability. Visualizations in value-embedding space further confirm significant convergence between original model outputs and socially aligned reference values after iterative realignment.
Beyond technical innovation, this work positions value alignment as a foundational challenge for the future governance of generative AI. The proposed framework functions both as a diagnostic instrument for identifying ethical gaps and as an intervention mechanism for adaptive value realignment, offering policymakers, developers, and institutions a scalable pathway toward transparent, accountable, and socially responsible AI systems. By bridging computational methods with moral, legal, and societal principles, this research advances a new paradigm for embedding human-centered values directly into the next generation of intelligent systems.