EXAMINING POLITICAL POLARIZATION IN THE GERMAN BUNDESTAG USING LLMs HISTORICAL TRENDS AND A CONTEMPORARY ANALYSIS

Executive Summary

This work examines political polarization in the German Bundestag (2009–2024), with a focus on the most recent, 20th electoral term (2021–2024), using an ensemble of fine-tuned LLMs (BERT, GPT-4o, LLaMA) alongside sentiment and structural analysis to investigate the prevalence of polarizing speech and associated factors.

Key Findings:

  • Polarization has doubled since 2009, peaking around elections and budget approvals.
  • AfD speakers exhibit the highest degree of polarization, with Die Linke and BSW also increasing recently.
  • Coalition tensions may have been signalled by declining applause exchanged between parties, showing internal strain before the 2024 government breakdown.
  • Economic, energy, and civil rights debates show the highest polarization levels.
  • Men and older politicians tend to be more polarizing.

1. Introduction

In November 2024, Germany witnessed a sudden breakdown of the governing coalition, sparking renewed scrutiny of the role polarization plays in politics. While polarization is often debated in journalistic and public spheres, empirical studies specifically targeting parliamentary speech in the German Bundestag remain limited. Accordingly, we analyze Bundestag speeches from 1949 to 2024, concentrating on the most recent four electoral terms (2009–2024) for a novel Large Language Model (LLM) ensemle based investigation.

Research Questions

  1. (RQ1) Historical Overview: How has political polarization, as reflected in parliamentary discourse, evolved over the last four electoral terms (2009–2024)? Can we use sentiment analysis to contextualize our findings for the entirety of our dataset (1949–2024)?
  2. (RQ2) Contemporary Insights: Which parties, debate topics, and speaker attributes correlate most strongly with polarizing speech during the 20th electoral term (2021–2024)?

By employing an ensemble of fine-tuned LLMs — BERT, GPT-4o-mini, and LLaMA — alongside sentiment- and structural analysis, we offer an insightful perspective on how language in the Bundestag has become more divisive over time, and which factors are most prominently associated with such polarization.


2. Methodology

2.1 Research Design Overview

Our methodology integrates two main analytical components:

  1. A deterministic approach, encompassing sentiment analysis and structural analysis (shouts, applause, interruptions, etc.), extending back to 1949.
  2. An LLM ensemble classification for polarizing vs. non-polarizing speech, focused primarily on 2009–2024.

This dual design allows both a broad historical context and a targeted lens on more recent discourse, where computational tools can feasibly be applied.

Drawing from Bauer (2019) and Roberts (2021), we use the following definition of polarization for our work:

“Political polarization is the growing division among political actors and voters, characterized by an increasing divergence of opinions that shifts towards extreme positions, creating antagonistic political camps”

2.2 Data Sources

We mainly leverage our custom adaptation of the Open Discourse Project to extract all official parliamentary transcripts (1949–2024) from the Bundestag website. After cleaning, structuring, and refining the data, we obtain a comprehensive dataset that includes all speeches, agenda items, contributions (e.g., shouts, laughter), and detailed records of parliamentary compositions.

2.3 Deterministic Approaches

2.3.1. Sentiment Analysis (Bag-of-Words Approach, Dictionary-Based)

  • We generate a sentiment score by counting and aggregating words associated with positive or negative sentiment, using the sentiment dictionary by Rauh (2018).
  • This method is computationally efficient, offering insight into long-term trends, though it does not target polarization directly.

2.3.2. Structural Analysis using contributions data (shouts, applause, interruptions, etc.)

  • We analyze the frequency of applause, interruptions, and speaker transitions.
  • These metrics serve as proxies for collaboration (applause) or conflict (negative interjections), tracking shifts in cross-party dynamics.

2.4 Ensemble LLM Classification

2.4.1. Model Selection

  • BERT: Its bidirectional encoder is adept at handling the nuanced German language, ensuring an accurate grasp of context and semantics.
  • GPT 4o: Implemented as a scaled-down, instruction-tuned variant (GPT-4o-mini), it offers state-of-the-art performance despite being closed source.
  • LLaMA: This open-source transformer, fine-tuned with parameter-efficient methods like LoRa, provides a valuable alternative to larger models.

2.4.2. Fine-Tuning Procedure

  • Corpus Preparation: Speeches are segmented into smaller text chunks to fit model context windows while preserving sentence structure.
  • Labeling: Following guidelines by Reiter (2020), a subset of text fragments is manually annotated as “polarizing” or “non-polarizing.” After peer review and iterative refinement, the final corpus comprises 4,767 training entries (824 polarizing, 3,943 non-polarizing) and 2,254 validation entries (230 polarizing, 2,024 non-polarizing).
  • Training & Validation: The training set is balanced with 824 polarizing and 824 non-polarizing samples (totaling 1,648), while the validation set maintains the natural distribution.

2.4.3. Model Application and Ensemble

  • Fragment Classification: Each fine-tuned model independently classifies speech fragments, with BERT achieving an F1 score of 0.7463, GPT-4o-mini leading with 0.7985, and LLaMA at 0.7453.
  • Ensemble Approach: Using majority voting, the ensemble model achieves an F1-score of 0.8247 — balancing precision and recall by leveraging complementary strengths and mitigating individual biases.
  • Aggregation to Speech Level: If over half the fragments of a speech are classified as polarizing, the entire speech is judged as such. This yields an overall polarizing speech rate of 11.82%.

3. Results and Empirical Findings

3.1 Historical Overview (RQ1, 1949/2009–2024)

  • Historial Sentiment Across Parties: After relative historical stability, net sentiment in Bundestag speeches has steadily increased across major parties since the 2000s. From around 2005, net sentiment began to diverge between parties — suggesting rising rhetorical polarization. Most recently, the governing coalition (e.g., SPD, Bündnis90/Die Grünen, FDP) and opposition parties (e.g., CDU/CSU) exhibit distinct sentiment trends, with Die Linke and AfD consistently scoring lower.
  • Sentiment By Governing Status: When considering net sentiment by governing status, we see our previous insights reaffirmed: Governing factions generally exhibit higher net sentiment scores than the opposition. Moreover, we again see a divergence from around 2005, potentially hinting at higher political polarization.
  • Long-term Increase in Polarization: Polarization has more than doubled (136%) over the last four electoral terms.
  • AfD Impact: The emergence of the AfD corresponds with a significant jump in polarizing rhetoric — AfD speakers tend to engage in polarizing speech at an average frequency of 0.4647, roughly triple that of other factions. Die Linke and BSW also register higher polarizing frequencies.
  • Governing vs. Opposition: Governing factions typically show lower polarization levels, with observable shifts aligning with changes in governing status.
  • External Events: Although global or national events do not uniformly affect polarization, nuances such as session absences or pre-set agenda topics might obscure these effects.
  • Election and Budgeting Cycles: Polarization spikes approximately one month before federal elections (with increases of 253.23%, 343.11%, and 23.17% across terms) and during budget approvals, followed by a normalization phase.

3.2 Contemporary Insights (RQ2, 2021–2024, 20th Term)

  • Reactions Received: The AfD receives the least applause (0.8%), while governing factions benefit from strong inter-coalition applause dynamics. AfD also experiences a higher frequency of personal interjections and shouts, signaling pronounced opposition.
  • Reactions Given: Governing factions predominantly applaud each other. In contrast, AfD gives significantly more shouts (28.6%) and interjections (24.1%), aligning with its role as a polarizing force.
  • Declining Applause and Coalition Tensions: Post-summer 2024, coalition parties (SPD, Die Grünen, FDP) show a noticeable drop in exchanged applause, hinting at growing internal tensions.
  • Rising Polarization: While polarization increases across all factions — with AfD remaining the most polarizing — Die Linke and BSW show notable growth, even as governing parties maintain a stable profile.
  • Polarization by Topic: Debate topics seem to lend themselves to polarization to different degrees: the economy (20.56%) shows high polarization despite infrequent discussion (2.13%), while energy (21.88%) and civil rights (20.24%) combine high polarization with more frequent debate. Notably, we find that Reunification is deemed wholly non-polarising.
  • Age & Gender Effects: Men and older politicians tend to exhibit higher polarization scores (20.38%), suggesting that demographic factors may be associated with differences in rhetorical styles.

4. Conclusion and Implications

  • Methodological Insights: The LLM ensemble effectively captures both overt and subtle signs of polarization, with sentiment and structural analyses providing complementary perspectives.
  • Rising Polarization: Political discourse in the Bundestag has grown increasingly polarized — especially following the entry of the AfD.
  • Associated Factors: Party affiliation (notably AfD), debate topics (energy, economy, civil rights), and demographic factors are associated with higher polarization.
  • Coalition Health: Subtle signals, such as reduced applause, may forecast deeper coalition rifts.
  • Practical Applications: Continuous monitoring of parliamentary discourse could help predict coalition stability, guide campaign strategies, and inform interventions to manage escalating tensions.

We welcome feedback, collaboration, and extensions of this work! Feel free to check out the full codebase on GitHub or get in touch at 58544@novasbe.pt.

Antonius Greiner

Antonius Greiner

Data Scientist