LLM backed Chess Reinforcement Learning Documentation
=============================================

.. figure:: /Documentation/images/intro.png
   :width: 500
   :align: center
   :alt: Documentation Cover

--------------------------------------------------------


Checkers Reinforcement Learning Model
=====================================

This documentation provides an overview of the training process and architecture of a reinforcement learning model designed to play the game of checkers. It incorporates a hybrid Deep Q-Learning approach, combined with custom self-play and evaluation mechanisms, with a core integration of a Language Model (LLM) to limit the action space effectively.

Key Features
------------

- **Reinforcement Learning (RL)**: Utilizes temporal difference learning with a neural network to approximate Q-values for state-action pairs.
- **Self-Play**: The model trains against itself to iteratively improve its gameplay.
- **Custom Opponents**: Supports training against various opponents, including random moves, minimax strategies, and itself.
- **LLM-Driven Action Space Limitation**: A Language Model (LLM) acts as an integral part of the architecture, dynamically reducing the action space to the most promising options.
- **Exploration-Exploitation Balance**: Implements a dynamic exploration parameter to balance random exploration and policy exploitation.
- **Performance Metrics**: Tracks win rates over generations for performance evaluation.


.. toctree::
   :maxdepth: 2
   :caption: Introduction

   Documentation/scripts/Scope/introduction.rst

.. toctree::
   :maxdepth: 2
   :caption: Implementation

   Documentation/scripts/Scope/implementation.rst

.. toctree::
   :maxdepth: 2
   :caption: LLM Integration

   Documentation/scripts/Scope/llm_integration.rst

.. toctree::
   :maxdepth: 2
   :caption: Interface

   Documentation/scripts/Scope/interface.rst