Skip to content

formalproofs/MDPLib

Repository files navigation

Formalizing Markov Decision Processes in Lean

Blueprint

Verified Lean algorithms for solving tabular MDPs and proving their properties. The focus of this project is on two main goals:

  1. Basic algorithms that can solve robust and risk-averse MDPs of moderate size.

  2. Proofs of correctness of algorithms and fundamental MDP properties which can be used independently to prove structural results, such as the optimality of certain policy class.

Status

Basic probability properties

  • Definitions: probability space and definition
  • Definitions: probability, expectation, conditional properties
  • Independent of random variables
  • Tower property, law of the unconscious statistician
  • Quantile definition and basic properties
  • Quantile under monotone transformation

Value at Risk

  • Definition (non-constructive)
  • Practical implementation O(n^2) and correctness
  • Fast practical implementation O(n log n) and correctness
  • VaR is positively homogeneous and monotone
  • VaR is translation (cash) invariant
  • VaR under monotone transformation

MDP: Basics

  • Definition of MDP
  • Definition of policies (history, Markov, stationary)
  • Definition of value function (history-dependent)

Finite Horizon

  • Histories and manipulation
  • Probability space over histories
  • Return and optimal return using histories
  • History-dependent value function and dynamic program
  • Markov optimal value function and optimal policy
  • DP algorithms

Risk-averse finite horizon

  • History-dependent utility functions
  • Augmented value function dynamic program
  • VaR computation from utility function
  • VaR DP decomposition as in Hau et al., 2023

Discounted infinite horizon

Average reward horizon

Lean Resources

Most useful

Others

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors