Relaxed Energy Games


Ritam Raha

Chennai Mathematical Institute


supervised by

Nicolas Markey                        Loïc Hélouët

INRIA Rennes                         INRIA Rennes


CMI
INRIA
Master's thesis in fulfillment of the requirements for the degree of Master of Science

1

Introduction


  • $G= \langle Q_1,Q_2,E,w,q_0,T\rangle$
  • Each edge has integral weights. Initial energy level is always $0$ (Note that, this is no loss of generality).
  • $P1$ has to reach $T$ starting from $q_0$ maintaining energy level within some given bound(s).

2

Different Games


L=0 energy level=0 Single Lower Bound Game: L=0 energy level=2 Single Lower Bound Game: L=0 energy level=6 Single Lower Bound Game: L=0 energy level=3 Single Lower Bound Game: L=0 energy level=6 Single Lower Bound Game:✔ L=0, U=4 energy level= Single Lower Bound Game:✔ Strong Dual Bound Game: L=0, U=4 energy level=2 Single Lower Bound Game:✔ Strong Dual Bound Game: L=0, U=4 energy level=2 Single Lower Bound Game:✔ Strong Dual Bound Game: ✘ L=0, W=4 energy level=2 Single Lower Bound Game:✔ Strong Dual Bound Game: ✘ Weak Dual(Upper) Bound Game: L=0, W=4 energy level=4 Single Lower Bound Game:✔ Strong Dual Bound Game: ✘ Weak Dual(Upper) Bound Game: L=0, W=4 energy level=1 Single Lower Bound Game:✔ Strong Dual Bound Game: ✘ Weak Dual(Upper) Bound Game: L=0, W=4 energy level=4 Single Lower Bound Game:✔ Strong Dual Bound Game: ✘ Weak Dual(Upper) Bound Game: ✔ L=0, RU=4(violated), v = 1 energy level= Single Lower Bound Game:✔ Strong Dual Bound Game: ✘ Weak Dual(Lower) Bound Game: ✔ APNA Game: L=0, RU=4(violated), v = 1 energy level= Single Lower Bound Game:✔ Strong Dual Bound Game: ✘ Weak Dual(Lower) Bound Game: ✔ APNA Game: ✔ L=0, RU=4(violated), v = 0 energy level= Single Lower Bound Game:✔ Strong Dual Bound Game: ✘ Weak Dual(Lower) Bound Game: ✔ APNA Game: ✘ L=0, RU=4(violated), v = 0 energy level= Single Lower Bound Game:✔ Strong Dual Bound Game: ✘ Weak Dual(Lower) Bound Game: ✔ APNA Game: ✘

3

Single Bound Games


$P1$ has to reach $T$ from $q_0$ in a path maintaining energy level at all the prefixes $\geq 0$ given 0 initial credit.

5

Strong Dual Bound Games



INPUT: A game graph, initial vertex $q_0$, target vertex $T$, lower bound $l$, upper bound $U$
QUESTION: Does $P1$ have a winning strategy to reach $T$ starting with 0 from $q_0$ maintaining the energy level $l \leq e \leq U$?

  • One player game: PSPACE-complete (reduction from Reachability of bounded one counter automaton )

  • Two player game: EXPTIME-complete (reduction from Countdown Games )

With strong bounds, reachability and Infinite Games are interreducible!!

6

Weak Dual Bound Games


  • We now relax the upper bound, by making it weak.

  • Formally,

INPUT: A game graph, initial vertex $q_0$, target vertex $T$, lower bound $L$, upper bound $W$(weak bound)
QUESTION: Does $P1$ have a winning strategy to reach $T$ starting with 0 from $q_0$ maintaining the energy level $e \geq L$ such that if at any point $e$ hits $U$, it does not go above $U$?

7

Motivation

Let us look at some motivation:
Access Secret Access Secret Access Secret Access Secret Access Secret +10 +6 +5 Access Secret +10 +6 +5 -2 -20 -4 Access Secret +10 +6 +5 -2 -20 -4
This can be formulated as a two player finite state game. [Hélouët, Marchand $\&$ Mullins '18]

Now, an intruder should not be able to play a large number of normal moves, higher his reward arbitarily and then try to access the security system staying above the strong lower bound. Hence, upper bound should be weak.

8

Infinite vs Reachability in Weak Dual Bound


L=0,W=4 energy level=0 Infinite Weak Dual Bound Game L=0,W=4 energy level=1 Infinite Weak Dual Bound Game L=0,W=4 energy level=0 Infinite Weak Dual Bound Game L=0,W=4 energy level=4 Infinite Weak Dual Bound Game L=0,W=4 energy level=2 Infinite Weak Dual Bound Game L=0,W=4 energy level=2 Infinite Weak Dual Bound Game L=0,W=4 energy level=2 Infinite Weak Dual Bound Game Conceptually easy: find a cycle that can be iterated once with a positive effect; memoryless strategy for both players L=0,W=4 energy level=2 Weak Dual Bound Reachability Game L=0,W=4 energy level=2 Weak Dual Bound Reachability Game L=0,W=4 energy level=0 Weak Dual Bound Reachability Game L=0,W=4 energy level=4 Weak Dual Bound Reachability Game L=0,W=4 energy level=3 Weak Dual Bound Reachability Game L=0,W=4 energy level=3 Weak Dual Bound Reachability Game L=0,W=4 energy level=0 Weak Dual Bound Reachability Game L=0,W=4 energy level=4 Weak Dual Bound Reachability Game L=0,W=4 energy level=4 Weak Dual Bound Reachability Game L=0,W=4 energy level=4 Weak Dual Bound Reachability Game L=0,W=4 energy level=0 Weak Dual Bound Reachability Game 😃 !! L=0,W=4 energy level=0 Weak Dual Bound Reachability Game 😃 !! Keep track of the exact energy level;exponential memory for P1.

9

Memory for P2

λ λ >5 >7 λ >5 >7 >max{5-(-1),7-2} λ >5 >7 >6 λ=winning >5 >7 >6 λ=winning >5 >7 <6 λ=winning >5 >7 <6 λ=winning >5 >7 <6 P2 has memoryless winning strategy!!

10

1 Player Game


  • Consider a winning strategy $\sigma$ of $P1$.

  • Intuitively, any outcome of $\sigma$ will not have any zero cycle or negative cycle.

  • Now, $P1$ has two options:
► win in an acyclic path
► choose a positive cycle-iterate-increase energy to certain level-continue

11

Technical Observations

  • Same cycle can be positive or negative cycle depending on the initial energy level.
Let $W=4$. In this example, if $x=4$, it is a negative cycle, but if $x=1$, it is a positive cycle.

  • Any feasible positive cycle can be iterated and after a certain time output energy stabilizes.
    Why?

12

Positive Cycle



#Iterations can be bounded by W-L.

13

Winning Path

energy level=0 q1 0 energy level=6 q1 0 q2 6 energy level=2 q1 0 q2 6 q1 2 energy level=1 q1 0 q2 6 q1 2 q3 1 energy level=4 q1 0 q2 6 q1 2 q3 1 q4 4 energy level=2 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 energy level=4 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 energy level=3 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 energy level=6 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 energy level=4 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 energy level=6 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 energy level=0 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 T 0 energy level=0 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 T 0 energy level=0 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 T 0 ► A cycle of length >|Q| can be ignored to check smaller cycles. energy level=0 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 T 0 ► A cycle of length >|Q| can be ignored to check smaller cycles. energy level=0 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 T 0 ► A cycle of length >|Q| can be ignored to check smaller cycles. energy level=0 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 T 0 ► A cycle of length >|Q| can be ignored to check smaller cycles. ► If same cycle appears, remove them, iterate the previous one enough. energy level=2 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 T 0 ► A cycle of length >|Q| can be ignored to check smaller cycles. ► If same cycle appears, remove them, iterate the previous one enough. energy level=4 (rotating the cycle 2 times) q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 T 0 ► A cycle of length >|Q| can be ignored to check smaller cycles. ► If same cycle appears, remove them, iterate the previous one enough. energy level=4 (rotating the cycle 2 times) q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 T 0 ► A cycle of length >|Q| can be ignored to check smaller cycles. ► If same cycle appears, remove them, iterate the previous one enough. ► Winning Path: α1 · ϕ1n ··· αk · ϕkn, where all ϕi's are distinct cycles of length<|Q|, αj's are acyclic paths and n = W-L.

14

Universal Cycle


Universal cycle on $q$: cycle can be taken with initial energy $L$.
L=0, W=4 L=0, W=4 L=0, W=4 abcde ✘ universal L=0, W=4 abcde ✘ universal bcdea ✔ universal (Output= 2) L=0, W=4 abcde ✘ universal bcdea ✔ universal (Output= 2) L=0, W=4 abcde ✘ universal bcdea ✔ universal (Output= 2) bde ✔ universal (Output= 1) L=0, W=4 abcde ✘ universal bcdea ✔ universal (Output= 2) bde ✔ universal (Output= 1) bcdea ▻ bde L=0, W=4 abcde ✘ universal bcdea ✔ universal (Output= 2) bde ✔ universal (Output= 1) bcdea ▻ bde From now on we try to find Optimal Universal Cycles.

15

Optimal Universal Cycle : NP algorithm


  • For every cycle, there is a universal cycle.

  • Winning paths: $\beta_1 \cdot {\tau_1}^{W-L} \cdot\beta_2\cdot{\tau_2}^{W-L} \cdots \beta_k \cdot {\tau_k}^{W-L}$ where, $\tau_j$'s are universal.

  • Use optimal universal cycles: $k<|Q| \Rightarrow$ NP Algorithm!!

  • Cycles(EXPTIME) $\Rightarrow$ Universal Cycles(EXPTIME) $\Rightarrow$ Optimal UC(NP) $\Rightarrow$ P ?

  • Idea for P:
    • Find Optimal UC in P
    • Find the winning path in P

16

Road to P: DAG-construction


energy level=0 q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 energy level=0 M=maximal energy seen m=M-current energy q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 energy level=3 M=maximal energy seen m=M-current energy q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) energy level=1 M=maximal energy seen m=M-current energy q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) (3,2) energy level=2 M=maximal energy seen m=M-current energy q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) (3,2) (2,0) (2,1) (2,0) (2,1) (6,0) (6,4) (6,5) M=maximal energy seen m=M-current energy q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) (3,2) (2,0) (2,1) (2,0) (2,1) (6,0) (6,4) (6,5) ► If a cycle ends with (M,m), iterating that we blow up the energy to W-m. q3q1q3is one of the optimalUC ! q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) (3,2) (2,0) (2,1) (2,0) (2,1) (6,0) (6,4) (6,5) ► If a cycle ends with (M,m), iterating that we blow up the energy to W-m. q3q1q3is one of the optimalUC ! But still Exponential!! q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) (3,2) (2,0) (2,1) (2,0) (2,1) (6,0) (6,4) (6,5) ► If a cycle ends with (M,m), iterating that we blow up the energy to W-m. q3q1q3is one of the optimalUC ! $(M,m) \prec (M',m'):$ $M − m \leq M' − m'$ and $m' \leq m$. q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) (3,2) (2,0) (2,1) (2,0) (2,1) (6,0) (6,4) (6,5) ► If a cycle ends with (M,m), iterating that we blow up the energy to W-m. q3q1q3is one of the optimalUC ! $(M,m) \prec (M',m'):$ $M − m \leq M' − m'$ and $m' \leq m$. $\rho \leq \rho' \Rightarrow$ we store only $\rho'$ q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) (3,2) (2,0) (2,1) (2,0) (2,1) (6,0) (6,4) (6,5) ► If a cycle ends with (M,m), iterating that we blow up the energy to W-m. q3q1q3is one of the optimalUC ! $(M,m) \prec (M',m'):$ $M − m \leq M' − m'$ and $m' \leq m$. $\rho \leq \rho' \Rightarrow$ we store only $\rho'$ q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) (3,2) (2,0) (2,1) (2,0) (2,1) (6,0) (6,4) (6,5) ► If a cycle ends with (M,m), iterating that we blow up the energy to W-m. ► Only store maximal labels: PTIME!!

17

Polynomial Algorithm


  • For each state q, compute $m_q$ for the optimal universal cycle, if exists.

  • Construct $G'$ adding transition $q \xrightarrow{:=W-m_q} q$.

  • Winning path: $\beta_1 \cdot \upsilon_1 \cdots \beta_k\cdot \upsilon_k$, where $\upsilon 's$ are special transitions. Path size $ \leq (|Q+1|)^2$.

  • Check paths of length less than or equal to $(|Q| + 1)^2$; inductively compute the maximal energy level: similar to DAG construction.
    PTIME!

Corollary: Two player LW-reachability is in coNP.

18

APNA Games

(Relaxed Upper Bound)



We define different violation measures later.

19

Motivation

Let us look at the same banking system scenario:
Access Secret 0 0 0 +2 +20 +4
Clearly, L=0. But, consider a normal user accidentally used some malicious actions, APNA Games come into play.

20

Notations

Violation measures for a path $\rho$ with violations:= $V(\rho) = \{i \in [0;|\rho|]\mid \tilde\rho_i>U\}$:


Given the notations, $APNA^{\sharp}$, $APNA^{\overline{\sharp}}$, $APNA^{\Sigma}$- infinite or reachability games are intuitive:

INPUT: A game graph, lower bound $l$, relaxed upper bound $U$ and a bound $V$
QUESTION: Does $P1$ have a strategy to win infinite or reachability objectives maintaining energy level $\geq L$ and for $RU$, corresponding violation measure $\leq V$?

21

Solving APNA Games

Lets go through the idea of solving APNA Games:

$E_n:$ Current Energy Level; $n:$ No. of violations; $c:$ No. of consecutive violations; $s:$ Sum of overloads; $E_n:$ Current Energy Level; $n:$ No. of violations; $c:$ No. of consecutive violations; $s:$ Sum of overloads; ${E_n}'= E_n + w$, if $\in [L,U+V \cdot w_{max}]$, $= \bot$, otherwise. $E_n:$ Current Energy Level; $n:$ No. of violations; $c:$ No. of consecutive violations; $s:$ Sum of overloads; ${E_n}'= E_n + w$, if $\in [L,U+V \cdot w_{max}]$, $= \bot$, otherwise. $n'=n+1$, if $q'$ violated $\&$ $n+1$< $V$, $= n$, if $q'$ not violated, $= \bot$, if $n+1$> $V$ or ${E_n}'=\bot.$ $E_n:$ Current Energy Level; $n:$ No. of violations; $c:$ No. of consecutive violations; $s:$ Sum of overloads; ${E_n}'= E_n + w$, if $\in [L,U+V \cdot w_{max}]$, $= \bot$, otherwise. $n'=n+1$, if $q'$ violated $\&$ $n+1$< $V$, $= n$, if $q'$ not violated, $= \bot$, if $n+1$> $V$ or ${E_n}'=\bot.$ $c'=c+1$, if $q$ and $q'$ violated $\&$ $c+1$< $V$, $= 0$, if $q'$ not violated, $= \bot$, if $c+1$> $V$ or ${E_n}'=\bot.$ $s'=s+{E_n}'-U$, if ${E_n}' \in (U,U+V \cdot w_{max})$ $ \& $ < $V$, $= s$, if ${E_n}' \in [L,U]$, $= \bot$, if $s+ {E_n}'-U$> $V$ or ${E_n}'=\bot.$ $E_n:$ Current Energy Level; $n:$ No. of violations; $c:$ No. of consecutive violations; $s:$ Sum of overloads; ${E_n}'= E_n + w$, if $\in [L,U+V \cdot w_{max}]$, $= \bot$, otherwise. $n'=n+1$, if $q'$ violated $\&$ $n+1$< $V$, $= n$, if $q'$ not violated, $= \bot$, if $n+1$> $V$ or ${E_n}'=\bot.$ $c'=c+1$, if $q$ and $q'$ violated $\&$ $c+1$< $V$, $= 0$, if $q'$ not violated, $= \bot$, if $c+1$> $V$ or ${E_n}'=\bot.$ $s'=s+{E_n}'-U$, if ${E_n}' \in (U,U+V \cdot w_{max})$ $ \& $ < $V$, $= s$, if ${E_n}' \in [L,U]$, $= \bot$, if $s+ {E_n}'-U$> $V$ or ${E_n}'=\bot.$ ◪ APNA$^{*}$ Games can be solved by solving reachability of the relevant set of states in $G_{APNA}$. $E_n:$ Current Energy Level; $n:$ No. of violations; $c:$ No. of consecutive violations; $s:$ Sum of overloads; ${E_n}'= E_n + w$, if $\in [L,U+V \cdot w_{max}]$, $= \bot$, otherwise. $n'=n+1$, if $q'$ violated $\&$ $n+1$< $V$, $= n$, if $q'$ not violated, $= \bot$, if $n+1$> $V$ or ${E_n}'=\bot.$ $c'=c+1$, if $q$ and $q'$ violated $\&$ $c+1$< $V$, $= 0$, if $q'$ not violated, $= \bot$, if $c+1$> $V$ or ${E_n}'=\bot.$ $s'=s+{E_n}'-U$, if ${E_n}' \in (U,U+V \cdot w_{max})$ $ \& $ < $V$, $= s$, if ${E_n}' \in [L,U]$, $= \bot$, if $s+ {E_n}'-U$> $V$ or ${E_n}'=\bot.$ ◪ APNA$^{*}$ Games can be solved by solving reachability of the relevant set of states in $G_{APNA}$. ◪ 1-player APNA$^{*}$ Game is in PSPACE and 2-player game is in EXPTIME. $E_n:$ Current Energy Level; $n:$ No. of violations; $c:$ No. of consecutive violations; $s:$ Sum of overloads; ${E_n}'= E_n + w$, if $\in [L,U+V \cdot w_{max}]$, $= \bot$, otherwise. $n'=n+1$, if $q'$ violated $\&$ $n+1$< $V$, $= n$, if $q'$ not violated, $= \bot$, if $n+1$> $V$ or ${E_n}'=\bot.$ $c'=c+1$, if $q$ and $q'$ violated $\&$ $c+1$< $V$, $= 0$, if $q'$ not violated, $= \bot$, if $c+1$> $V$ or ${E_n}'=\bot.$ $s'=s+{E_n}'-U$, if ${E_n}' \in (U,U+V \cdot w_{max})$ $ \& $ < $V$, $= s$, if ${E_n}' \in [L,U]$, $= \bot$, if $s+ {E_n}'-U$> $V$ or ${E_n}'=\bot.$ ◪ APNA$^{*}$ Games can be solved by solving reachability of the relevant set of states in $G_{APNA}$. ◪ 1-player APNA$^{*}$ Game is in PSPACE and 2-player game is in EXPTIME. ◪ Hardness reduction comes from LU game with $V=0$. $E_n:$ Current Energy Level; $n:$ No. of violations; $c:$ No. of consecutive violations; $s:$ Sum of overloads; ${E_n}'= E_n + w$, if $\in [L,U+V \cdot w_{max}]$, $= \bot$, otherwise. $n'=n+1$, if $q'$ violated $\&$ $n+1$< $V$, $= n$, if $q'$ not violated, $= \bot$, if $n+1$> $V$ or ${E_n}'=\bot.$ $c'=c+1$, if $q$ and $q'$ violated $\&$ $c+1$< $V$, $= 0$, if $q'$ not violated, $= \bot$, if $c+1$> $V$ or ${E_n}'=\bot.$ $s'=s+{E_n}'-U$, if ${E_n}' \in (U,U+V \cdot w_{max})$ $ \& $ < $V$, $= s$, if ${E_n}' \in [L,U]$, $= \bot$, if $s+ {E_n}'-U$> $V$ or ${E_n}'=\bot.$ ◪ APNA$^{*}$ Games can be solved by solving reachability of the relevant set of states in $G_{APNA}$. ◪ 1-player APNA$^{*}$ Game is in PSPACE and 2-player game is in EXPTIME. ◪ Hardness reduction comes from LU game with $V=0$. ◪ Hence, 1-player APNA$^{*}$ Games are PSPACE-complete and 2-player games are EXP-complete. $E_n:$ Current Energy Level; $n:$ No. of violations; $c:$ No. of consecutive violations; $s:$ Sum of overloads; ${E_n}'= E_n + w$, if $\in [L,U+V \cdot w_{max}]$, $= \bot$, otherwise. $n'=n+1$, if $q'$ violated $\&$ $n+1$< $V$, $= n$, if $q'$ not violated, $= \bot$, if $n+1$> $V$ or ${E_n}'=\bot.$ $c'=c+1$, if $q$ and $q'$ violated $\&$ $c+1$< $V$, $= 0$, if $q'$ not violated, $= \bot$, if $c+1$> $V$ or ${E_n}'=\bot.$ $s'=s+{E_n}'-U$, if ${E_n}' \in (U,U+V \cdot w_{max})$ $ \& $ < $V$, $= s$, if ${E_n}' \in [L,U]$, $= \bot$, if $s+ {E_n}'-U$> $V$ or ${E_n}'=\bot.$ ◪ APNA$^{*}$ Games can be solved by solving reachability of the relevant set of states in $G_{APNA}$. ◪ 1-player APNA$^{*}$ Game is in PSPACE and 2-player game is in EXPTIME. ◪ Hardness reduction comes from LU game with $V=0$. ◪ Hence, 1-player APNA$^{*}$ Games are PSPACE-complete and 2-player games are EXP-complete. ◪ $U+V \cdot w_{max}$ is an overestimation and two natural questions arise:

22

Natural Questions

  • Bound existence: Given $V$, decide if there exists a value $H$ such that $P1$ wins the APNA$^{*}$ games with bound $V$?
  • Minimization: Given $V_{max}$, find the value $H$ such that $P1$ wins the APNA$^{*}$ games with the smallest possible value for $V$.
  • Existence can be solved using the similar idea of $G_{APNA}$ and check reachability problems.
  • Minimization can be solved using binary search method within $[U, U+V_{max} \cdot w_{max}]$.
  • They are also PSPACE-complete for 1-player and EXPTIME-complete for two players.[Reduction from LU Games]

23

Conclusion


We have seen two kinds of games:


Future work:

  • consider energy games with mean payoff function
  • discounted total payoff for the energy level and for the violation constraints, and the associated minimization and existence problems

24