Reachability Games with Relaxed Energy Constraints


Ritam Raha



joint work with

Nicolas Markey                              Loïc Hélouët

INRIA Rennes                               INRIA Rennes


1

Games on Graphs: Qualitative/Quantitative


  • Two players on a graph: typically system ($P_1$) and environment ($P_2$)
  • Qualitative objectives: Reachability, Safety, Büchi
  • Quantitative objectives: Energy, Mean-payoff, Discounted

2

Energy with Reachability



  • Weights are intergers; w.l.o.g we start with $0$.

Objective: $P_1$ has to reach T maintaining energy level within some given bounds

3

Different Settings: Strong vs Weak Bounds


L = 0 energy level = 0 (Strong) Lower Bound Game: L = 0 energy level=2 (Strong) Lower Bound Game: L = 0 energy level=6 (Strong) Lower Bound Game: L = 0 energy level=3 (Strong) Lower Bound Game: L = 0 energy level=6 (Strong) Lower Bound Game:✔ L = 0, U=4 energy level= (Strong) Lower Bound Game:✔ (Strong) Dual Bound Game: L = 0, U=4 energy level=2 (Strong) Lower Bound Game:✔ (Strong) Dual Bound Game: L = 0, U=4 energy level=2 (Strong) Lower Bound Game:✔ (Strong) Dual Bound Game: ✘ L = 0, W = 4 energy level=2 (Strong) Lower Bound Game:✔ (Strong) Dual Bound Game: ✘ (Weak) Dual Bound Game: L = 0, W = 4 energy level=4 (Strong) Lower Bound Game:✔ (Strong) Dual Bound Game: ✘ (Weak) Dual Bound Game: L = 0, W = 4 energy level=1 (Strong) Lower Bound Game:✔ (Strong) Dual Bound Game: ✘ (Weak) Dual Bound Game: L = 0, W = 4 energy level=4 (Strong) Lower Bound Game:✔ (Strong) Dual Bound Game: ✘ (Weak) Dual Bound Game: ✔ L = 0, U = 4, v = 1 energy level= (Strong) Lower Bound Game:✔ (Strong) Dual Bound Game: ✘ (Weak) Dual Bound Game: ✔ Bounded Violation Game: L = 0, U = 4, v = 1 energy level= (Strong) Lower Bound Game:✔ (Strong) Dual Bound Game: ✘ (Weak) Dual Bound Game: ✔ Bounded Violation Game: ✔ L = 0, U = 4, v = 0 energy level= (Strong) Lower Bound Game:✔ (Strong) Dual Bound Game: ✘ (Weak) Dual Bound Game: ✔ Bounded Violation Game: ✘ L = 0, U = 4, v = 0 energy level= (Strong) Lower Bound Game:✔ (Strong) Dual Bound Game: ✘ (Weak) Dual Bound Game: ✔ Bounded Violation Game: ✘ L = 0, U = 4, v = 0 energy level= (Strong) Lower Bound Game:✔ (Strong) Dual Bound Game: ✘ (Weak) Dual Bound Game: ✔ Bounded Violation Game: ✘

4

Strong Lower Bound Games


$P_1$ maintains energy level $\geq L$ & reach $T$

Classical Setting: Infinite path setting without reachability

  • 1-player L-infinite games are in P and 2-player are in NP $\cap$ coNP

  • Idea:
    • $P_1$ finds a non-negative cycle and repeat
    • memoryless strategies for both players

5

Reductions


Reachability Game Reachability Game Infinite Game Reachability Game Infinite Game Reachability Game Infinite Game Reachability Game Infinite Game Reachability Game Infinite Game Reachability Game Infinite Game Reachability Game Infinite Game Reachability Game Infinite Game Reachability Game Infinite Game
  • With reachability: Same complexity
  • 6

    Strong Dual Bound Games


    $P_1$ maintains $L \leq$ energy level $\leq U$ & reach $T$

    • One player game: PSPACE-complete (reduction from Reachability of bounded one counter automaton )

    • Two player game: EXPTIME-complete (reduction from Countdown Games )

    With strong bounds, reachability and infinite Games are interreducible!

    7

    Weak Dual Bound Games


    Relax one of the bounds, say the upper bound

    - $P_1$ has to maintain energy level $\geq L$ & reach $T$
    but
    the upper bound $W$ ($U$) is weak,
    i.e.,
    if energy hits $\geq W$, it stays at $W$.

    8

    Motivation towards Weak Bounds

    Access Secret Access Secret Access Secret Access Secret Access Secret +10 +6 +5 Access Secret +10 +6 +5 -2 -20 -4 Access Secret +10 +6 +5 -2 -20 -4
    Can be formulated as a two player finite state game. [Hélouët et al.'18]

    - An intruder takes a large number of normal actions and then does somethings bad. ✕
    - With weak upper bound, this is not possible. ✔

    8

    Infinite vs Reachability in Weak Dual Bound Games


    L = 0, W = 4 energy level = 0 Infinite Setting L = 0, W = 4 energy level=1 Infinite Setting L = 0, W = 4 energy level = 0 Infinite Setting L = 0, W = 4 energy level=4 Infinite Setting L = 0, W = 4 energy level=2 Infinite Setting L = 0, W = 4 energy level=2 Infinite Setting L = 0, W = 4 energy level=2 Infinite Setting Conceptually easy: find a cycle that can be iterated once with a positive effect; memoryless strategy for both players L = 0, W = 4 energy level=2 Reachability Setting L = 0, W = 4 energy level=2 Reachability Setting L = 0, W = 4 energy level = 0 Reachability Setting L = 0, W = 4 energy level=4 Reachability Setting L = 0, W = 4 energy level=3 Reachability Setting L = 0, W = 4 energy level=3 Reachability Setting L = 0, W = 4 energy level = 0 Reachability Setting L = 0, W = 4 energy level=4 Reachability Setting L = 0, W = 4 energy level=4 Reachability Setting L = 0, W = 4 energy level=4 Reachability Setting L = 0, W = 4 energy level = 0 Reachability Setting 😃 !! L = 0, W = 4 energy level = 0 Reachability Setting 😃 !! Keep track of the exact energy level; exponential memory for P1.

    9

    Memory for $P_2$

    λ λ >5 >7 λ >5 >7 >max{5-(-1),7-2} λ >5 >7 >6 λ=winning >5 >7 >6 λ=winning >5 >7 <6 λ=winning >5 >7 <6 λ=winning >5 >7 <6 P2 has memoryless winning strategy!!

    10

    1 Player Game


    • Consider a winning strategy $\sigma$ of $P_1$.

    • Any outcome of $\sigma$ will not have any zero cycle or negative cycle.

    • Now, $P_1$ has two options:
      - win in an acyclic path
      - choose a positive cycle; iterate enough to increase energy; continue

    11

    Technical Observations


    • Same cycle can be positive or negative cycle depending on the initial energy level.
    $W = 4$.
    • $x=4 \Rightarrow$ a negative cycle
    • $x=1 \Rightarrow$ a positive cycle

    • A feasible positive cycle can be iterated and the output energy stabilizes.
      Why?

    12

    Positive Cycle



    - Reach $s_1$ with energy $\geq 5 (L+a)$ and reach energy level $14 [W-m]$
    - #Iterations can be bounded by W-L.

    13

    Winning Path

    energy level = 0 q1 0 energy level=6 q1 0 q2 6 energy level = 0 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 T 0 energy level = 0 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 T 0 ► Ignore cycles of size |Q|, check smaller ones energy level = 0 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 T 0 ► Ignore cycles of size |Q|, check smaller ones energy level = 0 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 T 0 ► Ignore cycles of size |Q|, check smaller ones energy level = 0 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 T 0 ► Ignore cycles of size |Q|, check smaller ones ► Ignore same cycles, iterate enough at the first occurrence energy level=2 q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 T 0 ► Ignore cycles of size |Q|, check smaller ones ► Ignore same cycles, iterate enough at the first occurrence energy level=4 (rotating the cycle 2 times) q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 T 0 ► Ignore cycles of size |Q|, check smaller ones ► Ignore same cycles, iterate enough at the first occurrence energy level=4 (rotating the cycle 2 times) q1 0 q2 6 q1 2 q3 1 q4 4 q3 2 q5 4 q3 3 q4 6 q3 4 q1 6 T 0 ► Ignore cycles of size |Q|, check smaller ones ► Ignore same cycles, iterate enough at the first occurrence ► Winning Path: α1 · ϕ1n ··· αk · ϕkn, n =W-L

    14

    Universal Cycle


    Universal cycle on $q$: a cycle that can be taken with initial energy $L$. L = 0, W = 4 L = 0, W = 4 L = 0, W = 4 abcde ✘ universal L = 0, W = 4 abcde ✘ universal bcdea ✔ universal (Output= 2) L = 0, W = 4 abcde ✘ universal bcdea ✔ universal (Output= 2) L = 0, W = 4 abcde ✘ universal bcdea ✔ universal (Output= 2) bde ✔ universal (Output= 1) L = 0, W = 4 abcde ✘ universal bcdea ✔ universal (Output= 2) bde ✔ universal (Output= 1) bcdea ▻ bde L = 0, W = 4 abcde ✘ universal bcdea ✔ universal (Output= 2) bde ✔ universal (Output= 1) bcdea ▻ bde Find Optimal Universal Cycles!

    15

    NP algorithm for 1 Player Game


    • For every cycle, there is a universal cycle.

    • Winning paths: $\beta_1 \cdot {\tau_1}^{W-L} \cdot\beta_2\cdot{\tau_2}^{W-L} \cdots \beta_k \cdot {\tau_k}^{W-L}$ where, $\tau_j$'s are universal.

    • Use optimal universal cycles: $k<|Q| \Rightarrow$ NP Algorithm!!

    • Cycles(EXPTIME) $\Rightarrow$ Universal Cycles(EXPTIME) $\Rightarrow$ Optimal UC(NP) $\Rightarrow$ P ?

    • Idea for P:
      • Find Optimal UC in P
      • Find the winning path in P

    16

    Road to P: DAG-construction


    energy level = 0 q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 energy level = 0 M=maximal energy seen m=M-current energy q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 energy level=3 M=maximal energy seen m=M-current energy q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) energy level=1 M=maximal energy seen m=M-current energy q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) (3,2) energy level=2 M=maximal energy seen m=M-current energy q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) (3,2) (2,0) (2,1) (2,0) (2,1) (6,0) (6,4) (6,5) M=maximal energy seen m=M-current energy q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) (3,2) (2,0) (2,1) (2,0) (2,1) (6,0) (6,4) (6,5) ► If a cycle ends with (M,m), it finally outputs W-m q3q1q3is one of the optimalUC ! q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) (3,2) (2,0) (2,1) (2,0) (2,1) (6,0) (6,4) (6,5) ► If a cycle ends with (M,m), it finally outputs W-m q3q1q3is one of the optimalUC ! But still Exponential!! q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) (3,2) (2,0) (2,1) (2,0) (2,1) (6,0) (6,4) (6,5) ► If a cycle ends with (M,m), it finally outputs W-m q3q1q3is one of the optimalUC ! $(M,m) \prec (M',m'):$ $M − m \leq M' − m'$ and $m' \leq m$. q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) (3,2) (2,0) (2,1) (2,0) (2,1) (6,0) (6,4) (6,5) ► If a cycle ends with (M,m), it finally outputs W-m q3q1q3is one of the optimalUC ! $(M,m) \prec (M',m'):$ $M − m \leq M' − m'$ and $m' \leq m$. $\rho \leq \rho' \Rightarrow$ we store only $\rho'$ q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) (3,2) (2,0) (2,1) (2,0) (2,1) (6,0) (6,4) (6,5) ► If a cycle ends with (M,m), it finally outputs W-m q3q1q3is one of the optimalUC ! $(M,m) \prec (M',m'):$ $M − m \leq M' − m'$ and $m' \leq m$. $\rho \leq \rho' \Rightarrow$ we store only $\rho'$ q3 q4 q5 q1 T q3 q3 q2 q3 q1 T q3 (0,0) (3,0) (3,2) (2,0) (2,1) (2,0) (2,1) (6,0) (6,4) (6,5) ► If a cycle ends with (M,m), it finally outputs W-m ► Only store maximal labels: PTIME!!

    17

    Polynomial Algorithm


    • Compute $m_q$ for each state q if it has an optimal universal cycle

    • Construct $G'$ adding transition $\upsilon_q: q \xrightarrow{:=W-m_q} q$.

    • Winning paths: $\beta_1 \cdot \upsilon_1 \cdots \beta_k\cdot \upsilon_k$

    • path size $ \leq (|Q+1|)^2$:
      PTIME!

    Corollary: Two player LW-reachability is in coNP.

    18

    Conclusion


    19