DVFS Memory Model and Algorithm
SCP (single DVS core processor) and a single memory
Memory-DVS-System-2004-Embedded
Ref: R. Jejurikar, R. Gupta, Dynamic voltage scaling for system-wide energy minimization in real-time embedded systems, in: Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2004, pp. 78–81.
System Model
-
A task set of n periodic real time tasks Γ={τ1,...τn}
-
For each task τi,
- The period of the task Ti
- The relative deadline of the task Di
-
Ti=Di
- The worst case execution cycles of the task at the maximum processor speed Ci
-
All tasks are scheduled by the Earliest Deadline First (EDF) scheduling policy.
-
All tasks are assumed to be independent and preemptive.
-
A task is procrastinated (or delayed) if the processor remains idle despite the presence of the task in the processor ready queue.
-
Processors support variable voltage and frequency levels for energy efficient operation of the system.
-
The available frequencies be {f1 , ...,fn}, in increasing order of frequency
-
The corresponding voltage levels be {v1, ...,vs}
-
A slowdown factor ηi
- the normalized operating frequency
- the ratio of the current frequency to the maximum frequency, fs, of the processor.
-
ηi∈(fsf1,fsf2,...,fsfn)⊑[0,1]
- Task level slowdown as opposed to intra-task slowdown.
- Explicitly, ηik when task τi is assigned a slowdown factor fsfk
-
Power consumption P(CPU,η)
-
The processor is said to be idle if it is not executing a task.
- Support shutdown to reduce the leakage power consumption.
- In the idle state, the processor could be in the shutdown state (no leakage) or in the standby state (active + idle) where leakage power is dissipated.
-
A set of m resources (peripheral devices) R=R1,....Rm.
- Each task τi uses a subset of the resources in R, denoted as Rτi
- Standby state if it is on (active) but idle.
- Power consumption P(Ri), where shutdown power of the resource = 0.
- Assume that the device Rj standby time for each task τi is expressed in number of processor cycles CiRj. ( especially for compiler directed DPM policies).
- If a task doesn't use Rj, then CiRj=0
Goal
Consider task slowdown factors that minimize the total system energy consumption including the resource standby energy consumption.
Not DPM policies, but consider standby energy in computing static slowdown factors.
Energy consumption considering standby resources
Task τi is executed at a speed η
Ei(η)=ηCiP(CPU,η)+Rj∈Rτi∑ηCiRjP(Rj)
Optimization problem
minimizei∑TiEi(η)
Subject to: [Feasibility]
i∑ηiCi≤i∑Ti
The slowdown factor for a task that minimizes its total energy consumption, called the critical speed for the task.
Heuristic Algorithm
We compute the energy consumption of each task at all possible discrete slowdown factors.
- compute the critical speed for each task.
- increase the task slowdown factors if the task set is not feasible.
A heuristic to select a task whose speed is increased. We increase the slowdown of a task that results in the
minimum energy increase per unit time.
The current slowdown factor of a candidate task τi is ηik; the next higher factor is ηik+1.
ΔEi=Ei(ηik+1)−Ei(ηik)
Speedup
Δti=Ci(ηik1−ηik+11)
Find the candidate task with the minimum value of ΔtiΔEi
Procrastination-DVS-System-2004-Embedded
---- Appendix -----
Maximizing the time period of sleep mode (which can be seen as a low power state, e.g., the power-down state and low refresh rate, when the memory is not accessed)
- Nevertheless, we should note that most previous research works either consider optimizing a single power consumption (i.e., multi-core processor power or memory power) or focus on the problem with the system provisioning single-core processor and memory.
- They first showed that the problem is NP-hard if the number of cores is fewer than the number of tasks (and the number of cores is at least 2). Thus, to avoid facing the NP-hardness for the feasibility of the problem, they then turned their attention to the case when the number of cores is unbounded where several optimal solutions are proposed regarding different task models.
[19] J. Zhuo, C. Chakrabarti, System-level energy-efficient dynamic task scheduling, in: Proceedings of the Annual Design Automation Conference (DAC),2005, pp. 628–631.
[20] X. Zhong, C.-Z. Xu, System-wide energy minimization for real-time tasks: lower bound and approximation, ACM Trans. Embed. Comput. Syst. 7 (3) (2008) 28:1–28:24.
[21] X. Zhong, C.-Z. Xu, Frequency-aware energy optimization for real-time periodic and aperiodic tasks, in: Proceedings of the ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), 2007, pp. 21–30.
SMP and the shared main memory
Sleep and DVS-aware system-wide Energy Minimization (SDEM) problem
A cluster with multiple processor cores, which can share the heavy workload to speed up the task executions. In such architecture, the main memory is usually shared among multiple processor cores.
SMP-AM-2017-Energy-Idle
Ref: Race to idle or not: balancing the memory sleep time with DVS for energy minimization
Model
-
Sleep time of the shared memory is determined by the common idle time of all the cores.
-
The static power for the shared main memory is denoted by αm, due to the leakage current.
-
Break-even time ξ
- Models transition energy overhead
- the length of the time interval where the memory working in the idle but active mode consumes the same energy as the transition overhead
-
Each core has disjont memory area.
-
The number of cores C, the number of tasks n
-
Bounded number of cores when C<n
- NP-hard (reduced from 3-Partition and Multi-Partition)
-
When the number of cores is unbounded C≥n, where cores have negligible and non-negligible static power (two cases)
- for tasks with common release time
- optimal schemes are proposed for two cases respectively O(nlogn),O(n2)
-
for agreeable deadline tasks (where later release time implies a later deadline)
- two DP-based (Dynamic Programming) optimal solutions are developed respectively. O(n4+n2),O(n5+n2)
-
The general task model
- An online heuristic algorithm is proposed.
- When a new task Ti arrives, run the algorithm, and set all unfinished tasks’ release times the same as that of Ti . By applying the scheme presented in Sect. 4.1 (Sect. 4.2), the local optimal solution can be obtained for the current tasks. Keep the memory in the sleep state until a new task arrives or some task has to be executed to guarantee the local optimality.
-
The energy overhead caused by mode transitions between active and sleep modes of the memory and the cores are further considered. With the transition overhead, the corresponding schemes over different models are presented.
When the number of tasks is more than the number of cores (NP-hard)
SMP-A-2021-Energy-SharedMemory
Ref: Minimizing energy on homogeneous processors with shared memory
- When there is a single-core, we present the optimal algorithm ( a polynomial-time approximation algorithm with guaranteed performance ).
- Extend the intuition to the multi-core case, where the problem is tackled in two steps:
- Present an optimal polynomial-time algorithm when the assignment of tasks to cores is given;
- Propose an algorithm to assign tasks to cores, and show that it is a constant approximation algorithm.
SCP-AM-2018-China_Wu-cache
W. Wu, M. Li, K. Wang, H. Huang, E. Chen, Speed scaling problems with memory/cache consideration, J. Sched. 21 (6) (2018) 633–646.
- For the uniform with-cache model with agreeable jobs (later-released jobs do not have earlier deadlines), O(n4) and the approximation algorithm.