چكيده به لاتين
With the emergence of web-based services, data centers receive huge amounts of network requests every day. These requests are initialized from the end-user, and then are aggregated to hundreds of sub-requests, each one destined to a specific server for processing. The aggregation of the sub-requests responses form the final response to the user. With the increase of incoming request load for each server in a data center, resources utilization is increased, but the possibility of long response delays due to queuing can also increase. Moreover, by increasing the number of physical servers, other challenges like power consumption and unmanaged power-saving features (e.g., processor idle-states) may arise. In this thesis, we first introduce the possible sources of delay injection in latency-sensitive network requests. Then, by entering the domain of operating systems, we explore the challenges in governing processor idle-states and present their considerable impact in the occurrence of microsecond-scale latencies. Then we propose an idle-state governing mechanism that uses online machine learning schemes to pcrovide an efficient tradeoff point between power consumption and request latency. The evaluation results show that the proposed mechanisms can achieve up to 40% reduction of tail latency in 99th percentile and up to to 30% improvement in server’s power consumption. Finally we demonstrate that achieving performance improvements in user-space is trickier due to additional overheads. To this end, we explore the usage of user-level threading runtimes as the backend of network applications, and present our findings.