What is the C10K problem? The C10K problem, known as the ten thousand simultaneous connections problem, is a numeronym used to express the limitation that most servers have in terms of network connections. This limit is based on the observation that in the various possible hardware and software configurations, current large servers do not seem capable of supporting more than ten thousand simultaneous connections. This limitation is partially due to constraints related to operating systems and the design of client-server applications in which they participate.
In the specific context of web servers, since the identification of this problem, a few solutions have been proposed to solve the C10k problem.
Asynchronous by message transmission: Erlang
These include web server implementations based on Erlang, Ericsson's language dedicated to concurrent and real-time application development. This language offers a distribution called Open Telecom Platform made up of several libraries offering very advanced distribution functionalities for processing and node supervision, as well as a distributed database. Erlang can interface with Java and C / C ++.
Richard Jones, on his blog explains how to develop a Comet application that can handle 1 million simultaneous connections based on Mochiweb. The Comet web application template allows the web server to push to the browser. In this approach, each Mochiweb connection is referenced by a router which distributes (push) messages to each user connected simultaneously. Observations on this Mochiweb-based application show that for 10,000 users connected simultaneously for 24 hours, with a transmission rate of 1,000 messages per second (i.e. 1 user receiving 1 message every 10 seconds), the memory resource consumed is of 80MB or 8KB per user.
Implementation of Yaws
Yaws is a high performance http server particularly suitable for dynamic content in web applications. The comparative measurements made by Ali Ghodsi show that Apache is totally collapsing with 4,000 simultaneous requests while Yaws continues to operate with 90,000 requests. The comparison is carried out with a Yaws server on NFS file system, an Apache server on NFS also, and an Apache server on a local system file. The servers receive a 20 kbit request in a loop. As soon as they respond, a new request arrives.
SEDA (staged event-driven architecture)
SEDA is the thesis subject of Matt Wesh who presents a new approach to the development of a robust internet service platform, able to support massive competition and more than 10,000 simultaneous connections. The philosophy in SEDA is to break down a complex event application into a set of steps connected to a queue. Thus, a complex processing is broken down into a series of operations. These operations are themselves broken down into a set of messages to be processed. These messages are then routed to one or more "workers". All these steps and dispatching are managed by a framework. The developer will only have to focus on the implementation of the business code and let the framework handle the technical complexity. This design avoids the overheads associated with one-thread-per-request models (thread-based concurrency). It decouples events and thread scheduling from application logic.
By performing admission control at each event queue, the service can be well conditioned to load, preventing resources from being overused when demand exceeds the service's capacity. The staged event-driven architecture employs dynamic control to automatically adapt execution parameters (such as the scheduling parameters of each stage), such as to manage load, for example. Breaking down services into a series of steps also allows code reuse and modularity. A benchmark showing that Haboob (SEDA Web server) is more efficient than Apache is presented on page.
Web infrastructure architecture
Implementation of web servers
Some "light" web servers (open source) have been developed to counter the problem of 10k simultaneous connections. These web servers have an asynchronous, non-blocking architecture:
Nginx (pronounced in English "engine x") was developed by Igor Sysoev in 2002, for the needs of a Russian site with very high traffic (rambler.ru). It was only known, outside Russia, from 2006, after being translated from Russian by Aleksandar Lazic. Its performance in terms of response, stability and memory consumption quickly earned it a certain reputation and its use on the web has continued to increase since then.
Lighttpd (pronounced "lighty" in English), developed in 2003 by a German student Jan Kneschke, is one of the servers with the lowest memory trace and the lowest CPU usage, while being very fast to serve documents. static as well as dynamic. The FastCGI module is included by default, which makes it very interesting for languages like PHP, Python or Ruby. Lighttpd uses a single thread to handle non-blocking I / O. It is also used by large sites like YouTube, SourceForge.net, or Wikipedia's image server. It is in 7th position of the most used web servers according to the Netcraft study. It is part of the software supplied with the Fedora distribution
Cherokee was developed in 2001 by Alvaro López Ortega. It is now developed and maintained by a community of contributors. Cherokee is very fast, flexible and quick to configure thanks to its graphical administration interface
Tornado is the open source version of the web server used for the FriendFeed application and acquired by Facebook in 2009. Its asynchronous-non-blocking architecture and the use of epoll (), allows it to exceed 10k simultaneous connections, and to keep the user connections open for a long time
The term was coined in 1999 by Dan Kegel, citing the Simtel FTP host, cdrom.com, serving 10,000 clients at once over 1 gigabit per second Ethernet in that year. The term has since been used for the general issue of large number of clients, with similar numeronyms for larger number of connections, most recently "C10M" in the 2010s.
By the early 2010s millions of connections on a single commodity 1U rackmount server became possible: over 2 million connections (WhatsApp, 24 cores, using Erlang on FreeBSD), 10–12 million connections (MigratoryData, 12 cores, using Java on Linux).
Common applications of very high number of connections include pub/sub servers, chat, file servers, web servers, and software-defined networking