Socket, coroutine and I/O-multiplexing

16 Jul, 2021

In this post, I will make a summary of the socket, coroutine, and IO-multiplexing, not very detailed, just some keys and something that confused me for a long time before.

Socket

What is a socket?

A socket is one endpoint of a two-way communication link. A socket is bound to a network port. It's a combination of an IP address and a port number.

Generally speaking, the socket also means socket programming, a two-way communication technology mentioned above.

On the server side, there are six steps/methods for communication(read/write data from/to client). They are socket(), bind(), listen(), accept(), read()/write(), close().

Some keys:

Why does it need two(more than two) socket instances on the server-side?
Regard a socket as a file that can be read and written.
Remember to close.

For the first point, there are two explanations I think are better.

Regard the socket of socket() as a guard socket, it listens on the port and allows some client connections into(That's what listen() and accept() mean).
A TCP/UDP connection is defined by 4 values: source IP, source port, target IP, and target port. So the socket of socket() is just bound with the two of them, while the socket of accept() bound with all these values, is a complete connection.

On the client side, there are four steps/methods to do it. They are socket(), connect(), read()/write(), close()

Note that you don't need to specify the client socket port in most cases. System kernel would do for you automatically.

Coroutine

A thread is the smallest unit of scheduling in the operating system. In other words, the os kernel manages some threads, allocates CPU time slices and other resources to each thread with a specified scheduler.

The switch of threads is in the kernel space(Although maybe the thread is running in the user space), so it needs some cost.

So what is a coroutine?

A coroutine is a programming model that allows to suspend and resume in a function simply.

In other words, using coroutine can change the default execution order of the program(code), it sounds like multi-processes, multi-threads, multi-task, and other relevant words, but it's important to know that coroutine is run in user space, the implementation is by user code(framework or built-in) not by os kernel.

I/O-multiplexing

A key problem of computer science is that how to accelerate the speed of I/O and reduce the waiting time(It's fake).

Blocking I/O

read();

Like this code, blocking wait the data means this thread was blocked, suspended in the blocking queue. It can't do anything until finish the I/O or any error happens.

Non-Blocking I/O

while(!read()){
    wait();
}

A simple way is polling, like the code above, try to check whether the data is ready for reading constantly. But it's inefficient to polling one file description in a thread.

while(!read(fd1)){} // thread1
while(!read(fd2)){} // thread2
while(!read(fd3)){} // thread3

while(!select(fd1 fd2 fd3 ...)){} // thread4
while(!epoll(fd1 fd2 fd3 ...)){}  // thread5

There are some more efficient methods provided by os kernel to poll(watch) file descriptions. Like select, epoll and so on.

More details about select/epoll will be written in another post.

Note that all mentioned above are just synchronous I/O.

I/O-multiplexing and coroutine

while(1){
    nfsd=epoll_wait(...);
    for(fd in nfsd){
        // check fd's state(readable, writable) and handle
    }
}

When handling a socket request, we usually handle multi times to read/write all the data. It's important and trouble to save and restore the context.

Obviously, it can do this efficiently(switch context in user space) and we don't have to care about the implementation details.

See this repository(fengyoulin/ef) if free.

References

videos

books/blogs

TCP/IP网络编程

#Programming